unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#13598: 24.3.50; url-http.el doesn't correctly parse headers when they are sent line-by-line
@ 2013-01-31 17:26 Jonas Hoersch
  2013-02-07 18:13 ` Jonas Hörsch
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jonas Hoersch @ 2013-01-31 17:26 UTC (permalink / raw)
  To: 13598

hej, everyone,

i just finished hunting down an improbable bug in url-http.el, which
appears when a url is retrieved from a server which sends the headers
line-by-line instead of in one junk, like it is the case for the
BaseHTTPServer classes coming along in python 2.

A simple test-case looks like the following (sorry for the long
non-emacs setup stuff, but it's the minimalist example i could come up
with)

cd into a directory containing only a single minimal text file and
start python's SimpleHTTPServer so it serves it.

$ cd $(mktemp -d)
$ echo "hello world" > textfile
$ python -m SimpleHTTPServer 8000  # works only for python 2.x

(switch-to-buffer (url-retrieve-synchronously
"http://127.0.0.1:8000/textfile"))

now correctly will retrieve the "hello world" but

the buffer-local-variables url-http-content-type and
url-http-content-length are nil in the returned buffer, although one
sees that they have been transmitted by python.

adding an extra debug line to url-http's
url-http-wait-for-headers-change-function around line 1043,

------
	(when (re-search-forward "^\r*$" nil t)
	  ;; Saw the end of the headers
	  (url-http-debug "Saw end of headers... (%s)" (buffer-name))          
+         (url-http-debug "when the buffer contained...\n%s" (buffer-substring (point-min) (point-max)))

	  (setq url-http-end-of-headers (set-marker (make-marker)
						    (point))
		end-of-headers t)
-------

will show you in *URL-DEBUG* (url-debug being t)

-------
http -> Saw end of headers... ( *http 127.0.0.1:8000*-273882)
http -> when the buffer contained...
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.3

http -> url-http-parse-response called in ( *http 127.0.0.1:8000*-273882)
http -> No content-length, being dumb.
-------

that the headers haven't completely arrived yet, when url-http decides
it has seen the end of them.

changing the regex in (re-search-forward "^\r*$" nil t) to "^\r*\n"
solves the problem for me, but i'm unsure about what i might possibly be
breaking that way.

thanks for looking into it,

jonas hörsch


In GNU Emacs 24.3.50.1 (x86_64-unknown-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2013-01-29 on kafka
Bzr revision: michael.albinus@gmx.de-20130129081211-mmthn9p4bh75h5pr
Windowing system distributor `The X.Org Foundation', version 11.0.11302000
Configured using:
 `configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var
 --libexecdir=/usr/lib --mandir=/usr/share/man --without-sound
 --with-xft --with-x-toolkit=lucid'

Important settings:
  value of $LANG: en_GB.UTF-8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-03-03  6:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-31 17:26 bug#13598: 24.3.50; url-http.el doesn't correctly parse headers when they are sent line-by-line Jonas Hoersch
2013-02-07 18:13 ` Jonas Hörsch
2013-02-13 17:19   ` Bastien
2013-02-13 19:30     ` Stefan Monnier
2013-02-13 19:42     ` Glenn Morris
2013-02-13 21:38       ` Glenn Morris
2013-02-14  6:08         ` Bastien
2013-02-16  2:06         ` Glenn Morris
2014-02-26  9:32 ` bug#13598: 24.3.50; Blazej Adamczyk
2014-02-26 16:54 ` bug#13598: 24.3.50 Blazej Adamczyk
2014-02-27 22:43   ` Glenn Morris
2014-03-03  6:10     ` Blazej Adamczyk

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).