From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?UTF-8?Q?Jaros=C5=82aw_?= =?UTF-8?Q?Rzesz=C3=B3tko?= Newsgroups: gmane.emacs.bugs Subject: bug#16220: url-http.el: Not conforming to HTTP spec Date: Sun, 22 Dec 2013 22:55:07 +0100 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=047d7b86c2ca8b047b04ee26907d X-Trace: ger.gmane.org 1387749370 3300 80.91.229.3 (22 Dec 2013 21:56:10 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 22 Dec 2013 21:56:10 +0000 (UTC) To: 16220@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Dec 22 22:56:17 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Vur0N-0001FT-GT for geb-bug-gnu-emacs@m.gmane.org; Sun, 22 Dec 2013 22:56:11 +0100 Original-Received: from localhost ([::1]:59347 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vur0N-0004Bu-7B for geb-bug-gnu-emacs@m.gmane.org; Sun, 22 Dec 2013 16:56:11 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60093) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vur0G-0004BZ-Pz for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:56:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vur0E-0000Qg-JT for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:56:04 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:49150) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vur0E-0000Qc-GF for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:56:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Vur0E-0002Pw-6q for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:56:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?Jaros=C5=82aw_?= =?UTF-8?Q?Rzesz=C3=B3tko?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 22 Dec 2013 21:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16220 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.13877493239230 (code B ref -1); Sun, 22 Dec 2013 21:56:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 22 Dec 2013 21:55:23 +0000 Original-Received: from localhost ([127.0.0.1]:34936 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VuqzX-0002Oh-Bb for submit@debbugs.gnu.org; Sun, 22 Dec 2013 16:55:22 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:58263) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VuqzT-0002OV-6x for submit@debbugs.gnu.org; Sun, 22 Dec 2013 16:55:16 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VuqzR-0000Lb-05 for submit@debbugs.gnu.org; Sun, 22 Dec 2013 16:55:14 -0500 Original-Received: from lists.gnu.org ([2001:4830:134:3::11]:41732) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VuqzQ-0000LX-TY for submit@debbugs.gnu.org; Sun, 22 Dec 2013 16:55:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59985) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VuqzP-0003Y7-6T for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:55:12 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VuqzN-0000LI-I2 for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:55:11 -0500 Original-Received: from mail-pa0-x233.google.com ([2607:f8b0:400e:c03::233]:55261) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VuqzN-0000L2-60 for bug-gnu-emacs@gnu.org; Sun, 22 Dec 2013 16:55:09 -0500 Original-Received: by mail-pa0-f51.google.com with SMTP id fa1so4623613pad.24 for ; Sun, 22 Dec 2013 13:55:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=4oO4U5YWTEXhxy4dAIViIPaq6fKoWy3x6DotkETkPX0=; b=tQrfIBzMvBhgj3q42+ig+0YQZKt7JigIuA4VE+FOz1g84hQty+ttsLjQl1lDpDrkXA Y4zf+WV1q4v2I7uscPYXTXV6HWcdkW10aC5DGpSfb9gJYwwkRyYxdQ+3W9oNoaeEj52q AdUxutW3hdBE3hUxNER4yYdoNllZ1+zhYlo3ODzb5Af2HXPffGzvVehoUmh2An3IJyrK 5lxlocreN6zSjVjLJqBPbuPAayin68PVRiiACJ+UY0hMZMrRYwgvzY4IiRLR7jcjCaes kP1GM9362rT7yymsKT8EjW0BvojgmmGlyEexNio6teCc2IoLGjS6rcjJOsKmd5M1JRz9 WSfQ== X-Received: by 10.66.161.1 with SMTP id xo1mr436572pab.146.1387749308038; Sun, 22 Dec 2013 13:55:08 -0800 (PST) Original-Received: by 10.66.77.230 with HTTP; Sun, 22 Dec 2013 13:55:07 -0800 (PST) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:82410 Archived-At: --047d7b86c2ca8b047b04ee26907d Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable Hi, To turn this into a concrete proposal: I suggest this part in url-http.el (starting line 356 in trunk): ;; End request "\r\n" ;; Any data url-http-data ;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931). (if url-http-data "\r\n"))) Should read simply: ;; End request "\r\n" ;; Any data url-http-data)) I believe that the person who originally introduced the additional newline most likely was confused by some issue in mediawiki.el itself, a broken HTTP server, a broken PHP script, or something of this kind, and obscured the real bug instead of fixing it: http://bzr.savannah.gnu.org/lh/emacs/trunk/revision/100681 http://article.gmane.org/gmane.emacs.diffs/105629 I have confirmed that HTTPS connections work just fine after removing the new line, which is unsurprising, because this is what valid HTTP looks like. In fact, anyone can, using https://posttestserver.com/, easily check that the use case used as motivation for the original change, works just fine after removing that additional newline as I suggest here: (let ((url-request-method "POST") (url-request-data "action=3Dlogin")) (url-retrieve-synchronously "https://posttestserver.com/post.php")) Futhermore url-http-attempt-keepalives should be nil as default, or better yet should be completely removed, as true keepalive connections are anyway not currently supported on the Emacs side, are they? Cheers, Jaros=B3aw Rzesz=F3tko 2013/12/22 Jaros=B3aw Rzesz=F3tko > Hi, > > At the end of every HTTP request to be made with url-http.el and > containing a body, an unnecessary "\r\n" is appended, and additionally > those two characters are not used in the calculation of the Content-Lengt= h > header. This normally would not matter, because a carefully build server > will anyway only read "Content-Length" bytes from the body and ignore the > final CRLF, but Emacs additionally defaults to using Connection: > keep-alive, which results in the TCP traffic for what was meant to be a > single request, being interpreted as two separate HTTP requests, the firs= t > one being roughly the intended one, and the other one consisting only of > CRLF. In particular, I am using the HTTP server from net.http in Go > language. That keepalive is enabled by default is strange, especially giv= en > how the variable that controls this is described: > > (defvar url-http-attempt-keepalives t > "Whether to use a single TCP connection multiple times in HTTP. > This is only useful when debugging the HTTP subsystem. Setting to > nil will explicitly close the connection to the server after every > request.") > > Those issues have been somewhat discussed here, but it seems the people > discussing unfortunately don't understand HTTP: > > > https://groups.google.com/forum/#!msg/gnu.emacs.bug/SF4P7gVI6IQ/SExtWzutK= I4J > > Please just compare this discussion to > http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html > > If you don't go into any fancy things like chunked encoding etc., an HTTP > request is a: > > - sequence of headers, each header followed with a newline > - a newline terminating the sequence of headers > - an optional request body > > The request body will be read by the server only if one of the headers is > a Content-Length header, and the value of the header is exactly the numbe= r > of bytes that is being sent, there is no CRLF terminating the body. So i= f > there is no body, the request ends with two CRLFs, if there is a body, it > just ends with [Content-Length] bytes of raw data. There is no possibilit= y > a proper HTTP server could be confused by a request being terminated with > two CRLFs, if the request is otherwise correct. I think there must have > been some confusion as to the reason of the original problem, that was th= en > turned into this "fix". > > For reference, my code is roughly this, and as mentioned I am using > net.http from the Go language on the other end: > > (defun server-command (command) > (let* ((url-request-method "POST") > (url-request-extra-headers '(("Content-Type" . > "application/x-www-form-urlencoded"))) > (url-request-data (concat (url-hexify-string "command") "=3D" > (url-hexify-string (json-encode command)))) > (result-buffer (url-retrieve-synchronously server-url)) > (result > (with-current-buffer result-buffer > (goto-char (point-min)) > (delete-region (point-min) (search-forward "\n\n")) > (buffer-string)))) > (kill-buffer result-buffer) > result)) > > Normally I get from this function the contents of the first, "correct" > response body from the server, but if I run it a few times in quick > succession I additonally get the string "HTTP/1.1 400 Bad Request" at the > end, which is actually the second HTTP response showing up at random in t= he > buffer (altough it's consistently sent by the server every time, as I can > see in a sniffer). > > Cheers, > Jaros=B3aw Rzesz=F3tko > --047d7b86c2ca8b047b04ee26907d Content-Type: text/html; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable
Hi,

To turn this into a concret= e proposal: I suggest this part in url-http.el (starting line 356 in trunk)= :

;; End request
"\r\n"
;; Any data
url-http-data=
;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
(if url-ht= tp-data "\r\n")))

Should read simply:

;; End = request
"\r\n"
;; Any data
url-http-data))

I believe that the person who originally introduced the additional newline = most likely was confused by some issue in mediawiki.el itself, a broken HTT= P server, a broken PHP script, or something of this kind, and obscured the = real bug instead of fixing it:

= http://bzr.savannah.gnu.org/lh/emacs/trunk/revision/100681
http://article.gmane= .org/gmane.emacs.diffs/105629

I have confirmed that HTTPS connections work just fine= after removing the new line, which is unsurprising, because this is what v= alid HTTP looks like. In fact, anyone can, using https://posttestserver.com/, easily check that the use c= ase used as motivation for the original change, works just fine after remov= ing that additional newline as I suggest here:

(let ((url-request-method "POST")
=A0=A0=A0=A0=A0 (url-req= uest-data "action=3Dlogin"))
=A0 (url-retrieve-synchronously &= quot;https://posttestserver= .com/post.php"))

Futhermore url-http-attempt-keepalives should be nil as default, or bet= ter yet should be completely removed, as true keepalive connections are any= way not currently supported on the Emacs side, are they?

Cheer= s,
Jaros=B3aw Rzesz=F3tko


2013/12/22 Jaros=B3aw Rzesz=F3tko <= sztywny@gmail.com>
Hi,

At the= end of every HTTP request to be made with url-http.el and containing a bod= y, an unnecessary "\r\n" is appended, and additionally those two = characters are not used in the calculation of the Content-Length header. Th= is normally would not matter, because a carefully build server will anyway = only read "Content-Length" bytes from the body and ignore the fin= al CRLF, but Emacs additionally defaults to using Connection: keep-alive, w= hich results in the TCP traffic for what was meant to be a single request, = being interpreted as two separate HTTP requests, the first one being roughl= y the intended one, and the other one consisting only of CRLF. In particula= r, I am using the HTTP server from net.http in Go language. That keepalive = is enabled by default is strange, especially given how the variable that co= ntrols this is described:

(defvar url-http-attempt-keepalives t
=A0 "Whether to use = a single TCP connection multiple times in HTTP.
This is only useful when= debugging the HTTP subsystem.=A0 Setting to
nil will explicitly close t= he connection to the server after every
request.")

Those issues have been somewhat discussed here= , but it seems the people discussing unfortunately don't understand HTT= P:

https://groups.google.com/forum/#= !msg/gnu.emacs.bug/SF4P7gVI6IQ/SExtWzutKI4J

Please just compare this discussion to http://www.w= 3.org/Protocols/rfc2616/rfc2616-sec4.html

If you don't go i= nto any fancy things like chunked encoding etc., an HTTP request is a:

- sequence of headers, each header followed with a newline
- a newline terminating the sequence of headers
- an optio= nal request body

The request body will be read by the ser= ver only if one of the headers is a Content-Length header, and the value of= the header is exactly the number of bytes that is being sent, there is no = CRLF terminating the body.=A0 So if there is no body, the request ends with= two CRLFs, if there is a body, it just ends with [Content-Length] bytes of= raw data. There is no possibility a proper HTTP server could be confused b= y a request being terminated with two CRLFs, if the request is otherwise co= rrect. I think there must have been some confusion as to the reason of the = original problem, that was then turned into this "fix".

For reference, my code is roughly this, and as mentioned I a= m using net.http from the Go language on the other end:

(defun serve= r-command (command)
=A0 (let* ((url-request-method "POST")
=A0=A0=A0=A0=A0=A0=A0=A0 (url-request-extra-headers '(("Content-Ty= pe" . "application/x-www-form-urlencoded")))
=A0=A0=A0=A0= =A0=A0=A0=A0 (url-request-data (concat (url-hexify-string "command&quo= t;) "=3D" (url-hexify-string (json-encode command))))
=A0=A0=A0=A0=A0=A0=A0=A0 (result-buffer (url-retrieve-synchronously server-= url))
=A0=A0=A0=A0=A0=A0=A0=A0 (result
=A0=A0=A0=A0=A0=A0=A0=A0=A0 (w= ith-current-buffer result-buffer
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 (goto= -char (point-min))
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 (delete-region (poi= nt-min) (search-forward "\n\n"))
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 (buffer-string))))
=A0=A0=A0 (kill-buf= fer result-buffer)
=A0=A0=A0 result))

Normally I get from t= his function the contents of the first, "correct" response body f= rom the server, but if I run it a few times in quick succession I additonal= ly get the string "HTTP/1.1 400 Bad Request" at the end, which is= actually the second HTTP response showing up at random in the buffer (alto= ugh it's consistently sent by the server every time, as I can see in a = sniffer).

Cheers,
Jaros=B3aw Rzesz=F3tko

--047d7b86c2ca8b047b04ee26907d--