all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Jarosław Rzeszótko" <sztywny@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: 16220@debbugs.gnu.org
Subject: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Fri, 3 Jan 2014 19:06:22 +0100	[thread overview]
Message-ID: <CAHnYAZYxYWm2BkvH7he6GKGX24psQf340uOdt0dJD41JKuF0mg@mail.gmail.com> (raw)
In-Reply-To: <jwvmwjf2kx0.fsf-monnier+emacsbugs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 2487 bytes --]

I attach a patch that removes the extra "\r\n", adds a function to
merge two alists, and uses it to merge the extra-headers on top of the
default-headers. I added a few basic tests, too.

Cheers,
Jarosław Rzeszótko

2014/1/2 Stefan Monnier <monnier@iro.umontreal.ca>:
>> The problem is that url-http.el sets a lot of headers by default that
>> can not be overwritten in any other way then dynamically overshadowing
>> some variables.
>
> Indeed, this is ugly.  Improvements welcome.
>
>> For example, all connections are keep-alive by
>> default, which is confusing in itself already,
>
> Not sure why it should be a problem.
>
>> (let ((url-request-method "GET")
>>       (url-request-extra-headers '(("Connection" . "close"))))
>>   (url-retrieve-synchronously "http://www.google.com/"))
>> And what is sent is this:
>> GET / HTTP/1.1
>> Connection: keep-alive
>> ...
>> Connection: close
>
>> Which again isn't valid HTTP and the behaviour of the HTTP server in
>> this case is undefined and implementation specific. The only way to
>> workaround this is doing this:
>
>> (let ((url-http-attempt-keepalives nil)
>>       (url-request-method "GET")
>>       (url-request-extra-headers '(("Connection" . "close"))))
>>   (url-retrieve-synchronously "http://www.google.com/"))
>
> Yuck!  We can probably fix this fairly easily by letting
> url-request-extra-headers override (rather than just add to)
> other headers.
>
>> This is all the more irritating so many of the headers are set by
>> default using the variables url-vars.el. Why those things are at all
>> variables is a mystery to me.
>
> Probably partly historical evolution (there was no place to add new
> "parameters", so adding dynamic vars was an easy way to add more control
> without breaking existing code).
>
>> In the end it is much easier to do HTTP requests manually using
>> make-network-process then it is with the url library,
>
> I think that's misleading: the URL library is supposed to deal with
> things like proxies and redirections, which "manual requests via
> make-network-process" probably won't handle.
>
>> Didn't anyone else run into problems with it?
>
> Apparently not yet.  But I agree that the API might deserve a redesign
> (IIRC another problem is in the way headers in the answer are returned
> to the caller, which does not work consistently across different kinds
> of URLs (ftp, http, file, imap, ...)).
>
>
>         Stefan

[-- Attachment #2: url.patch --]
[-- Type: text/x-patch, Size: 15212 bytes --]

=== modified file 'lisp/subr.el'
--- old/lisp/subr.el	2014-01-01 07:43:34 +0000
+++ new/lisp/subr.el	2014-01-03 17:57:14 +0000
@@ -464,6 +464,30 @@
 	    (aset tree i (copy-tree (aref tree i) vecp)))
 	  tree)
       tree)))
+
+(defun merge-alists (alis1 alis2 keycmp)
+  "Merges alists ALIS1 and ALIS2, non-destructively, returning
+  another alist, containing the keys from both ALIS1 and ALIS2,
+  with values from ALIS2 taking precedence. For efficiency, a
+  comparision function KEYCMP have to be supplied, the lists will
+  be sorted and then merged by having two pointers traverse the
+  two lists simulateneously."
+  (let ((alist1 (sort (copy-list alis1) keycmp))
+        (alist2 (sort (copy-list alis2) keycmp))
+        (res (list)))
+    (while (and alist1 alist2)
+      (cond ((equal (car (car alist1)) (car (car alist2)))
+             (push (car alist1) res)
+             (setq alist1 (cdr alist1))
+             (setq alist2 (cdr alist2)))
+            ((funcall keycmp (car alist1) (car alist2))
+             (push (car alist1) res)
+             (setq alist1 (cdr alist1)))
+            (t
+             (push (car alist2) res)
+             (setq alist2 (cdr alist2)))))
+    (nconc res alist1 alist2)))
+
 \f
 ;;;; Various list-search functions.
 

=== modified file 'lisp/url/url-cookie.el'
--- old/lisp/url/url-cookie.el	2014-01-01 07:43:34 +0000
+++ new/lisp/url/url-cookie.el	2014-01-03 17:50:08 +0000
@@ -208,6 +208,7 @@
 		     (if retval
 			 (concat retval "; " chunk)
 		       (concat "Cookie: " chunk)))))
+    (message (prin1-to-string retval))
     (if retval
 	(concat retval "\r\n")
       "")))

=== modified file 'lisp/url/url-http.el'
--- old/lisp/url/url-http.el	2014-01-01 07:43:34 +0000
+++ new/lisp/url/url-http.el	2014-01-03 18:04:03 +0000
@@ -209,20 +209,23 @@
 	(url-http-mark-connection-as-busy host port connection))))
 
 ;; Building an HTTP request
-(defun url-http-user-agent-string ()
+(defun url-http-user-agent ()
   (if (or (eq url-privacy-level 'paranoid)
 	  (and (listp url-privacy-level)
 	       (memq 'agent url-privacy-level)))
-      ""
-    (format "User-Agent: %sURL/%s\r\n"
-	    (if url-package-name
-		(concat url-package-name "/" url-package-version " ")
-	      "")
-	    url-version)))
+      '()
+    `(("User-agent" .
+       ,(format "%sURL/%s"
+                (if url-package-name
+                    (concat url-package-name "/" url-package-version " ")
+                  "")
+                url-version)))))
 
 (defun url-http-create-request (&optional ref-url)
   "Create an HTTP request for `url-http-target-url', referred to by REF-URL."
-  (let* ((extra-headers)
+  (let* ((default-headers)
+         (extra-headers)
+         (headers-string)
 	 (request nil)
 	 (no-cache (cdr-safe (assoc "Pragma" url-http-extra-headers)))
 	 (using-proxy url-http-proxy)
@@ -235,19 +238,13 @@
 			 (url-get-authentication url-http-proxy nil 'any nil))))
 	 (real-fname (url-filename url-http-target-url))
 	 (host (url-host url-http-target-url))
-	 (auth (if (cdr-safe (assoc "Authorization" url-http-extra-headers))
-		   nil
-		 (url-get-authentication (or
-					  (and (boundp 'proxy-info)
-					       proxy-info)
-					  url-http-target-url) nil 'any nil))))
+	 (auth (url-get-authentication (or
+                                        (and (boundp 'proxy-info)
+                                             proxy-info)
+                                        url-http-target-url) nil 'any nil)))
     (if (equal "" real-fname)
 	(setq real-fname "/"))
     (setq no-cache (and no-cache (string-match "no-cache" no-cache)))
-    (if auth
-	(setq auth (concat "Authorization: " auth "\r\n")))
-    (if proxy-auth
-	(setq proxy-auth (concat "Proxy-Authorization: " proxy-auth "\r\n")))
 
     ;; Protection against stupid values in the referrer
     (if (and ref-url (stringp ref-url) (or (string= ref-url "file:nil")
@@ -260,14 +257,56 @@
 		 (memq 'lastloc url-privacy-level)))
 	(setq ref-url nil))
 
-    ;; url-http-extra-headers contains an assoc-list of
-    ;; header/value pairs that we need to put into the request.
-    (setq extra-headers (mapconcat
-			 (lambda (x)
-			   (concat (car x) ": " (cdr x)))
-			 url-http-extra-headers "\r\n"))
-    (if (not (equal extra-headers ""))
-	(setq extra-headers (concat extra-headers "\r\n")))
+    ;; Default-headers and url-http-extra-headers are both alists of
+    ;; header/value pairs
+    (setq default-headers
+          `(("Connection" . ,(if (or using-proxy
+                                     (not url-http-attempt-keepalives))
+                                 "close" "keep-alive"))
+            ("Host" . ,(if (/= (url-port url-http-target-url)
+                               (url-scheme-get-property
+                                (url-type url-http-target-url) 'default-port))
+                           (format "%s:%d" host (url-port url-http-target-url))
+                         host))
+            ("MIME-Version" . "1.0")
+            ("Accept" . ,(or url-mime-accept-string "*/*"))            
+            ,@(when (and (not no-cache)
+                         (member url-http-method '("GET" nil)))
+                (let ((tm (url-is-cached url-http-target-url)))
+                  (if tm
+                      `(("If-modified-since" . ,(url-get-normalized-date tm))))))
+            ,@(when ref-url
+                `(("Referer" . ,ref-url)))
+            ,@(when url-personal-mail-address
+                `(("From" . ,url-personal-mail-address)))
+            ,@(when url-mime-encoding-string
+                `(("Accept-encoding" . ,url-mime-encoding-string)))
+            ,@(when url-mime-charset-string
+                `(("Accept-charset" . ,url-mime-charset-string)))
+            ,@(when url-mime-language-string
+                `(("Accept-language" . ,url-mime-language-string)))
+            ,@(when auth
+                `(("Authorization" . ,auth)))
+            ,@(when proxy-auth
+                `(("Proxy-Authorization" . ,proxy-auth)))
+            ,@(when url-http-data
+                `(("Content-length" . ,(number-to-string (length url-http-data)))))
+            ,@(when url-extensions-header
+                `(("Extension" . ,url-extensions-header)))
+            ,@(url-http-user-agent)))
+
+    ;;; url-http-extra-headers are merged on top default-headers, any
+    ;;; headers specified in both will be sent as per value in
+    ;;; url-http-extra-headers
+    (setq headers-string
+          (concat
+           (mapconcat
+            (lambda (x)
+              (concat (car x) ": " (cdr x)))
+            (merge-alists default-headers url-http-extra-headers
+                          (lambda (a b) (string-lessp (car a) (car b))))
+            "\r\n")
+           "\r\n"))
 
     ;; This was done with a call to `format'.  Concatenating parts has
     ;; the advantage of keeping the parts of each header together and
@@ -287,78 +326,21 @@
            'string-as-unibyte
            (delq nil
             (list
-             ;; The request
+             ;; The request line
              (or url-http-method "GET") " "
              (if using-proxy (url-recreate-url url-http-target-url) real-fname)
              " HTTP/" url-http-version "\r\n"
-             ;; Version of MIME we speak
-             "MIME-Version: 1.0\r\n"
-             ;; (maybe) Try to keep the connection open
-             "Connection: " (if (or using-proxy
-                                    (not url-http-attempt-keepalives))
-                                "close" "keep-alive") "\r\n"
-                                ;; HTTP extensions we support
-             (if url-extensions-header
-                 (format
-                  "Extension: %s\r\n" url-extensions-header))
-             ;; Who we want to talk to
-             (if (/= (url-port url-http-target-url)
-                     (url-scheme-get-property
-                      (url-type url-http-target-url) 'default-port))
-                 (format
-                  "Host: %s:%d\r\n" host (url-port url-http-target-url))
-               (format "Host: %s\r\n" host))
-             ;; Who its from
-             (if url-personal-mail-address
-                 (concat
-                  "From: " url-personal-mail-address "\r\n"))
-             ;; Encodings we understand
-             (if url-mime-encoding-string
-                 (concat
-                  "Accept-encoding: " url-mime-encoding-string "\r\n"))
-             (if url-mime-charset-string
-                 (concat
-                  "Accept-charset: " url-mime-charset-string "\r\n"))
-             ;; Languages we understand
-             (if url-mime-language-string
-                 (concat
-                  "Accept-language: " url-mime-language-string "\r\n"))
-             ;; Types we understand
-             "Accept: " (or url-mime-accept-string "*/*") "\r\n"
-             ;; User agent
-             (url-http-user-agent-string)
-             ;; Proxy Authorization
-             proxy-auth
-             ;; Authorization
-             auth
+             ;; Headers
+             headers-string
              ;; Cookies
-	     (when (url-use-cookies url-http-target-url)
-	       (url-cookie-generate-header-lines
-		host real-fname
-		(equal "https" (url-type url-http-target-url))))
-             ;; If-modified-since
-             (if (and (not no-cache)
-                      (member url-http-method '("GET" nil)))
-                 (let ((tm (url-is-cached url-http-target-url)))
-                   (if tm
-                       (concat "If-modified-since: "
-                               (url-get-normalized-date tm) "\r\n"))))
-             ;; Whence we came
-             (if ref-url (concat
-                          "Referer: " ref-url "\r\n"))
-             extra-headers
-             ;; Length of data
-             (if url-http-data
-                 (concat
-                  "Content-length: " (number-to-string
-                                      (length url-http-data))
-                  "\r\n"))
-             ;; End request
+             (when (url-use-cookies url-http-target-url)
+               (url-cookie-generate-header-lines
+                host real-fname
+                (equal "https" (url-type url-http-target-url))))
+             ;; End of headers
              "\r\n"
-             ;; Any data
-             url-http-data
-	     ;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
-	     (if url-http-data "\r\n")))
+             ;; Data
+             url-http-data))
            ""))
     (url-http-debug "Request is: \n%s" request)
     request))

=== added file 'test/automated/url-http-tests.el'
--- old/test/automated/url-http-tests.el	1970-01-01 00:00:00 +0000
+++ new/test/automated/url-http-tests.el	2014-01-03 18:06:21 +0000
@@ -0,0 +1,100 @@
+;;; url-http.el --- Test suite for url-http.
+
+;; Copyright (C) 2011-2014 Free Software Foundation, Inc.
+
+;; Author: Jarosław Rzeszótko <jrzeszotko@gmail.com>
+;; Keywords: data
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'ert)
+(require 'url-future)
+
+(ert-deftest url-http-create-request/creates-valid-http-get-request ()
+  (let* ((url-http-extra-headers)
+         (url-http-proxy nil)      
+         (url-http-method "GET")
+         (url-http-target-url (url-generic-parse-url "http://www.gnu.org/"))
+         (url-http-data)
+         (url-package-name "XYZ")
+         (url-package-version "2.0"))
+    (with-temp-buffer
+      (insert (url-http-create-request))
+      (goto-char (point-min))
+      (should (looking-at "GET / HTTP/1.1\r\n"))
+      (should (search-forward "Accept: */*\r\n"))
+      (should (search-forward "Host: www.gnu.org\r\n"))
+      (should (search-forward "User-agent: XYZ/2.0 URL/Emacs\r\n")))))
+
+(ert-deftest url-http-create-request/sends-singleline-http-cookies ()
+  (let* ((url-http-extra-headers)
+         (url-http-proxy nil)      
+         (url-http-method "GET")
+         (url-http-target-url (url-generic-parse-url "http://www.url-http-test-host.com/"))
+         (url-http-data)
+         (url-cookie-multiple-line nil))
+    (setf (url-use-cookies url-http-target-url) t)
+    (setq url-cookie-storage nil)
+    (unwind-protect
+        (progn
+          (url-cookie-store "test1" "testvalue1testvalue1testvalue1testvalue1testvalue1" nil "www.url-http-test-host.com" "/")
+          (url-cookie-store "test2" "testvalue2" nil "www.url-http-test-host.com" "/")
+          (with-temp-buffer
+            (insert (url-http-create-request))
+            (goto-char (point-min))
+            (should (search-forward "Cookie: test1=testvalue1testvalue1testvalue1testvalue1testvalue1; test2=testvalue2\r\n"))))
+      (setq url-cookie-storage nil))))
+
+(ert-deftest url-http-create-request/sends-multiline-http-cookies ()
+  (let* ((url-http-extra-headers)
+         (url-http-proxy nil)      
+         (url-http-method "GET")
+         (url-http-target-url (url-generic-parse-url "http://www.url-http-test-host.com/"))
+         (url-http-data)
+         (url-cookie-multiple-line t)
+         (cookie-value))
+    (setf (url-use-cookies url-http-target-url) t)
+    (setq url-cookie-storage nil)
+    (unwind-protect
+        (progn
+          (url-cookie-store "test1" "testvalue1testvalue1testvalue1testvalue1testvalue1" nil "www.url-http-test-host.com" "/")
+          (url-cookie-store "test2" "testvalue2" nil "www.url-http-test-host.com" "/")
+          (with-temp-buffer
+            (insert (url-http-create-request))
+            (goto-char (point-min))
+            (should (search-forward "Cookie: test1=testvalue1testvalue1testvalue1testvalue1testvalue1\r\n"))
+            (should (search-forward "Cookie: test2=testvalue2\r\n"))))
+      (setq url-cookie-storage nil))))
+
+(ert-deftest url-http-create-request/creates-valid-http-post-request ()
+  (let* ((url-http-extra-headers)
+         (url-http-proxy nil)      
+         (url-http-method "POST")
+         (url-http-target-url (url-generic-parse-url "http://www.gnu.org/"))
+         (url-http-data "test"))
+    (with-temp-buffer
+      (insert (url-http-create-request))
+      (goto-char (point-min))
+      (should (looking-at "POST / HTTP/1.1\r\n"))
+      (should (search-forward "Accept: */*\r\n"))
+      (should (search-forward "Content-length: 4\r\n"))
+      (should (search-forward "Host: www.gnu.org\r\n"))
+      (goto-char (point-min))
+      (should (search-forward "\r\n\r\n"))
+      (should (search-forward "test"))
+      (should (equal (point) (point-max))))))


  reply	other threads:[~2014-01-03 18:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-22 20:52 bug#16220: url-http.el: Not conforming to HTTP spec Jarosław Rzeszótko
2013-12-22 21:55 ` Jarosław Rzeszótko
2013-12-22 22:33   ` Ted Zlatanov
2013-12-23  6:51     ` Jarosław Rzeszótko
2013-12-23 13:08       ` Ted Zlatanov
2013-12-24  7:50   ` Lars Ingebrigtsen
2013-12-24  8:28     ` Jarosław Rzeszótko
2013-12-24 14:43 ` Stefan Monnier
2013-12-24 16:31   ` Jarosław Rzeszótko
2014-01-02  2:21     ` Stefan Monnier
2014-01-03 18:06       ` Jarosław Rzeszótko [this message]
2014-01-05  9:57       ` Lars Magne Ingebrigtsen
2014-01-05 13:25         ` Jarosław Rzeszótko
2014-01-06 15:06         ` Stefan Monnier
2014-01-07 19:30           ` Jarosław Rzeszótko
2014-01-08 18:29             ` Stefan Monnier
2014-01-18 22:35 ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHnYAZYxYWm2BkvH7he6GKGX24psQf340uOdt0dJD41JKuF0mg@mail.gmail.com \
    --to=sztywny@gmail.com \
    --cc=16220@debbugs.gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.