From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Chong Yidong Newsgroups: gmane.emacs.bugs Subject: bug#7017: url-retrieve seems busted Date: Tue, 08 May 2012 12:52:01 +0800 Message-ID: <87obpz9swe.fsf@gnu.org> References: <87zk9jacda.fsf@edgecast.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1336452786 13982 80.91.229.3 (8 May 2012 04:53:06 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 8 May 2012 04:53:06 +0000 (UTC) Cc: 7017@debbugs.gnu.org To: Seth Mason Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue May 08 06:53:04 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SRcQ4-0004ro-Hm for geb-bug-gnu-emacs@m.gmane.org; Tue, 08 May 2012 06:53:04 +0200 Original-Received: from localhost ([::1]:58332 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SRcQ3-0002uN-SS for geb-bug-gnu-emacs@m.gmane.org; Tue, 08 May 2012 00:53:03 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:54441) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SRcQ0-0002uI-CK for bug-gnu-emacs@gnu.org; Tue, 08 May 2012 00:53:01 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SRcPy-0007IB-C5 for bug-gnu-emacs@gnu.org; Tue, 08 May 2012 00:52:59 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:39248) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SRcPy-0007I4-8q for bug-gnu-emacs@gnu.org; Tue, 08 May 2012 00:52:58 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1SRcRx-0000XQ-Ni for bug-gnu-emacs@gnu.org; Tue, 08 May 2012 00:55:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Chong Yidong Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 08 May 2012 04:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7017 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: fixed patch Original-Received: via spool by 7017-submit@debbugs.gnu.org id=B7017.13364528582015 (code B ref 7017); Tue, 08 May 2012 04:55:01 +0000 Original-Received: (at 7017) by debbugs.gnu.org; 8 May 2012 04:54:18 +0000 Original-Received: from localhost ([127.0.0.1]:40282 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRcRG-0000WR-11 for submit@debbugs.gnu.org; Tue, 08 May 2012 00:54:18 -0400 Original-Received: from fencepost.gnu.org ([208.118.235.10]:39267 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRcRD-0000WK-3j for 7017@debbugs.gnu.org; Tue, 08 May 2012 00:54:15 -0400 Original-Received: from [155.69.17.96] (port=54322 helo=ulysses) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1SRcPA-0004mt-FW; Tue, 08 May 2012 00:52:09 -0400 In-Reply-To: <87zk9jacda.fsf@edgecast.com> (Seth Mason's message of "Mon, 07 May 2012 14:51:29 -0700") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.96 (gnu/linux) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:59854 Archived-At: Seth Mason writes: > If you put the following in a buffer and eval it, you'll get a 404: > > ;; http://httpbin.org/get?x=1 > ;; eval this buffer > (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest > args) (switch-to-buffer (current-buffer)))) > > If you curl/wget the same URL, it'll work fine. > > If you look at the request, it's going to "/get%3fx%3d1". It seems to me > that the URL is getting improperly encoded for multibyte strings. Thanks for pointing this out. Applying url-hexify-string on the entire URL, as the previous patch did, is wrong. We musn't hexify reserved characters that are being used in their special role. Unfortunately, figuring out when those characters are being used in their special role requires an implementation of RFC2396, which I don't think we currently have in Emacs. Or, the following not-strictly-correct hack leaves out reserved characters from hexification. === modified file 'lisp/url/url.el' *** lisp/url/url.el 2012-04-26 12:43:28 +0000 --- lisp/url/url.el 2012-05-08 04:46:45 +0000 *************** *** 180,188 **** (url-gc-dead-buffers) (if (stringp url) (set-text-properties 0 (length url) nil url)) (when (multibyte-string-p url) ! (let ((url-unreserved-chars (append '(?: ?/) url-unreserved-chars))) (setq url (url-hexify-string url)))) (if (not (vectorp url)) (setq url (url-generic-parse-url url))) (if (not (functionp callback)) --- 180,193 ---- (url-gc-dead-buffers) (if (stringp url) (set-text-properties 0 (length url) nil url)) + (when (multibyte-string-p url) ! (let* ((reserved-chars '(?! ?# ?$ ?& ?' ?( ?) ?* ?+ ?, ?/ ?: ?\; ! ?= ?? ?@ ?[ ?])) ! (url-unreserved-chars (append reserved-chars ! url-unreserved-chars))) (setq url (url-hexify-string url)))) + (if (not (vectorp url)) (setq url (url-generic-parse-url url))) (if (not (functionp callback))