unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Chong Yidong <cyd@gnu.org>
To: Seth Mason <seth@edgecast.com>
Cc: 7017@debbugs.gnu.org
Subject: bug#7017: url-retrieve seems busted
Date: Tue, 08 May 2012 12:52:01 +0800	[thread overview]
Message-ID: <87obpz9swe.fsf@gnu.org> (raw)
In-Reply-To: <87zk9jacda.fsf@edgecast.com> (Seth Mason's message of "Mon, 07 May 2012 14:51:29 -0700")

Seth Mason <seth@edgecast.com> writes:

> If you put the following in a buffer and eval it, you'll get a 404:
>
>     ;; http://httpbin.org/get?x=1
>     ;; eval this buffer
>     (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest
> args) (switch-to-buffer (current-buffer))))
>
> If you curl/wget the same URL, it'll work fine.
>
> If you look at the request, it's going to "/get%3fx%3d1". It seems to me
> that the URL is getting improperly encoded for multibyte strings.

Thanks for pointing this out.

Applying url-hexify-string on the entire URL, as the previous patch did,
is wrong.  We musn't hexify reserved characters that are being used in
their special role.  Unfortunately, figuring out when those characters
are being used in their special role requires an implementation of
RFC2396, which I don't think we currently have in Emacs.

Or, the following not-strictly-correct hack leaves out reserved
characters from hexification.


=== modified file 'lisp/url/url.el'
*** lisp/url/url.el	2012-04-26 12:43:28 +0000
--- lisp/url/url.el	2012-05-08 04:46:45 +0000
***************
*** 180,188 ****
    (url-gc-dead-buffers)
    (if (stringp url)
         (set-text-properties 0 (length url) nil url))
    (when (multibyte-string-p url)
!     (let ((url-unreserved-chars (append '(?: ?/) url-unreserved-chars)))
        (setq url (url-hexify-string url))))
    (if (not (vectorp url))
        (setq url (url-generic-parse-url url)))
    (if (not (functionp callback))
--- 180,193 ----
    (url-gc-dead-buffers)
    (if (stringp url)
         (set-text-properties 0 (length url) nil url))
+ 
    (when (multibyte-string-p url)
!     (let* ((reserved-chars '(?! ?# ?$ ?& ?' ?( ?) ?* ?+ ?, ?/ ?: ?\;
! 			     ?= ?? ?@ ?[ ?]))
! 	   (url-unreserved-chars (append reserved-chars
! 					 url-unreserved-chars)))
        (setq url (url-hexify-string url))))
+ 
    (if (not (vectorp url))
        (setq url (url-generic-parse-url url)))
    (if (not (functionp callback))






  reply	other threads:[~2012-05-08  4:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <plhbhvu5zr.fsf@fencepost.gnu.org>
2012-04-10 11:22 ` bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first Lars Magne Ingebrigtsen
2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason
2012-05-08  4:52   ` Chong Yidong [this message]
2012-05-08  5:25     ` Chong Yidong
2012-05-09  8:34       ` Chong Yidong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87obpz9swe.fsf@gnu.org \
    --to=cyd@gnu.org \
    --cc=7017@debbugs.gnu.org \
    --cc=seth@edgecast.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).