* bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first
[not found] <plhbhvu5zr.fsf@fencepost.gnu.org>
@ 2012-04-10 11:22 ` Lars Magne Ingebrigtsen
2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason
1 sibling, 0 replies; 5+ messages in thread
From: Lars Magne Ingebrigtsen @ 2012-04-10 11:22 UTC (permalink / raw)
To: William Xu; +Cc: 7017
William Xu <william.xwl@gmail.com> writes:
> Feeding the same url to `wget', it would first hexify it, then download
> it successfully. I suggest we do the same in url-retrieve, like this:
>
> (url-retrieve-internal): Hexify multibye URL string first when necessary.
Thanks; applied to the Emacs trunk.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted
[not found] <plhbhvu5zr.fsf@fencepost.gnu.org>
2012-04-10 11:22 ` bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first Lars Magne Ingebrigtsen
@ 2012-05-07 21:51 ` Seth Mason
2012-05-08 4:52 ` Chong Yidong
1 sibling, 1 reply; 5+ messages in thread
From: Seth Mason @ 2012-05-07 21:51 UTC (permalink / raw)
To: 7017
If you put the following in a buffer and eval it, you'll get a 404:
;; http://httpbin.org/get?x=1
;; eval this buffer
(url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest args) (switch-to-buffer (current-buffer))))
If you curl/wget the same URL, it'll work fine.
If you look at the request, it's going to "/get%3fx%3d1". It seems to me
that the URL is getting improperly encoded for multibyte strings.
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted
2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason
@ 2012-05-08 4:52 ` Chong Yidong
2012-05-08 5:25 ` Chong Yidong
0 siblings, 1 reply; 5+ messages in thread
From: Chong Yidong @ 2012-05-08 4:52 UTC (permalink / raw)
To: Seth Mason; +Cc: 7017
Seth Mason <seth@edgecast.com> writes:
> If you put the following in a buffer and eval it, you'll get a 404:
>
> ;; http://httpbin.org/get?x=1
> ;; eval this buffer
> (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest
> args) (switch-to-buffer (current-buffer))))
>
> If you curl/wget the same URL, it'll work fine.
>
> If you look at the request, it's going to "/get%3fx%3d1". It seems to me
> that the URL is getting improperly encoded for multibyte strings.
Thanks for pointing this out.
Applying url-hexify-string on the entire URL, as the previous patch did,
is wrong. We musn't hexify reserved characters that are being used in
their special role. Unfortunately, figuring out when those characters
are being used in their special role requires an implementation of
RFC2396, which I don't think we currently have in Emacs.
Or, the following not-strictly-correct hack leaves out reserved
characters from hexification.
=== modified file 'lisp/url/url.el'
*** lisp/url/url.el 2012-04-26 12:43:28 +0000
--- lisp/url/url.el 2012-05-08 04:46:45 +0000
***************
*** 180,188 ****
(url-gc-dead-buffers)
(if (stringp url)
(set-text-properties 0 (length url) nil url))
(when (multibyte-string-p url)
! (let ((url-unreserved-chars (append '(?: ?/) url-unreserved-chars)))
(setq url (url-hexify-string url))))
(if (not (vectorp url))
(setq url (url-generic-parse-url url)))
(if (not (functionp callback))
--- 180,193 ----
(url-gc-dead-buffers)
(if (stringp url)
(set-text-properties 0 (length url) nil url))
+
(when (multibyte-string-p url)
! (let* ((reserved-chars '(?! ?# ?$ ?& ?' ?( ?) ?* ?+ ?, ?/ ?: ?\;
! ?= ?? ?@ ?[ ?]))
! (url-unreserved-chars (append reserved-chars
! url-unreserved-chars)))
(setq url (url-hexify-string url))))
+
(if (not (vectorp url))
(setq url (url-generic-parse-url url)))
(if (not (functionp callback))
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted
2012-05-08 4:52 ` Chong Yidong
@ 2012-05-08 5:25 ` Chong Yidong
2012-05-09 8:34 ` Chong Yidong
0 siblings, 1 reply; 5+ messages in thread
From: Chong Yidong @ 2012-05-08 5:25 UTC (permalink / raw)
To: Seth Mason; +Cc: 7017
Chong Yidong <cyd@gnu.org> writes:
> Applying url-hexify-string on the entire URL, as the previous patch did,
> is wrong. We musn't hexify reserved characters that are being used in
> their special role. Unfortunately, figuring out when those characters
> are being used in their special role requires an implementation of
> RFC2396, which I don't think we currently have in Emacs.
Actually, I think we could use url-generic-parse-url for this.
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted
2012-05-08 5:25 ` Chong Yidong
@ 2012-05-09 8:34 ` Chong Yidong
0 siblings, 0 replies; 5+ messages in thread
From: Chong Yidong @ 2012-05-09 8:34 UTC (permalink / raw)
To: Seth Mason; +Cc: 7017
Chong Yidong <cyd@gnu.org> writes:
> Chong Yidong <cyd@gnu.org> writes:
>
>> Applying url-hexify-string on the entire URL, as the previous patch did,
>> is wrong. We musn't hexify reserved characters that are being used in
>> their special role. Unfortunately, figuring out when those characters
>> are being used in their special role requires an implementation of
>> RFC2396, which I don't think we currently have in Emacs.
>
> Actually, I think we could use url-generic-parse-url for this.
Fixed in trunk (revision 108172).
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-05-09 8:34 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <plhbhvu5zr.fsf@fencepost.gnu.org>
2012-04-10 11:22 ` bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first Lars Magne Ingebrigtsen
2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason
2012-05-08 4:52 ` Chong Yidong
2012-05-08 5:25 ` Chong Yidong
2012-05-09 8:34 ` Chong Yidong
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.