* bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first [not found] <plhbhvu5zr.fsf@fencepost.gnu.org> @ 2012-04-10 11:22 ` Lars Magne Ingebrigtsen 2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason 1 sibling, 0 replies; 5+ messages in thread From: Lars Magne Ingebrigtsen @ 2012-04-10 11:22 UTC (permalink / raw) To: William Xu; +Cc: 7017 William Xu <william.xwl@gmail.com> writes: > Feeding the same url to `wget', it would first hexify it, then download > it successfully. I suggest we do the same in url-retrieve, like this: > > (url-retrieve-internal): Hexify multibye URL string first when necessary. Thanks; applied to the Emacs trunk. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog http://lars.ingebrigtsen.no/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted [not found] <plhbhvu5zr.fsf@fencepost.gnu.org> 2012-04-10 11:22 ` bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first Lars Magne Ingebrigtsen @ 2012-05-07 21:51 ` Seth Mason 2012-05-08 4:52 ` Chong Yidong 1 sibling, 1 reply; 5+ messages in thread From: Seth Mason @ 2012-05-07 21:51 UTC (permalink / raw) To: 7017 If you put the following in a buffer and eval it, you'll get a 404: ;; http://httpbin.org/get?x=1 ;; eval this buffer (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest args) (switch-to-buffer (current-buffer)))) If you curl/wget the same URL, it'll work fine. If you look at the request, it's going to "/get%3fx%3d1". It seems to me that the URL is getting improperly encoded for multibyte strings. ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted 2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason @ 2012-05-08 4:52 ` Chong Yidong 2012-05-08 5:25 ` Chong Yidong 0 siblings, 1 reply; 5+ messages in thread From: Chong Yidong @ 2012-05-08 4:52 UTC (permalink / raw) To: Seth Mason; +Cc: 7017 Seth Mason <seth@edgecast.com> writes: > If you put the following in a buffer and eval it, you'll get a 404: > > ;; http://httpbin.org/get?x=1 > ;; eval this buffer > (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest > args) (switch-to-buffer (current-buffer)))) > > If you curl/wget the same URL, it'll work fine. > > If you look at the request, it's going to "/get%3fx%3d1". It seems to me > that the URL is getting improperly encoded for multibyte strings. Thanks for pointing this out. Applying url-hexify-string on the entire URL, as the previous patch did, is wrong. We musn't hexify reserved characters that are being used in their special role. Unfortunately, figuring out when those characters are being used in their special role requires an implementation of RFC2396, which I don't think we currently have in Emacs. Or, the following not-strictly-correct hack leaves out reserved characters from hexification. === modified file 'lisp/url/url.el' *** lisp/url/url.el 2012-04-26 12:43:28 +0000 --- lisp/url/url.el 2012-05-08 04:46:45 +0000 *************** *** 180,188 **** (url-gc-dead-buffers) (if (stringp url) (set-text-properties 0 (length url) nil url)) (when (multibyte-string-p url) ! (let ((url-unreserved-chars (append '(?: ?/) url-unreserved-chars))) (setq url (url-hexify-string url)))) (if (not (vectorp url)) (setq url (url-generic-parse-url url))) (if (not (functionp callback)) --- 180,193 ---- (url-gc-dead-buffers) (if (stringp url) (set-text-properties 0 (length url) nil url)) + (when (multibyte-string-p url) ! (let* ((reserved-chars '(?! ?# ?$ ?& ?' ?( ?) ?* ?+ ?, ?/ ?: ?\; ! ?= ?? ?@ ?[ ?])) ! (url-unreserved-chars (append reserved-chars ! url-unreserved-chars))) (setq url (url-hexify-string url)))) + (if (not (vectorp url)) (setq url (url-generic-parse-url url))) (if (not (functionp callback)) ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted 2012-05-08 4:52 ` Chong Yidong @ 2012-05-08 5:25 ` Chong Yidong 2012-05-09 8:34 ` Chong Yidong 0 siblings, 1 reply; 5+ messages in thread From: Chong Yidong @ 2012-05-08 5:25 UTC (permalink / raw) To: Seth Mason; +Cc: 7017 Chong Yidong <cyd@gnu.org> writes: > Applying url-hexify-string on the entire URL, as the previous patch did, > is wrong. We musn't hexify reserved characters that are being used in > their special role. Unfortunately, figuring out when those characters > are being used in their special role requires an implementation of > RFC2396, which I don't think we currently have in Emacs. Actually, I think we could use url-generic-parse-url for this. ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#7017: url-retrieve seems busted 2012-05-08 5:25 ` Chong Yidong @ 2012-05-09 8:34 ` Chong Yidong 0 siblings, 0 replies; 5+ messages in thread From: Chong Yidong @ 2012-05-09 8:34 UTC (permalink / raw) To: Seth Mason; +Cc: 7017 Chong Yidong <cyd@gnu.org> writes: > Chong Yidong <cyd@gnu.org> writes: > >> Applying url-hexify-string on the entire URL, as the previous patch did, >> is wrong. We musn't hexify reserved characters that are being used in >> their special role. Unfortunately, figuring out when those characters >> are being used in their special role requires an implementation of >> RFC2396, which I don't think we currently have in Emacs. > > Actually, I think we could use url-generic-parse-url for this. Fixed in trunk (revision 108172). ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-05-09 8:34 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <plhbhvu5zr.fsf@fencepost.gnu.org> 2012-04-10 11:22 ` bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first Lars Magne Ingebrigtsen 2012-05-07 21:51 ` bug#7017: url-retrieve seems busted Seth Mason 2012-05-08 4:52 ` Chong Yidong 2012-05-08 5:25 ` Chong Yidong 2012-05-09 8:34 ` Chong Yidong
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.