* [happy@mcplaksin.org: HTTP redirects make url-retrieve-synchronously asynchronous] @ 2006-01-17 19:59 Richard Stallman 2006-01-18 3:13 ` Fwd: HTTP redirects make url-retrieve-synchronously asynchronous Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Richard Stallman @ 2006-01-17 19:59 UTC (permalink / raw) [I sent this message a week ago but did not get a response.] Would someone please look at this, and then ack? ------- Start of forwarded message ------- X-Injected-Via-Gmane: http://gmane.org/ To: emacs-pretest-bug@gnu.org From: Mark Plaksin <happy@mcplaksin.org> Date: Mon, 09 Jan 2006 13:18:46 -0500 X-Gmane-NNTP-Posting-Host: stone.tss.usg.edu Cancel-Lock: sha1:UaHMftiRGaDChvZKQsoPebVNGFs= Subject: HTTP redirects make url-retrieve-synchronously asynchronous Sender: emacs-pretest-bug-bounces+rms=gnu.org@gnu.org X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on monty-python X-Spam-Level: X-Spam-Status: No, hits=0.0 required=5.0 tests=HTML_MESSAGE autolearn=no version=2.63 url-retrieve-synchronously becomes asynchronous when HTTP redirects are involved. When it encounters a redirect, url-http-parse-headers calls url-retrieve instead of url-retrieve-synchronously. Naively switching to the latter doesn't solve the problem and I haven't been able to find a fix.. I encountered the problem in Gnus using nnrss. I have an old URL for Slashdot's RSS feed and was experimenting with setting mm-url-use-external to nil. To reproduce, evaluate this in *scratch*: (mm-url-insert "http://slashdot.org/index.rss") You'll get this: ("http://slashdot.org/index.rss" 316) <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://rss.slashdot.org/Slashdot/slashdot">here</A>.<P> <HR> <ADDRESS>Apache/1.3.33 Server at slashdot.org Port 80</ADDRESS> </BODY></HTML> To make the problem go away, add a breakpoint before "(when redirect-uri" in url-http.el. Then re-run the test and wait a few seconds after hitting the breakpoint. Tell the debugger to continue and you will get the expected contents of Slashdot's RSS feed instead of the redirect message above. I'll keep trying to find a way to fix this but maybe it's trivial for somebody who already understands the URL library. _______________________________________________ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug ------- End of forwarded message ------- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-17 19:59 [happy@mcplaksin.org: HTTP redirects make url-retrieve-synchronously asynchronous] Richard Stallman @ 2006-01-18 3:13 ` Stefan Monnier 2006-01-19 17:44 ` Richard M. Stallman 0 siblings, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2006-01-18 3:13 UTC (permalink / raw) Cc: emacs-devel > [I sent this message a week ago but did not get a response.] > Would someone please look at this, and then ack? I've looked at it and understand the problem, but I can't think of a quickfix, and the real fix will take some work (and enough changes that it'll risk introducing bugs that'll need more testing than I can do myself). Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-18 3:13 ` Fwd: HTTP redirects make url-retrieve-synchronously asynchronous Stefan Monnier @ 2006-01-19 17:44 ` Richard M. Stallman 2006-01-20 21:56 ` Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Richard M. Stallman @ 2006-01-19 17:44 UTC (permalink / raw) Cc: emacs-devel I've looked at it and understand the problem, but I can't think of a quickfix, and the real fix will take some work (and enough changes that it'll risk introducing bugs that'll need more testing than I can do myself). Could you explain the issue to us? Maybe someone will have an idea. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-19 17:44 ` Richard M. Stallman @ 2006-01-20 21:56 ` Stefan Monnier 2006-01-22 3:59 ` Richard M. Stallman 0 siblings, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2006-01-20 21:56 UTC (permalink / raw) Cc: emacs-devel > I've looked at it and understand the problem, but I can't think of a > quickfix, and the real fix will take some work (and enough changes that > it'll risk introducing bugs that'll need more testing than I can do > myself). > Could you explain the issue to us? Maybe someone will have an idea. Here is what I believe to be the relevant information: 1. when an http server returns a redirect, url-http.el currently calls url-retrieve to retrieve the target of the redirect (i.e. transparently follow the redirection). 2. url-retrieve takes a url and a callack function and returns a new buffer into which the requested "page" will be asynchronously inserted. When the buffer is complete, it calls the callback function (in that buffer). 3. some backends (at least url-http, maybe others) sometimes decide not to call the callback, presumably as a way to signal an error (the operation can't be completed so the callback can't be called, basically). This is a bug, but I don't know of anyone who's tried to tackle it yet. 4. url-retrieve-synchronously is implemented on top of url-retrieve by busy-looping with accept-process-input waiting for a variable to be set to t by the callback function. Now, because of number 3 above url-retrieve-synchronously can't assume that the callback will eventually be called, so it also stops the busy-waiting if it notices that there's no more process running in the buffer that url-retrive returned. So when a redirect is encountered, the inner call to url-retrieve creates a brand new buffer, different from the one returned by the original url-retrieve call, and the subsequent async process runs in that buffer and the callback will be called in *that* buffer as well. So url-retrieve-synchronously gets all confused: in the buffer in which it expects the output, after the redirect there's no more process running (it's running the newly generated buffer), so it stops busy-waiting and returns, thinking the download has completed whereas it's actually still going on, just in another buffer. So there are fundamentally two bugs in the code: 1. sometimes the callback doesn't get called. 2. sometimes the callback gets called in another buffer than the one returned by url-retrieve. I think the first is due to a bug in the API because there's no documented way for a backend to indicate to the callback function that the operation failed. And the second is an internal bug, but which I think is due to a bug in the internal API (the one used between the generic part of the URL lib and each backend) where the url-<foo> function necessarily returns a new buffer. They (and url-retrieve) should either take an optional destination buffer as parameter or they should simply not return any buffer at all and the destination buffer should only be made known when the callback is called. This second option is simpler but would cause problems for code that wants to start processing data before it's fully downloaded. Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-20 21:56 ` Stefan Monnier @ 2006-01-22 3:59 ` Richard M. Stallman 2006-01-23 16:40 ` Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Richard M. Stallman @ 2006-01-22 3:59 UTC (permalink / raw) Cc: emacs-devel 3. some backends (at least url-http, maybe others) sometimes decide not to call the callback, presumably as a way to signal an error (the operation can't be completed so the callback can't be called, basically). This is a bug, but I don't know of anyone who's tried to tackle it yet. What exactly is the bug? It isn't clear to me. Are you saying that they should also call the callback function to report failure? That seems like a good idea on general principles. I don't know how much of a change it would be. Is the calling convention easy to extend for this? 2. sometimes the callback gets called in another buffer than the one returned by url-retrieve. One solution would be to give the first buffer a local variable that would, in this case, point to the second buffer. Then url-retrieve-synchronously could check the local variable, which would tell it to check the process in the other buffer. That solution would be safe, since it would not involve changing general calling conventions. They (and url-retrieve) should either take an optional destination buffer as parameter or they should simply not return any buffer at all and the destination buffer should only be made known when the callback is called. I don't much like the idea of that big a change in calling conventions. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-22 3:59 ` Richard M. Stallman @ 2006-01-23 16:40 ` Stefan Monnier 2006-01-23 20:38 ` Stefan Monnier 2006-01-24 16:47 ` Richard M. Stallman 0 siblings, 2 replies; 9+ messages in thread From: Stefan Monnier @ 2006-01-23 16:40 UTC (permalink / raw) Cc: emacs-devel > 3. some backends (at least url-http, maybe others) sometimes decide > not to call the callback, presumably as a way to signal an error (the > operation can't be completed so the callback can't be called, > basically). This is a bug, but I don't know of anyone who's tried to > tackle it yet. > What exactly is the bug? It isn't clear to me. Are you saying that > they should also call the callback function to report failure? Yes. > That seems like a good idea on general principles. I don't know how > much of a change it would be. Is the calling convention easy to > extend for this? Not directly. The most natural way to do that is to pass the success/failure information as a parameter to the callback function, but if we don't want to break existing code, we could pass the info via a local variable in the destination buffer. > 2. sometimes the callback gets called in another buffer than the one > returned by url-retrieve. > One solution would be to give the first buffer a local variable that > would, in this case, point to the second buffer. > Then url-retrieve-synchronously could check the local variable, which > would tell it to check the process in the other buffer. Yes, sounds like a good quick-fix. > They (and url-retrieve) should either take an optional destination > buffer as parameter or they should simply not return any buffer at all > and the destination buffer should only be made known when the callback > is called. > I don't much like the idea of that big a change in calling conventions. Neither do I, especially not that close to a release. OTOH we haven't released URL yet so it's the last opportunity to clean up the API. Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-23 16:40 ` Stefan Monnier @ 2006-01-23 20:38 ` Stefan Monnier 2006-02-19 20:15 ` Mark Plaksin 2006-01-24 16:47 ` Richard M. Stallman 1 sibling, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2006-01-23 20:38 UTC (permalink / raw) Cc: emacs-devel >> 2. sometimes the callback gets called in another buffer than the one >> returned by url-retrieve. >> One solution would be to give the first buffer a local variable that >> would, in this case, point to the second buffer. >> Then url-retrieve-synchronously could check the local variable, which >> would tell it to check the process in the other buffer. > Yes, sounds like a good quick-fix. The patch below uses this approach. Mark, does it work well for you? Stefan Index: lisp/url/url.el =================================================================== RCS file: /sources/emacs/emacs/lisp/url/url.el,v retrieving revision 1.20 diff -u -r1.20 url.el --- lisp/url/url.el 10 Jan 2006 19:31:15 -0000 1.20 +++ lisp/url/url.el 23 Jan 2006 20:37:41 -0000 @@ -114,6 +114,13 @@ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Retrieval functions ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +(defvar url-redirect-buffer nil + "New buffer into which the retrieval will take place. +Sometimes while retrieving a URL, the URL library needs to use another buffer +than the one returned initially by `url-retrieve'. In this case, it sets this +variable in the original buffer as a forwarding pointer.") + ;;;###autoload (defun url-retrieve (url callback &optional cbargs) "Retrieve URL asynchronously and call CALLBACK with CBARGS when finished. @@ -189,10 +189,14 @@ (url-debug 'retrieval "Spinning in url-retrieve-synchronously: %S (%S)" retrieval-done asynch-buffer) + (if (buffer-local-value 'url-redirect-buffer asynch-buffer) + (setq proc (get-buffer-process + (setq asynch-buffer + (buffer-local-value 'url-redirect-buffer + asynch-buffer)))) (if (and proc (memq (process-status proc) '(closed exit signal failed)) - ;; Make sure another process hasn't been started, as can - ;; happen with http redirections. + ;; Make sure another process hasn't been started. (eq proc (or (get-buffer-process asynch-buffer) proc))) ;; FIXME: It's not clear whether url-retrieve's callback is ;; guaranteed to be called or not. It seems that url-http @@ -200,7 +204,7 @@ ;; clear that it's a bug, but even then we need to decide how ;; url-http can then warn us that the download has completed. ;; In the mean time, we use this here workaround. - (setq retrieval-done t) + (setq retrieval-done t)) ;; We used to use `sit-for' here, but in some cases it wouldn't ;; work because apparently pending keyboard input would always ;; interrupt it before it got a chance to handle process input. Index: lisp/url/url-http.el =================================================================== RCS file: /sources/emacs/emacs/lisp/url/url-http.el,v retrieving revision 1.23 diff -u -r1.23 url-http.el --- lisp/url/url-http.el 18 Nov 2005 16:55:54 -0000 1.23 +++ lisp/url/url-http.el 23 Jan 2006 20:37:41 -0000 @@ -1,6 +1,6 @@ ;;; url-http.el --- HTTP retrieval routines -;; Copyright (C) 1999, 2001, 2004, 2005 Free Software Foundation, Inc. +;; Copyright (C) 1999, 2001, 2004, 2005, 2006 Free Software Foundation, Inc. ;; Author: Bill Perry <wmperry@gnu.org> ;; Keywords: comm, data, processes @@ -35,10 +35,8 @@ (require 'url-cookie) (require 'mail-parse) (require 'url-auth) -(autoload 'url-retrieve-synchronously "url") -(autoload 'url-retrieve "url") +(require 'url) (autoload 'url-cache-create-filename "url-cache") -(autoload 'url-mark-buffer-as-dead "url") (defconst url-http-default-port 80 "Default HTTP port.") (defconst url-http-asynchronous-p t "HTTP retrievals are asynchronous.") @@ -509,10 +509,17 @@ (let ((url-request-method url-http-method) (url-request-data url-http-data) (url-request-extra-headers url-http-extra-headers)) + ;; Put in the current buffer a forwarding pointer to the new + ;; destination buffer. + ;; FIXME: This is a hack to fix url-retrieve-synchronously + ;; without changing the API. Instead url-retrieve should + ;; either simply not return the "destination" buffer, or it + ;; should take an optional `dest-buf' argument. + (set (make-local-variable 'url-redirect-buffer) (url-retrieve redirect-uri url-callback-function (cons :redirect (cons redirect-uri - url-callback-arguments))) + url-callback-arguments)))) (url-mark-buffer-as-dead (current-buffer)))))) (4 ; Client error ;; 400 Bad Request ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-23 20:38 ` Stefan Monnier @ 2006-02-19 20:15 ` Mark Plaksin 0 siblings, 0 replies; 9+ messages in thread From: Mark Plaksin @ 2006-02-19 20:15 UTC (permalink / raw) Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> 2. sometimes the callback gets called in another buffer than the one >>> returned by url-retrieve. > >>> One solution would be to give the first buffer a local variable that >>> would, in this case, point to the second buffer. >>> Then url-retrieve-synchronously could check the local variable, which >>> would tell it to check the process in the other buffer. > >> Yes, sounds like a good quick-fix. > > The patch below uses this approach. Mark, does it work well for you? It does; thanks! Are applications supposed to be aware of the possibility of redirect buffers and clean up both the original buffer and the redirect buffer? You could easily say "It's a work around, so yes!" :) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: HTTP redirects make url-retrieve-synchronously asynchronous 2006-01-23 16:40 ` Stefan Monnier 2006-01-23 20:38 ` Stefan Monnier @ 2006-01-24 16:47 ` Richard M. Stallman 1 sibling, 0 replies; 9+ messages in thread From: Richard M. Stallman @ 2006-01-24 16:47 UTC (permalink / raw) Cc: emacs-devel Neither do I, especially not that close to a release. OTOH we haven't released URL yet so it's the last opportunity to clean up the API. In that case, I'd say it is a good idea to pass the success/failure info as another argument to the callback function. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-02-19 20:15 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-01-17 19:59 [happy@mcplaksin.org: HTTP redirects make url-retrieve-synchronously asynchronous] Richard Stallman 2006-01-18 3:13 ` Fwd: HTTP redirects make url-retrieve-synchronously asynchronous Stefan Monnier 2006-01-19 17:44 ` Richard M. Stallman 2006-01-20 21:56 ` Stefan Monnier 2006-01-22 3:59 ` Richard M. Stallman 2006-01-23 16:40 ` Stefan Monnier 2006-01-23 20:38 ` Stefan Monnier 2006-02-19 20:15 ` Mark Plaksin 2006-01-24 16:47 ` Richard M. Stallman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.