From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ian Price Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] read-response-body should return received data when any break happens Date: Thu, 15 Mar 2012 18:31:52 +0000 Message-ID: <87bonxu353.fsf@Kagami.home> References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1331836339 22071 80.91.229.3 (15 Mar 2012 18:32:19 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 15 Mar 2012 18:32:19 +0000 (UTC) Cc: guile-devel To: Nala Ginrut Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Mar 15 19:32:18 2012 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1S8FTE-0006UL-2E for guile-devel@m.gmane.org; Thu, 15 Mar 2012 19:32:16 +0100 Original-Received: from localhost ([::1]:57767 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8FTD-0004BE-Fe for guile-devel@m.gmane.org; Thu, 15 Mar 2012 14:32:15 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:43025) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8FT4-0004AQ-Ix for guile-devel@gnu.org; Thu, 15 Mar 2012 14:32:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8FSx-0006Wl-W0 for guile-devel@gnu.org; Thu, 15 Mar 2012 14:32:06 -0400 Original-Received: from mail-we0-f169.google.com ([74.125.82.169]:41094) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8FSx-0006We-JF for guile-devel@gnu.org; Thu, 15 Mar 2012 14:31:59 -0400 Original-Received: by werj55 with SMTP id j55so4080211wer.0 for ; Thu, 15 Mar 2012 11:31:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; bh=4y1zoJ/tlR4Kd9y3RA7yHSZvTMur+AvwN/gzrnRgW0w=; b=vppOWg64ZoMITvzu6GHHSp9tt13O2WdLRumb3NKv6deYOP7L/hDf5o9b4LQ05FjtmS AyQflo/80YYYkEweCQBZYppZmDilQcUwfu3Y7tzRaXsekkOG/KA+m7idyuFM8T6Z9Ykz u1tg8/pCeAwM8F/UjNu7Xb/xV9E2+RU9fS16/YPk4u2u7eM96C5NG9yBFYxx01+iWbWi 8twZHrBo+cBIR4OfXGdZuOJSBDfUq1u7Jl8ELLWL5jVFBKuzf4K9oA1SUaEPdQ/v7SOQ 1VoGhf7NNJC7Z/eKiWSVO2LJMmjjmOe3JzKgMzc2kBhBEiuZkjk5HihDZfBQWgvXuiU4 OXIg== Original-Received: by 10.180.83.97 with SMTP id p1mr17984306wiy.19.1331836317343; Thu, 15 Mar 2012 11:31:57 -0700 (PDT) Original-Received: from Kagami.home (host86-151-78-130.range86-151.btcentralplus.com. [86.151.78.130]) by mx.google.com with ESMTPS id e6sm6959695wix.8.2012.03.15.11.31.55 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 15 Mar 2012 11:31:56 -0700 (PDT) In-Reply-To: (Nala Ginrut's message of "Sun, 11 Mar 2012 23:35:56 +0800") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 74.125.82.169 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:14105 Archived-At: Nala Ginrut writes: > I've been troubled with a weird problem in read-response-body for a long time. > I think read-response-body never return the received data when any break happens. No matter the break caused by connection problem or user interruption. > The only possible read-response-body returns data is connection never down and all the data have been received even if I want to download a 2G file. Or there's no > chance to write any data to the disk. When break occurs, all the received data will evaporate. > Considering my terrible network, I decide not to pray for good luck when I establish connection with our web module. > So here's a patch to fix it. I don't think read-response-body is a good choice for downloading a 2GB file in any case. Just as you wouldn't read a 2GB sorted file into memory to perform a binary search, preferring to search directly on the disk itself, I think people with expectations of treating large files should write an appropriate handler for themselves using the lower building blocks guile already provides. > The new read-response-body will add the received data to the exceptional information which used by "throw", if read-response-body can't continue to work anymore, the received > data will return with throw. Yes, providing it at the point of error is best. The data is already available to us and it won't affect normal returns. This last part is particularly important to me, as I mentioned on IRC, since programmers should not have to do anything special in the normal case. > -(define (read-response-body r) > +(define* (read-response-body r #:key (block 4096)) > "Reads the response body from @var{r}, as a bytevector. Returns > @code{#f} if there was no response body." > - (let ((nbytes (response-content-length r))) > - (and nbytes > - (let ((bv (get-bytevector-n (response-port r) nbytes))) > - (if (= (bytevector-length bv) nbytes) > - bv > - (bad-response "EOF while reading response body: ~a bytes of ~a" > - (bytevector-length bv) nbytes)))))) > + (let* ((nbytes (response-content-length r)) > + (bv (and nbytes (make-bytevector nbytes))) > + (start 0)) > + (catch #t > + (lambda () > + (let lp((buf (get-bytevector-n (response-port r) block))) > + (if (eof-object? buf) > + bv > + (let ((len (bytevector-length buf))) > + (cond > + ((<= len block) > + (bytevector-copy! buf 0 bv start len) > + (set! start (+ start len)) > + (lp (get-bytevector-n (response-port r) block))) > + (else > + (bad-response "EOF while reading response body: ~a bytes of ~a" > + start nbytes))))))) The manual buffering is superfluous. get-bytevector-n already does this under the hood, and much more efficiently. Even if it didn't, reading in smaller chunks is only a win if you are also processing it in chunks, since large blocks can make the gc interfere more often. The only conceivable difference I could see is that you are choosing to return an eof object rather of erroring. I'm not convinced of the utility of that: #f or #vu8() I could understand, eof less so. > + (lambda (k . e) > + (let ((received (call-with-port > + (open-bytevector-input-port bv) > + (lambda (port) > + (get-bytevector-n port start))))) > + (throw k `(,@e (body ,@received))) ;; return the received data > + ))))) > + yuck. This consing a body symbol is hideous IMO. From a programmatic point of view it is unnecessary, and all it's adding is requiring the user to perform an extra cadr. It would be much better to choose a different key to throw to, rather than doing it this way. > +;; output the received data if there is, or do nothing > +(define (output-received-response-body e port) > + (let ((received (assoc-ref (cadr e) 'body))) > + (if received > + (begin > + (put-bytevector port received) > + (force-output port))))) > + > +;; Exceptional information contains the received bytevector added from the > +;; read-response-body if any exception had been caught. > +;; If received data ware huge(it always does), it'd be a trouble during the tracing. > +;; This helper function could get rid of the received data from exceptional info, > +;; and re-throw it. > +(define (throw-from-response-body-break e) > + (throw (car e) (list-head (cdr e) (1- (length (cdr e)))))) > > (define (write-response-body r bv) > "Write @var{bv}, a bytevector, to the port corresponding to the HTTP I know I don't have any sort of authority, but I'd like to veto these. Special procedures to deal with an ad-hoc protocol layered over another more appropriate protocol is just ugly. As I already mentioned, exceptions have tags, and this is what these are for. Fundamentally, I think this patch could be simplified to checking for an eof from get-bytevector-n and changing the bad-response to an "incomplete-response" that provides the bytevector. What does everyone else think? -- Ian Price "Programming is like pinball. The reward for doing it well is the opportunity to do it again" - from "The Wizardy Compiled"