Re: [PATCH] read-response-body should return received data when any break happens

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Ian Price <ianprice90@googlemail.com>
To: Nala Ginrut <nalaginrut@gmail.com>
Cc: guile-devel <guile-devel@gnu.org>
Subject: Re: [PATCH] read-response-body should return received data when any break happens
Date: Thu, 15 Mar 2012 18:31:52 +0000	[thread overview]
Message-ID: <87bonxu353.fsf@Kagami.home> (raw)
In-Reply-To: <CAPjoZoeAyj169HvzE92zpEVrQ4LopZCoK-EOeHZyW3GL34Oftg@mail.gmail.com> (Nala Ginrut's message of "Sun, 11 Mar 2012 23:35:56 +0800")

Nala Ginrut <nalaginrut@gmail.com> writes:

> I've been troubled with a weird problem in read-response-body for a long time.
> I think read-response-body never return the received data when any break happens. No matter the break caused by connection problem or user interruption.
> The only possible read-response-body returns data is connection never down and all the data have been received even if I want to download a 2G file. Or there's no
> chance to write any data to the disk. When break occurs, all the received data will evaporate.
> Considering my terrible network, I decide not to pray for good luck when I establish connection with our web module.
> So here's a patch to fix it.
I don't think read-response-body is a good choice for downloading a 2GB
file in any case. Just as you wouldn't read a 2GB sorted file into
memory to perform a binary search, preferring to search directly on the
disk itself, I think people with expectations of treating large files
should write an appropriate handler for themselves using the lower
building blocks guile already provides.

> The new read-response-body will add the received data to the exceptional information which used by "throw", if read-response-body can't continue to work anymore, the received
> data will return with throw.
Yes, providing it at the point of error is best. The data is already
available to us and it won't affect normal returns.

This last part is particularly important to me, as I mentioned on IRC,
since programmers should not have to do anything special in the normal case.

> -(define (read-response-body r)
> +(define* (read-response-body r #:key (block 4096))
>    "Reads the response body from @var{r}, as a bytevector.  Returns
>  @code{#f} if there was no response body."
> -  (let ((nbytes (response-content-length r)))
> -    (and nbytes
> -         (let ((bv (get-bytevector-n (response-port r) nbytes)))
> -           (if (= (bytevector-length bv) nbytes)
> -               bv
> -               (bad-response "EOF while reading response body: ~a bytes of ~a"
> -                            (bytevector-length bv) nbytes))))))
> +  (let* ((nbytes (response-content-length r))
> +         (bv (and nbytes (make-bytevector nbytes)))
> +         (start 0))
> +    (catch #t
> +           (lambda ()
> +             (let lp((buf (get-bytevector-n (response-port r) block)))
> +                 (if (eof-object? buf)
> +                     bv
> +                     (let ((len (bytevector-length buf)))
> +                       (cond
> +                        ((<= len block)
> +                         (bytevector-copy! buf 0 bv start len)
> +                         (set! start (+ start len))
> +                         (lp (get-bytevector-n (response-port r) block)))
> +                        (else
> +                         (bad-response "EOF while reading response body: ~a bytes of ~a"
> +                                       start nbytes)))))))
The manual buffering is superfluous. get-bytevector-n already does this
under the hood, and much more efficiently. Even if it didn't, reading in
smaller chunks is only a win if you are also processing it in chunks,
since large blocks can make the gc interfere more often.

The only conceivable difference I could see is that you are choosing to
return an eof object rather of erroring. I'm not convinced of the
utility of that: #f or #vu8() I could understand, eof less so.

> +             (lambda (k . e)
> +               (let ((received (call-with-port 
> +                                (open-bytevector-input-port bv) 
> +                                (lambda (port)
> +                                  (get-bytevector-n port start)))))
> +                 (throw k `(,@e (body ,@received))) ;; return the received data
> +                 )))))
> +
yuck. This consing a body symbol is hideous IMO. From a programmatic
point of view it is unnecessary, and all it's adding is requiring the
user to perform an extra cadr. It would be much better to choose a
different key to throw to, rather than doing it this way.

> +;; output the received data if there is, or do nothing
> +(define (output-received-response-body e port)
> +  (let ((received (assoc-ref (cadr e) 'body)))
> +    (if received
> +        (begin
> +          (put-bytevector port received) 
> +          (force-output port)))))
> +   
> +;; Exceptional information contains the received bytevector added from the
> +;; read-response-body if any exception had been caught.
> +;; If received data ware huge(it always does), it'd be a trouble during the tracing.
> +;; This helper function could get rid of the received data from exceptional info,
> +;; and re-throw it.
> +(define (throw-from-response-body-break e)
> +  (throw (car e) (list-head (cdr e) (1- (length (cdr e))))))
>  
>  (define (write-response-body r bv)
>    "Write @var{bv}, a bytevector, to the port corresponding to the HTTP

I know I don't have any sort of authority, but I'd like to veto
these. Special procedures to deal with an ad-hoc protocol layered over
another more appropriate protocol is just ugly. As I already mentioned,
exceptions have tags, and this is what these are for.

Fundamentally, I think this patch could be simplified to checking for an
eof from get-bytevector-n and changing the bad-response to an
"incomplete-response" that provides the bytevector.

What does everyone else think?

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"

     prev parent reply	other threads:[~2012-03-15 18:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-11 15:35 [PATCH] read-response-body should return received data when any break happens Nala Ginrut
2012-03-15 16:09 ` Daniel Hartwig
2012-03-15 18:37   ` Ian Price
2012-03-15 18:48     ` Daniel Hartwig
2012-03-16  2:23       ` Nala Ginrut
2012-03-16  3:24         ` Daniel Hartwig
2012-03-16  3:31         ` Nala Ginrut
2012-03-15 18:31 ` Ian Price [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bonxu353.fsf@Kagami.home \
    --to=ianprice90@googlemail.com \
    --cc=guile-devel@gnu.org \
    --cc=nalaginrut@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).