unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#15368: HTTP client is slow [2.0.9]
@ 2013-09-13 13:41 Ludovic Courtès
  2013-09-13 16:07 ` Mark H Weaver
  2014-05-23 20:14 ` Ludovic Courtès
  0 siblings, 2 replies; 4+ messages in thread
From: Ludovic Courtès @ 2013-09-13 13:41 UTC (permalink / raw)
  To: 15368

I just noticed that our HTTP client is very slow.  Consider this:

--8<---------------cut here---------------start------------->8---
(use-modules (web client)
             (rnrs io ports)
             (rnrs bytevectors)
             (srfi srfi-11)
             (ice-9 format))

(define %uri
  "http://ftp.gnu.org/gnu/idutils/idutils-4.6.tar.xz")

(with-fluids ((%default-port-encoding #f))
  (let*-values (((start)
                 (gettimeofday))
                ((p)
                 (let ((s (open-socket-for-uri %uri)))
                   (setvbuf s _IONBF)
                   s))
                ((r h)
                 (http-get %uri
                           #:port p
                           #:streaming? #t
                           #:decode-body? #f))
                ((d len)
                 (let ((b (get-bytevector-all h)))
                   (values b (bytevector-length b)))
                 ;; (let ((b (make-bytevector (* 5 (expt 2 20)))))
                 ;;   (values b
                 ;;           (get-bytevector-n! h b 0 (bytevector-length b))))
                 )
                ((end)
                 (gettimeofday))
                ((throughput)
                 (let ((duration (- (car end) (car start))))
                   (/ (/ len 1024.) duration 1.0))))
    (format #t "~5,1f KiB/s (total: ~5,1f KiB)~%"
            throughput (/ len 1024.))))
--8<---------------cut here---------------end--------------->8---

Here I get a throughput of ~60 KiB/s, vs. ~400 KiB/s for wget and curl.

Looking at the strace output reveals no real difference: they all make
one syscall for each chunk of 1410 bytes.

‘time’ reports that Guile spends 0.2 s. in user and 0.8 s. in system,
both of which are an order of magnitude higher than wget/curl.

Bypassing the custom binary input ports from http.scm and response.scm
doesn’t make any big difference.  Forcing the zero-copy path in
‘scm_c_read’ doesn’t help much either.

Ideas?

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#15368: HTTP client is slow [2.0.9]
  2013-09-13 13:41 bug#15368: HTTP client is slow [2.0.9] Ludovic Courtès
@ 2013-09-13 16:07 ` Mark H Weaver
  2013-09-13 21:14   ` Ludovic Courtès
  2014-05-23 20:14 ` Ludovic Courtès
  1 sibling, 1 reply; 4+ messages in thread
From: Mark H Weaver @ 2013-09-13 16:07 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 15368

Hi Ludovic,

ludo@gnu.org (Ludovic Courtès) writes:

> I just noticed that our HTTP client is very slow.  Consider this:
>
> (use-modules (web client)
>              (rnrs io ports)
>              (rnrs bytevectors)
>              (srfi srfi-11)
>              (ice-9 format))
>
> (define %uri
>   "http://ftp.gnu.org/gnu/idutils/idutils-4.6.tar.xz")
>
> (with-fluids ((%default-port-encoding #f))
>   (let*-values (((start)
>                  (gettimeofday))
>                 ((p)
>                  (let ((s (open-socket-for-uri %uri)))
>                    (setvbuf s _IONBF)

Why are you using an unbuffered port?  On my system, changing this to
_IOFBF increases throughput from 326 KiB/s to 489.0 KiB/s.

Also, the fact that my throughput is so much higher than yours (on a
several-year-old computer) is interesting.  Obviously I have a faster
net connection (wget reports 1.19M/s), but the fact that Guile can
benefit so much from my faster connection suggests that the body is read
reasonably efficiently.  I guess the problem is added latency somewhere,
or perhaps inefficiency in the writing of the request or reading of the
response headers.

Note that using an unbuffered port means that all the reads of the
response headers will be done 1 byte at a time.

>                    s))
>                 ((r h)
>                  (http-get %uri
>                            #:port p
>                            #:streaming? #t
>                            #:decode-body? #f))
>                 ((d len)
>                  (let ((b (get-bytevector-all h)))
>                    (values b (bytevector-length b)))
>                  ;; (let ((b (make-bytevector (* 5 (expt 2 20)))))
>                  ;;   (values b
>                  ;;           (get-bytevector-n! h b 0 (bytevector-length b))))
>                  )
>                 ((end)
>                  (gettimeofday))
>                 ((throughput)
>                  (let ((duration (- (car end) (car start))))
>                    (/ (/ len 1024.) duration 1.0))))
>     (format #t "~5,1f KiB/s (total: ~5,1f KiB)~%"
>             throughput (/ len 1024.))))
>
> Here I get a throughput of ~60 KiB/s, vs. ~400 KiB/s for wget and curl.
>
> Looking at the strace output reveals no real difference: they all make
> one syscall for each chunk of 1410 bytes.
>
> ‘time’ reports that Guile spends 0.2 s. in user and 0.8 s. in system,
> both of which are an order of magnitude higher than wget/curl.

If they make essentially the same syscalls, then why would the system
time be an order of magnitude higher?  Something doesn't sound right
here.

    Regards,
      Mark





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#15368: HTTP client is slow [2.0.9]
  2013-09-13 16:07 ` Mark H Weaver
@ 2013-09-13 21:14   ` Ludovic Courtès
  0 siblings, 0 replies; 4+ messages in thread
From: Ludovic Courtès @ 2013-09-13 21:14 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 15368

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> I just noticed that our HTTP client is very slow.  Consider this:
>>
>> (use-modules (web client)
>>              (rnrs io ports)
>>              (rnrs bytevectors)
>>              (srfi srfi-11)
>>              (ice-9 format))
>>
>> (define %uri
>>   "http://ftp.gnu.org/gnu/idutils/idutils-4.6.tar.xz")
>>
>> (with-fluids ((%default-port-encoding #f))
>>   (let*-values (((start)
>>                  (gettimeofday))
>>                 ((p)
>>                  (let ((s (open-socket-for-uri %uri)))
>>                    (setvbuf s _IONBF)
>
> Why are you using an unbuffered port?  On my system, changing this to
> _IOFBF increases throughput from 326 KiB/s to 489.0 KiB/s.

Arf, that’s because I was also forcing the ‘scm_c_read’ hack (which
is currently never used, and this is a bug):


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 654 bytes --]

diff --git a/libguile/ports.c b/libguile/ports.c
index 9068c5c..c217712 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1657,7 +1657,8 @@ scm_c_read (SCM port, void *buffer, size_t size)
      requested number of bytes.  (Note that a single scm_i_fill_input
      call does not guarantee to fill the whole of the port's read
      buffer.) */
-  if (pt->read_buf_size <= 1 && pt->encoding == NULL)
+  if (pt->read_buf_size <= 1
+      && (pt->encoding == NULL || strcmp (pt->encoding, "ISO-8859-1") == 0))
     {
       /* The port that we are reading from is unbuffered - i.e. does
 	 not have its own persistent buffer - but we have a buffer,

[-- Attachment #3: Type: text/plain, Size: 855 bytes --]


So in practice it was reading several KiB at a time, doing zero-copy.

> Also, the fact that my throughput is so much higher than yours (on a
> several-year-old computer) is interesting.  Obviously I have a faster
> net connection (wget reports 1.19M/s),

So for you wget is ~2.5 times faster than Guile, right?

[...]

>> Looking at the strace output reveals no real difference: they all make
>> one syscall for each chunk of 1410 bytes.
>>
>> ‘time’ reports that Guile spends 0.2 s. in user and 0.8 s. in system,
>> both of which are an order of magnitude higher than wget/curl.
>
> If they make essentially the same syscalls, then why would the system
> time be an order of magnitude higher?  Something doesn't sound right
> here.

I concur.

I’ve tried Linux perf and OProfile but failed to get useful info.

Ludo’.

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#15368: HTTP client is slow [2.0.9]
  2013-09-13 13:41 bug#15368: HTTP client is slow [2.0.9] Ludovic Courtès
  2013-09-13 16:07 ` Mark H Weaver
@ 2014-05-23 20:14 ` Ludovic Courtès
  1 sibling, 0 replies; 4+ messages in thread
From: Ludovic Courtès @ 2014-05-23 20:14 UTC (permalink / raw)
  To: 15368-done

ludo@gnu.org (Ludovic Courtès) skribis:

> Here I get a throughput of ~60 KiB/s, vs. ~400 KiB/s for wget and curl.

There’s one little detail I hadn’t even bothered checking:

          ;; Enlarge the receive buffer.
          (setsockopt s SOL_SOCKET SO_RCVBUF (* 12 1024))


Its effect was to *shrink* the receive buffer from 124 KiB (the default
size, per /proc/sys/net/core/rmem_default) to 12 KiB...

Fixed in 0bb3f94, which will be in 2.0.12.

Ludo’.





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-23 20:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-13 13:41 bug#15368: HTTP client is slow [2.0.9] Ludovic Courtès
2013-09-13 16:07 ` Mark H Weaver
2013-09-13 21:14   ` Ludovic Courtès
2014-05-23 20:14 ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).