unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Christopher Baines <mail@cbaines.net>
Cc: 47288@debbugs.gnu.org
Subject: [bug#47288] [PATCH] guix: http-client: Tweak http-multiple-get error handling.
Date: Thu, 25 Mar 2021 23:20:57 +0100	[thread overview]
Message-ID: <87r1k250za.fsf_-_@gnu.org> (raw)
In-Reply-To: <20210325110316.862-1-mail@cbaines.net> (Christopher Baines's message of "Thu, 25 Mar 2021 11:03:15 +0000")

[-- Attachment #1: Type: text/plain, Size: 4389 bytes --]

Hi!

Christopher Baines <mail@cbaines.net> skribis:

> This isn't meant to change the way errors are handled, and arguably makes the
> code harder to read, but it's a uninformed attempt to improve the
> performance (following on from a performance regression in
> 205833b72c5517915a47a50dbe28e7024dc74e57).
>
> I'm guessing something about Guile internals makes calling (loop ...) within
> the catch bit less performant than avoiding this and calling (loop ...) after
> the catch bit has finished. Since this happens lots, this seems to be
> sufficient to make guix weather a lot slower than it was before.

As Maxime wrote, the problem is that we were making non-tail calls,
thereby consuming stack space as well as accumulating exception
handlers.  As discussed earlier, ‘raise-exception’ exhibits quadratic
behavior in the number of exception handlers, which is okay in normal
situations, but not so much when there are thousands of handlers, as is
the case when asking for many substitutes.

> Anecdotal testing of guix weather suggests this change might work.

Don’t leave this last sentence in the actual commit.  :-)

Please mention <https://bugs.gnu.org/47283>.

> * guix/http-client.scm (http-multiple-get): Tweak how the second catch
> statement works.
> ---
>  guix/http-client.scm | 77 +++++++++++++++++++++++++-------------------
>  1 file changed, 43 insertions(+), 34 deletions(-)
>
> diff --git a/guix/http-client.scm b/guix/http-client.scm
> index 4b4c14ed0b..adbfbc0d6e 100644
> --- a/guix/http-client.scm
> +++ b/guix/http-client.scm
> @@ -219,42 +219,51 @@ returning."
>               (remainder
>                (connect p remainder result))))
>            ((head tail ...)

[...]

> +           (match
> +               (catch #t
> +                 (lambda ()
> +                   (let* ((resp   (read-response p))
> +                          (body   (response-body-port resp))
> +                          (result (proc head resp body result)))
> +                     ;; The server can choose to stop responding at any time,
> +                     ;; in which case we have to try again.  Check whether
> +                     ;; that is the case.  Note that even upon "Connection:
> +                     ;; close", we can read from BODY.
> +                     (match (assq 'connection (response-headers resp))
> +                       (('connection 'close)
> +                        (close-port p)
> +                        (list 'connect
> +                              #f
>                                (drop requests (+ 1 processed))
>                                result))
> -                   (apply throw key args))))))))))
> +                       (_
> +                        (list 'loop tail (+ 1 processed) result)))))
> +                 (lambda (key . args)
> +                   ;; If PORT was cached and the server closed the connection
> +                   ;; in the meantime, we get EPIPE.  In that case, open a
> +                   ;; fresh connection and retry.  We might also get
> +                   ;; 'bad-response or a similar exception from (web response)
> +                   ;; later on, once we've sent the request, or a
> +                   ;; ERROR/INVALID-SESSION from GnuTLS.
> +                   (if (or (and (eq? key 'system-error)
> +                                (= EPIPE (system-error-errno `(,key ,@args))))
> +                           (and (eq? key 'gnutls-error)
> +                                (eq? (first args) error/invalid-session))
> +                           (memq key
> +                                 '(bad-response
> +                                   bad-header
> +                                   bad-header-component)))
> +                       (begin
> +                         (close-port p)
> +                         (list 'connect
> +                               #f
> +                               (drop requests processed)
> +                               result))
> +                       (apply throw key args))))
> +             (('connect . args)
> +              (apply connect args))
> +             (('loop . args)
> +              (apply loop args)))))))))

OK to write it this way as the first commit, to ease review.

What about the approach below:


[-- Attachment #2: Type: text/x-patch, Size: 5247 bytes --]

diff --git a/guix/http-client.scm b/guix/http-client.scm
index 4b4c14ed0b..6351e2d051 100644
--- a/guix/http-client.scm
+++ b/guix/http-client.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2020 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2015 Mark H Weaver <mhw@netris.org>
 ;;; Copyright © 2012, 2015 Free Software Foundation, Inc.
 ;;; Copyright © 2017 Tobias Geerinckx-Rice <me@tobias.gr>
@@ -147,6 +147,27 @@ Raise an '&http-get-error' condition if downloading fails."
                                 (uri->string uri) code
                                 (response-reason-phrase resp))))))))))))
 
+(define-syntax-rule (false-if-networking-error exp)
+  "Return #f if EXP triggers a network related exception."
+  ;; FIXME: Duplicated from 'with-cached-connection'.
+  (catch #t
+    (lambda ()
+      exp)
+    (lambda (key . args)
+      ;; If PORT was cached and the server closed the connection in the
+      ;; meantime, we get EPIPE.  In that case, open a fresh connection and
+      ;; retry.  We might also get 'bad-response or a similar exception from
+      ;; (web response) later on, once we've sent the request, or a
+      ;; ERROR/INVALID-SESSION from GnuTLS.
+      (if (or (and (eq? key 'system-error)
+                   (= EPIPE (system-error-errno `(,key ,@args))))
+              (and (eq? key 'gnutls-error)
+                   (eq? (first args) error/invalid-session))
+              (memq key
+                    '(bad-response bad-header bad-header-component)))
+          #f
+          (apply throw key args)))))
+
 (define* (http-multiple-get base-uri proc seed requests
                             #:key port (verify-certificate? #t)
                             (open-connection guix:open-connection-for-uri)
@@ -219,42 +240,27 @@ returning."
              (remainder
               (connect p remainder result))))
           ((head tail ...)
-           (catch #t
-             (lambda ()
-               (let* ((resp   (read-response p))
-                      (body   (response-body-port resp))
-                      (result (proc head resp body result)))
-                 ;; The server can choose to stop responding at any time,
-                 ;; in which case we have to try again.  Check whether
-                 ;; that is the case.  Note that even upon "Connection:
-                 ;; close", we can read from BODY.
-                 (match (assq 'connection (response-headers resp))
-                   (('connection 'close)
-                    (close-port p)
-                    (connect #f                       ;try again
-                             (drop requests (+ 1 processed))
-                             result))
-                   (_
-                    (loop tail (+ 1 processed) result))))) ;keep going
-             (lambda (key . args)
-               ;; If PORT was cached and the server closed the connection
-               ;; in the meantime, we get EPIPE.  In that case, open a
-               ;; fresh connection and retry.  We might also get
-               ;; 'bad-response or a similar exception from (web response)
-               ;; later on, once we've sent the request, or a
-               ;; ERROR/INVALID-SESSION from GnuTLS.
-               (if (or (and (eq? key 'system-error)
-                            (= EPIPE (system-error-errno `(,key ,@args))))
-                       (and (eq? key 'gnutls-error)
-                            (eq? (first args) error/invalid-session))
-                       (memq key
-                             '(bad-response bad-header bad-header-component)))
-                   (begin
-                     (close-port p)
-                     (connect #f      ; try again
-                              (drop requests (+ 1 processed))
-                              result))
-                   (apply throw key args))))))))))
+           (match (false-if-networking-error (read-response p))
+             ((? response? resp)
+              (let* ((body   (response-body-port resp))
+                     (result (proc head resp body result)))
+                ;; The server can choose to stop responding at any time,
+                ;; in which case we have to try again.  Check whether
+                ;; that is the case.  Note that even upon "Connection:
+                ;; close", we can read from BODY.
+                (match (assq 'connection (response-headers resp))
+                  (('connection 'close)
+                   (close-port p)
+                   (connect #f                    ;try again
+                            (drop requests (+ 1 processed))
+                            result))
+                  (_
+                   (loop tail (+ 1 processed) result)))))
+             (#f
+              (close-port p)
+              (connect #f                         ; try again
+                       (drop requests (+ 1 processed))
+                       result)))))))))
 
 \f
 ;;;

[-- Attachment #3: Type: text/plain, Size: 187 bytes --]


I believe it’s a bit more readable because it moves ‘catch’ out of sight
and avoids the sort of “mini DSL” where we return lists of arguments.

WDYT?

Thanks,
Ludo’.

  parent reply	other threads:[~2021-03-25 22:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-21  0:43 [bug#47288] [PATCH] guix: http-client: Tweak http-multiple-get error handling Christopher Baines
2021-03-21  0:56 ` [bug#47288] [PATCH v2] " Christopher Baines
2021-03-24 14:55   ` [bug#47288] [PATCH] " Ludovic Courtès
2021-03-25 11:09     ` Christopher Baines
2021-03-24 14:55   ` Ludovic Courtès
2021-03-21  8:36 ` Maxime Devos
2021-03-25 11:03 ` [bug#47288] [PATCH v3 1/2] " Christopher Baines
2021-03-25 11:03   ` [bug#47288] [PATCH v3 2/2] guix: http-client: Refactor http-multiple-get Christopher Baines
2021-03-25 22:20   ` Ludovic Courtès [this message]
2021-03-26  8:39     ` [bug#47288] [PATCH] guix: http-client: Tweak http-multiple-get error handling Christopher Baines
2021-03-27 17:15       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r1k250za.fsf_-_@gnu.org \
    --to=ludo@gnu.org \
    --cc=47288@debbugs.gnu.org \
    --cc=mail@cbaines.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).