unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Christopher Baines <mail@cbaines.net>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 47283@debbugs.gnu.org
Subject: bug#47283: Performance regression in narinfo fetching
Date: Tue, 23 Mar 2021 20:47:12 +0000	[thread overview]
Message-ID: <87k0pxbnsf.fsf@cbaines.net> (raw)
In-Reply-To: <87czvs19tp.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 5622 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Christopher Baines <mail@cbaines.net> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Indeed, there’s one place on the hot path where we install exception
>>> handlers: in ‘http-multiple-get’ (from commit
>>> 205833b72c5517915a47a50dbe28e7024dc74e57).  I don’t think it’s needed,
>>> is it?  (But if it is, let’s find another approach, this one is
>>> prohibitively expensive.)
>>
>> I think the exception handling has moved around, but I guess the
>> exceptions that could be caught in http-multiple-get could happen,
>> right? I am really just guessing here, as Guile doesn't help tell you
>> about possible exceptions, and I haven't spent enough time to read all
>> the possible code involved to find out if these are definitely possible.
>
> Yeah.
>
> Commit 205833b72c5517915a47a50dbe28e7024dc74e57 added a ‘catch’ block
> that catches the same things as ‘with-cached-connection’ did (it would
> be better to not duplicate it IMO).  That includes EPIPE, gnutls-error,
> bad-response & co.

So, my intention here was to move the error handling, to allow
separating out the connection caching code from the code I wanted to
move out to the (guix substitutes) module. I don't think there's
currently duplication in the error handling for the code path involving
http-multiple-get currently, at least for the exceptions in question
here.

> Earlier, commit be5a75ebb5988b87b2392e2113f6590f353dd6cd (“substitute:
> Reuse connections for '--query'.”) did not add such a ‘catch’ block in
> ‘http-multiple-get’.  Instead, it wrapped its call in ‘do-fetch’ in
> ‘fetch-narinfos’:
>
>    (define (do-fetch uri)
>      (case (and=> uri uri-scheme)
>        ((http https)
> -       (let ((requests (map (cut narinfo-request url <>) paths)))
> -         (match (open-connection-for-uri/maybe uri)
> -           (#f
> -            '())
> -           (port
> -            (update-progress!)
>         ;; Note: Do not check HTTPS server certificates to avoid depending
>         ;; on the X.509 PKI.  We can do it because we authenticate
>         ;; narinfos, which provides a much stronger guarantee.
> -            (let ((result (http-multiple-get uri
> +       (let* ((requests (map (cut narinfo-request url <>) paths))
> +              (result   (call-with-cached-connection uri
> +                          (lambda (port)
> +                            (if port
> +                                (begin
> +                                  (update-progress!)
> +                                  (http-multiple-get uri
>                                                       handle-narinfo-response '()
>                                                       requests
> +                                                     #:open-connection
> +                                                     open-connection-for-uri/cached
>                                                       #:verify-certificate? #f
> -                                             #:port port)))
>
> This bit is still there in current ‘master’, so I think it’s not
> necessary to catch these exceptions in ‘http-multiple-get’ itself, and I
> would just remove the ‘catch’ wrap altogether.
>
> WDYT?

I'm not sure what you're referring to as still being there on the master
branch?

Looking at the changes to this particular code path resulting from the
changes I've made recently, starting at lookup-narinfos, before:

 - lookup-narinfos calls fetch-narinfos, which calls do-fetch

 - call-with-cached-connection is used, which catches a number of
   exceptions relating to requests, and will retry PROC once upon a
   matching exception

 - open-connection-for-uri/maybe is also used, which is like
   open-connection-for-uri/cached, except it includes error handling for
   establishing connections to substitute servers

 - http-multiple-get doesn't include error handling

After:

 - lookup-narinfos calls fetch-narinfos, which calls do-fetch

 - call-with-connection-error-handling is used, which performs the same
   role as the error handling previously within
   open-connection-for-uri/maybe, catching exceptions relating to
   establishing connections to substitute servers

 - http-multiple-get now includes error handling similar to what was
   previously done by call-with-cached-connection, although it's more
   complicated since it's done with knowledge of what http-multiple-get
   is doing

I think that the error handling now in http-multiple-get isn't covered
elsewhere. Moving this error handling back in to fetch-narinfos is
possible, but then we'd be back to handling connection caching in that
code, and avoiding that led to this refactoring in the first place.

Also, apart from the implementation problems, I do think that the error
handling here is better than before. Previously, if you called
lookup-narinfos, and a connection problem occurred, processing all the
requests would start from scratch (as call-with-cached-connection calls
PROC a second time), if a second connection error was to happen, well,
call-with-cached-connection only handles one error, so that won't be
caught.

I think it's possible that http-multiple-get will be making thousands of
requests, running guix weather with no cached results for example. The
error handling in http-multiple-get is smarter than the previous
approach, doing as little as possible again. It's also not limited to
catching one exception.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 987 bytes --]

  reply	other threads:[~2021-03-23 20:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-20 17:38 bug#47283: Performance regression in narinfo fetching Ludovic Courtès
2021-03-20 20:32 ` Christopher Baines
2021-03-21  0:48   ` Christopher Baines
2021-03-21 21:10     ` Ludovic Courtès
2021-03-21 21:22   ` Ludovic Courtès
2021-03-23 20:47     ` Christopher Baines [this message]
2021-03-24 14:51       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k0pxbnsf.fsf@cbaines.net \
    --to=mail@cbaines.net \
    --cc=47283@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).