unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Mathieu Othacehe <othacehe@gnu.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 48468@debbugs.gnu.org
Subject: bug#48468: substitute server connection timeout
Date: Wed, 28 Dec 2022 15:23:21 +0100	[thread overview]
Message-ID: <87v8lvsj3a.fsf@gnu.org> (raw)
In-Reply-To: <87wn72fay3.fsf@gnu.org> ("Ludovic Courtès"'s message of "Thu, 08 Dec 2022 11:26:28 +0100")

[-- Attachment #1: Type: text/plain, Size: 5267 bytes --]


Hello,

So I had an other look to that one,

> (From <https://ci.guix.gnu.org/build/224849/log/raw>.)
>
> Plausible explanations that come to mind:
>
>   1. ‘guix publish’ returning 404, but not due to baking.  Instead the
>      .drv is simply not in store, hence 404.
>
>   2. Client timeout (‘guix publish’ fails to reply on time).
>
>   3. Cached 404 in nginx (though we’re not supposed to cache those I
>      think?), or timeout in nginx (again due to ‘guix publish’ being too
>      slow and ‘proxy_read_timeout’ is reached, currently 10s).

I focused on https://ci.guix.gnu.org/build/308493/details which is
similar to the one linked above. Some derivations are successfully
substituted then, one is not and it aborts.

This build stopped at 2022/12/24 00:25:38 CET. At that time the nginx error
log on Berlin looks like:

--8<---------------cut here---------------start------------->8---
2022/12/24 00:25:23 [info] 128755#0: *178076209 recv() failed (104: Connection reset by peer) while sending to client, client: 78.47.68.4, server: ci.guix.gnu.org, request: "GET /x3c2f9zzxhk0ci81nrdsxzglp4b1h3di.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/x3c2f9zzxhk0ci81nrdsxzglp4b1h3di.narinfo", host: "ci.guix.gnu.org"
2022/12/24 00:25:24 [info] 128742#0: *178076816 client 174.59.221.51 closed keepalive connection
2022/12/24 00:25:25 [info] 128755#0: *178076830 recv() failed (104: Connection reset by peer) while sending to client, client: 78.47.68.4, server: ci.guix.gnu.org, request: "GET /vx7lgvrcw6i6r0inw513qn6fj3cb27mm.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/vx7lgvrcw6i6r0inw513qn6fj3cb27mm.narinfo", host: "ci.guix.gnu.org"
2022/12/24 00:25:26 [info] 128742#0: *178077436 recv() failed (104: Connection reset by peer) while sending to client, client: 78.47.68.4, server: ci.guix.gnu.org, request: "GET /b1g8vyqp8c52yk6vscdvjflci5mnladq.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/b1g8vyqp8c52yk6vscdvjflci5mnladq.narinfo", host: "ci.guix.gnu.org"
2022/12/24 00:25:28 [info] 128742#0: *178078045 recv() failed (104: Connection reset by peer) while sending to client, client: 78.47.68.4, server: ci.guix.gnu.org, request: "GET /007zgflsl5xkr377wpakbsis5c2yqh1q.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/007zgflsl5xkr377wpakbsis5c2yqh1q.narinfo", host: "ci.guix.gnu.org"
2022/12/24 00:25:32 [info] 128742#0: *178078659 recv() failed (104: Connection reset by peer) while sending to client, client: 78.47.68.4, server: ci.guix.gnu.org, request: "GET /xdfyi8jki350i8njk1glrzg9p0va18vy.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/xdfyi8jki350i8njk1glrzg9p0va18vy.narinfo", host: "ci.guix.gnu.org"
2022/12/24 00:25:33 [info] 128741#0: *178079271 recv() failed (104: Connection reset by peer) while sending to client, client: 78.47.68.4, server: ci.guix.gnu.org, request: "GET /rzs7jiwhyls2bg8zv2g1vmmgcfwkvaax.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/rzs7jiwhyls2bg8zv2g1vmmgcfwkvaax.narinfo", host: "ci.guix.gnu.org"
2022/12/24 00:25:36 [info] 128741#0: *178080049 client 157.55.39.212 closed keepalive connection
2022/12/24 00:25:38 [info] 128741#0: *178080047 peer closed connection in SSL handshake while SSL handshaking, client: 127.0.0.1, server: 0.0.0.0:443
--8<---------------cut here---------------end--------------->8---

That would, in that specific case, invalidate your hypothesis number
2. When 'guix publish' timeouts, we have messages like:

--8<---------------cut here---------------start------------->8---
2022/12/24 00:45:36 [error] 128749#0: *178225211 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 86.87.253.49, server: ci.guix.gnu.org, request: "GET /m0hqyvj8di41xlraz087kmpr38zaw90r.narinfo HTTP/1.1", upstream: "http://127.0.0.1:3000/m0hqyvj8di41xlraz087kmpr38zaw90r.narinfo", host: "ci.guix.gnu.org"
--8<---------------cut here---------------end--------------->8---

However, like suggested in your hypothesis number 1, it seems instead
that we are replying 404 to the worker which resets the connection. As
we have put aside the baking thing, the question is now why are those
derivations not available?

Are they not part of Berlin's store at that time? Or is the publish
server erroneously returns 404?

It looks like the requested derivations are still absent, as of right
now:

--8<---------------cut here---------------start------------->8---
mathieu@berlin /var/log/nginx$ ls /gnu/store/vx7lgvrcw6i6r0inw513qn6fj3cb27mm*
ls: cannot access '/gnu/store/vx7lgvrcw6i6r0inw513qn6fj3cb27mm*': No such file or directory
mathieu@berlin /var/log/nginx$ ls /gnu/store/b1g8vyqp8c52yk6vscdvjflci5mnladq*
ls: cannot access '/gnu/store/b1g8vyqp8c52yk6vscdvjflci5mnladq*': No such file or directory
mathieu@berlin /var/log/nginx$ ls /gnu/store/007zgflsl5xkr377wpakbsis5c2yqh1q*
ls: cannot access '/gnu/store/007zgflsl5xkr377wpakbsis5c2yqh1q*': No such file or directory
--8<---------------cut here---------------end--------------->8---

As I don't have much clue about what are those derivations, I think we
should instrument a bit the publish server and maybe the substitute
script like proposed in the attachments.

WDYT?

Thanks,

Mathieu

[-- Attachment #2: 0001-scripts-publish-Add-a-custom-baking-header.patch --]
[-- Type: text/x-patch, Size: 3120 bytes --]

From 9f9c839937ac2edd1b5901b2262c4be0954fa20c Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <othacehe@gnu.org>
Date: Wed, 28 Dec 2022 15:12:46 +0100
Subject: [PATCH 1/2] scripts: publish: Add a custom baking header.

Log the not-found responses and their reason (baking or not) to stdout. Also
send the X-Baking custom header so that the client can be informed of the
cause of the failure.

* guix/scripts/publish.scm (not-found): Add a baking? argument to add the
X-Baking HTTP header to the response if baking is in progress.  Also, log the
404 responses to stdout, indicating if it is due to baking or not.
(render-narinfo/cached): Pass the baking? argument.
---
 guix/scripts/publish.scm | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/guix/scripts/publish.scm b/guix/scripts/publish.scm
index 3bf3bd9c7c..11fedf092e 100644
--- a/guix/scripts/publish.scm
+++ b/guix/scripts/publish.scm
@@ -4,7 +4,7 @@
 ;;; Copyright © 2015-2022 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2020 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;; Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
-;;; Copyright © 2021 Mathieu Othacehe <othacehe@gnu.org>
+;;; Copyright © 2021, 2022 Mathieu Othacehe <othacehe@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -375,14 +375,28 @@ (define* (narinfo-string store store-path
                                            compression)))
                  compressions))))
 
+;; Custom header to indicate that baking is in progress.
+(declare-opaque-header! "X-Baking")
+
 (define* (not-found request
-                    #:key (phrase "Resource not found")
+                    #:key
+                    baking?
+                    (phrase "Resource not found")
                     ttl)
   "Render 404 response for REQUEST."
+  (format #t (G_ "↳ ~a ~a: 404~a~%")
+          (request-method request)
+          (uri-path (request-uri request))
+          (if baking? " (baking)" ""))
   (values (build-response #:code 404
-                          #:headers (if ttl
-                                        `((cache-control (max-age . ,ttl)))
-                                        '()))
+                          #:headers
+                          (append
+                           (if ttl
+                               `((cache-control (max-age . ,ttl)))
+                               '())
+                           (if baking?
+                               '((x-baking . "1"))
+                               '())))
           (string-append phrase ": "
                          (uri-path (request-uri request)))))
 
@@ -587,6 +601,7 @@ (define (delete-entry narinfo)
                                #:nar-path nar-path
                                #:compressions compressions)
                (not-found request
+                          #:baking? #t
                           #:phrase "We're baking it"
                           #:ttl 300)))          ;should be available within 5m
           (else
-- 
2.38.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-substitutes-Log-the-failing-queries.patch --]
[-- Type: text/x-patch, Size: 1921 bytes --]

From 25ffc57864dbf34ca58741f89c1f790dbde6702f Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <othacehe@gnu.org>
Date: Wed, 28 Dec 2022 15:19:29 +0100
Subject: [PATCH 2/2] substitutes: Log the failing queries.

* guix/substitutes.scm (%debug?): New variable.
(handle-narinfo-response): Log the failing queries if the %debug? parameter is
set.
---
 guix/substitutes.scm | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/guix/substitutes.scm b/guix/substitutes.scm
index 9014cf61ec..819eb2c73e 100644
--- a/guix/substitutes.scm
+++ b/guix/substitutes.scm
@@ -90,6 +90,12 @@ (define %narinfo-cache-directory
           (string-append %state-directory "/substitute/cache"))
       (string-append (cache-directory #:ensure? #f) "/substitute")))
 
+(define %debug?
+  ;; Enable debug mode by setting the GUIX_SUBSTITUTE_DEBUG environmnent
+  ;; variable.
+  (make-parameter
+   (getenv "GUIX_SUBSTITUTE_DEBUG")))
+
 (define (narinfo-cache-file cache-url path)
   "Return the name of the local file that contains an entry for PATH.  The
 entry is stored in a sub-directory specific to CACHE-URL."
@@ -224,6 +230,15 @@ (define (handle-narinfo-response request response port result)
           (let* ((path      (uri-path (request-uri request)))
                  (hash-part (basename
                              (string-drop-right path 8)))) ;drop ".narinfo"
+            ;; Log the failing queries and indicate if it failed because the
+            ;; narinfo is being baked.
+            (when (%debug?)
+              (let ((baking?
+                     (assoc-ref (response-headers response) 'x-baking)))
+                (display
+                 (format #f "could not fetch ~a~a ~a~a~%"
+                         url path code
+                         (if baking? " (baking)" "")))))
             (if len
                 (get-bytevector-n port len)
                 (read-to-eof port))
-- 
2.38.1


  parent reply	other threads:[~2022-12-28 14:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-16 17:57 bug#48468: substitute server connection timeout Mathieu Othacehe
2021-05-16 18:26 ` Christopher Baines
2021-05-17 14:49   ` Mathieu Othacehe
2021-05-18 14:35     ` Mathieu Othacehe
2021-05-21 13:30       ` Mathieu Othacehe
2021-05-29 21:44   ` Ludovic Courtès
2021-06-18 12:33 ` Ludovic Courtès
2021-06-29 16:49   ` Mathieu Othacehe
2022-12-05 13:21     ` Ludovic Courtès
2022-12-07  9:44       ` Mathieu Othacehe
2022-12-07 13:38         ` Ludovic Courtès
2022-12-07 14:31           ` Mathieu Othacehe
2022-12-08 10:26             ` Ludovic Courtès
2022-12-10 10:55               ` Ludovic Courtès
2022-12-27  9:52                 ` Mathieu Othacehe
2022-12-28 14:23               ` Mathieu Othacehe [this message]
2023-01-06 22:42                 ` Ludovic Courtès
2023-01-07 13:40                   ` Mathieu Othacehe
2023-01-09  9:39                     ` Ludovic Courtès
2023-01-10  8:10                       ` Mathieu Othacehe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8lvsj3a.fsf@gnu.org \
    --to=othacehe@gnu.org \
    --cc=48468@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).