From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 41625@debbugs.gnu.org
Subject: bug#41625: [PATCH v2] offload: Handle a possible EOF response from read-repl-response.
Date: Thu, 27 May 2021 07:49:22 -0400 [thread overview]
Message-ID: <877djkl7lp.fsf@gmail.com> (raw)
In-Reply-To: <87fsy9x3ev.fsf@gnu.org> ("Ludovic Courtès"'s message of "Wed, 26 May 2021 11:14:32 +0200")
Hi Ludovic,
Ludovic Courtès <ludo@gnu.org> writes:
[...]
> I see. So I’d say it’s a prerequisite (a patch that must come before)
> but not entirely the same thing. I’m nitpicking!
Eh, it's okay :-). Splitting changes into the right unit is a problem
that is akin to naming things; it's hard! I welcome your suggestion.
> We should make sure it doesn’t trigger thread-safety issues in libssh or
> anything like that (running it repeatedly on a large machines.scm should
> give us some confidence).
It seems fine so far, but I've only tested in a loop with 4 build
machines. When it nears completion I'll give it a shot on berlin.
[...]
> Yes, but note that this is just for ‘guix offload test’. The actual
> code run while offloading will still fail badly.
Ah, thanks for pointing that; I somehow thought that this machine status
checking code was a prelude to every offloaded build.
[...]
>> I don't have a password set for my user on overdrive1, so can't attach
>> strace to sshd, but yeah, we could try to capture it and see if we can
>> understand what's going on.
>
> OK.
I'd be happy to try strace when your are available. You can ping me on
the chat. It's been more than 8 hours since I tried, so I should be
able to trigger the problem :-).
[...]
> Perhaps worth adding an ‘inferior’ and/or ‘port’ field. That would
> allow the handler to present more information as to which inferior is
> failing.
>
> Maybe ‘premature-eof’ would be more accurate than ‘connection-lost’.
Good suggestions. I'll implement them.
>> + (format (current-error-port)
>> + (G_ "connection to machine '~a' lost; retrying~%")
>> + (build-machine-name machine))
>
> You can use ‘info’ instead of ‘format’.
That also. Thanks!
On another note, I was able to 'exercise' the fix, and the exception is
raised but something fails with the following backtrace instead of being
retried:
--8<---------------cut here---------------start------------->8---
guix offload: Testing 1 build machines defined in '/etc/guix/machines.scm'...
connection to machine 'overdrive1.guix.gnu.org' lost; retrying
Backtrace:
In ice-9/boot-9.scm:
1752:10 10 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In unknown file:
9 (apply-smob/0 #<thunk 7f915c028f60>)
In ice-9/boot-9.scm:
724:2 8 (call-with-prompt _ _ #<procedure default-prompt-handler (k proc)>)
In ice-9/eval.scm:
619:8 7 (_ #(#(#<directory (guile-user) 7f915c022c80>)))
In guix/ui.scm:
2161:12 6 (run-guix-command _ . _)
In ice-9/boot-9.scm:
1752:10 5 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
1747:15 4 (with-exception-handler #<procedure 7f91576bf0c0 at ice-9/boot-9.scm:1831:7 (exn)> _ # _ # …)
In srfi/srfi-1.scm:
634:9 3 (for-each #<procedure check-machine-availability (a)> (#<<build-machine> name: "overdriv…>))
In ice-9/eval.scm:
191:35 2 (_ #(#(#(#<directory (guix scripts offload) 7f9159852780> 3 #<<build-machine> na…> …) …) …))
Exception thrown while printing backtrace:
In procedure frame-local-ref: Argument 2 out of range: 1
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Wrong type to apply: 2
--8<---------------cut here---------------end--------------->8---
I haven't been able to pinpoint what yet. Notice that in the above code
I've changed par-for-each by just for-each, doubting it might have
something to do with it, but it appears unrelated.
Thanks,
Maxim
next prev parent reply other threads:[~2021-05-27 11:52 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-31 9:51 bug#41625: Sporadic guix-offload crashes due to EOF errors Marius Bakke
2020-05-31 10:12 ` Marius Bakke
2020-05-31 11:21 ` Marius Bakke
2020-06-04 12:05 ` Ludovic Courtès
2021-05-24 5:33 ` Maxim Cournoyer
2021-05-25 15:50 ` bug#41625: [PATCH] offload: Handle a possible EOF response from read-repl-response Maxim Cournoyer
2021-05-25 20:27 ` Ludovic Courtès
2021-05-26 3:18 ` bug#41625: [PATCH v2] " Maxim Cournoyer
2021-05-26 9:14 ` Ludovic Courtès
2021-05-27 11:49 ` Maxim Cournoyer [this message]
2021-05-27 14:57 ` bug#41625: [PATCH v3] " Maxim Cournoyer
2021-07-05 8:57 ` bug#41625: Sporadic guix-offload crashes due to EOF errors Ludovic Courtès
2021-09-24 4:53 ` Maxim Cournoyer
2021-09-24 4:55 ` Maxim Cournoyer
2021-05-27 17:20 ` bug#41625: [PATCH v2] offload: Handle a possible EOF response from read-repl-response Maxim Cournoyer
2021-05-29 19:24 ` Ludovic Courtès
2021-05-26 15:48 ` Marius Bakke
2021-05-27 11:51 ` Maxim Cournoyer
2022-03-26 5:03 ` bug#41625: Sporadic guix-offload crashes due to EOF errors Maxim Cournoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877djkl7lp.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=41625@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).