unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 41625@debbugs.gnu.org
Subject: bug#41625: [PATCH v2] offload: Handle a possible EOF response from read-repl-response.
Date: Thu, 27 May 2021 07:49:22 -0400	[thread overview]
Message-ID: <877djkl7lp.fsf@gmail.com> (raw)
In-Reply-To: <87fsy9x3ev.fsf@gnu.org> ("Ludovic Courtès"'s message of "Wed, 26 May 2021 11:14:32 +0200")

Hi Ludovic,

Ludovic Courtès <ludo@gnu.org> writes:

[...]

> I see.  So I’d say it’s a prerequisite (a patch that must come before)
> but not entirely the same thing.  I’m nitpicking!

Eh, it's okay :-).  Splitting changes into the right unit is a problem
that is akin to naming things; it's hard!  I welcome your suggestion.

> We should make sure it doesn’t trigger thread-safety issues in libssh or
> anything like that (running it repeatedly on a large machines.scm should
> give us some confidence).

It seems fine so far, but I've only tested in a loop with 4 build
machines.  When it nears completion I'll give it a shot on berlin.

[...]

> Yes, but note that this is just for ‘guix offload test’.  The actual
> code run while offloading will still fail badly.

Ah, thanks for pointing that; I somehow thought that this machine status
checking code was a prelude to every offloaded build.

[...]

>> I don't have a password set for my user on overdrive1, so can't attach
>> strace to sshd, but yeah, we could try to capture it and see if we can
>> understand what's going on.
>
> OK.

I'd be happy to try strace when your are available.  You can ping me on
the chat.  It's been more than 8 hours since I tried, so I should be
able to trigger the problem :-).

[...]

> Perhaps worth adding an ‘inferior’ and/or ‘port’ field.  That would
> allow the handler to present more information as to which inferior is
> failing.
>
> Maybe ‘premature-eof’ would be more accurate than ‘connection-lost’.

Good suggestions.  I'll implement them.

>> +                       (format (current-error-port)
>> +                               (G_ "connection to machine '~a' lost; retrying~%")
>> +                               (build-machine-name machine))
>
> You can use ‘info’ instead of ‘format’.

That also.  Thanks!

On another note, I was able to 'exercise' the fix, and the exception is
raised but something fails with the following backtrace instead of being
retried:

--8<---------------cut here---------------start------------->8---
guix offload: Testing 1 build machines defined in '/etc/guix/machines.scm'...
connection to machine 'overdrive1.guix.gnu.org' lost; retrying
Backtrace:
In ice-9/boot-9.scm:
  1752:10 10 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In unknown file:
           9 (apply-smob/0 #<thunk 7f915c028f60>)
In ice-9/boot-9.scm:
    724:2  8 (call-with-prompt _ _ #<procedure default-prompt-handler (k proc)>)
In ice-9/eval.scm:
    619:8  7 (_ #(#(#<directory (guile-user) 7f915c022c80>)))
In guix/ui.scm:
  2161:12  6 (run-guix-command _ . _)
In ice-9/boot-9.scm:
  1752:10  5 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
  1747:15  4 (with-exception-handler #<procedure 7f91576bf0c0 at ice-9/boot-9.scm:1831:7 (exn)> _ # _ # …)
In srfi/srfi-1.scm:
    634:9  3 (for-each #<procedure check-machine-availability (a)> (#<<build-machine> name: "overdriv…>))
In ice-9/eval.scm:
   191:35  2 (_ #(#(#(#<directory (guix scripts offload) 7f9159852780> 3 #<<build-machine> na…> …) …) …))
Exception thrown while printing backtrace:
In procedure frame-local-ref: Argument 2 out of range: 1

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Wrong type to apply: 2
--8<---------------cut here---------------end--------------->8---

I haven't been able to pinpoint what yet.  Notice that in the above code
I've changed par-for-each by just for-each, doubting it might have
something to do with it, but it appears unrelated.

Thanks,

Maxim




  reply	other threads:[~2021-05-27 11:52 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-31  9:51 bug#41625: Sporadic guix-offload crashes due to EOF errors Marius Bakke
2020-05-31 10:12 ` Marius Bakke
2020-05-31 11:21   ` Marius Bakke
2020-06-04 12:05     ` Ludovic Courtès
2021-05-24  5:33       ` Maxim Cournoyer
2021-05-25 15:50         ` bug#41625: [PATCH] offload: Handle a possible EOF response from read-repl-response Maxim Cournoyer
2021-05-25 20:27           ` Ludovic Courtès
2021-05-26  3:18             ` bug#41625: [PATCH v2] " Maxim Cournoyer
2021-05-26  9:14               ` Ludovic Courtès
2021-05-27 11:49                 ` Maxim Cournoyer [this message]
2021-05-27 14:57                 ` bug#41625: [PATCH v3] " Maxim Cournoyer
2021-07-05  8:57                   ` bug#41625: Sporadic guix-offload crashes due to EOF errors Ludovic Courtès
2021-09-24  4:53                     ` Maxim Cournoyer
2021-09-24  4:55                     ` Maxim Cournoyer
2021-05-27 17:20                 ` bug#41625: [PATCH v2] offload: Handle a possible EOF response from read-repl-response Maxim Cournoyer
2021-05-29 19:24                   ` Ludovic Courtès
2021-05-26 15:48               ` Marius Bakke
2021-05-27 11:51                 ` Maxim Cournoyer
2022-03-26  5:03                   ` bug#41625: Sporadic guix-offload crashes due to EOF errors Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877djkl7lp.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=41625@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).