unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 41625@debbugs.gnu.org
Subject: bug#41625: Sporadic guix-offload crashes due to EOF errors
Date: Mon, 24 May 2021 01:33:21 -0400	[thread overview]
Message-ID: <87mtsky9um.fsf@gmail.com> (raw)
In-Reply-To: <87a71jc9yi.fsf@gnu.org> ("Ludovic Courtès"'s message of "Thu, 04 Jun 2020 14:05:41 +0200")

Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Marius Bakke <marius@gnu.org> skribis:
>
>> Marius Bakke <marius@gnu.org> writes:
>>
>>> 'guix offload test' passes without problems.
>>
>> Not so fast, running it in a loop reveals the crash.
>>
>> There is a trace file in /root/offloadtest.trace on Berlin with such an
>> occurence.  It looks like a timeout is reached shortly before the EOF
>> error:
>>
>> 10139 poll([{fd=14, events=POLLIN|POLLOUT}], 1, 0) = 1 ([{fd=14, revents=POLLOUT}])
>> 10139 poll([{fd=14, events=POLLIN}], 1, 15000) = 0 (Timeout)
>> 10139 write(2, "Backtrace:\n", 11)      = 11
>>
>> This seems to be from a different node than the one reported previously,
>> as the preceding connect() was to this machine:
>>
>> 10139 connect(44, {sa_family=AF_INET, sin_port=htons(22),
>> sin_addr=inet_addr("141.80.167.186")}, 16) = -1 EINPROGRESS
>> (Operation now in progress)
>
> So it looks like ‘connect’ fails and eventually we get an EOF object.
> However, I don’t see where that EOF comes from because the return value
> of ‘connect!’ (the Guile-SSH procedure) is properly checked.
>
> Ludo’.

I got a slightly different backtrace that suggests making the connection
is not at fault, rather it occurs during the read-repl-response call:

--8<---------------cut here---------------start------------->8---
guix offload: testing 1 build machines defined in '/etc/guix/machines.scm'...
Backtrace:
           8 (primitive-load "/home/maxim/.config/guix/current/bin/guix")
In guix/ui.scm:
  2165:12  7 (run-guix-command _ . _)
In ice-9/boot-9.scm:
  1752:10  6 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
  1747:15  5 (with-exception-handler #<procedure 7f2caf885780 at ice-9/boot-9.scm:1831:7 (exn)> _ # _ # …)
In guix/scripts/offload.scm:
   704:21  4 (check-machine-availability _ _)
In srfi/srfi-1.scm:
   586:17  3 (map1 (#<session maxim@overdrive1.guix.gnu.org:52522 (connected) 7f2cae396fc0>))
In guix/inferior.scm:
    258:2  2 (port->inferior _ _)
    240:2  1 (read-repl-response _ _)
In ice-9/boot-9.scm:
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `match-error' with args `("match" "no matching pattern" #<eof>)'.
--8<---------------cut here---------------end--------------->8---

I seem to get this more often than not with the overdrive1 offload
machine.

Maxim




  reply	other threads:[~2021-05-24  5:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-31  9:51 bug#41625: Sporadic guix-offload crashes due to EOF errors Marius Bakke
2020-05-31 10:12 ` Marius Bakke
2020-05-31 11:21   ` Marius Bakke
2020-06-04 12:05     ` Ludovic Courtès
2021-05-24  5:33       ` Maxim Cournoyer [this message]
2021-05-25 15:50         ` bug#41625: [PATCH] offload: Handle a possible EOF response from read-repl-response Maxim Cournoyer
2021-05-25 20:27           ` Ludovic Courtès
2021-05-26  3:18             ` bug#41625: [PATCH v2] " Maxim Cournoyer
2021-05-26  9:14               ` Ludovic Courtès
2021-05-27 11:49                 ` Maxim Cournoyer
2021-05-27 14:57                 ` bug#41625: [PATCH v3] " Maxim Cournoyer
2021-07-05  8:57                   ` bug#41625: Sporadic guix-offload crashes due to EOF errors Ludovic Courtès
2021-09-24  4:53                     ` Maxim Cournoyer
2021-09-24  4:55                     ` Maxim Cournoyer
2021-05-27 17:20                 ` bug#41625: [PATCH v2] offload: Handle a possible EOF response from read-repl-response Maxim Cournoyer
2021-05-29 19:24                   ` Ludovic Courtès
2021-05-26 15:48               ` Marius Bakke
2021-05-27 11:51                 ` Maxim Cournoyer
2022-03-26  5:03                   ` bug#41625: Sporadic guix-offload crashes due to EOF errors Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtsky9um.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=41625@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).