unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludovic.courtes@inria.fr>
To: 59493@debbugs.gnu.org
Cc: Mathieu Othacehe <othacehe@gnu.org>
Subject: bug#59493: cuirass-remote-worker crash
Date: Tue, 22 Nov 2022 23:14:05 +0100	[thread overview]
Message-ID: <87ilj6hc2a.fsf@inria.fr> (raw)

Hi,

In /var/log/cuirass-remote-worker.log on overdrive1.guix, I found this:

--8<---------------cut here---------------start------------->8---
2022-11-21 14:27:24 Backtrace:
2022-11-21 14:27:24 Backtrace:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24   1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24 In unknown file:
2022-11-21 14:27:24            9 (apply-smob/0 #<thunk 3903a300>)
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24     724:2  8 (call-with-prompt _ _ #<procedure default-prompt-handle?>)
2022-11-21 14:27:24 In ice-9/eval.scm:
2022-11-21 14:27:24   1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24     619:8  7 (_ #(#(#<directory (guile-user) 3903dc80>)))
2022-11-21 14:27:24 In cuirass/ui.scm:
2022-11-21 14:27:24 In unknown file:
2022-11-21 14:27:24            9 (apply-smob/0 #<thunk 3903a300>)
2022-11-21 14:27:24    104:10  6 (run-cuirass-command _ . _)
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24     724:2  8 (call-with-prompt _ _ #<procedure default-prompt-handle?>)
2022-11-21 14:27:24   1752:10  5 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24 In ice-9/eval.scm:
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24     619:8  7 (_ #(#(#<directory (guile-user) 3903dc80>)))
2022-11-21 14:27:24 In cuirass/ui.scm:
2022-11-21 14:27:24    104:10  6 (run-cuirass-command _ . _)
2022-11-21 14:27:24    435:12  4 (_)
2022-11-21 14:27:24 In srfi/srfi-1.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24   1752:10  5 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24     634:9  3 (for-each #<procedure 398a3510 at cuirass/scripts/remo?> ?)
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24    448:18  2 (_ _)
2022-11-21 14:27:24    435:12  4 (_)
2022-11-21 14:27:24 In srfi/srfi-1.scm:
2022-11-21 14:27:24     634:9  3 (for-each #<procedure 398a3510 at cuirass/scripts/remo?> ?)
2022-11-21 14:27:24    356:11  1 (start-worker _ _)
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24    448:18  2 (_ _)
2022-11-21 14:27:24   1685:16  0 (raise-exception _ #:continuable? _)
2022-11-21 14:27:24
2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matching pattern" (#vu8()))'.
2022-11-21 14:27:24    356:11  1 (start-worker _ _)
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24   1685:16  0 (raise-exception _ #:continuable? _)
2022-11-21 14:27:24
2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matching pattern" (#vu8()))'.
--8<---------------cut here---------------end--------------->8---

(Stuttering is due to the unprotected use of ‘primitive-fork’: a
non-local exit in the child leads it to execute the same code as its
parent.  We should fix that, but should we really fork in the first
place?  :-))

This comes from here:

--8<---------------cut here---------------start------------->8---
  (define (read-server-info socket)
    (request-info socket)
    (match (zmq-get-msg-parts-bytevector socket '())   ;<-- here
      ((empty info)
       (match (zmq-read-message (bv->string info))
         (('server-info
           ('worker-address worker-address)
           ('log-port log-port)
           ('publish-port publish-port))
          (list worker-address log-port publish-port))))))
--8<---------------cut here---------------end--------------->8---

This is the version being used:

--8<---------------cut here---------------start------------->8---
ludo@overdrive1 ~$ cat /proc/24019/cmdline |xargs -0
/gnu/store/zpir9n73amaxrwz2k7x46l73v21vxk6s-guile-3.0.8/bin/guile --no-auto-compile -e main -s /gnu/store/rlqdzmfyamjpn6lz07yqk2hsabv3l7g5-cuirass-1.1.0-11.9f08035/bin/.cuirass-real remote-worker --workers=2 --server=10.0.0.1:5555 --systems=armhf-linux,aarch64-linux --publish-port=5558 --substitute-urls=http://10.0.0.1
ludo@overdrive1 ~$ guix system describe
Generation 36   Sep 27 2022 09:06:48    (current)
  file name: /var/guix/profiles/system-36-link
  canonical file name: /gnu/store/m04qw6f0lfd0wpn1skiys4b56wqfc3b8-system
  label: GNU with Linux-Libre 5.19.11
  bootloader: grub-efi
  root device: /dev/sda3
  kernel: /gnu/store/09r4wbbabskmbrnwmshpdk7vh6g87gam-linux-libre-5.19.11/Image
  channels:
    guix:
      repository URL: https://git.savannah.gnu.org/git/guix.git
      commit: f15a141cf35bd4188767f0e91c0654991d4c49e0
  configuration file: /gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configuration.scm
--8<---------------cut here---------------end--------------->8---

The sequence leading to this seems to be:

--8<---------------cut here---------------start------------->8---
22340 eventfd2(0, EFD_CLOEXEC <unfinished ...>
[…]
22340 <... eventfd2 resumed>)           = 15
[…]
22340 ppoll([{fd=15, events=POLLIN}], 1, NULL, NULL, 0 <unfinished ...>
[…]
22340 <... ppoll resumed>)              = 1 ([{fd=15, revents=POLLIN}])
22343 epoll_pwait(8,  <unfinished ...>
22340 read(15, "\1\0\0\0\0\0\0\0", 8)   = 8
22340 ppoll([{fd=15, events=POLLIN}], 1, {tv_sec=0, tv_nsec=0}, NULL, 0) = 0 (Timeout)
22340 write(2, "Backtrace:\n", 11)      = 11
--8<---------------cut here---------------end--------------->8---

Does that ring a bell?  Perhaps that was fixed in the meantime?

Right now it cannot be restarted: it always fails at start up with the
error above.  10.0.0.1 is reachable though so I’m not sure what’s up.

Ludo’.




             reply	other threads:[~2022-11-22 22:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-22 22:14 Ludovic Courtès [this message]
2022-11-23  8:08 ` bug#59493: cuirass-remote-worker crash Mathieu Othacehe
2022-11-23 15:47   ` Ludovic Courtès
2022-11-23 16:03     ` Mathieu Othacehe
2022-11-26 15:04       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ilj6hc2a.fsf@inria.fr \
    --to=ludovic.courtes@inria.fr \
    --cc=59493@debbugs.gnu.org \
    --cc=othacehe@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).