unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: 66692@debbugs.gnu.org
Subject: bug#66692: [cuirass] Spurious failed builds following patching u-boot sources
Date: Mon, 23 Oct 2023 22:49:56 +0200	[thread overview]
Message-ID: <87pm15dpxn.fsf@gnu.org> (raw)
In-Reply-To: <87ttqi82rh.fsf@gmail.com> (Maxim Cournoyer's message of "Sun, 22 Oct 2023 22:57:06 -0400")

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> The build failures are not captured in the logs, which appear truncated,
> e.g. from <https://ci.guix.gnu.org/build/2294052/details> (log empty --
> builder: sjd-p9, worker N38vG7OW) 

On sjd-p9, I get:

--8<---------------cut here---------------start------------->8---
root@sjd-p9:~# journalctl -u cuirass-remote-worker.service |grep -C2 s167lh0w0azfxx83ri8k0mn2gj57i5rl
Oct 23 00:06:21 sjd-p9 cuirass[531]: N38vG7OW: derivation /gnu/store/kcv6hfqz94f1vdw24viw7d9ih6v9zi1j-libkmahjongg-23.04.3.drv build succeeded.
Oct 23 00:06:21 sjd-p9 cuirass[531]: N38vG7OW: request work.
Oct 23 00:06:21 sjd-p9 cuirass[531]: N38vG7OW: building derivation `/gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv' (system: powerpc64le-linux)
Oct 23 00:06:32 sjd-p9 cuirass[531]: C2q41fts: ping tcp://10.0.0.1:5555.
Oct 23 00:06:32 sjd-p9 cuirass[531]: BrnR2Sju: ping tcp://10.0.0.1:5555.
--
Oct 23 00:34:16 sjd-p9 cuirass[531]: vA6cYbb5: request work.
Oct 23 00:34:16 sjd-p9 cuirass[531]: vA6cYbb5: building derivation `/gnu/store/lj4h1g1zq235l6i8mjh88s1ylvj7ksxd-u-boot-pine64-plus-2023.07.02.drv' (system: powerpc64le-linux)
Oct 23 00:34:29 sjd-p9 cuirass[531]: N38vG7OW: derivation /gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv build failed.
Oct 23 00:34:29 sjd-p9 cuirass[531]: N38vG7OW: request work.
Oct 23 00:34:29 sjd-p9 cuirass[531]: N38vG7OW: building derivation `/gnu/store/dy8g5pi1gv8dza1av9957ic6zzj4l88j-u-boot-tools-2023.07.02.drv' (system: powerpc64le-linux)
root@sjd-p9:~# guix build --log-file /gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv
https://ci.guix.gnu.org/log/0yzfwl2b017gsnf60wvr7dnn2g26wrj7-u-boot-rpi-4-32b-2023.07.02
--8<---------------cut here---------------end--------------->8---

So it looks as though the build was attempted and failed, but there’s no
local build log in /var/log/guix/drvs, meaning that, most likely, no
build process was even started.

The log of guix-daemon provide few hints:

--8<---------------cut here---------------start------------->8---
root@sjd-p9:~# journalctl -u guix-daemon |grep 'Oct 23 00:'

[...]

Oct 23 00:06:21 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:06:21 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:06:55 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:07:51 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:16:25 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:17:15 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:33:42 sjd-p9 guix-daemon[2097426]: SIGPOLL
Oct 23 00:33:42 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:33:49 sjd-p9 guix-daemon[2097426]: unexpected build daemon error: interrupted by the user
Oct 23 00:34:16 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:34:16 sjd-p9 guix-daemon[1990649]: SIGPOLL
Oct 23 00:34:23 sjd-p9 guix-daemon[1990649]: unexpected build daemon error: interrupted by the user
Oct 23 00:34:29 sjd-p9 guix-daemon[2052532]: SIGPOLL
Oct 23 00:34:29 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:34:30 sjd-p9 guix-daemon[2052532]: unexpected build daemon error: interrupted by the user
Oct 23 00:34:38 sjd-p9 guix-daemon[1996946]: SIGPOLL
Oct 23 00:34:38 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:34:38 sjd-p9 guix-daemon[1996946]: unexpected build daemon error: interrupted by the user
Oct 23 00:34:45 sjd-p9 guix-daemon[2248556]: SIGPOLL
Oct 23 00:34:45 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:34:45 sjd-p9 guix-daemon[2248556]: unexpected build daemon error: interrupted by the user
Oct 23 00:35:33 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
Oct 23 00:36:46 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user root
--8<---------------cut here---------------end--------------->8---

One possibility would have been that ‘guix gc’ was running, preventing
the build process from starting (“big GC lock” taken); however, if this
were the case, we’d see a connection from a different process in the
guix-daemon log.

> or from <https://ci.guix.gnu.org/build/2294218/details> (builder:
> hydra-guix-107, worker 4CWieLpQ) which ends with:

Here we get:

--8<---------------cut here---------------start------------->8---
root@hydra-guix-107 ~# grep -C2 jy463789qf42dn4zk5i5jzry2lf82ypy /var/log/cuirass-remote-worker.log
2023-10-23 00:17:34 vDKeRcHy: no available build.
2023-10-23 00:17:46 4CWieLpQ: request work.
2023-10-23 00:17:46 4CWieLpQ: building derivation `/gnu/store/jy463789qf42dn4zk5i5jzry2lf82ypy-u-boot-cubietruck-2023.07.02.drv' (system: x86_64-linux)
2023-10-23 00:18:03 zARGXQMa: request work.
2023-10-23 00:18:03 zARGXQMa: no available build.
--
2023-10-23 00:27:35 vDKeRcHy: request work.
2023-10-23 00:27:35 vDKeRcHy: no available build.
2023-10-23 00:27:42 4CWieLpQ: derivation /gnu/store/jy463789qf42dn4zk5i5jzry2lf82ypy-u-boot-cubietruck-2023.07.02.drv build failed.
2023-10-23 00:27:42 4CWieLpQ: request work.
2023-10-23 00:27:42 4CWieLpQ: no available build.
root@hydra-guix-107 ~# ls /var/log/guix/drvs/*/*u-boot-cubietruck-2023*
ls: cannot access '/var/log/guix/drvs/*/*u-boot-cubietruck-2023*': No such file or directory
--8<---------------cut here---------------end--------------->8---

And likewise:

--8<---------------cut here---------------start------------->8---
root@hydra-guix-107 ~# grep -C2 '10-23 00:27:' /var/log/guix-daemon.log
2023-10-23 00:17:46 accepted connection from pid 64022, user root
2023-10-23 00:19:34 accepted connection from pid 64022, user root
2023-10-23 00:27:42 SIGPOLL
2023-10-23 00:27:45 unexpected build daemon error: interrupted by the user
2023-10-23 00:33:33 accepted connection from pid 18016, user hydra
2023-10-23 00:33:34 accepted connection from pid 18035, user hydra
--8<---------------cut here---------------end--------------->8---

But hey, as a matter of fact, ‘cuirass remote-server’ on berlin was
restarted at around that time, as can be seen in
/var/log/cuirass-remote-server.log:

--8<---------------cut here---------------start------------->8---
2023-10-23 00:26:06 0 items queued for eventual download
2023-10-23 00:26:55 period update: 0 resumable, 0 failed builds.
2023-10-23 00:26:58 error: failed to start worker/database proxy: Invalid argument
2023-10-23 00:26:59 increased maximum number of open files from 1024 to 2048
2023-10-23 00:26:59 listening for build logs on port 5556
2023-10-23 00:26:59 deleting old build logs from '/var/cache/cuirass/remote/'...
2023-10-23 00:28:49 selected 1115 build logs to remove
2023-10-23 00:28:49 WARNING: (cuirass scripts remote-server): imported module (fibers) overrides core binding `sleep'
2023-10-23 00:28:49 starting fetch worker with up to 8 concurrent downloads
2023-10-23 00:28:49 build failed: '/gnu/store/qlmhcyk9ac30464v99i2ys0irs7jz3lg-u-boot-am335x-evm-2023.07.02.drv'
2023-10-23 00:28:49 build failed: '/gnu/store/dl3far2xa4paq6nl6pp9iklpxrwg1wy2-ungoogled-chromium-wayland-112.0.5615.165-1.drv'
2023-10-23 00:28:50 build failed: '/gnu/store/y9fwzzgvyhvrmf0avihm6944j8rvpg6m-u-boot-mx6cuboxi-2023.07.02.drv'
2023-10-23 00:28:50 build failed: '/gnu/store/5bhl5q86j6bkqx8zybq29khndhf050jm-u-boot-rpi-2-2023.07.02.drv'
2023-10-23 00:28:50 build failed: '/gnu/store/16d5vq3dlzy1f4s7vazlfg6358bl8dlc-u-boot-qemu-arm-2023.07.02.drv'
2023-10-23 00:28:51 build failed: '/gnu/store/i88vczhcnlz462cmpls8qgx5n3ssw7rc-u-boot-qemu-arm-2023.07.02.drv'
2023-10-23 00:28:51 build failed: '/gnu/store/jy463789qf42dn4zk5i5jzry2lf82ypy-u-boot-cubietruck-2023.07.02.drv'
2023-10-23 00:28:51 build failed: '/gnu/store/i1ri2nkh1bcfzynmggbjg0rjlyi121y8-u-boot-rpi-3-32b-2023.07.02.drv'
2023-10-23 00:28:51 build failed: '/gnu/store/q08xw052r484c1c1nj31sx0fg5yf2py9-openjdk-18.0.2.drv'
2023-10-23 00:28:52 build failed: '/gnu/store/5j4pp2b5dx2ykw4na2wbmiwp84g0hi4i-u-boot-am335x-boneblack-2023.07.02.drv'
2023-10-23 00:28:52 build failed: '/gnu/store/nqp61id4bavvcsib307rw5234myl7ykr-u-boot-rpi-3-32b-efi-2023.07.02.drv'
2023-10-23 00:28:52 build failed: '/gnu/store/9h7x080cn44ik79mrzycbg2x0dvvgigc-u-boot-a20-olinuxino-lime-2023.07.02.drv'
2023-10-23 00:28:52 build failed: '/gnu/store/mkwy8ywa6mrw5zdxwr88skyw94afgaqv-u-boot-novena-2023.07.02.drv'
2023-10-23 00:28:53 build failed: '/gnu/store/lfjp146k0z0w2yqd3y6m2mv09fq01948-u-boot-sifive-unmatched-2023.07.02.drv'
2023-10-23 00:28:53 build failed: '/gnu/store/425ykg0ci4i612ryc7z79as94vipp9rb-u-boot-rpi-2-efi-2023.07.02.drv'
2023-10-23 00:28:57 build failed: '/gnu/store/7fyq24jsmvclnapjgf1li43w6dmm5n06-u-boot-mx6cuboxi-2023.07.02.drv'
[…]
2023-10-23 00:34:29 build failed: '/gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv'
--8<---------------cut here---------------end--------------->8---

/var/log/messages shows that ‘remote-server’ failed and was respawned
(the failure manifests as “failed to start worker/database proxy” in the
cuirass-remote-server.log excerpt above):

--8<---------------cut here---------------start------------->8---
Oct 23 00:24:33 localhost vmunix: [12534110.756389] BTRFS error (device sde2): parent transid verify failed on logical 30457856 mirror 2 wanted 61709 found 11
Oct 23 00:24:33 localhost vmunix: [12534110.768173] BTRFS error (device sde2): parent transid verify failed on logical 30457856 mirror 1 wanted 61709 found 11
Oct 23 00:24:34 localhost vmunix: [12534110.779454] BTRFS error (device sde2): parent transid verify failed on logical 30457856 mirror 2 wanted 61709 found 11
Oct 23 00:24:59 localhost ntpd[1953]: Soliciting pool server 81.169.208.5
Oct 23 00:26:04 localhost ntpd[1953]: Soliciting pool server 141.98.136.83
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server (PID 92414) exited with 1. 
Oct 23 00:26:58 localhost shepherd[1]: Respawning cuirass-remote-server. 
Oct 23 00:26:58 localhost shepherd[1]: Starting service user-homes... 
Oct 23 00:26:58 localhost shepherd[1]: Service user-homes has been started. 
Oct 23 00:26:58 localhost shepherd[1]: Service user-homes started. 
Oct 23 00:26:58 localhost shepherd[1]: Service user-homes running with value #t. 
Oct 23 00:26:58 localhost shepherd[1]: Starting service postgres-roles... 
Oct 23 00:26:58 localhost shepherd[1]: Service postgres-roles has been started. 
Oct 23 00:26:58 localhost shepherd[1]: Service postgres-roles started. 
Oct 23 00:26:58 localhost shepherd[1]: Service postgres-roles running with value #t. 
Oct 23 00:26:58 localhost shepherd[1]: Starting service cuirass-remote-server... 
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server has been started. 
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server started. 
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server running with value 63052. 
Oct 23 00:26:59 localhost vmunix: [12534255.969754] validate_extent_buffer: 58 callbacks suppressed
Oct 23 00:26:59 localhost vmunix: [12534255.969764] BTRFS error (device sde2): parent transid verify failed on logical 30457856 mirror 1 wanted 61709 found 11
--8<---------------cut here---------------end--------------->8---

To summarize:

  1. ‘cuirass remote-server’ failed for some unknown reason and was
     respawned.

  2. All the ongoing and about-to-be-started builds carried out by
     worker processes at that point were marked as failed.

We can improve on both parts.

To be continued…

Ludo’.




  reply	other threads:[~2023-10-23 20:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-23  2:57 bug#66692: [cuirass] Spurious failed builds following patching u-boot sources Maxim Cournoyer
2023-10-23 20:49 ` Ludovic Courtès [this message]
2023-10-23 21:14   ` Ludovic Courtès
2023-10-24  0:59     ` Maxim Cournoyer
2023-10-24 20:27       ` Ludovic Courtès
2023-10-28  9:55   ` Ludovic Courtès
2023-10-30 13:13     ` Ludovic Courtès
     [not found] ` <handler.66692.D66692.169867163715469.notifdone@debbugs.gnu.org>
2023-10-31 13:31   ` bug#66692: closed (Re: bug#66692: [cuirass] Spurious failed builds following patching u-boot sources) Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pm15dpxn.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=66692@debbugs.gnu.org \
    --cc=maxim.cournoyer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).