all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: 30365@debbugs.gnu.org
Subject: bug#30365: Offloading sometimes hangs
Date: Tue, 06 Feb 2018 11:04:10 +0100	[thread overview]
Message-ID: <877erq8med.fsf@gnu.org> (raw)

Hi,

On berlin.guixsd.org, offloading would sometimes hang in the middle of
an offloaded build: no more build log output showing up, nothing
happening (this is with guix-0.14.0-6.0dcf675).

On the build machine side, the guile process that forwards data between
the sshd and guix-daemon¹ is stuck on:

  read(0, …)

with this stack trace:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  0x00007f09d6068aed in read () from /gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/lib/libpthread.so.0
#1  0x00007f09d653fc47 in fport_read ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
#2  0x00007f09d656cd77 in scm_i_read_bytes ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
#3  0x00007f09d65705fe in scm_fill_input ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
#4  0x00007f09d6577897 in scm_get_bytevector_some ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
#5  0x00007f09d65abc4d in vm_regular_engine ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
#6  0x00007f09d65af2aa in scm_call_n ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
#7  0x00007f09d65338d7 in scm_primitive_eval ()
   from /gnu/store/0v539yjmdqhjm1xcpvndmagkgjz5fvh2-guile-2.2.2/lib/libguile-2.2.so.1
--8<---------------cut here---------------end--------------->8---

In theory this “cannot happen” because it reads from stdin iff ‘select’
said stdin is ready.

On the server side (on berlin itself), the corresponding ‘guix offload’
process is stuck here:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  0x00007ff49b3590bd in poll () from target:/gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/lib/libc.so.6
#1  0x00007ff48f4db377 in ssh_poll_ctx_dopoll ()
   from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4
#2  0x00007ff48f4dc319 in ssh_handle_packets ()
   from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4
#3  0x00007ff48f4dc3ed in ssh_handle_packets_termination ()
   from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4
#4  0x00007ff48f4c8eff in ssh_channel_read_timeout ()
   from target:/gnu/store/3phbrya78gpk7rg6flqyqzf53y3x9zv9-libssh-0.7.5/lib/libssh.so.4
#5  0x00007ff48f930803 in read_from_channel_port ()
   from target:/gnu/store/xfaqdvk060yz7ddc9isk3wkybqmcfj3w-guile-ssh-0.11.2/lib/libguile-ssh.so.11
#6  0x00007ff49cea7d77 in scm_i_read_bytes ()
   from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1
#7  0x00007ff49ceac3fc in scm_c_read_bytes ()
   from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1
#8  0x00007ff49ceb2838 in scm_get_bytevector_n ()
   from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1
#9  0x00007ff49cee6c4d in vm_regular_engine ()
   from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1
#10 0x00007ff49ceea2aa in scm_call_n ()
   from target:/gnu/store/swyipr8smrd5bc72n92sdfxzx0p4cjpi-guile-2.2.2/lib/libguile-2.2.so.1
#11 0x00007ff49ce6e8d7 in scm_primitive_eval ()
--8<---------------cut here---------------end--------------->8---

Presumably the ‘scm_get_bytevector_n’ call comes from (guix
serialization) or ‘process-stderr’.

IOW we have a deadlock where both sides are waiting for input data.

Ludo’.

¹ https://git.savannah.gnu.org/cgit/guix.git/tree/guix/ssh.scm?id=0362e5820ab6a1eb8eaf33bc47e592857c25f765#n102

             reply	other threads:[~2018-02-06 10:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-06 10:04 Ludovic Courtès [this message]
2018-02-07 13:42 ` bug#30365: Offloading sometimes hangs Ludovic Courtès
2018-02-07 20:54   ` Ludovic Courtès
2018-02-09 23:16   ` Ludovic Courtès
2018-02-10 10:17     ` Ricardo Wurmus
2018-02-10 11:07       ` Ludovic Courtès
2018-02-15  0:06     ` Danny Milosavljevic

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877erq8med.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=30365@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.