From: Mathieu Othacehe <othacehe@gnu.org>
To: 52182@debbugs.gnu.org
Subject: bug#52182: [cuirass] remote-worker process freeze
Date: Mon, 29 Nov 2021 16:19:48 +0100 [thread overview]
Message-ID: <87ilwbj8ff.fsf@gnu.org> (raw)
Hello,
On the newly installed honeycomb machines, some Cuirass remote-worker
process freeze completely and stop communicating with the
remote-server.
This has already been observed, but is for some reason more repeatable
on those machines.
Here are the information I could collect on such a frozen process using
GDB:
--8<---------------cut here---------------start------------->8---
(gdb) attach 5660 ;frozen cuirass-remote-worker PID
(gdb) info thr
Id Target Id Frame
* 1 Thread 0xffffafd32e20 (LWP 5660) "yHg3r3fS" 0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
2 Thread 0xffffa6c1c1d0 (LWP 5666) "ZMQbg/Reaper" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
3 Thread 0xffffaf0071d0 (LWP 5667) "ZMQbg/IO/0" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
4 Thread 0xffffa641b1d0 (LWP 5674) "yHg3r3fS" 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0 0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#1 0x0000ffffafb3fb78 in __new_sem_wait_slow.constprop.0 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#2 0x0000ffffafb80318 in GC_stop_world () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3 0x0000ffffafb6c020 in GC_stopped_mark () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4 0x0000ffffafb6c8dc in GC_try_to_collect_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5 0x0000ffffafb6d598 in GC_collect_or_expand () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6 0x0000ffffafb73b4c in GC_alloc_large () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#7 0x0000ffffafb74038 in GC_generic_malloc () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#8 0x0000ffffafb74298 in GC_malloc_kind_global () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#9 0x0000ffffafc11fa8 in scm_make_bytevector () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffacacc418 in ?? ()
#11 0x0000ffffacc2ef2c in ?? ()
(gdb) thr 4
[Switching to thread 4 (Thread 0xffffa641b1d0 (LWP 5674))]
#0 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#1 0x0000ffffaf7bf55c in nanosleep () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#2 0x0000ffffafb7e844 in GC_lock () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3 0x0000ffffafb7ecdc in GC_do_blocking_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4 0x0000ffffafb73998 in GC_with_callee_saves_pushed () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5 0x0000ffffafb79654 in GC_do_blocking () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6 0x0000ffffafc96d94 in scm_without_guile () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#7 0x0000ffffafc97050 in scm_std_select () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#8 0x0000ffffafc97b5c in scm_std_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#9 0x0000ffffafc75918 in scm_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffa6c50d94 in ?? ()
#11 0x0000ffffacc2ee0c in ?? ()
--8<---------------cut here---------------end--------------->8---
So the threads 2 and 3 are managed internally by ZMQ. The threads 1 and
4 are respectively the thread pinging the remote-server and the thread
actually building stuff.
Looks like they are both stuck doing GC stuff.
Thanks,
Mathieu
next reply other threads:[~2021-11-29 15:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-29 15:19 Mathieu Othacehe [this message]
2022-02-07 19:39 ` bug#52182: [cuirass] remote-worker process freeze Maxim Cournoyer
2023-09-11 8:04 ` Ludovic Courtès
2023-09-11 14:11 ` Maxim Cournoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ilwbj8ff.fsf@gnu.org \
--to=othacehe@gnu.org \
--cc=52182@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).