* bug#52646: GC thread freeze
@ 2021-12-18 20:52 Mathieu Othacehe
2021-12-21 10:16 ` Ludovic Courtès
0 siblings, 1 reply; 3+ messages in thread
From: Mathieu Othacehe @ 2021-12-18 20:52 UTC (permalink / raw)
To: 52646
Hello,
I experiment a strange behaviour with this Guile 3.0.7 process:
https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/tree/src/cuirass/scripts/remote-worker.scm.
The process is forking N processes that in turn start 4 threads. On
aarch64 machines specifically, some of those threads are freezing. Here
is what GDB is reporting:
--8<---------------cut here---------------start------------->8---
(gdb) attach 5660 ;frozen cuirass-remote-worker PID
(gdb) info thr
Id Target Id Frame
* 1 Thread 0xffffafd32e20 (LWP 5660) "yHg3r3fS" 0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
2 Thread 0xffffa6c1c1d0 (LWP 5666) "ZMQbg/Reaper" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
3 Thread 0xffffaf0071d0 (LWP 5667) "ZMQbg/IO/0" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
4 Thread 0xffffa641b1d0 (LWP 5674) "yHg3r3fS" 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0 0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#1 0x0000ffffafb3fb78 in __new_sem_wait_slow.constprop.0 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#2 0x0000ffffafb80318 in GC_stop_world () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3 0x0000ffffafb6c020 in GC_stopped_mark () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4 0x0000ffffafb6c8dc in GC_try_to_collect_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5 0x0000ffffafb6d598 in GC_collect_or_expand () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6 0x0000ffffafb73b4c in GC_alloc_large () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#7 0x0000ffffafb74038 in GC_generic_malloc () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#8 0x0000ffffafb74298 in GC_malloc_kind_global () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#9 0x0000ffffafc11fa8 in scm_make_bytevector () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffacacc418 in ?? ()
#11 0x0000ffffacc2ef2c in ?? ()
(gdb) thr 4
[Switching to thread 4 (Thread 0xffffa641b1d0 (LWP 5674))]
#0 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#1 0x0000ffffaf7bf55c in nanosleep () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#2 0x0000ffffafb7e844 in GC_lock () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3 0x0000ffffafb7ecdc in GC_do_blocking_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4 0x0000ffffafb73998 in GC_with_callee_saves_pushed () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5 0x0000ffffafb79654 in GC_do_blocking () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6 0x0000ffffafc96d94 in scm_without_guile () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#7 0x0000ffffafc97050 in scm_std_select () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#8 0x0000ffffafc97b5c in scm_std_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#9 0x0000ffffafc75918 in scm_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffa6c50d94 in ?? ()
#11 0x0000ffffacc2ee0c in ?? ()
--8<---------------cut here---------------end--------------->8---
The threads 1 and 4 do no respond anymore and are stuck, thread 1 on a
futex wait and thread 4 on a sleep, both in the GC library. For what
it's worth, I do not experiment this behaviour on x86 machines.
I tried to come up with a smaller reproducer without success, but I'll
keep trying.
Thanks,
Mathieu
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#52646: GC thread freeze
2021-12-18 20:52 bug#52646: GC thread freeze Mathieu Othacehe
@ 2021-12-21 10:16 ` Ludovic Courtès
2021-12-22 9:15 ` Mathieu Othacehe
0 siblings, 1 reply; 3+ messages in thread
From: Ludovic Courtès @ 2021-12-21 10:16 UTC (permalink / raw)
To: Mathieu Othacehe; +Cc: 52646
Hello!
Mathieu Othacehe <othacehe@gnu.org> skribis:
> I experiment a strange behaviour with this Guile 3.0.7 process:
> https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/tree/src/cuirass/scripts/remote-worker.scm.
>
> The process is forking N processes that in turn start 4 threads.
This is happening in this order, right?
POSIX leaves unspecified the behavior of a child process forked from a
multi-threaded process; there could be deadlocks, etc. ‘primitive-fork’
prints a warning when called from a multi-threaded Guile process.
The solution is for multi-threaded Guile processes to not fork at all,
or to fork only via ‘open-pipe*’, ‘system*’, etc., which are “known
good” (they take care of post-fork handling in the child and call ‘exec’
before anything bad could happen.)
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#52646: GC thread freeze
2021-12-21 10:16 ` Ludovic Courtès
@ 2021-12-22 9:15 ` Mathieu Othacehe
0 siblings, 0 replies; 3+ messages in thread
From: Mathieu Othacehe @ 2021-12-22 9:15 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 52646
Hey!
> This is happening in this order, right?
Right, plus I don't see the following warning:
--8<---------------cut here---------------start------------->8---
warning: call to primitive-fork while multiple threads are running;
further behavior unspecified. See "Processes" in the
manual, for more information
--8<---------------cut here---------------end--------------->8---
so I guess the issue is elsewhere.
> POSIX leaves unspecified the behavior of a child process forked from a
> multi-threaded process; there could be deadlocks, etc. ‘primitive-fork’
> prints a warning when called from a multi-threaded Guile process.
>
> The solution is for multi-threaded Guile processes to not fork at all,
> or to fork only via ‘open-pipe*’, ‘system*’, etc., which are “known
> good” (they take care of post-fork handling in the child and call ‘exec’
> before anything bad could happen.)
Thanks for clarifying that anyway.
Mathieu
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-12-22 9:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-18 20:52 bug#52646: GC thread freeze Mathieu Othacehe
2021-12-21 10:16 ` Ludovic Courtès
2021-12-22 9:15 ` Mathieu Othacehe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).