From: "Ludovic Courtès" <ludo@gnu.org>
To: Jesse Gibbons <jgibbons2357@gmail.com>
Cc: Andy Wingo <wingo@igalia.com>, 37757@debbugs.gnu.org
Subject: bug#37757: Kernel panic upon shutdown
Date: Mon, 09 Dec 2019 14:47:59 +0100 [thread overview]
Message-ID: <87lfrlfw4w.fsf@gnu.org> (raw)
In-Reply-To: <87d0d6k4z4.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 02 Dec 2019 18:33:03 +0100")
[-- Attachment #1: Type: text/plain, Size: 4466 bytes --]
Hello,
[+Cc: Andy for a heads-up on the fix below.]
Ludovic Courtès <ludo@gnu.org> skribis:
> It turns out the previous patch didn’t work; in short, we really have to
> use async-signal-safe functions only from the signal handler, so this
> has to be done in C.
>
> The attached patch does that. I’ve tried it with ‘guix system
> container’ and it seems to dump core as expected, from what I can see.
>
> Let me know if you manage to reproduce the bug and to get a core dumped
> with this patch.
Good news! The patch does indeed allow shepherd to dump core, and I
managed to grab the backtrace below on an x86_64 machine running Guix
System (from yesterday) with GNOME:
--8<---------------cut here---------------start------------->8---
Using host libthread_db library "/gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libthread_db.so.1".
Core was generated by `/gnu/store/1mkkv2caiqbdbbd256c4dirfi4kwsacv-guile-2.2.6/bin/guile --no-auto-com'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 handle_crash (sig=11)
at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
43 * (int *) 0 = 42;
[Current thread is 1 (LWP 4635)]
[…]
Thread 1 (LWP 4635):
#0 handle_crash (sig=11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
pid = <optimized out>
msg = "Shepherd crashed!\n"
pid = <optimized out>
#1 <signal handler called>
No locals.
#2 handle_crash (sig=6) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
pid = <optimized out>
msg = "Shepherd crashed!\n"
pid = <optimized out>
#3 <signal handler called>
No locals.
#4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
set = {__val = {0, 2314885530818445312, 0 <repeats 14 times>}}
pid = <optimized out>
tid = <optimized out>
ret = <optimized out>
#5 0x00007f03eef40891 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 13 times>, 139654877144192, 0, 139654877624544}}, sa_flags = -279049286, sa_restorer = 0x7f03ef57e480 <read_finalization_pipe_data>}
sigs = {__val = {32, 0 <repeats 15 times>}}
#6 0x00007f03ef57e89a in finalization_thread_proc (unused=<optimized out>) at finalizers.c:228
data = {byte = -24 '\350', n = -1, err = 4}
#7 0x00007f03ef56f35a in c_body (d=0x7f03ed152e50) at continuations.c:422
data = 0x7f03ed152e50
#8 0x00007f03ef5f079f in vm_regular_engine (thread=0x2, vp=0x7f03eb1caea0, registers=0x0, resume=-286001158) at vm-engine.c:786
ret = 2
ip = <optimized out>
sp = <optimized out>
op = 10
jump_table_ = {…}
jump_table = 0x7f03ef64d8e0 <jump_table_>
[…]
#19 scm_with_guile (func=<optimized out>, data=<optimized out>) at threads.c:710
No locals.
#20 0x00007f03ef497015 in start_thread (arg=0x7f03ed153700) at pthread_create.c:486
ret = <optimized out>
pd = 0x7f03ed153700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139654839219968, -749312912628550421, 140727702524830, 140727702524831, 140727702524832, 139654839219968, 837174519050892523, 837169745183601899}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#21 0x00007f03eeffd91f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
--8<---------------cut here---------------end--------------->8---
So what happens is that ‘finalization_thread_proc’ in Guile receives
EINTR (data.err == 4) but then, despite EINTR, it goes on to check the
value of ‘data.byte’ and aborts because it’s neither 0 nor 1.
My plan is to:
1. push the patch below to the ‘stable-2.2’ branch of Guile;
done:
<https://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.2&id=edf5aea7ac852db2356ef36cba4a119eb0c81ea9>;
2. use a patched Guile for the ‘shepherd’ package;
3. include the crash handler in the Shepherd.
Thoughts?
Thanks,
Ludo’.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1353 bytes --]
diff --git a/libguile/finalizers.c b/libguile/finalizers.c
index c5d69e8e3..94a6e6b0a 100644
--- a/libguile/finalizers.c
+++ b/libguile/finalizers.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2012, 2013, 2014 Free Software Foundation, Inc.
+/* Copyright (C) 2012, 2013, 2014, 2019 Free Software Foundation, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public License
@@ -211,21 +211,26 @@ finalization_thread_proc (void *unused)
scm_without_guile (read_finalization_pipe_data, &data);
- if (data.n <= 0 && data.err != EINTR)
+ if (data.n <= 0)
{
- perror ("error in finalization thread");
- return NULL;
+ if (data.err != EINTR)
+ {
+ perror ("error in finalization thread");
+ return NULL;
+ }
}
-
- switch (data.byte)
+ else
{
- case 0:
- scm_run_finalizers ();
- break;
- case 1:
- return NULL;
- default:
- abort ();
+ switch (data.byte)
+ {
+ case 0:
+ scm_run_finalizers ();
+ break;
+ case 1:
+ return NULL;
+ default:
+ abort ();
+ }
}
}
}
next prev parent reply other threads:[~2019-12-09 13:49 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <0876c9961fdffa47be54b756a05eb6320b6bdb18.camel@gmail.com>
2019-10-28 22:28 ` bug#37757: Kernel panic upon shutdown Ludovic Courtès
2019-11-13 22:05 ` Ludovic Courtès
2019-11-13 22:22 ` Jan
2019-11-28 11:45 ` Ludovic Courtès
2019-12-02 17:33 ` Ludovic Courtès
2019-12-03 9:43 ` Arne Babenhauserheide
2019-12-09 13:47 ` Ludovic Courtès [this message]
2019-12-09 23:13 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lfrlfw4w.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=37757@debbugs.gnu.org \
--cc=jgibbons2357@gmail.com \
--cc=wingo@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).