From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#37757: Kernel panic upon shutdown Date: Mon, 09 Dec 2019 14:47:59 +0100 Message-ID: <87lfrlfw4w.fsf@gnu.org> References: <0876c9961fdffa47be54b756a05eb6320b6bdb18.camel@gmail.com> <874kzsfqsx.fsf@gnu.org> <87k183mnza.fsf@gnu.org> <87wobkw7gj.fsf@gnu.org> <87d0d6k4z4.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:56575) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ieJPM-0002cn-2y for bug-guix@gnu.org; Mon, 09 Dec 2019 08:49:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ieJPK-0006Ml-93 for bug-guix@gnu.org; Mon, 09 Dec 2019 08:49:04 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:46815) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ieJPJ-0006La-TT for bug-guix@gnu.org; Mon, 09 Dec 2019 08:49:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ieJPJ-0002bZ-RH for bug-guix@gnu.org; Mon, 09 Dec 2019 08:49:01 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87d0d6k4z4.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 02 Dec 2019 18:33:03 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Jesse Gibbons Cc: Andy Wingo , 37757@debbugs.gnu.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello, [+Cc: Andy for a heads-up on the fix below.] Ludovic Court=C3=A8s skribis: > It turns out the previous patch didn=E2=80=99t work; in short, we really = have to > use async-signal-safe functions only from the signal handler, so this > has to be done in C. > > The attached patch does that. I=E2=80=99ve tried it with =E2=80=98guix s= ystem > container=E2=80=99 and it seems to dump core as expected, from what I can= see. > > Let me know if you manage to reproduce the bug and to get a core dumped > with this patch. Good news! The patch does indeed allow shepherd to dump core, and I managed to grab the backtrace below on an x86_64 machine running Guix System (from yesterday) with GNOME: --8<---------------cut here---------------start------------->8--- Using host libthread_db library "/gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4p= j-glibc-2.29/lib/libthread_db.so.1". Core was generated by `/gnu/store/1mkkv2caiqbdbbd256c4dirfi4kwsacv-guile-2.= 2.6/bin/guile --no-auto-com'. Program terminated with signal SIGSEGV, Segmentation fault. #0 handle_crash (sig=3D11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c= :43 43 * (int *) 0 =3D 42; [Current thread is 1 (LWP 4635)] [=E2=80=A6] Thread 1 (LWP 4635): #0 handle_crash (sig=3D11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-= shepherd-crash-handler.c:43 infinity =3D {rlim_cur =3D 18446744073709551615, rlim_max =3D 18446= 744073709551615} pid =3D msg =3D "Shepherd crashed!\n" pid =3D #1 No locals. #2 handle_crash (sig=3D6) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-s= hepherd-crash-handler.c:43 infinity =3D {rlim_cur =3D 18446744073709551615, rlim_max =3D 18446= 744073709551615} pid =3D msg =3D "Shepherd crashed!\n" pid =3D #3 No locals. #4 __GI_raise (sig=3Dsig@entry=3D6) at ../sysdeps/unix/sysv/linux/raise.c:= 51 set =3D {__val =3D {0, 2314885530818445312, 0 }} pid =3D tid =3D ret =3D #5 0x00007f03eef40891 in __GI_abort () at abort.c:79 save_stage =3D 1 act =3D {__sigaction_handler =3D {sa_handler =3D 0x0, sa_sigaction = =3D 0x0}, sa_mask =3D {__val =3D {0 , 139654877144192, 0,= 139654877624544}}, sa_flags =3D -279049286, sa_restorer =3D 0x7f03ef57e480= } sigs =3D {__val =3D {32, 0 }} #6 0x00007f03ef57e89a in finalization_thread_proc (unused=3D) at finalizers.c:228 data =3D {byte =3D -24 '\350', n =3D -1, err =3D 4} #7 0x00007f03ef56f35a in c_body (d=3D0x7f03ed152e50) at continuations.c:422 data =3D 0x7f03ed152e50 #8 0x00007f03ef5f079f in vm_regular_engine (thread=3D0x2, vp=3D0x7f03eb1ca= ea0, registers=3D0x0, resume=3D-286001158) at vm-engine.c:786 ret =3D 2 ip =3D sp =3D op =3D 10 jump_table_ =3D {=E2=80=A6} jump_table =3D 0x7f03ef64d8e0 [=E2=80=A6] #19 scm_with_guile (func=3D, data=3D) at thre= ads.c:710 No locals. #20 0x00007f03ef497015 in start_thread (arg=3D0x7f03ed153700) at pthread_cr= eate.c:486 ret =3D pd =3D 0x7f03ed153700 now =3D unwind_buf =3D {cancel_jmp_buf =3D {{jmp_buf =3D {139654839219968, = -749312912628550421, 140727702524830, 140727702524831, 140727702524832, 139= 654839219968, 837174519050892523, 837169745183601899}, mask_was_saved =3D 0= }}, priv =3D {pad =3D {0x0, 0x0, 0x0, 0x0}, data =3D {prev =3D 0x0, cleanup= =3D 0x0, canceltype =3D 0}}} not_first_call =3D #21 0x00007f03eeffd91f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clo= ne.S:95 No locals. --8<---------------cut here---------------end--------------->8--- So what happens is that =E2=80=98finalization_thread_proc=E2=80=99 in Guile= receives EINTR (data.err =3D=3D 4) but then, despite EINTR, it goes on to check the value of =E2=80=98data.byte=E2=80=99 and aborts because it=E2=80=99s neithe= r 0 nor 1. My plan is to: 1. push the patch below to the =E2=80=98stable-2.2=E2=80=99 branch of Gui= le; done: ; 2. use a patched Guile for the =E2=80=98shepherd=E2=80=99 package; 3. include the crash handler in the Shepherd. Thoughts? Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/libguile/finalizers.c b/libguile/finalizers.c index c5d69e8e3..94a6e6b0a 100644 --- a/libguile/finalizers.c +++ b/libguile/finalizers.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2012, 2013, 2014 Free Software Foundation, Inc. +/* Copyright (C) 2012, 2013, 2014, 2019 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -211,21 +211,26 @@ finalization_thread_proc (void *unused) scm_without_guile (read_finalization_pipe_data, &data); - if (data.n <= 0 && data.err != EINTR) + if (data.n <= 0) { - perror ("error in finalization thread"); - return NULL; + if (data.err != EINTR) + { + perror ("error in finalization thread"); + return NULL; + } } - - switch (data.byte) + else { - case 0: - scm_run_finalizers (); - break; - case 1: - return NULL; - default: - abort (); + switch (data.byte) + { + case 0: + scm_run_finalizers (); + break; + case 1: + return NULL; + default: + abort (); + } } } } --=-=-=--