unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Jesse Gibbons <jgibbons2357@gmail.com>
Cc: Andy Wingo <wingo@igalia.com>, 37757@debbugs.gnu.org
Subject: bug#37757: Kernel panic upon shutdown
Date: Mon, 09 Dec 2019 14:47:59 +0100	[thread overview]
Message-ID: <87lfrlfw4w.fsf@gnu.org> (raw)
In-Reply-To: <87d0d6k4z4.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 02 Dec 2019 18:33:03 +0100")

[-- Attachment #1: Type: text/plain, Size: 4466 bytes --]

Hello,

[+Cc: Andy for a heads-up on the fix below.]

Ludovic Courtès <ludo@gnu.org> skribis:

> It turns out the previous patch didn’t work; in short, we really have to
> use async-signal-safe functions only from the signal handler, so this
> has to be done in C.
>
> The attached patch does that.  I’ve tried it with ‘guix system
> container’ and it seems to dump core as expected, from what I can see.
>
> Let me know if you manage to reproduce the bug and to get a core dumped
> with this patch.

Good news!  The patch does indeed allow shepherd to dump core, and I
managed to grab the backtrace below on an x86_64 machine running Guix
System (from yesterday) with GNOME:

--8<---------------cut here---------------start------------->8---
Using host libthread_db library "/gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libthread_db.so.1".
Core was generated by `/gnu/store/1mkkv2caiqbdbbd256c4dirfi4kwsacv-guile-2.2.6/bin/guile --no-auto-com'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  handle_crash (sig=11)
    at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
43	      * (int *) 0 = 42;
[Current thread is 1 (LWP 4635)]

[…]

Thread 1 (LWP 4635):
#0  handle_crash (sig=11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
        infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
        pid = <optimized out>
        msg = "Shepherd crashed!\n"
        pid = <optimized out>
#1  <signal handler called>
No locals.
#2  handle_crash (sig=6) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
        infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
        pid = <optimized out>
        msg = "Shepherd crashed!\n"
        pid = <optimized out>
#3  <signal handler called>
No locals.
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
        set = {__val = {0, 2314885530818445312, 0 <repeats 14 times>}}
        pid = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
#5  0x00007f03eef40891 in __GI_abort () at abort.c:79
        save_stage = 1
        act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 13 times>, 139654877144192, 0, 139654877624544}}, sa_flags = -279049286, sa_restorer = 0x7f03ef57e480 <read_finalization_pipe_data>}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#6  0x00007f03ef57e89a in finalization_thread_proc (unused=<optimized out>) at finalizers.c:228
        data = {byte = -24 '\350', n = -1, err = 4}
#7  0x00007f03ef56f35a in c_body (d=0x7f03ed152e50) at continuations.c:422
        data = 0x7f03ed152e50
#8  0x00007f03ef5f079f in vm_regular_engine (thread=0x2, vp=0x7f03eb1caea0, registers=0x0, resume=-286001158) at vm-engine.c:786
        ret = 2
        ip = <optimized out>
        sp = <optimized out>
        op = 10
        jump_table_ = {…}
        jump_table = 0x7f03ef64d8e0 <jump_table_>

[…]

#19 scm_with_guile (func=<optimized out>, data=<optimized out>) at threads.c:710
No locals.
#20 0x00007f03ef497015 in start_thread (arg=0x7f03ed153700) at pthread_create.c:486
        ret = <optimized out>
        pd = 0x7f03ed153700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139654839219968, -749312912628550421, 140727702524830, 140727702524831, 140727702524832, 139654839219968, 837174519050892523, 837169745183601899}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#21 0x00007f03eeffd91f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
--8<---------------cut here---------------end--------------->8---

So what happens is that ‘finalization_thread_proc’ in Guile receives
EINTR (data.err == 4) but then, despite EINTR, it goes on to check the
value of ‘data.byte’ and aborts because it’s neither 0 nor 1.

My plan is to:

  1. push the patch below to the ‘stable-2.2’ branch of Guile;
     done:
     <https://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.2&id=edf5aea7ac852db2356ef36cba4a119eb0c81ea9>;

  2. use a patched Guile for the ‘shepherd’ package;

  3. include the crash handler in the Shepherd.

Thoughts?

Thanks,
Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1353 bytes --]

diff --git a/libguile/finalizers.c b/libguile/finalizers.c
index c5d69e8e3..94a6e6b0a 100644
--- a/libguile/finalizers.c
+++ b/libguile/finalizers.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2012, 2013, 2014 Free Software Foundation, Inc.
+/* Copyright (C) 2012, 2013, 2014, 2019 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -211,21 +211,26 @@ finalization_thread_proc (void *unused)
 
       scm_without_guile (read_finalization_pipe_data, &data);
       
-      if (data.n <= 0 && data.err != EINTR) 
+      if (data.n <= 0)
         {
-          perror ("error in finalization thread");
-          return NULL;
+          if (data.err != EINTR)
+            {
+              perror ("error in finalization thread");
+              return NULL;
+            }
         }
-
-      switch (data.byte)
+      else
         {
-        case 0:
-          scm_run_finalizers ();
-          break;
-        case 1:
-          return NULL;
-        default:
-          abort ();
+          switch (data.byte)
+            {
+            case 0:
+              scm_run_finalizers ();
+              break;
+            case 1:
+              return NULL;
+            default:
+              abort ();
+            }
         }
     }
 }

  parent reply	other threads:[~2019-12-09 13:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <0876c9961fdffa47be54b756a05eb6320b6bdb18.camel@gmail.com>
2019-10-28 22:28 ` bug#37757: Kernel panic upon shutdown Ludovic Courtès
2019-11-13 22:05   ` Ludovic Courtès
2019-11-13 22:22     ` Jan
2019-11-28 11:45     ` Ludovic Courtès
2019-12-02 17:33       ` Ludovic Courtès
2019-12-03  9:43         ` Arne Babenhauserheide
2019-12-09 13:47         ` Ludovic Courtès [this message]
2019-12-09 23:13           ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lfrlfw4w.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=37757@debbugs.gnu.org \
    --cc=jgibbons2357@gmail.com \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).