all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Mathieu Othacehe <othacehe@gnu.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 41948@debbugs.gnu.org
Subject: bug#41948: Shepherd deadlocks
Date: Sun, 16 Aug 2020 11:56:37 +0200	[thread overview]
Message-ID: <87k0xyhq22.fsf@gnu.org> (raw)
In-Reply-To: <87a70yc9kj.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sat, 20 Jun 2020 12:31:40 +0200")

[-- Attachment #1: Type: text/plain, Size: 2301 bytes --]


Hey Ludo,

> We should be able to reproduce it with much simpler tests then, right?
> Like maybe “while : ; do herd restart guix-daemon ; done” or similar?

Well I tried that without success. Then I had a closer look to the
strace log.

Turns out there are two concurrent "finalizer" threads:

--8<---------------cut here---------------start------------->8---
1     clone(child_stack=0x7f17981e6fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[271], tls=0x7f17981e7700, child_tidptr=0x7f17981e79d0) = 271
--8<---------------cut here---------------end--------------->8---

and this one,

--8<---------------cut here---------------start------------->8---
217   <... clone resumed>, parent_tid=[253], tls=0x7f1799309700, child_tidptr=0x7f17993099d0) = 253
--8<---------------cut here---------------end--------------->8---

The first one is spawned from Shepherd directly. The other one is
spawned from the forked process in "marionette-shepherd-service".

Those two finalizer threads share the same pipe. When we try to
stop the finalizer thread in Shepherd, right before forking a new
process, we send a '\1' byte to the finalizer pipe.

--8<---------------cut here---------------start------------->8---
1     write(6, "\1", 1 <unfinished ...>
--8<---------------cut here---------------end--------------->8---

which is received by (line 183597): 

--8<---------------cut here---------------start------------->8---
253   <... read resumed>"\1", 1)        = 1
--8<---------------cut here---------------end--------------->8---

the marionette finalizer thread. Then, we pthread_join the Shepherd
finalizer thread, which never stops! Quite unfortunate.

Here's a small reproducer attached. So unless I'm wrong this is a Guile
issue, that will cause any program that uses at least two primitive-fork
calls to possibly hang.

I'm quite convinced that those two bugs are directly related:

* https://issues.guix.info/31925
* https://issues.guix.gnu.org/42353

Now regarding the fix of this issue, I guess that a process forked with
"primitive-fork" in Guile should close it's parent finalizer pipe and
open a new one.

WDYT?

Thanks,

Mathieu


[-- Attachment #2: t.scm --]
[-- Type: application/octet-stream, Size: 315 bytes --]

(use-modules (shepherd service)
             (ice-9 match))

(match (primitive-fork)
  (0
   (while #t
     (gc)
     (usleep 200000)))
  (pid
   (let loop ((count 0))
     (format #t "Forking ~a~%" count)
     (fork+exec-command '("/bin/sh" "-c" "sleep 1"))
     (usleep (random 200000))
     (loop (1+ count)))))

  reply	other threads:[~2020-08-16  9:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19  8:41 bug#41948: Shepherd deadlocks Mathieu Othacehe
2020-06-19 12:10 ` Mathieu Othacehe
2020-06-20  0:16 ` Michael Rohleder
2020-06-20 10:31 ` Ludovic Courtès
2020-08-16  9:56   ` Mathieu Othacehe [this message]
2021-05-07 21:49     ` Ludovic Courtès
2021-05-07 22:07       ` Ludovic Courtès
2021-05-08 20:52         ` Ludovic Courtès
2021-05-08  9:43     ` Ludovic Courtès
2021-05-08 13:49       ` Andrew Whatson
2021-05-08 13:49         ` bug#41948: [PATCH] Fix some finalizer thread race conditions Andrew Whatson
2021-05-08 20:50           ` bug#41948: Shepherd deadlocks Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k0xyhq22.fsf@gnu.org \
    --to=othacehe@gnu.org \
    --cc=41948@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.