unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw@netris.org>
Cc: Andy Wingo <wingo@igalia.com>, 31925@debbugs.gnu.org
Subject: bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27
Date: Thu, 05 Jul 2018 10:34:38 +0200	[thread overview]
Message-ID: <87lgaqgjxd.fsf@gnu.org> (raw)
In-Reply-To: <87efgil5jz.fsf@netris.org> (Mark H. Weaver's message of "Wed, 04 Jul 2018 23:33:52 -0400")

[-- Attachment #1: Type: text/plain, Size: 1791 bytes --]

Hello Mark,

Thanks for chiming in!

Mark H Weaver <mhw@netris.org> skribis:

> Does libgc spawn threads that run concurrently with user threads?  If
> so, that would be news to me.  My understanding was that incremental
> marking occurs within GC allocation calls, and marking threads are only
> spawned after all user threads have been stopped, but I could be wrong.

libgc launches mark threads as soon as it is initialized, I think.

> The first idea that comes to my mind is that perhaps the finalization
> thread is holding the GC allocation lock when 'fork' is called.  The
> finalization thread grabs the GC allocation lock every time it calls
> 'GC_invoke_finalizers'.  All ports backed by POSIX file descriptors
> (including pipes) register finalizers and therefore spawn the
> finalization thread and make work for it to do.

In 2.2 there’s scm_i_finalizer_pre_fork that takes care of shutting down
the finalization thread right before fork.  So the finalization thread
cannot be blamed, AIUI.

> Another possibility: both the finalization thread and the signal
> delivery thread call 'scm_without_guile', which calls 'GC_do_blocking',
> which also temporarily grabs the GC allocation lock before calling the
> specified function.  See 'GC_do_blocking_inner' in pthread_support.c in
> libgc.  You spawn the signal delivery thread by calling 'sigaction' and
> you make work for it to do every second when the SIGALRM is delivered.

That’s definitely a possibility: the signal thread could be allocating
stuff, and thereby taking the alloc lock just at that time.

>> If that is correct, the fix would be to call fork within
>> ‘GC_call_with_alloc_lock’.
>>
>> How does that sound?
>
> Sure, sounds good to me.

Here’s a patch:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1080 bytes --]

diff --git a/libguile/posix.c b/libguile/posix.c
index b0fcad5fd..088e75631 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1209,6 +1209,13 @@ SCM_DEFINE (scm_execle, "execle", 2, 0, 1,
 #undef FUNC_NAME
 
 #ifdef HAVE_FORK
+static void *
+do_fork (void *pidp)
+{
+  * (int *) pidp = fork ();
+  return NULL;
+}
+
 SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
             (),
 	    "Creates a new \"child\" process by duplicating the current \"parent\" process.\n"
@@ -1236,7 +1243,13 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
         "         further behavior unspecified.  See \"Processes\" in the\n"
         "         manual, for more information.\n"),
        scm_current_warning_port ());
-  pid = fork ();
+
+  /* Take the alloc lock to make sure it is released when the child
+     process starts.  Failing to do that the child process could start
+     in a state where the alloc lock is taken and will never be
+     released.  */
+  GC_call_with_alloc_lock (do_fork, &pid);
+
   if (pid == -1)
     SCM_SYSERROR;
   return scm_from_int (pid);

[-- Attachment #3: Type: text/plain, Size: 266 bytes --]


Thoughts?

Unfortunately my ‘call-with-decompressed-port’ reproducer doesn’t seem t
to reproduce much today so I can’t tell if this helps (I let it run more
than 5 minutes with the supposedly-buggy Guile and nothing happened…).

Thanks,
Ludo’.

      parent reply	other threads:[~2018-07-05  8:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-21 11:45 bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27 Ludovic Courtès
2018-06-21 14:10 ` Ricardo Wurmus
2018-07-04  7:03 ` Ludovic Courtès
2018-07-04 16:58   ` Ludovic Courtès
2018-07-05  3:33     ` Mark H Weaver
2018-07-05  8:00       ` Andy Wingo
2018-07-05 10:05         ` Mark H Weaver
2018-07-05 14:04           ` Andy Wingo
2018-07-05 12:27         ` Ludovic Courtès
2018-07-05 14:08           ` Andy Wingo
2018-07-06 15:35             ` Ludovic Courtès
2018-07-05  8:34       ` Ludovic Courtès [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lgaqgjxd.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=31925@debbugs.gnu.org \
    --cc=mhw@netris.org \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).