all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 63982-done@debbugs.gnu.org
Subject: bug#63982: Shepherd can crash when a user service fails to start
Date: Tue, 18 Jul 2023 21:11:42 -0400	[thread overview]
Message-ID: <87sf9kln75.fsf@gmail.com> (raw)
In-Reply-To: <87ttu910q7.fsf@gnu.org> ("Ludovic Courtès"'s message of "Wed, 12 Jul 2023 19:46:56 +0200")

Hey Ludo!

Ludovic Courtès <ludo@gnu.org> writes:

> Hi!
>
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> Turns out that this happens when calling the ‘daemonize’ action on
>> ‘root’.  I have a reproducer now and am investigating…
>
> Good news: this is fixed in Shepherd commit
> f4272d2f0f393d2aa3e9d76b36ab6aa5f2fc72c2!
>
> The root cause is inconsistent semantics when mixing epoll, signalfd,
> and fork, specifically this part from signalfd(2):
>
>    epoll(7) semantics
>        If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
>        epoll(7) instance, then epoll_wait(2) returns events only  for  signals
>        sent  to that process.  In particular, if the process then uses fork(2)
>        to create a child process, then the child will be able to read(2)  sig‐
>        nals  that  are  sent  to  it  using  the signalfd file descriptor, but
>        epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
>        ready.   In  this  scenario,  a  possible  workaround is that after the
>        fork(2), the child process can close the signalfd file descriptor  that
>        it  inherited  from the parent process and then create another signalfd
>        file descriptor and add it to the epoll instance. […]
>
> The C program below illustrates this behavior:
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/signal.h>
> #include <sys/signalfd.h>
> #include <sys/epoll.h>
>
> int
> main ()
> {
>   int ep, sfd;
>
>   sigset_t signals;
>   sigemptyset (&signals);
>   sigaddset (&signals, SIGINT);
>   sigaddset (&signals, SIGHUP);
>
>   sigprocmask (SIG_BLOCK, &signals, NULL);
>   sfd = signalfd (-1, &signals, SFD_CLOEXEC);
>
>   ep = epoll_create1 (EPOLL_CLOEXEC);
>
>   struct epoll_event events = { .events = EPOLLIN | EPOLLONESHOT, .data = NULL };
>   epoll_ctl (ep, EPOLL_CTL_ADD, sfd, &events);
>
>   epoll_wait (ep, &events, 1, 123);
>
>   if (fork () == 0)
>     {
>       /* Quoth signalfd(2):
>
> 	 If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
> 	 epoll(7) instance, then epoll_wait(2) returns events only  for  signals
> 	 sent  to that process.  In particular, if the process then uses fork(2)
> 	 to create a child process, then the child will be able to read(2)  sig‐
> 	 nals  that  are  sent  to  it  using  the signalfd file descriptor, but
> 	 epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
> 	 ready.   */
>
>       printf ("try this: kill -INT %i\n", getpid ());
>       while (1)
> 	{
> 	  struct signalfd_siginfo info;
> 	  if (epoll_wait (ep, &events, 1, 777) > 0)
> 	    {
> 	      read (sfd, &info, sizeof info);
> 	      printf ("got signal %i!\n", info.ssi_signo);
> 	      epoll_ctl (ep, EPOLL_CTL_MOD, sfd, &events);
> 	    }
> 	}
>     }
>
>   return 0;
> }
>
>
> Of course it took me a while to find out about this; I first looked at
> things individually and didn’t expect the mixture to behave
> inconsistently.

Tricky!  Thanks for sharing the result of your investigation, it's
always enlightening!

> Maxim, let me know if it works for you!

Better than ever!  Thanks a lot for fixing the various issues reported
here.

I'm closing this one!

-- 
Thanks,
Maxim




  reply	other threads:[~2023-07-19  1:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09 17:13 bug#63982: Shepherd can crash when a user service fails to start Maxim Cournoyer
2023-06-12 13:44 ` Ludovic Courtès
2023-06-12 17:32   ` Maxim Cournoyer
2023-06-14 15:57     ` Ludovic Courtès
2023-06-19  1:42       ` bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start) Maxim Cournoyer
2023-06-21 14:20         ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
2023-06-22 21:35         ` Ludovic Courtès
2023-06-26 15:53           ` Maxim Cournoyer
2023-07-12 17:46           ` Ludovic Courtès
2023-07-19  1:11             ` Maxim Cournoyer [this message]
2023-06-18 15:14 ` bug#63982: Shepherd wrong-type-arg nils
2023-06-22 20:08   ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
2023-06-25 13:03     ` nils

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sf9kln75.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=63982-done@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.