From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 63982-done@debbugs.gnu.org
Subject: bug#63982: Shepherd can crash when a user service fails to start
Date: Tue, 18 Jul 2023 21:11:42 -0400 [thread overview]
Message-ID: <87sf9kln75.fsf@gmail.com> (raw)
In-Reply-To: <87ttu910q7.fsf@gnu.org> ("Ludovic Courtès"'s message of "Wed, 12 Jul 2023 19:46:56 +0200")
Hey Ludo!
Ludovic Courtès <ludo@gnu.org> writes:
> Hi!
>
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> Turns out that this happens when calling the ‘daemonize’ action on
>> ‘root’. I have a reproducer now and am investigating…
>
> Good news: this is fixed in Shepherd commit
> f4272d2f0f393d2aa3e9d76b36ab6aa5f2fc72c2!
>
> The root cause is inconsistent semantics when mixing epoll, signalfd,
> and fork, specifically this part from signalfd(2):
>
> epoll(7) semantics
> If a process adds (via epoll_ctl(2)) a signalfd file descriptor to an
> epoll(7) instance, then epoll_wait(2) returns events only for signals
> sent to that process. In particular, if the process then uses fork(2)
> to create a child process, then the child will be able to read(2) sig‐
> nals that are sent to it using the signalfd file descriptor, but
> epoll_wait(2) will not indicate that the signalfd file descriptor is
> ready. In this scenario, a possible workaround is that after the
> fork(2), the child process can close the signalfd file descriptor that
> it inherited from the parent process and then create another signalfd
> file descriptor and add it to the epoll instance. […]
>
> The C program below illustrates this behavior:
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/signal.h>
> #include <sys/signalfd.h>
> #include <sys/epoll.h>
>
> int
> main ()
> {
> int ep, sfd;
>
> sigset_t signals;
> sigemptyset (&signals);
> sigaddset (&signals, SIGINT);
> sigaddset (&signals, SIGHUP);
>
> sigprocmask (SIG_BLOCK, &signals, NULL);
> sfd = signalfd (-1, &signals, SFD_CLOEXEC);
>
> ep = epoll_create1 (EPOLL_CLOEXEC);
>
> struct epoll_event events = { .events = EPOLLIN | EPOLLONESHOT, .data = NULL };
> epoll_ctl (ep, EPOLL_CTL_ADD, sfd, &events);
>
> epoll_wait (ep, &events, 1, 123);
>
> if (fork () == 0)
> {
> /* Quoth signalfd(2):
>
> If a process adds (via epoll_ctl(2)) a signalfd file descriptor to an
> epoll(7) instance, then epoll_wait(2) returns events only for signals
> sent to that process. In particular, if the process then uses fork(2)
> to create a child process, then the child will be able to read(2) sig‐
> nals that are sent to it using the signalfd file descriptor, but
> epoll_wait(2) will not indicate that the signalfd file descriptor is
> ready. */
>
> printf ("try this: kill -INT %i\n", getpid ());
> while (1)
> {
> struct signalfd_siginfo info;
> if (epoll_wait (ep, &events, 1, 777) > 0)
> {
> read (sfd, &info, sizeof info);
> printf ("got signal %i!\n", info.ssi_signo);
> epoll_ctl (ep, EPOLL_CTL_MOD, sfd, &events);
> }
> }
> }
>
> return 0;
> }
>
>
> Of course it took me a while to find out about this; I first looked at
> things individually and didn’t expect the mixture to behave
> inconsistently.
Tricky! Thanks for sharing the result of your investigation, it's
always enlightening!
> Maxim, let me know if it works for you!
Better than ever! Thanks a lot for fixing the various issues reported
here.
I'm closing this one!
--
Thanks,
Maxim
next prev parent reply other threads:[~2023-07-19 1:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-09 17:13 bug#63982: Shepherd can crash when a user service fails to start Maxim Cournoyer
2023-06-12 13:44 ` Ludovic Courtès
2023-06-12 17:32 ` Maxim Cournoyer
2023-06-14 15:57 ` Ludovic Courtès
2023-06-19 1:42 ` bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start) Maxim Cournoyer
2023-06-21 14:20 ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
2023-06-22 21:35 ` Ludovic Courtès
2023-06-26 15:53 ` Maxim Cournoyer
2023-07-12 17:46 ` Ludovic Courtès
2023-07-19 1:11 ` Maxim Cournoyer [this message]
2023-06-18 15:14 ` bug#63982: Shepherd wrong-type-arg nils
2023-06-22 20:08 ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
2023-06-25 13:03 ` nils
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sf9kln75.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=63982-done@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.