From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Josselin Poiret <dev@jpoiret.xyz>, 57922@debbugs.gnu.org
Subject: bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Date: Sun, 25 Sep 2022 20:12:09 -0400 [thread overview]
Message-ID: <87leq76lk6.fsf@gmail.com> (raw)
In-Reply-To: <87zgeo68hc.fsf@gnu.org> ("Ludovic Courtès"'s message of "Sat, 24 Sep 2022 18:30:07 +0200")
Hi,
Ludovic Courtès <ludo@gnu.org> writes:
> Hi,
>
> Josselin Poiret <dev@jpoiret.xyz> skribis:
>
>> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>>
>>> This leads me to believe that Shepherd does not block until the process
>>> is actually dead to mark the process as stopped (it just waitpid on the
>>> group pid with WNOHANG), which means it won't block if the child process
>>> hasn't exited yet, if I'm correct.
>
> Correct: the service is marked as stopped as soon as ‘stop’ returns.
>
>>> When we are in the stop slot, we know for sure that the process should
>>> terminate completely, hence it'd make sense to call 'waitpid' *without*
>>> WNOHANG there, to avoid 'herd restart' from starting the service while
>>> its stopped process is not done terminating.
>>>
>>> jamid can take quite some time to terminate cleanly because of the
>>> networking threads in the opendht library that needs to be finalized,
>>> which is probably the reason this problem can be observed here.
>>>
>>> Thoughts?
>>
>> I agree with you, make-kill-destructor should waitpid the processes it's
>> killing. There shouldn't be any issues waitpid'ing before the
>> shepherd's signal handler, since stop actions are run with asyncs
>> disabled. The signal handler will run once but won't get anything
>> because all the processes were already waitpid'd and it uses WNOHANG.
>
> I think we need an extra “stopping” state for services. In general,
> we’ll want to send SIGTERM, wait for some grace period or dead process
> notification, then send SIGKILL, and finally change state to “stopped”.
>
> This is not possible in 0.9 but is something I’d like to have in 0.10¹.
This sounds good. Let's keep this ticket open until this goodness
lands, as a reminder.
Thank you!
Maxim
prev parent reply other threads:[~2022-09-26 0:13 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-19 4:29 bug#57922: Shepherd doesn't seem to correctly handle waitpid itself Maxim Cournoyer
2022-09-20 7:31 ` Josselin Poiret via Bug reports for GNU Guix
2022-09-23 6:33 ` Ludovic Courtès
2022-09-23 17:49 ` Maxim Cournoyer
2022-09-24 3:32 ` Maxim Cournoyer
2022-09-24 8:09 ` Josselin Poiret via Bug reports for GNU Guix
2022-09-24 16:30 ` Ludovic Courtès
2022-09-26 0:12 ` Maxim Cournoyer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87leq76lk6.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=57922@debbugs.gnu.org \
--cc=dev@jpoiret.xyz \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).