From: Mathieu Othacehe <othacehe@gnu.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 40981@debbugs.gnu.org
Subject: bug#40981: Graphical installer tests sometimes hang.
Date: Sun, 10 May 2020 13:19:18 +0200 [thread overview]
Message-ID: <87eersm409.fsf@gnu.org> (raw)
In-Reply-To: <87tv0o9j33.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 10 May 2020 12:32:00 +0200")
[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]
Hey Ludo,
> Woow, good catch!
Thanks :)
>> The work-around here is to save the installed SIGTERM handler and reset
>> it. Then, after forking, the parent can restore the SIGTERM handler. The
>> child will use the default SIGTERM handler that terminates the process.
>
> OK, makes sense. (Another occasion to see how something like
> ‘posix_spawn’ would be more appropriate than fork + exec…)
Didn't know about that function, but it seems way easier to manipulate
and less error prone indeed!
> Shouldn’t it be:
>
> (let ((pgid (getpgid pid)))
> (if (= (getpgid 0) pgid)
> (kill pid signal) ;don't kill ourself
> (kill (-p pgid) signal)))
Yes, I changed it.
> Note: Use the most specific comparison procedure, ‘=’ in this case,
> because we know we’re dealing with numbers (it enables proper type
> checking, better compiler optimizations, etc.). More importantly, ‘eq?’
> performs pointer comparison, so it shouldn’t be used with numbers (in
> practice it works with “fixnums” but not with bignums).
Ok, noted!
>> +# Try to trigger eventual race conditions, when killing a process between fork
>> +# and execv calls.
>> +for i in {1..50}
>> +do
>> + $herd restart test3
>> +done
>
> Would it reproduce the problem well enough?
On a slow machine one time out of two, and on a faster machine,
never. The 'reproducer' I used, was to add a 'sleep' between fork and
exec, it works way better!
Tell me if you think its better to drop it.
> Could you send an updated patch?
Here it is!
> Thanks for the bug hunting and for the patch!
Thanks for the fast review :)
Mathieu
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-service-Fix-make-kill-destructor-when-PGID-is-zero.patch --]
[-- Type: text/x-diff, Size: 4901 bytes --]
From 79d3603bf15b8f815136178be8c8a236734a7596 Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <m.othacehe@gmail.com>
Date: Thu, 7 May 2020 18:39:41 +0200
Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.
When a process is forked, and before its GID is changed in "exec-command",
it will share the parent GID, which is 0 for Shepherd. In that case, use
the PID instead of the PGID.
Also make sure to reset the SIGTERM handler before forking a process. Failing
to do so, will result in stopping Shepherd if a SIGTERM is received between
fork and execv calls. Restore the SIGTERM handler once the process has been
forked.
* modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
handler and reset it before forking. Then, restore it on the parent after
forking.
(make-kill-destructor): Handle the case when PGID is zero, between the process
fork and exec.
---
modules/shepherd/service.scm | 33 ++++++++++++++++++++++++++-------
tests/forking-service.sh | 18 ++++++++++++++++++
2 files changed, 44 insertions(+), 7 deletions(-)
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 8604d2f..45fcf32 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -5,6 +5,7 @@
;; Copyright (C) 2016 Alex Kost <alezost@gmail.com>
;; Copyright (C) 2018 Carlo Zancanaro <carlo@zancanaro.id.au>
;; Copyright (C) 2019 Ricardo Wurmus <rekado@elephly.net>
+;; Copyright (C) 2020 Mathieu Othacehe <m.othacehe@gmail.com>
;;
;; This file is part of the GNU Shepherd.
;;
@@ -884,7 +885,18 @@ its PID."
(unless %sigchld-handler-installed?
(sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP)
(set! %sigchld-handler-installed? #t))
- (let ((pid (primitive-fork)))
+
+ ;; When forking a process, the signal handlers are inherited, until it
+ ;; forks. If SIGTERM is received by the forked process, before it calls
+ ;; execv, the installed SIGTERM handler, stopping Shepherd will be called.
+ ;; To avoid this, save the SIGTERM handler, disable it, and restore it once,
+ ;; the process has been forked. This way, the forked process will use the
+ ;; default SIGTERM handler stopping the process.
+ (let ((term-handler (match (sigaction SIGTERM)
+ ((proc . _)
+ proc)))
+ (pid (and (sigaction SIGTERM SIG_DFL)
+ (primitive-fork))))
(if (zero? pid)
(exec-command command
#:user user
@@ -893,7 +905,10 @@ its PID."
#:directory directory
#:file-creation-mask file-creation-mask
#:environment-variables environment-variables)
- pid)))
+ (begin
+ ;; Restore the initial SIGTERM handler.
+ (sigaction SIGTERM term-handler)
+ pid))))
(define* (make-forkexec-constructor command
#:key
@@ -957,11 +972,15 @@ start."
"Return a procedure that sends SIGNAL to the process group of the PID given
as argument, where SIGNAL defaults to `SIGTERM'."
(lambda (pid . args)
- ;; Kill the whole process group PID belongs to. Don't assume that PID
- ;; is a process group ID: that's not the case when using #:pid-file,
- ;; where the process group ID is the PID of the process that
- ;; "daemonized".
- (kill (- (getpgid pid)) signal)
+ ;; Kill the whole process group PID belongs to. Don't assume that PID is
+ ;; a process group ID: that's not the case when using #:pid-file, where
+ ;; the process group ID is the PID of the process that "daemonized". If
+ ;; this procedure is called, between the process fork and exec, the PGID
+ ;; will still be zero (the Shepherd PGID). In that case, use the PID.
+ (let ((pgid (getpgid pid)))
+ (if (= (getpgid 0) pgid)
+ (kill pid signal) ;don't kill ourself
+ (kill (- pgid) signal)))
#f))
;; Produce a constructor that executes a command.
diff --git a/tests/forking-service.sh b/tests/forking-service.sh
index 7126c3d..bd9aac9 100644
--- a/tests/forking-service.sh
+++ b/tests/forking-service.sh
@@ -71,6 +71,17 @@ cat > "$conf"<<EOF
#:pid-file "$PWD/$service2_pid")
#:stop (make-kill-destructor)
#:respawn? #t))
+
+(define %command3
+ '("$SHELL" "-c" "sleep 600"))
+
+(register-services
+ (make <service>
+ ;; A service that forks into a different process.
+ #:provides '(test3)
+ #:start (make-forkexec-constructor %command3)
+ #:stop (make-kill-destructor)
+ #:respawn? #t))
EOF
cat $conf
@@ -113,3 +124,10 @@ sleep 1;
$herd status test2 | grep started
test "`cat $PWD/$service2_started`" = "started
started"
+
+# Try to trigger eventual race conditions, when killing a process between fork
+# and execv calls.
+for i in `seq 1 50`
+do
+ $herd restart test3
+done
--
2.26.2
next prev parent reply other threads:[~2020-05-10 11:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-30 11:51 bug#40981: Graphical installer tests sometimes hang Mathieu Othacehe
2020-05-04 11:42 ` Mathieu Othacehe
2020-05-04 12:50 ` Mathieu Othacehe
2020-05-05 10:00 ` Ludovic Courtès
2020-05-05 10:22 ` Mathieu Othacehe
2020-05-06 10:02 ` Mathieu Othacehe
2020-05-07 16:48 ` Mathieu Othacehe
2020-05-10 10:32 ` Ludovic Courtès
2020-05-10 11:19 ` Mathieu Othacehe [this message]
2020-05-11 21:09 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eersm409.fsf@gnu.org \
--to=othacehe@gnu.org \
--cc=40981@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.