all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Mathieu Othacehe <othacehe@gnu.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 40981@debbugs.gnu.org
Subject: bug#40981: Graphical installer tests sometimes hang.
Date: Sun, 10 May 2020 13:19:18 +0200	[thread overview]
Message-ID: <87eersm409.fsf@gnu.org> (raw)
In-Reply-To: <87tv0o9j33.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 10 May 2020 12:32:00 +0200")

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]


Hey Ludo,

> Woow, good catch!

Thanks :)

>> The work-around here is to save the installed SIGTERM handler and reset
>> it. Then, after forking, the parent can restore the SIGTERM handler. The
>> child will use the default SIGTERM handler that terminates the process.
>
> OK, makes sense.  (Another occasion to see how something like
> ‘posix_spawn’ would be more appropriate than fork + exec…)

Didn't know about that function, but it seems way easier to manipulate
and less error prone indeed!

> Shouldn’t it be:
>
>   (let ((pgid (getpgid pid)))
>     (if (= (getpgid 0) pgid)
>         (kill pid signal)  ;don't kill ourself
>         (kill (-p pgid) signal)))

Yes, I changed it.

> Note: Use the most specific comparison procedure, ‘=’ in this case,
> because we know we’re dealing with numbers (it enables proper type
> checking, better compiler optimizations, etc.).  More importantly, ‘eq?’
> performs pointer comparison, so it shouldn’t be used with numbers (in
> practice it works with “fixnums” but not with bignums).

Ok, noted!

>> +# Try to trigger eventual race conditions, when killing a process between fork
>> +# and execv calls.
>> +for i in {1..50}
>> +do
>> +    $herd restart test3
>> +done
>
> Would it reproduce the problem well enough?

On a slow machine one time out of two, and on a faster machine,
never. The 'reproducer' I used, was to add a 'sleep' between fork and
exec, it works way better!

Tell me if you think its better to drop it.

> Could you send an updated patch?

Here it is!

> Thanks for the bug hunting and for the patch!

Thanks for the fast review :)

Mathieu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-service-Fix-make-kill-destructor-when-PGID-is-zero.patch --]
[-- Type: text/x-diff, Size: 4901 bytes --]

From 79d3603bf15b8f815136178be8c8a236734a7596 Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <m.othacehe@gmail.com>
Date: Thu, 7 May 2020 18:39:41 +0200
Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.

When a process is forked, and before its GID is changed in "exec-command",
it will share the parent GID, which is 0 for Shepherd. In that case, use
the PID instead of the PGID.

Also make sure to reset the SIGTERM handler before forking a process. Failing
to do so, will result in stopping Shepherd if a SIGTERM is received between
fork and execv calls. Restore the SIGTERM handler once the process has been
forked.

* modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
handler and reset it before forking. Then, restore it on the parent after
forking.
(make-kill-destructor): Handle the case when PGID is zero, between the process
fork and exec.
---
 modules/shepherd/service.scm | 33 ++++++++++++++++++++++++++-------
 tests/forking-service.sh     | 18 ++++++++++++++++++
 2 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 8604d2f..45fcf32 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -5,6 +5,7 @@
 ;; Copyright (C) 2016 Alex Kost <alezost@gmail.com>
 ;; Copyright (C) 2018 Carlo Zancanaro <carlo@zancanaro.id.au>
 ;; Copyright (C) 2019 Ricardo Wurmus <rekado@elephly.net>
+;; Copyright (C) 2020 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;
 ;; This file is part of the GNU Shepherd.
 ;;
@@ -884,7 +885,18 @@ its PID."
   (unless %sigchld-handler-installed?
     (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP)
     (set! %sigchld-handler-installed? #t))
-  (let ((pid (primitive-fork)))
+
+  ;; When forking a process, the signal handlers are inherited, until it
+  ;; forks. If SIGTERM is received by the forked process, before it calls
+  ;; execv, the installed SIGTERM handler, stopping Shepherd will be called.
+  ;; To avoid this, save the SIGTERM handler, disable it, and restore it once,
+  ;; the process has been forked. This way, the forked process will use the
+  ;; default SIGTERM handler stopping the process.
+  (let ((term-handler (match (sigaction SIGTERM)
+                        ((proc . _)
+                         proc)))
+        (pid (and (sigaction SIGTERM SIG_DFL)
+                  (primitive-fork))))
     (if (zero? pid)
         (exec-command command
                       #:user user
@@ -893,7 +905,10 @@ its PID."
                       #:directory directory
                       #:file-creation-mask file-creation-mask
                       #:environment-variables environment-variables)
-        pid)))
+        (begin
+          ;; Restore the initial SIGTERM handler.
+          (sigaction SIGTERM term-handler)
+          pid))))
 
 (define* (make-forkexec-constructor command
                                     #:key
@@ -957,11 +972,15 @@ start."
   "Return a procedure that sends SIGNAL to the process group of the PID given
 as argument, where SIGNAL defaults to `SIGTERM'."
   (lambda (pid . args)
-    ;; Kill the whole process group PID belongs to.  Don't assume that PID
-    ;; is a process group ID: that's not the case when using #:pid-file,
-    ;; where the process group ID is the PID of the process that
-    ;; "daemonized".
-    (kill (- (getpgid pid)) signal)
+    ;; Kill the whole process group PID belongs to.  Don't assume that PID is
+    ;; a process group ID: that's not the case when using #:pid-file, where
+    ;; the process group ID is the PID of the process that "daemonized".  If
+    ;; this procedure is called, between the process fork and exec, the PGID
+    ;; will still be zero (the Shepherd PGID). In that case, use the PID.
+    (let ((pgid (getpgid pid)))
+      (if (= (getpgid 0) pgid)
+          (kill pid signal) ;don't kill ourself
+          (kill (- pgid) signal)))
     #f))
 
 ;; Produce a constructor that executes a command.
diff --git a/tests/forking-service.sh b/tests/forking-service.sh
index 7126c3d..bd9aac9 100644
--- a/tests/forking-service.sh
+++ b/tests/forking-service.sh
@@ -71,6 +71,17 @@ cat > "$conf"<<EOF
                                       #:pid-file "$PWD/$service2_pid")
    #:stop  (make-kill-destructor)
    #:respawn? #t))
+
+(define %command3
+  '("$SHELL" "-c" "sleep 600"))
+
+(register-services
+ (make <service>
+   ;; A service that forks into a different process.
+   #:provides '(test3)
+   #:start (make-forkexec-constructor %command3)
+   #:stop  (make-kill-destructor)
+   #:respawn? #t))
 EOF
 cat $conf
 
@@ -113,3 +124,10 @@ sleep 1;
 $herd status test2 | grep started
 test "`cat $PWD/$service2_started`" = "started
 started"
+
+# Try to trigger eventual race conditions, when killing a process between fork
+# and execv calls.
+for i in `seq 1 50`
+do
+    $herd restart test3
+done
-- 
2.26.2


  reply	other threads:[~2020-05-10 11:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 11:51 bug#40981: Graphical installer tests sometimes hang Mathieu Othacehe
2020-05-04 11:42 ` Mathieu Othacehe
2020-05-04 12:50   ` Mathieu Othacehe
2020-05-05 10:00     ` Ludovic Courtès
2020-05-05 10:22       ` Mathieu Othacehe
2020-05-06 10:02         ` Mathieu Othacehe
2020-05-07 16:48           ` Mathieu Othacehe
2020-05-10 10:32             ` Ludovic Courtès
2020-05-10 11:19               ` Mathieu Othacehe [this message]
2020-05-11 21:09                 ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eersm409.fsf@gnu.org \
    --to=othacehe@gnu.org \
    --cc=40981@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.