From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id qMx+Kjc8tF6oLAAA0tVLHw (envelope-from ) for ; Thu, 07 May 2020 16:49:59 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id UB74OUM8tF69bwAAB5/wlQ (envelope-from ) for ; Thu, 07 May 2020 16:50:11 +0000 Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:470:142::17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 06762940C01 for ; Thu, 7 May 2020 16:50:08 +0000 (UTC) Received: from localhost ([::1]:51748 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jWjiq-0000LZ-IZ for larch@yhetil.org; Thu, 07 May 2020 12:50:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48066) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jWjik-0000LR-KE for bug-guix@gnu.org; Thu, 07 May 2020 12:50:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:60495) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jWjik-00067W-B5 for bug-guix@gnu.org; Thu, 07 May 2020 12:50:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jWjik-00053O-3a for bug-guix@gnu.org; Thu, 07 May 2020 12:50:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#40981: Graphical installer tests sometimes hang. Resent-From: Mathieu Othacehe Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Thu, 07 May 2020 16:50:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40981 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by 40981-submit@debbugs.gnu.org id=B40981.158887014819323 (code B ref 40981); Thu, 07 May 2020 16:50:02 +0000 Received: (at 40981) by debbugs.gnu.org; 7 May 2020 16:49:08 +0000 Received: from localhost ([127.0.0.1]:43808 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jWjhs-00051Z-8T for submit@debbugs.gnu.org; Thu, 07 May 2020 12:49:08 -0400 Received: from mail-wm1-f52.google.com ([209.85.128.52]:34173) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jWjhp-00050x-UB for 40981@debbugs.gnu.org; Thu, 07 May 2020 12:49:06 -0400 Received: by mail-wm1-f52.google.com with SMTP id v4so7897641wme.1 for <40981@debbugs.gnu.org>; Thu, 07 May 2020 09:49:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=PshOTfyj+6gs4MA76m4xaFK4WmtzYkTbVvGnSCzzIDE=; b=UnAYtjvzvEiblRz3WqoZblsXCBxlHIqjaZotqNo58EcvGzIH3eImaRXt/Dui/HpVv9 22GtafdD7Bz9ydhzbXHBvcvG2se3mnXdB7R7Mf7BZlgXkyuuIcoW9BwE5VOSQsiMyUjQ Iyj1cQWpJsxBXwTwzRxj+vS/K3UXqdz2sKBlQJaWQs3uUInoN+1+p7bcm4z5waBT+Q6q dnTl1qbHlbgqnWJGVE0R0hBrbIRBSR08i3FuE2lTJWISjEt1klvXaemdRTBdWmTMspFw WZdWzqbjolYGPjsdgrS753bgg/t9hteJNJkk1nu6smpwvVhBg9tLvx/kEvb7IeVsPWF5 777Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=PshOTfyj+6gs4MA76m4xaFK4WmtzYkTbVvGnSCzzIDE=; b=KMrwhdWr+0ZOqHz4HS3hKOD61GXSJa3yISNBDIu220eCD9ckTIEZlVFx54Cd3K9dNC 0qpIkcG0hVVlJeW7MFppg3W/Xwa4hdCc7o2lS+VK8T2O/65wDsdRaKX8tr+2Fs1ShSRT HOC+/p7dwLcrRF1A6smHXj/w1XjW2wGCYUT/XorCmkUQpFh1OewL6ILMo0Fq7roavn9i 5bvBnjZt+R3EvPPtrk8y3FYOzVVqUJvR6ntp4lGKw+icdWCE4SG6eev8zeSn+h9Y2+X1 7SW1fu/7dgKUmt3sHFwz5Gh0e4JKqxeTCnhR5Ar9VRP1WbWv9YpxUiPUJxrsyKh1IHO0 kMJg== X-Gm-Message-State: AGi0PuZBqqlbJYRNKfLxjHxjhx9vpnCRbdeeSqKqfVEO5VuzpQzVO5eh 8mul5QpF8a6NgaNckseTULSNcGdN X-Google-Smtp-Source: APiQypLEskq+s0R/Ed4exxRHUwgEubua5Vfqz/FLnnD9CsD09XP3V4kSO8mQ/kixOnhizNzzhTdrnw== X-Received: by 2002:a05:600c:114d:: with SMTP id z13mr11404810wmz.54.1588870136653; Thu, 07 May 2020 09:48:56 -0700 (PDT) Received: from meru ([2a01:cb18:832e:5f00:a588:8943:3d1c:a34c]) by smtp.gmail.com with ESMTPSA id b23sm6540953wmb.26.2020.05.07.09.48.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 May 2020 09:48:55 -0700 (PDT) From: Mathieu Othacehe References: <87o8r9w5re.fsf@gmail.com> <87a72ox6ws.fsf@gmail.com> <87zhanx3rw.fsf@gmail.com> <87a72mn1l2.fsf@gnu.org> <875zdael5s.fsf@gmail.com> <87v9l9bcu0.fsf@gmail.com> Date: Thu, 07 May 2020 18:48:54 +0200 In-Reply-To: <87v9l9bcu0.fsf@gmail.com> (Mathieu Othacehe's message of "Wed, 06 May 2020 12:02:47 +0200") Message-ID: <87mu6jelmx.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -1.0 (-) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 40981@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: scn0 X-Spam-Score: 0.09 Authentication-Results: aspmx1.migadu.com; dkim=fail (rsa verify failed) header.d=gmail.com header.s=20161025 header.b=UnAYtjvz; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Scan-Result: default: False [0.09 / 13.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; GENERIC_REPUTATION(0.00)[-0.49827311596173]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2001:470:142::/48:c]; R_DKIM_REJECT(1.00)[gmail.com:s=20161025]; DWL_DNSWL_BLOCKED(0.00)[2001:470:142::17:from]; FREEMAIL_FROM(0.00)[gmail.com]; IP_REPUTATION_HAM(0.00)[asn: 22989(0.11), country: US(-0.00), ip: 2001:470:142::17(-0.50)]; DKIM_TRACE(0.00)[gmail.com:-]; RCPT_COUNT_TWO(0.00)[2]; MX_GOOD(-0.50)[cached: eggs.gnu.org]; MAILLIST(-0.20)[mailman]; FORGED_RECIPIENTS_MAILLIST(0.00)[]; RCVD_IN_DNSWL_FAIL(0.00)[2001:470:142::17:server fail]; MIME_TRACE(0.00)[0:+,1:+,2:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:22989, ipnet:2001:470:142::/48, country:US]; TAGGED_FROM(0.00)[larch=yhetil.org]; FROM_NEQ_ENVFROM(0.00)[mothacehe@gmail.com,bug-guix-bounces@gnu.org]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; URIBL_BLOCKED(0.00)[elephly.net:email,zancanaro.id.au:email]; MIME_GOOD(-0.10)[multipart/mixed,text/plain,text/x-diff]; PREVIOUSLY_DELIVERED(0.00)[40981@debbugs.gnu.org]; HAS_LIST_UNSUB(-0.01)[]; RCVD_COUNT_SEVEN(0.00)[9]; FORGED_SENDER_MAILLIST(0.00)[]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : SPF not aligned (relaxed),none] X-TUID: VSVGd/uQOUtU --=-=-= Content-Type: text/plain Hello, The previous patch was not working. The reason is that, when a process is forked and before execv is called, it shares the parent signal handler. So when sending a SIGTERM to the child process, we are stopping root-service, if the signal is received before the child has forked. The work-around here is to save the installed SIGTERM handler and reset it. Then, after forking, the parent can restore the SIGTERM handler. The child will use the default SIGTERM handler that terminates the process. WDYT? Thanks, Mathieu --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-service-Fix-make-kill-destructor-when-PGID-is-zero.patch >From aa6f67068f1fdd622673ec0223f05fd8f8a96baa Mon Sep 17 00:00:00 2001 From: Mathieu Othacehe Date: Thu, 7 May 2020 18:39:41 +0200 Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero. When a process is forked, and before its GID is changed in "exec-command", it will share the parent GID, which is 0 for Shepherd. In that case, use the PID instead of the PGID. Also make sure to reset the SIGTERM handler before forking a process. Failing to do so, will result in stopping Shepherd if a SIGTERM is received between fork and execv calls. Restore the SIGTERM handler once the process has been forked. * modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM handler and reset it before forking. Then, restore it on the parent after forking. (make-kill-destructor): Handle the case when PGID is zero, between the process fork and exec. --- modules/shepherd/service.scm | 36 +++++++++++++++++++++++++++++------- tests/forking-service.sh | 18 ++++++++++++++++++ 2 files changed, 47 insertions(+), 7 deletions(-) diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm index 8604d2f..c68d7e0 100644 --- a/modules/shepherd/service.scm +++ b/modules/shepherd/service.scm @@ -5,6 +5,7 @@ ;; Copyright (C) 2016 Alex Kost ;; Copyright (C) 2018 Carlo Zancanaro ;; Copyright (C) 2019 Ricardo Wurmus +;; Copyright (C) 2020 Mathieu Othacehe ;; ;; This file is part of the GNU Shepherd. ;; @@ -884,7 +885,18 @@ its PID." (unless %sigchld-handler-installed? (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP) (set! %sigchld-handler-installed? #t)) - (let ((pid (primitive-fork))) + + ;; When forking a process, the signal handlers are inherited, until it + ;; forks. If SIGTERM is received by the forked process, before it calls + ;; execv, the installed SIGTERM handler, stopping Shepherd will be called. + ;; To avoid this, save the SIGTERM handler, disable it, and restore it once, + ;; the process has been forked. This way, the forked process will use the + ;; default SIGTERM handler stopping the process. + (let ((term-handler (match (sigaction SIGTERM) + ((proc . _) + proc))) + (pid (and (sigaction SIGTERM SIG_DFL) + (primitive-fork)))) (if (zero? pid) (exec-command command #:user user @@ -893,7 +905,10 @@ its PID." #:directory directory #:file-creation-mask file-creation-mask #:environment-variables environment-variables) - pid))) + (begin + ;; Restore the initial SIGTERM handler. + (sigaction SIGTERM term-handler) + pid)))) (define* (make-forkexec-constructor command #:key @@ -957,11 +972,18 @@ start." "Return a procedure that sends SIGNAL to the process group of the PID given as argument, where SIGNAL defaults to `SIGTERM'." (lambda (pid . args) - ;; Kill the whole process group PID belongs to. Don't assume that PID - ;; is a process group ID: that's not the case when using #:pid-file, - ;; where the process group ID is the PID of the process that - ;; "daemonized". - (kill (- (getpgid pid)) signal) + ;; Kill the whole process group PID belongs to. Don't assume that PID is + ;; a process group ID: that's not the case when using #:pid-file, where + ;; the process group ID is the PID of the process that "daemonized". If + ;; this procedure is called, between the process fork and exec, the PGID + ;; will still be zero (the Shepherd PGID). In that case, use the PID. + (let ((current-pgid (getpgid 0)) + (pgid (getpgid pid))) + (if (eq? pgid current-pgid) + (begin + (kill pid signal)) + (begin + (kill (- pgid) signal)))) #f)) ;; Produce a constructor that executes a command. diff --git a/tests/forking-service.sh b/tests/forking-service.sh index 7126c3d..47b09a2 100644 --- a/tests/forking-service.sh +++ b/tests/forking-service.sh @@ -71,6 +71,17 @@ cat > "$conf"< + ;; A service that forks into a different process. + #:provides '(test3) + #:start (make-forkexec-constructor %command3) + #:stop (make-kill-destructor) + #:respawn? #t)) EOF cat $conf @@ -113,3 +124,10 @@ sleep 1; $herd status test2 | grep started test "`cat $PWD/$service2_started`" = "started started" + +# Try to trigger eventual race conditions, when killing a process between fork +# and execv calls. +for i in {1..50} +do + $herd restart test3 +done -- 2.26.0 --=-=-=--