From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id GO8DGCe/uV77RgAA0tVLHw (envelope-from ) for ; Mon, 11 May 2020 21:09:59 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id QCC6HDW/uV6uWwAAB5/wlQ (envelope-from ) for ; Mon, 11 May 2020 21:10:13 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 246AA940DDE for ; Mon, 11 May 2020 21:10:11 +0000 (UTC) Received: from localhost ([::1]:33074 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYFgh-0002Fo-W4 for larch@yhetil.org; Mon, 11 May 2020 17:10:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39194) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYFgY-0002CL-Se for bug-guix@gnu.org; Mon, 11 May 2020 17:10:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:41927) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYFgY-0005OA-Jp for bug-guix@gnu.org; Mon, 11 May 2020 17:10:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jYFgY-0003P8-EH for bug-guix@gnu.org; Mon, 11 May 2020 17:10:02 -0400 Subject: bug#40981: Graphical installer tests sometimes hang. Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-To: bug-guix@gnu.org Resent-Date: Mon, 11 May 2020 21:10:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: cc-closed 40981 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mathieu Othacehe Mail-Followup-To: 40981@debbugs.gnu.org, ludo@gnu.org, m.othacehe@gmail.com Received: via spool by 40981-done@debbugs.gnu.org id=D40981.158923137013033 (code D ref 40981); Mon, 11 May 2020 21:10:02 +0000 Received: (at 40981-done) by debbugs.gnu.org; 11 May 2020 21:09:30 +0000 Received: from localhost ([127.0.0.1]:53472 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jYFfn-0003Nu-7B for submit@debbugs.gnu.org; Mon, 11 May 2020 17:09:30 -0400 Received: from eggs.gnu.org ([209.51.188.92]:36074) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jYFfk-0003Nh-HO for 40981-done@debbugs.gnu.org; Mon, 11 May 2020 17:09:14 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53977) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYFff-0005Br-Bk for 40981-done@debbugs.gnu.org; Mon, 11 May 2020 17:09:07 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=39328 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jYFfe-0004aZ-HI; Mon, 11 May 2020 17:09:06 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87o8r9w5re.fsf@gmail.com> <87a72ox6ws.fsf@gmail.com> <87zhanx3rw.fsf@gmail.com> <87a72mn1l2.fsf@gnu.org> <875zdael5s.fsf@gmail.com> <87v9l9bcu0.fsf@gmail.com> <87mu6jelmx.fsf@gmail.com> <87tv0o9j33.fsf@gnu.org> <87eersm409.fsf@gnu.org> Date: Mon, 11 May 2020 23:09:05 +0200 In-Reply-To: <87eersm409.fsf@gnu.org> (Mathieu Othacehe's message of "Sun, 10 May 2020 13:19:18 +0200") Message-ID: <87o8qudvri.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -1.0 (-) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 40981-done@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: scn0 X-Spam-Score: -1.01 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Scan-Result: default: False [-1.01 / 13.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; GENERIC_REPUTATION(0.00)[-0.54011600208721]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.51.188.0/24:c]; IP_REPUTATION_HAM(0.00)[asn: 22989(0.07), country: US(-0.00), ip: 209.51.188.17(-0.54)]; DWL_DNSWL_FAIL(0.00)[209.51.188.17:server fail]; MX_GOOD(-0.50)[cached: eggs.gnu.org]; RCPT_COUNT_TWO(0.00)[2]; MAILLIST(-0.20)[mailman]; FORGED_RECIPIENTS_MAILLIST(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:22989, ipnet:209.51.188.0/24, country:US]; MID_RHS_MATCH_FROM(0.00)[]; TAGGED_FROM(0.00)[larch=yhetil.org]; ARC_NA(0.00)[]; FROM_NEQ_ENVFROM(0.00)[ludo@gnu.org,bug-guix-bounces@gnu.org]; FROM_HAS_DN(0.00)[]; URIBL_BLOCKED(0.00)[gnu.org:email,lwn.net:url]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[gnu.org]; HAS_LIST_UNSUB(-0.01)[]; DNSWL_BLOCKED(0.00)[209.51.188.17:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.51.188.17:from]; RCVD_COUNT_SEVEN(0.00)[9]; FORGED_SENDER_MAILLIST(0.00)[] X-TUID: NmQpB38NXFlW Hello, Mathieu Othacehe skribis: >>> The work-around here is to save the installed SIGTERM handler and reset >>> it. Then, after forking, the parent can restore the SIGTERM handler. The >>> child will use the default SIGTERM handler that terminates the process. >> >> OK, makes sense. (Another occasion to see how something like >> =E2=80=98posix_spawn=E2=80=99 would be more appropriate than fork + exec= =E2=80=A6) > > Didn't know about that function, but it seems way easier to manipulate > and less error prone indeed! Make sure to read =E2=80=9CA fork() in the Road=E2=80=9D on that topic: https://lwn.net/Articles/785430/ >>> +# Try to trigger eventual race conditions, when killing a process betw= een fork >>> +# and execv calls. >>> +for i in {1..50} >>> +do >>> + $herd restart test3 >>> +done >> >> Would it reproduce the problem well enough? > > On a slow machine one time out of two, and on a faster machine, > never. The 'reproducer' I used, was to add a 'sleep' between fork and > exec, it works way better! > > Tell me if you think its better to drop it. It=E2=80=99s better than nothing, let=E2=80=99s keep it. >>>From 79d3603bf15b8f815136178be8c8a236734a7596 Mon Sep 17 00:00:00 2001 > From: Mathieu Othacehe > Date: Thu, 7 May 2020 18:39:41 +0200 > Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero. > > When a process is forked, and before its GID is changed in "exec-command", > it will share the parent GID, which is 0 for Shepherd. In that case, use > the PID instead of the PGID. > > Also make sure to reset the SIGTERM handler before forking a process. Fai= ling > to do so, will result in stopping Shepherd if a SIGTERM is received betwe= en > fork and execv calls. Restore the SIGTERM handler once the process has be= en > forked. > > * modules/shepherd/service.scm (fork+exec-command): Save the current SIGT= ERM > handler and reset it before forking. Then, restore it on the parent after > forking. > (make-kill-destructor): Handle the case when PGID is zero, between the pr= ocess > fork and exec. I added a =E2=80=9CFixes=E2=80=9D line and pushed it. Thanks a lot! I can roll a 0.8.1 release soonish (I=E2=80=99d like to add signalfd support while at it, we=E2=80=99ll see.) Ludo=E2=80=99.