From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id uHT8HEBEKmPcgQAAbAwnHQ (envelope-from ) for ; Wed, 21 Sep 2022 00:52:48 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id uHoBHEBEKmP17QAAG6o9tA (envelope-from ) for ; Wed, 21 Sep 2022 00:52:48 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C8D5D8272 for ; Wed, 21 Sep 2022 00:52:47 +0200 (CEST) Received: from localhost ([::1]:59854 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oam6g-0006En-CN for larch@yhetil.org; Tue, 20 Sep 2022 18:52:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33226) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oah6J-000475-8q for bug-guix@gnu.org; Tue, 20 Sep 2022 13:32:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:60463) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oah6I-0003Jo-Td for bug-guix@gnu.org; Tue, 20 Sep 2022 13:32:03 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oah6I-00035z-Ki for bug-guix@gnu.org; Tue, 20 Sep 2022 13:32:02 -0400 Subject: bug#57827: Shepherd 0.9.2 possible regressions Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-To: bug-guix@gnu.org Resent-Date: Tue, 20 Sep 2022 17:32:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: cc-closed 57827 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mathieu Othacehe Cc: 57827-done@debbugs.gnu.org Mail-Followup-To: 57827@debbugs.gnu.org, ludo@gnu.org, othacehe@gnu.org Received: via spool by 57827-done@debbugs.gnu.org id=D57827.166369508111845 (code D ref 57827); Tue, 20 Sep 2022 17:32:02 +0000 Received: (at 57827-done) by debbugs.gnu.org; 20 Sep 2022 17:31:21 +0000 Received: from localhost ([127.0.0.1]:59540 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oah5b-00034x-O8 for submit@debbugs.gnu.org; Tue, 20 Sep 2022 13:31:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:47090) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oah5L-00034D-So for 57827-done@debbugs.gnu.org; Tue, 20 Sep 2022 13:31:18 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52962) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oah5G-00039Y-N0 for 57827-done@debbugs.gnu.org; Tue, 20 Sep 2022 13:30:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=Igx9YK17xL0BSM1Jy9h4e5tEhUi8EjvuM18iGSbTxVE=; b=Vv5ZUQ82UekMXSLGMHxR n5spK3ITwhyvDTMdbBtipDpa6PwDj+l3vTXVPAJIttu/MzYm15nfp3rFHjT53P6hk4UrU1ll8dtNa dv+B2ll8/aYJrbQ0A0uuuQ3e+UnpXG3q3iIZk5Z9aaoY+4kPeKGfcNMit3sKjmvRdOUTN23nx86g+ 3Gyazl8ALg81+dx1/he7m+WEDlst0ApAuHV8HmEWO5+VpOCMhmhFY4tWVJPqGRVsmwbU8WhktX9z7 sQgwIo+MLEXeVyRDFz54eHg1VfVvLjuwv9DnF4f2LiCKC8gjChEhZJzuWaewmQcuzRBS40ofbKAVv qPi/xJG5lUlgIg==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201]:57104 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oah5D-0001oW-Qd; Tue, 20 Sep 2022 13:30:56 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87sfksn5yg.fsf@gnu.org> <871qs76f3y.fsf@gnu.org> Date: Tue, 20 Sep 2022 19:30:53 +0200 In-Reply-To: <871qs76f3y.fsf@gnu.org> (Mathieu Othacehe's message of "Mon, 19 Sep 2022 08:41:05 +0200") Message-ID: <87r106lzqq.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1663714368; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-to: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=Igx9YK17xL0BSM1Jy9h4e5tEhUi8EjvuM18iGSbTxVE=; b=JieHdeauEYDWkyMTkO5OoujZ/ook8xjssFLdgQ5UDHzg3lkD7+3vEXGDt5K/8Utvm/qucW rzGcDQZhtQaQIIMd1F2mtoxbTAUwMSzUFhK3gRm0lx6OVVZd6FoXAJ569tLaCgmoNpyOKK kVJMlKS4tIwF8dpinu7RSaT5+fBrC2WN6y2R5DiIBOe/LI7Lr2EOg1l60jval4ymIndbT5 Bg31ZBmM6pkl7B3ekcMeYvysw3NyAErQUwsKv5B+bf4o9Uwy4hqGVJq2ibF4/K6A8QaQdk XWRqwUDWBlux3jRi3Zv30fyfDuyBlE9GICpZBhO0ifqjcSVTwoNxZJoJkArYFA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1663714368; a=rsa-sha256; cv=none; b=ESLia2+EWEwH6mbTTbRd+qMya0EkBWdwUgtIGDXKOh6d+XuH/EO6YoY6juIZOLfQ8tVPas PuQOhwhblIwfAoRY/NDuYX4ZFYdEX2PVvRRRCX1YnC0DD8DgyGZ94PBHCuHbkuebqm4fBF DbUXU8u6Z9z4hLKslNBboYvwMPD+1lEgoBsmmqHA7D9BGG/7+9n5Jz3ac0zBJ9BHxoEKQ7 VDDaccFsb4bBUmP+prYePYNolt9WHexS13jqClB7iK2zizXXk/XNM4q8SWvyDhiSdByFDq 31dSO9csJJl4rv6GUs9Ie5w6DGUyIHIVdX4rUkpWaQZawkxGWEFOHgALjANOXA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=Vv5ZUQ82; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -2.23 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=Vv5ZUQ82; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: C8D5D8272 X-Spam-Score: -2.23 X-Migadu-Scanner: scn1.migadu.com X-TUID: zmGPlfw/tKJQ Hi, Mathieu Othacehe skribis: > Regarding those four, I was able to reproduce the issue this way: > > $ guix repl > (stop-service 'guix-daemon) > (start-service 'guix-daemon (list (number->string (getpid)))) Or from the shell: herd stop guix-daemon herd start guix-daemon $$ I was able to reproduce it using a bare-bones.tmpl VM. > The latter command hangs and Shepherd becomes unresponsive. I collected > an (attached) strace dump of Shepherd showing that there is no response > on the socket when the service is started. > > Note that, this works: > > $ guix repl > (stop-service 'guix-daemon) > (start-service 'guix-daemon)=20 > > So the problem could be caused by the "container-excursion*" in the > "fork+exec-command/container" procedure. PID=C2=A01 gets stuck on read(16, =E2=80=A6) forever, after reading the str= ing =E2=80=9C2866=E2=80=9D (a PID): --8<---------------cut here---------------start------------->8--- [pid 2865] clone(child_stack=3DNULL, flags=3DCLONE_CHILD_CLEARTID|CLONE_CH= ILD_SETTID|SIGCHLDstrace: Process 2866 attached , child_tidptr=3D0x7fccfbe00a10) =3D 2866 [pid 2866] set_robust_list(0x7fccfbe00a20, 24) =3D 0 [pid 2866] close(3) =3D 0 [pid 2865] write(39, "2866", 4 [pid 2866] close(4 [pid 2865] <... write resumed>) =3D 4 [pid 2866] <... close resumed>) =3D 0 [pid 2866] pipe2( [pid 2865] close(39 [pid 2866] <... pipe2 resumed>[3, 4], O_CLOEXEC) =3D 0 [pid 2865] <... close resumed>) =3D 0 [pid 2865] exit_group(0) =3D ? [pid 2866] rt_sigaction(SIGCHLD, {sa_handler=3DSIG_DFL, sa_mask=3D[], sa_f= lags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, {sa_handler=3D0x7fccfc42= 7d50, sa_mask=3D[], sa_flags=3DSA_RESTORER|SA_NOCLDSTOP, sa_restorer=3D0x7f= ccfc304d80}, 8) =3D 0 [pid 2866] rt_sigaction(SIGINT, {sa_handler=3DSIG_DFL, sa_mask=3D[], sa_fl= ags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, {sa_handler=3D0x7fccfc427= d50, sa_mask=3D[], sa_flags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, 8= ) =3D 0 [pid 2866] rt_sigaction(SIGHUP, {sa_handler=3DSIG_DFL, sa_mask=3D[], sa_fl= ags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, {sa_handler=3D0x7fccfc427= d50, sa_mask=3D[], sa_flags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, 8= ) =3D 0 [pid 2866] rt_sigaction(SIGTERM, {sa_handler=3DSIG_DFL, sa_mask=3D[], sa_f= lags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, {sa_handler=3D0x7fccfc42= 7d50, sa_mask=3D[], sa_flags=3DSA_RESTORER, sa_restorer=3D0x7fccfc304d80}, = 8) =3D 0 [pid 2866] rt_sigprocmask(SIG_UNBLOCK, [HUP INT TERM CHLD], [HUP INT TERM = CHLD], 8) =3D 0 [pid 2866] mkdir("/var", 0777) =3D -1 EEXIST (File exists) [pid 2866] mkdir("/var/run", 0777) =3D -1 EEXIST (File exists) [pid 2865] +++ exited with 0 +++ [pid 1] <... wait4 resumed>[{WIFEXITED(s) && WEXITSTATUS(s) =3D=3D 0}],= 0, NULL) =3D 2865 [pid 1] close(39) =3D 0 [pid 2866] setsid( [pid 1] read(16, [pid 2866] <... setsid resumed>) =3D 2866 [pid 1] <... read resumed>"2866", 4096) =3D 4 [pid 2866] chdir("/") =3D 0 [pid 1] read(16, [pid 2866] prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=3D1024, rlim_max=3D= 4*1024}) =3D 0 [pid 2866] close(0) =3D 0 [pid 2866] openat(AT_FDCWD, "/dev/null", O_RDONLY) =3D 0 [pid 2866] dup2(0, 0) =3D 0 [pid 2866] close(1) =3D 0 [pid 2866] close(2) =3D 0 [pid 2866] openat(AT_FDCWD, "/var/log/guix-daemon.log", O_WRONLY|O_CREAT|O= _APPEND, 0640) =3D 1 [pid 2866] dup2(1, 1) =3D 1 [pid 2866] dup2(1, 2) =3D 2 [pid 2866] execve("/gnu/store/bxnkqnpbf4q4z6245b61wgpm8gkr9nj1-guix-1.3.0-= 29.9e46320/bin/guix-daemon", ["/gnu/store/bxnkqnpbf4q4z6245b61w"..., "--bui= ld-users-group", "guixbuild", "--max-silent-time", "0", "--timeout", "0", "= --log-compression", "gzip", "--discover=3Dyes", "--substitute-urls", "https= ://substitutes.nonguix.org "...], 0x7fccf71fa480 /* 3 vars */) =3D 0 --8<---------------cut here---------------end--------------->8--- This happens because the other end of the file descriptor happens to be inherited by 2866, which will never close it because it just execs guix-daemon. This is fixed by 6abdcef4a68e98f538ab69fde096adc5f5ca4ff4; the log contains extra details. Thanks! Ludo=E2=80=99.