From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id QDJpKaQ3cmSMCwAASxT56A (envelope-from ) for ; Sat, 27 May 2023 19:02:28 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id oBFXKaQ3cmROCAAA9RJhRA (envelope-from ) for ; Sat, 27 May 2023 19:02:28 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 2D55B21DC for ; Sat, 27 May 2023 19:02:28 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q2xIp-0002Vi-Qk; Sat, 27 May 2023 13:02:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2xIo-0002Va-Vx for bug-guix@gnu.org; Sat, 27 May 2023 13:02:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q2xIo-0003ij-O3 for bug-guix@gnu.org; Sat, 27 May 2023 13:02:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1q2xIo-0002il-49 for bug-guix@gnu.org; Sat, 27 May 2023 13:02:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#63736: [Shepherd] Loss of SIGCHLD notifications Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Sat, 27 May 2023 17:02:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63736 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 63736@debbugs.gnu.org Received: via spool by 63736-submit@debbugs.gnu.org id=B63736.168520688010392 (code B ref 63736); Sat, 27 May 2023 17:02:02 +0000 Received: (at 63736) by debbugs.gnu.org; 27 May 2023 17:01:20 +0000 Received: from localhost ([127.0.0.1]:52477 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q2xHt-0002hE-St for submit@debbugs.gnu.org; Sat, 27 May 2023 13:01:20 -0400 Received: from eggs.gnu.org ([209.51.188.92]:60522) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q2xHs-0002gd-1I for 63736@debbugs.gnu.org; Sat, 27 May 2023 13:01:04 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2xHm-0003eX-Jd for 63736@debbugs.gnu.org; Sat, 27 May 2023 13:00:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=4BJoMv5vFcfe4Fbou2z3zXwZ5XnLkBMI0ts1928uoHE=; b=Oiv3STDpfyM3stWUEVkx cX2o1d2ceTftgDtszotObN7gRGFZIkS+gwwyGMqcIiowHLKpvi861Br25URw+XnysX8AxTrU7a/rd sW0Oma55IqP0e5jbPNj0hvKzNb4XU6dq/ovy33N1Yygyx6TFyJ6hi96/VwVPDKpGwvR1hVFlEUmrw C14jMq36IKh9ibb1bcTH4mimXm8xm4ePl5r2LPRFUHssGs/Lj4U2DtEWOC7EzaFXjmSAGcaHbtEyY pE465GClIENv3r5BfAybcaKHs4LvJDWGNquUCvL4heqgAHa05mzluOr3kQ1nNyZu6vF1a9VXN1Mr8 ifj0+2KPe+GLdA==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201] helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2xHm-0005cZ-6a for 63736@debbugs.gnu.org; Sat, 27 May 2023 13:00:58 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87353jb88e.fsf@inria.fr> Date: Sat, 27 May 2023 19:00:55 +0200 In-Reply-To: <87353jb88e.fsf@inria.fr> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Fri, 26 May 2023 14:14:09 +0200") Message-ID: <87wn0thfp4.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: bug-guix-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1685206948; a=rsa-sha256; cv=none; b=F+4mjRSqBC9tVYd4qxV5kJ0irOYKDKS2wvbzzvGc0vuhKpJ5Af9UljyyJWONE5KsQwrX6x jWpdSgDP2yi0jJUKYLhiSCucUr8LXuXD2TIVwVLvcKQ+wyVnclbRWND8f3fyFEiqBgxOmS cW7eVP/5c8531aomkZJge5SVutgyAgv/zpW/hotO5K2WchqYj/50hmmDMAdy3KKqF3Oh8O VN7kkRPhFvKdqp/CDY8Oc5McVrVGOcOp3uShkSsb0i8DLz4/B9OwbRc2gJieACVN4gHcqi A7KstX9ulnMO4feHYquYh78StfKGaM5A30i+Ys33X1ge2HpojbBiu3r5RgRKZQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=Oiv3STDp; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1685206948; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=4BJoMv5vFcfe4Fbou2z3zXwZ5XnLkBMI0ts1928uoHE=; b=aePikvGMbsyPg0oKwZ8A3EY7QnIAvR6Ryd283U+ZELWWnc0STXb15V+lztuMbe7HgOSlFr UhHeh2kWeJ7xOukC1CaBYK1I305S5FTR+y1c7+8fh6AyC727k8hqxNY2WS47hhS8BF+KU6 HHnoF1A095wdzwGo1Yw04YPqjxwQ/hc/IxGDd1w1tkA4wYUP0YxozDANz8Vr5Jx8EF0TZa +Ni7IdJvf1UD1YpuLPR7QRFNDBM8ujWb34Q4zlZsvPRARCOVT3unAVkxZa+vfOcuwCpCF2 kvz2OWNx8F1t2el8i8cPi7AcSVFmsfwK9b+DtUw4ldfMqlI0n4fuHdifzGWu/Q== X-Migadu-Spam-Score: -4.66 X-Spam-Score: -4.66 X-Migadu-Queue-Id: 2D55B21DC X-Migadu-Scanner: scn0.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=Oiv3STDp; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-TUID: z5JdCzVzsAjU Ludovic Court=C3=A8s skribis: > Long story short: there seems to be a problem with signal delivery. > Most likely, the initial grace period expiration above when stopping > nscd is a symptom of shepherd no longer receiving/processing SIGCHLD > rather than the cause. Another possibility is lockup: one of the relevant fibers is either gone or stuck in =E2=80=98put-message=E2=80=99 or =E2=80=98get-message=E2=80=99. I did two things: b9a37f3 shepherd: Make signal handling fiber an essential task. 8ae2780 service: Do not attempt to restart transient services. Commit 8ae2780 fixes a bug whereby =E2=80=98herd restart=E2=80=99 could end= up attempting to restart a transient service, which would lock up the calling fiber because the service=E2=80=99s controlling fiber would first receive the 'terminate message, so it would return and nobody would be reading further messages send on its channel. Commit b9a37f3 will allows us to ensure that the signal-handling fiber never exits (and we=E2=80=99ll get a trace in the log if it tries to). Ludo=E2=80=99.