From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id iHG0EJXuMGO29wAAbAwnHQ (envelope-from ) for ; Mon, 26 Sep 2022 02:13:09 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id aK2dEJXuMGN7zAAAauVa8A (envelope-from ) for ; Mon, 26 Sep 2022 02:13:09 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A37393B895 for ; Mon, 26 Sep 2022 02:13:08 +0200 (CEST) Received: from localhost ([::1]:58486 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ocbkB-0007cJ-12 for larch@yhetil.org; Sun, 25 Sep 2022 20:13:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50262) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ocbk7-0007cB-85 for bug-guix@gnu.org; Sun, 25 Sep 2022 20:13:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:49646) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ocbk6-00042w-SI for bug-guix@gnu.org; Sun, 25 Sep 2022 20:13:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ocbk6-0001D2-JB for bug-guix@gnu.org; Sun, 25 Sep 2022 20:13:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#57922: Shepherd doesn't seem to correctly handle waitpid itself Resent-From: Maxim Cournoyer Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Mon, 26 Sep 2022 00:13:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 57922 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: Josselin Poiret , 57922@debbugs.gnu.org Received: via spool by 57922-submit@debbugs.gnu.org id=B57922.16641511434602 (code B ref 57922); Mon, 26 Sep 2022 00:13:02 +0000 Received: (at 57922) by debbugs.gnu.org; 26 Sep 2022 00:12:23 +0000 Received: from localhost ([127.0.0.1]:48724 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ocbjT-0001CA-FR for submit@debbugs.gnu.org; Sun, 25 Sep 2022 20:12:23 -0400 Received: from mail-qv1-f53.google.com ([209.85.219.53]:39494) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ocbjN-0001Bt-G8 for 57922@debbugs.gnu.org; Sun, 25 Sep 2022 20:12:21 -0400 Received: by mail-qv1-f53.google.com with SMTP id c6so3453601qvn.6 for <57922@debbugs.gnu.org>; Sun, 25 Sep 2022 17:12:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from:from:to:cc:subject :date; bh=/lD9T+YU39Xm2ZAkqduxh4bIOb/HECvfCUpzrUYo9IE=; b=Ypzv9pQ1qf0oJX1b/J54ZsZ+F8ggFEf/RdvTnFYAB7SCcD41BayU0v2MHF+BmV+4fr Z4SWubsjBX6Wid8r7hg1UayBJJArna7ehuM6PpE3+7aXgqNhgR5jXCLvPVpteiXklo52 KMoynGalWN4rXdh+WwG7xkcL4XjnfI6BvjuwNUq0NvKEALkgBC0urEDLGj+fe1XgeJtA cRRQ8cq34N6Tu0vo3I1rwaF07hdLpdU3xpqDwhYy7YwQtDAv2b4M3Wh7RpBaR7PhGXYb njmWEzOKPwQ4RJG2m+QGwl+ljf3RuRdrLrJ5N/Ny1W3Yoz+OMX0vYIZ1Fs7AfYOg3wEt AecA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date; bh=/lD9T+YU39Xm2ZAkqduxh4bIOb/HECvfCUpzrUYo9IE=; b=v91G5iNxz3Kde0pJCTM7vVO03MvsehCDfrHBvV4Spy3Vrn84lpJ/vDOIGRyR1VhCuM WgKgYHdbtUeZCOiA1UhZDgiuKpwuZf1JofFTGKqqOply4Q9Gx6cXLfK4q2e8O2xHemJu nAZn7DmXf/qouMCadj9S0kubGatasdUdgoC6c5Ft3B2JB14lOvbSF46AkoGC1K2+pf0B kc32yLR1tsCpeTaXEJinYi4j9ndNME3BmwN/mJN3d16QBxNIc0KkAB+a2gn3jpuqVVhI x5Oks5jw1FxqMFv268DUnCOzfCZNOEhdRE3FhlybyEu+3y+QAgxmaVRFWhq+eY/Ya6He QkaQ== X-Gm-Message-State: ACrzQf3TzGfFvEnCbXrq+wi3oEnNpbUedpJrLimCfkKkTy21kYztZ3hV hxbitKwJde1k1dZTxViHJP7lvfim4C8= X-Google-Smtp-Source: AMsMyM4uQUvGzqzUJYYAVtL2g/bmqkc06QU/vIU4r0cQgoNYYB0hf1ri0ErcwqgrUSMbCrPVaZs/ag== X-Received: by 2002:a0c:ec46:0:b0:4a7:509:386e with SMTP id n6-20020a0cec46000000b004a70509386emr15332891qvq.61.1664151131702; Sun, 25 Sep 2022 17:12:11 -0700 (PDT) Received: from hurd (dsl-10-132-99.b2b2c.ca. [72.10.132.99]) by smtp.gmail.com with ESMTPSA id o17-20020a05622a045100b003447c4f5aa5sm10741698qtx.24.2022.09.25.17.12.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Sep 2022 17:12:10 -0700 (PDT) From: Maxim Cournoyer References: <874jx4q953.fsf@gmail.com> <87o7va33iq.fsf@jpoiret.xyz> <87bkr6fvlz.fsf@gnu.org> <878rm98n17.fsf@gmail.com> <87sfkh8a8z.fsf@jpoiret.xyz> <87zgeo68hc.fsf@gnu.org> Date: Sun, 25 Sep 2022 20:12:09 -0400 In-Reply-To: <87zgeo68hc.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Sat, 24 Sep 2022 18:30:07 +0200") Message-ID: <87leq76lk6.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1664151189; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=/lD9T+YU39Xm2ZAkqduxh4bIOb/HECvfCUpzrUYo9IE=; b=s6RwU8BGNAnzdzf0IVO3yy1W7MrUE2tPA0HkSIrsxM3A72h8LYKPTQpGz2HkVaSvqVhbDY rV5NyiCzzueTkVkQPhePGtNpnO2brUGg1SM3BE93ZTN/IYeW1dn3jdxuksD6Y9zPSCU6DE B0a80hOGaNj/AAjvSVFn/ELPvoo3HOf2U961DtcnWVlP30okZdwgL1tq712MNjaEPLCfWw wAYoZxu2xcd+tAD7FKZ783/dpIqRbpTUWdlwcijlvsoejkKJpy9nvTarkuHhv6YGsBRZWz 1E3DpwtntbF/wJ+kmNLY9EYAYkJ0fhms0c3P828hP30X0d2m2HNiW1bbEy6OFg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1664151189; a=rsa-sha256; cv=none; b=d/tNoQ/ZBAAB3tgF2N/3C/Rd03yONaOpyWj31JZDJMppHPjuF3HKNcgwKPTxMA7ZH8TI22 yOFPOFC6X8bXFujq3JxpjxJT8RiKfjZcFM0GzNKHJI3z/A3MHiTkoFU84UXjo9j+UGy0RV YUelCuIqfTMx1jOsEo+DlMXkkyN4yHGt5Qq+Y8Jfnq1EGAg45tRplHxlOHLTRxa+DxgMWO oXPgJWV6mlbGbiSWa09Fbvcm7I7CX2M5yas86H+NRIILiAWhpUlRnNN3dUtTXNyKjk20wm FRyVKKFtVvJe2AeWSEH8AY9L+6Af7eXefB00qe7dE76GTmJhnItM8aRcYBsYyQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20210112 header.b=Ypzv9pQ1; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 7.65 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20210112 header.b=Ypzv9pQ1; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: A37393B895 X-Spam-Score: 7.65 X-Migadu-Scanner: scn1.migadu.com X-TUID: ywsD5mxrbQd3 Hi, Ludovic Court=C3=A8s writes: > Hi, > > Josselin Poiret skribis: > >> Maxim Cournoyer writes: >> >>> This leads me to believe that Shepherd does not block until the process >>> is actually dead to mark the process as stopped (it just waitpid on the >>> group pid with WNOHANG), which means it won't block if the child process >>> hasn't exited yet, if I'm correct. > > Correct: the service is marked as stopped as soon as =E2=80=98stop=E2=80= =99 returns. > >>> When we are in the stop slot, we know for sure that the process should >>> terminate completely, hence it'd make sense to call 'waitpid' *without* >>> WNOHANG there, to avoid 'herd restart' from starting the service while >>> its stopped process is not done terminating. >>> >>> jamid can take quite some time to terminate cleanly because of the >>> networking threads in the opendht library that needs to be finalized, >>> which is probably the reason this problem can be observed here. >>> >>> Thoughts? >> >> I agree with you, make-kill-destructor should waitpid the processes it's >> killing. There shouldn't be any issues waitpid'ing before the >> shepherd's signal handler, since stop actions are run with asyncs >> disabled. The signal handler will run once but won't get anything >> because all the processes were already waitpid'd and it uses WNOHANG. > > I think we need an extra =E2=80=9Cstopping=E2=80=9D state for services. = In general, > we=E2=80=99ll want to send SIGTERM, wait for some grace period or dead pr= ocess > notification, then send SIGKILL, and finally change state to =E2=80=9Csto= pped=E2=80=9D. > > This is not possible in 0.9 but is something I=E2=80=99d like to have in = 0.10=C2=B9. This sounds good. Let's keep this ticket open until this goodness lands, as a reminder. Thank you! Maxim