From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Cl=C3=A9ment?= Lassieur Subject: bug#26931: GuixSD rebooting fails when tmux is running Date: Mon, 28 Aug 2017 10:30:37 +0200 Message-ID: <87efrwt7pu.fsf@lassieur.org> References: <20170514193043.GA5396@jasmine> <87lgpzqfxq.fsf@gnu.org> <20170516231839.GA22678@jasmine> <87mv6kgkyr.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:34103) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dmFRq-0000in-W0 for bug-guix@gnu.org; Mon, 28 Aug 2017 04:31:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dmFRm-0003GI-35 for bug-guix@gnu.org; Mon, 28 Aug 2017 04:31:07 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:49938) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dmFRl-0003GE-Va for bug-guix@gnu.org; Mon, 28 Aug 2017 04:31:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dmFRl-0002Ix-Jv for bug-guix@gnu.org; Mon, 28 Aug 2017 04:31:01 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-reply-to: <87mv6kgkyr.fsf@gnu.org> List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 26931-done@debbugs.gnu.org Ludovic Courtès writes: > Hi, > > Leo Famulari skribis: > >> On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote: >>> What does /var/log/shepherd.log show around the time where you hit >>> “halt”? >>> >>> I get something like this: >>> >>> --8<---------------cut here---------------start------------->8--- >>> 18:06:26 Service mcron has been stopped. >>> 18:06:26 sending all processes the TERM signal >> >> For me, this is where it gets stuck: >> >> ------ >> 2017-05-16 19:12:53 sending all processes the TERM signal >> 2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) >> 2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) >> 2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) >> ------ >> >> In my experience, it will wait here forever. >> >> And from `ps aux`: >> >> leo 494 0.0 0.1 27232 3676 ? Ss 19:12 0:00 tmux > > The bug was 100% reproducible in a VM, and AFAICS it is fixed by > 7f090203d5fb033eb1b64778b03afad5bb35f5f2. > > The problem was that the tmux server process would be left as a zombie, > and then the loop would always see it because the parent process of the > tmux server process is PID1 and for some reason the PID1 either didn’t > get SIGCHLD or the handler didn’t run. > > The test that this commit adds does exactly the same thing: launch tmux > and then invoke “halt”. I tried to create a synthetic test not > involving tmux, simply creating a process that gets PID1 as its parent, > but it wouldn’t trigger the bug. I’m unclear as to why tmux triggers it > and no that other simple test. FYI I have the exact same problem with guix-publish: I have to do 'herd stop guix-publish', otherwise I can't reboot. I'll report a bug soon. I would be interested to know if someone else reproduces it.