From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#26931: GuixSD rebooting fails when tmux is running Date: Mon, 28 Aug 2017 10:22:52 +0200 Message-ID: <87mv6kgkyr.fsf@gnu.org> References: <20170514193043.GA5396@jasmine> <87lgpzqfxq.fsf@gnu.org> <20170516231839.GA22678@jasmine> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:60904) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dmFL3-0006c6-Oj for bug-guix@gnu.org; Mon, 28 Aug 2017 04:24:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dmFL0-0000tc-Jz for bug-guix@gnu.org; Mon, 28 Aug 2017 04:24:05 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:49933) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dmFL0-0000tN-Ge for bug-guix@gnu.org; Mon, 28 Aug 2017 04:24:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dmFKz-00027p-Vi for bug-guix@gnu.org; Mon, 28 Aug 2017 04:24:01 -0400 Sender: "Debbugs-submit" Resent-To: bug-guix@gnu.org Resent-Message-ID: In-Reply-To: <20170516231839.GA22678@jasmine> (Leo Famulari's message of "Tue, 16 May 2017 19:18:39 -0400") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Leo Famulari Cc: 26931-done@debbugs.gnu.org Hi, Leo Famulari skribis: > On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Court=C3=A8s wrote: >> What does /var/log/shepherd.log show around the time where you hit >> =E2=80=9Chalt=E2=80=9D? >>=20 >> I get something like this: >>=20 >> --8<---------------cut here---------------start------------->8--- >> 18:06:26 Service mcron has been stopped. >> 18:06:26 sending all processes the TERM signal > > For me, this is where it gets stuck: > > ------ > 2017-05-16 19:12:53 sending all processes the TERM signal > 2017-05-16 19:12:58 waiting for process termination (processes left: (1 4= 94))=20 > 2017-05-16 19:13:00 waiting for process termination (processes left: (1 4= 94))=20 > 2017-05-16 19:13:02 waiting for process termination (processes left: (1 4= 94))=20 > ------ > > In my experience, it will wait here forever. > > And from `ps aux`: > > leo 494 0.0 0.1 27232 3676 ? Ss 19:12 0:00 tmux The bug was 100% reproducible in a VM, and AFAICS it is fixed by 7f090203d5fb033eb1b64778b03afad5bb35f5f2. The problem was that the tmux server process would be left as a zombie, and then the loop would always see it because the parent process of the tmux server process is PID=C2=A01 and for some reason the PID=C2=A01 either= didn=E2=80=99t get SIGCHLD or the handler didn=E2=80=99t run. The test that this commit adds does exactly the same thing: launch tmux and then invoke =E2=80=9Chalt=E2=80=9D. I tried to create a synthetic test= not involving tmux, simply creating a process that gets PID=C2=A01 as its paren= t, but it wouldn=E2=80=99t trigger the bug. I=E2=80=99m unclear as to why tmu= x triggers it and no that other simple test. Thanks, Ludo=E2=80=99.