From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#30299: [core-updates] shepherd fails tests on all systems except x86_64 Date: Sat, 17 Feb 2018 01:04:00 +0100 Message-ID: <87inawlbwv.fsf@gnu.org> References: <87zi4uvi88.fsf@netris.org> <87k1vsfk1u.fsf@gnu.org> <87sha4osus.fsf@netris.org> <87a7wam53e.fsf@netris.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:36984) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1emq02-0000af-7j for bug-guix@gnu.org; Fri, 16 Feb 2018 19:05:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1empzy-0006U7-Sq for bug-guix@gnu.org; Fri, 16 Feb 2018 19:05:06 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:39434) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1empzy-0006U3-Pw for bug-guix@gnu.org; Fri, 16 Feb 2018 19:05:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1empzy-0003yX-GH for bug-guix@gnu.org; Fri, 16 Feb 2018 19:05:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87a7wam53e.fsf@netris.org> (Mark H. Weaver's message of "Thu, 15 Feb 2018 14:21:25 -0500") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Mark H Weaver Cc: 30299@debbugs.gnu.org Hello, Mark H Weaver skribis: > However, on armhf-linux, three tests failed: respawn.sh, > respawn-throttling.sh, and pid-file.sh. > > https://hydra.gnu.org/build/2499835 (Similar issue on aarch64: . Though of course it passed on the 2nd and 3rd attempts=E2=80=A6) I was able to reproduce a tests/respawn.sh failure on hardware (ARMv7). The issue is that a service is not respawned, and the log shows: --8<---------------cut here---------------start------------->8--- + assert_killed_service_is_respawned t-service2-pid-695 ++ cat t-service2-pid-695 + old_pid=3D789 + rm t-service2-pid-695 + kill 789 + wait_for_file t-service2-pid-695 + i=3D0 + test -f t-service2-pid-695 + test 0 -lt 20 + sleep 0.3 ++ expr 0 + 1 [...] 2018-02-16 11:13:31 Service root has been started. 2018-02-16 11:13:32 Service test1 has been started. 2018-02-16 11:13:34 Service test2 has been started. 2018-02-16 11:13:35 Respawning test1. 2018-02-16 11:13:35 Service test1 has been started. 2018-02-16 11:13:36 Respawning test2. 2018-02-16 11:13:37 Service test2 has been started. 2018-02-16 11:13:37 Respawning test1. 2018-02-16 11:13:37 Service test1 has been started. 2018-02-16 11:13:38 Respawning test2. 2018-02-16 11:13:43 Service test2 could not be started. --8<---------------cut here---------------end--------------->8--- So SIGCHLD was correctly delivered, but somehow restarting that service didn=E2=80=99t work (its PID file didn=E2=80=99t show up again; the 5 secon= ds between =E2=80=9CRespawning=E2=80=9D and =E2=80=9Ccould not be started=E2=80=9D cor= respond to the delay in =E2=80=98read-pid-file=E2=80=99 in (shepherd service)).=20=20 These test failures seem to be more frequent when the machine is loaded. Ludo=E2=80=99.