From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42482) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f1dqA-00023G-RI for guix-patches@gnu.org; Thu, 29 Mar 2018 16:08:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f1dq6-0002kc-6W for guix-patches@gnu.org; Thu, 29 Mar 2018 16:08:06 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:51172) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1f1dq5-0002jh-QE for guix-patches@gnu.org; Thu, 29 Mar 2018 16:08:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1f1dq5-0005Bl-K0 for guix-patches@gnu.org; Thu, 29 Mar 2018 16:08:01 -0400 Subject: [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers. Resent-Message-ID: From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) References: <87muyvulwt.fsf@zancanaro.id.au> Date: Thu, 29 Mar 2018 22:07:05 +0200 In-Reply-To: <87muyvulwt.fsf@zancanaro.id.au> (Carlo Zancanaro's message of "Mon, 26 Mar 2018 22:16:34 +1100") Message-ID: <87bmf6ve6u.fsf@gnu.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Carlo Zancanaro Cc: 30948@debbugs.gnu.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Carlo, Carlo Zancanaro skribis: > When working on the Shepherd, I found that in the build containers > processes don't get reaped by pid 1. See > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D30637#29. This caused > (and will cause) the Shepherd's tests to fail on some systems. > > Our guile-builder script should handle SIGCHLD and then use waitpid to > reap the child processes. Here's my attempt at a patch to do that. I would rather install the handler as a phase in gnu-build-system: this leaves =E2=80=98build-expression->derivation=E2=80=99 generic, and also giv= es us more flexibility (e.g., we can disable that phase without doing a full rebuild if needed.) See the patch below. WDYT? On my first attempt with: ./pre-inst-env guix build -e '(@@ (gnu packages commencement) findutils-b= oot0)' quickly failed: --8<---------------cut here---------------start------------->8--- checking for vfork.h... no checking for fork... yes checking for vfork... yes checking for working fork... Backtrace: In ice-9/boot-9.scm: yes checking for working vfork... (cached) yes checking for strcasecmp... 157: 13 [catch #t # ...] In unknown file: ?: 12 [apply-smob/1 #] In ice-9/boot-9.scm: 63: 11 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 10 [eval # #] In ice-9/boot-9.scm: 2320: 9 [save-module-excursion #] 3966: 8 [#] 1645: 7 [%start-stack load-stack #] 1650: 6 [#] In unknown file: ?: 5 [primitive-load "/gnu/store/pz3jy89ax5jg0j6fnp5n42x4vznga8s3-make-b= oot0-4.2.1-guile-builder"] In ice-9/eval.scm: 387: 4 [eval # ()] In srfi/srfi-1.scm: 619: 3 [for-each # ...] In /gnu/store/hf8xflikhgsd4hfy9h8s0cjzfqm8f3yb-module-import/guix/build/gnu= -build-system.scm: 819: 2 [# #] In /gnu/store/hf8xflikhgsd4hfy9h8s0cjzfqm8f3yb-module-import/guix/build/uti= ls.scm: 614: 1 [invoke "/gnu/store/g34swjqyw205d15pyra39j56qvyxq9w9-bootstrap-bina= ries-0/bin/bash" ...] In unknown file: ?: 0 [system* "/gnu/store/g34swjqyw205d15pyra39j56qvyxq9w9-bootstrap-bin= aries-0/bin/bash" ...] ERROR: In procedure system*: ERROR: In procedure system*: Interrupted system call builder for `/gnu/store/hc96d5dcshbdgavpp0j01qnsjf0yf9z5-make-boot0-4.2.1.d= rv' failed with exit code 1 --8<---------------cut here---------------end--------------->8--- This is why =E2=80=98install-SIGCHLD-handler=E2=80=99 in the patch does not= hing on Guile <=3D 2.0.9. Now, we=E2=80=99d need to test it for real with Guile 2.2. I suppose one w= ay to test without rebuilding it all would be to add this phase explicitly in a package and try building it with --rounds=3D10 or something. Would you like to try that? Note that we have only a couple of days left before the =E2=80=98core-updat= es=E2=80=99 freeze. Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm index be5ad78b9..2c6cb4ad2 100644 --- a/guix/build/gnu-build-system.scm +++ b/guix/build/gnu-build-system.scm @@ -51,6 +51,28 @@ (define time-monotonic time-tai)) (else #t)) +(define* (install-SIGCHLD-handler #:rest _) + "Handle SIGCHLD signals. Since this code is usually running as PID 1 in the +build daemon, it has to reap dead processes, hence this procedure." + ;; In Guile <= 2.0.9, syscalls could throw EINTR. With these versions, + ;; installing a SIGCHLD handler is not safe because we could have uncaught + ;; 'system-error' exceptions at any time. + (when (or (not (string=? (effective-version) "2.0")) + (> (string->number (micro-version)) 9)) + (format #t "installing SIGCHLD handler in PID ~a\n" (getpid)) + (sigaction SIGCHLD + (lambda _ + (let loop () + (match (catch 'system-error + (lambda () + (waitpid WAIT_ANY WNOHANG)) + (lambda args + '(0 . -))) + ((0 . _) #f) + ((pid . _) (loop))))) + SA_NOCLDSTOP)) + #t) + (define* (set-SOURCE-DATE-EPOCH #:rest _) "Set the 'SOURCE_DATE_EPOCH' environment variable. This is used by tools that incorporate timestamps as a way to tell them to use a fixed timestamp. @@ -758,7 +780,8 @@ which cannot be found~%" ;; Standard build phases, as a list of symbol/procedure pairs. (let-syntax ((phases (syntax-rules () ((_ p ...) `((p . ,p) ...))))) - (phases set-SOURCE-DATE-EPOCH set-paths install-locale unpack + (phases install-SIGCHLD-handler + set-SOURCE-DATE-EPOCH set-paths install-locale unpack bootstrap patch-usr-bin-file patch-source-shebangs configure patch-generated-file-shebangs --=-=-=--