all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
@ 2018-03-26 11:16 Carlo Zancanaro
  2018-03-26 23:39 ` Carlo Zancanaro
  2018-03-29 20:07 ` Ludovic Courtès
  0 siblings, 2 replies; 14+ messages in thread
From: Carlo Zancanaro @ 2018-03-26 11:16 UTC (permalink / raw)
  To: 30948


[-- Attachment #1.1: Type: text/plain, Size: 576 bytes --]

When working on the Shepherd, I found that in the build containers 
processes don't get reaped by pid 1. See 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=30637#29. This 
caused (and will cause) the Shepherd's tests to fail on some 
systems.

Our guile-builder script should handle SIGCHLD and then use 
waitpid to reap the child processes. Here's my attempt at a patch 
to do that.

I haven't been able to build anything with it because the computer 
I'm currently on is laughably slow. If someone else can check that 
you can still build with it I'd really appreciate it.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-guix-Reap-finished-child-processes-in-build-containe.patch --]
[-- Type: text/x-patch, Size: 1457 bytes --]

From 7c66818570a139fc4e7b11de34d07c76ebdc6bac Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Mon, 26 Mar 2018 22:08:26 +1100
Subject: [PATCH] guix: Reap finished child processes in build containers.

* guix/derivations (build-expression->derivation)[prologue]: Handle SIGCHLD
  and reap child processes when they finish.
---
 guix/derivations.scm | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/guix/derivations.scm b/guix/derivations.scm
index da686e89e..80787e99e 100644
--- a/guix/derivations.scm
+++ b/guix/derivations.scm
@@ -1180,6 +1180,17 @@ ALLOWED-REFERENCES, DISALLOWED-REFERENCES, LOCAL-BUILD?, and SUBSTITUTABLE?."
                            (filter module-form? exp))
                           (_ `(,exp)))
 
+                      ;; The root process in the build container should reap
+                      ;; processes that die, so handle SIGCHLD.
+                      (sigaction SIGCHLD
+                        (lambda ()
+                          (let loop ()
+                            (match (waitpid WAIT_ANY WNOHANG)
+                              ((0 . _) #f)
+                              ((pid . _) (loop))
+                              (_ #f))))
+                        SA_NOCLDSTOP)
+
                       (define %output (getenv "out"))
                       (define %outputs
                         (map (lambda (o)
-- 
2.16.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-26 11:16 [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers Carlo Zancanaro
@ 2018-03-26 23:39 ` Carlo Zancanaro
  2018-03-29 20:07 ` Ludovic Courtès
  1 sibling, 0 replies; 14+ messages in thread
From: Carlo Zancanaro @ 2018-03-26 23:39 UTC (permalink / raw)
  To: 30948


[-- Attachment #1.1: Type: text/plain, Size: 301 bytes --]

Okay, it turns out my previous patch was very wrong. I tried to 
start a build and it broke pretty significantly.

I've attached a new patch that at least starts building. My 
computer takes too long to actually build anything, but I'm 
slightly more confident that my change won't break everything.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-guix-Reap-finished-child-processes-in-build-containe.patch --]
[-- Type: text/x-patch, Size: 1693 bytes --]

From c57b2fe19865afc21fd1fd9a7cad3286b05a9b22 Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Mon, 26 Mar 2018 22:08:26 +1100
Subject: [PATCH] guix: Reap finished child processes in build containers.

* guix/derivations (build-expression->derivation)[prologue]: Handle SIGCHLD
  and reap child processes when they finish.
---
 guix/derivations.scm | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/guix/derivations.scm b/guix/derivations.scm
index da686e89e..27ab3e420 100644
--- a/guix/derivations.scm
+++ b/guix/derivations.scm
@@ -1201,6 +1201,21 @@ ALLOWED-REFERENCES, DISALLOWED-REFERENCES, LOCAL-BUILD?, and SUBSTITUTABLE?."
                                           (else drv))))))
                                inputs))
 
+                      ;; The root process in the build container should reap
+                      ;; processes that die, so handle SIGCHLD.
+                      (use-modules (ice-9 match))
+                      (sigaction SIGCHLD
+                        (lambda _
+                          (let loop ()
+                            (match (catch 'system-error
+                                     (lambda ()
+                                       (waitpid WAIT_ANY WNOHANG))
+                                     (lambda args
+                                       '(0 . -)))
+                              ((0 . _) #f)
+                              ((pid . _) (loop)))))
+                        SA_NOCLDSTOP)
+
                       ,@(if (null? modules)
                             '()
                             ;; Remove our own settings.
-- 
2.16.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-26 11:16 [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers Carlo Zancanaro
  2018-03-26 23:39 ` Carlo Zancanaro
@ 2018-03-29 20:07 ` Ludovic Courtès
  2018-03-29 21:15   ` Carlo Zancanaro
  1 sibling, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2018-03-29 20:07 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: 30948

[-- Attachment #1: Type: text/plain, Size: 3231 bytes --]

Hi Carlo,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> When working on the Shepherd, I found that in the build containers
> processes don't get reaped by pid 1. See
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=30637#29. This caused
> (and will cause) the Shepherd's tests to fail on some systems.
>
> Our guile-builder script should handle SIGCHLD and then use waitpid to
> reap the child processes. Here's my attempt at a patch to do that.

I would rather install the handler as a phase in gnu-build-system: this
leaves ‘build-expression->derivation’ generic, and also gives us more
flexibility (e.g., we can disable that phase without doing a full
rebuild if needed.)  See the patch below.

WDYT?

On my first attempt with:

  ./pre-inst-env guix build -e '(@@ (gnu packages commencement) findutils-boot0)'

quickly failed:

--8<---------------cut here---------------start------------->8---
checking for vfork.h... no
checking for fork... yes
checking for vfork... yes
checking for working fork... Backtrace:
In ice-9/boot-9.scm:
yes
checking for working vfork... (cached) yes
checking for strcasecmp...  157: 13 [catch #t #<catch-closure c900a0> ...]
In unknown file:
   ?: 12 [apply-smob/1 #<catch-closure c900a0>]
In ice-9/boot-9.scm:
  63: 11 [call-with-prompt prompt0 ...]
In ice-9/eval.scm:
 432: 10 [eval # #]
In ice-9/boot-9.scm:
2320: 9 [save-module-excursion #<procedure cc1b80 at ice-9/boot-9.scm:3961:3 ()>]
3966: 8 [#<procedure cc1b80 at ice-9/boot-9.scm:3961:3 ()>]
1645: 7 [%start-stack load-stack #<procedure cbd2c0 at ice-9/boot-9.scm:3957:10 ()>]
1650: 6 [#<procedure cc3060 ()>]
In unknown file:
   ?: 5 [primitive-load "/gnu/store/pz3jy89ax5jg0j6fnp5n42x4vznga8s3-make-boot0-4.2.1-guile-builder"]
In ice-9/eval.scm:
 387: 4 [eval # ()]
In srfi/srfi-1.scm:
 619: 3 [for-each #<procedure 1217560 at /gnu/store/hf8xflikhgsd4hfy9h8s0cjzfqm8f3yb-module-import/guix/build/gnu-build-system.scm:815:12 (expr)> ...]
In /gnu/store/hf8xflikhgsd4hfy9h8s0cjzfqm8f3yb-module-import/guix/build/gnu-build-system.scm:
 819: 2 [#<procedure 1217560 at /gnu/store/hf8xflikhgsd4hfy9h8s0cjzfqm8f3yb-module-import/guix/build/gnu-build-system.scm:815:12 (expr)> #]
In /gnu/store/hf8xflikhgsd4hfy9h8s0cjzfqm8f3yb-module-import/guix/build/utils.scm:
 614: 1 [invoke "/gnu/store/g34swjqyw205d15pyra39j56qvyxq9w9-bootstrap-binaries-0/bin/bash" ...]
In unknown file:
   ?: 0 [system* "/gnu/store/g34swjqyw205d15pyra39j56qvyxq9w9-bootstrap-binaries-0/bin/bash" ...]

ERROR: In procedure system*:
ERROR: In procedure system*: Interrupted system call
builder for `/gnu/store/hc96d5dcshbdgavpp0j01qnsjf0yf9z5-make-boot0-4.2.1.drv' failed with exit code 1
--8<---------------cut here---------------end--------------->8---

This is why ‘install-SIGCHLD-handler’ in the patch does nothing on Guile
<= 2.0.9.

Now, we’d need to test it for real with Guile 2.2.  I suppose one way to
test without rebuilding it all would be to add this phase explicitly in
a package and try building it with --rounds=10 or something.  Would you
like to try that?

Note that we have only a couple of days left before the ‘core-updates’
freeze.

Thanks,
Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1878 bytes --]

diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
index be5ad78b9..2c6cb4ad2 100644
--- a/guix/build/gnu-build-system.scm
+++ b/guix/build/gnu-build-system.scm
@@ -51,6 +51,28 @@
    (define time-monotonic time-tai))
   (else #t))
 
+(define* (install-SIGCHLD-handler #:rest _)
+  "Handle SIGCHLD signals.  Since this code is usually running as PID 1 in the
+build daemon, it has to reap dead processes, hence this procedure."
+  ;; In Guile <= 2.0.9, syscalls could throw EINTR.  With these versions,
+  ;; installing a SIGCHLD handler is not safe because we could have uncaught
+  ;; 'system-error' exceptions at any time.
+  (when (or (not (string=? (effective-version) "2.0"))
+            (> (string->number (micro-version)) 9))
+    (format #t "installing SIGCHLD handler in PID ~a\n" (getpid))
+    (sigaction SIGCHLD
+      (lambda _
+        (let loop ()
+          (match (catch 'system-error
+                   (lambda ()
+                     (waitpid WAIT_ANY WNOHANG))
+                   (lambda args
+                     '(0 . -)))
+            ((0 . _) #f)
+            ((pid . _) (loop)))))
+      SA_NOCLDSTOP))
+  #t)
+
 (define* (set-SOURCE-DATE-EPOCH #:rest _)
   "Set the 'SOURCE_DATE_EPOCH' environment variable.  This is used by tools
 that incorporate timestamps as a way to tell them to use a fixed timestamp.
@@ -758,7 +780,8 @@ which cannot be found~%"
   ;; Standard build phases, as a list of symbol/procedure pairs.
   (let-syntax ((phases (syntax-rules ()
                          ((_ p ...) `((p . ,p) ...)))))
-    (phases set-SOURCE-DATE-EPOCH set-paths install-locale unpack
+    (phases install-SIGCHLD-handler
+            set-SOURCE-DATE-EPOCH set-paths install-locale unpack
             bootstrap
             patch-usr-bin-file
             patch-source-shebangs configure patch-generated-file-shebangs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-29 20:07 ` Ludovic Courtès
@ 2018-03-29 21:15   ` Carlo Zancanaro
  2018-03-30  8:16     ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Carlo Zancanaro @ 2018-03-29 21:15 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30948

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

Hey Ludo,

On Thu, Mar 29 2018, Ludovic Courtès wrote:
> I would rather install the handler as a phase in 
> gnu-build-system: this leaves ‘build-expression->derivation’ 
> generic, and also gives us more flexibility (e.g., we can 
> disable that phase without doing a full rebuild if needed.)  See 
> the patch below.
>
> WDYT?

What do you mean by "generic"? From what I can understand it's one 
of pid 1's responsiblities to reap child processes, so I would 
expect this to be set up for every builder, before the builder is 
run. Given it's not specific to the gnu-build-system, I don't 
think it really fits there.

> On my first attempt with:
>
>   ./pre-inst-env guix build -e '(@@ (gnu packages commencement) 
>   findutils-boot0)'
>
> quickly failed:
>
> ...
>
> This is why ‘install-SIGCHLD-handler’ in the patch does nothing 
> on Guile <= 2.0.9.

From what I understand, Guix depends on Guile 2.0.13 or later, so 
I didn't think it needed to work with 2.0.9. From my quick check, 
though, our bootstrap binaries are Guile 2.0.9? I can see how that 
might cause a problem. In what sense does Guix require 2.0.13 (as 
the manual claims) rather than 2.0.9?

> Now, we’d need to test it for real with Guile 2.2.  I suppose 
> one way to
> test without rebuilding it all would be to add this phase 
> explicitly in
> a package and try building it with --rounds=10 or something. 
> Would you
> like to try that?

Yeah, I'll give it a go.

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-29 21:15   ` Carlo Zancanaro
@ 2018-03-30  8:16     ` Ludovic Courtès
  2018-03-30 11:17       ` Carlo Zancanaro
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2018-03-30  8:16 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: 30948

Heya,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> On Thu, Mar 29 2018, Ludovic Courtès wrote:
>> I would rather install the handler as a phase in gnu-build-system:
>> this leaves ‘build-expression->derivation’ generic, and also gives
>> us more flexibility (e.g., we can disable that phase without doing a
>> full rebuild if needed.)  See the patch below.
>>
>> WDYT?
>
> What do you mean by "generic"?

I want as little magic as possible around the expression that’s passed
to ‘build-expression->derivation’.

> From what I can understand it's one of pid 1's responsiblities to reap
> child processes, so I would expect this to be set up for every
> builder, before the builder is run.

True, but for derivations it’s also “optional” because eventually
guix-daemon terminates all its child processes.

> Given it's not specific to the gnu-build-system, I don't think it
> really fits there.

Yes, but note that it would be inherited by all the build systems.

>> On my first attempt with:
>>
>>   ./pre-inst-env guix build -e '(@@ (gnu packages commencement)
>> findutils-boot0)'
>>
>> quickly failed:
>>
>> ...
>>
>> This is why ‘install-SIGCHLD-handler’ in the patch does nothing on
>> Guile <= 2.0.9.
>
> From what I understand, Guix depends on Guile 2.0.13 or later, so I
> didn't think it needed to work with 2.0.9. From my quick check,
> though, our bootstrap binaries are Guile 2.0.9?

Exactly.

> I can see how that might cause a problem. In what sense does Guix
> require 2.0.13 (as the manual claims) rather than 2.0.9?

There’s the “host side” (the ‘guix’ commands and related modules), and
there’s the “build side” (code used in the build environment when
building derivations.)

The “build side” is fully specified: ‘guix graph’ shows exactly what
Guile is used where, and you can see with, say:

  guix graph -t derivation \
    -e '(@@ (gnu packages commencement) findutils-boot0)'

that the early derivations run on Guile 2.0.9.

For “host side” code, users can use any Guile >= 2.0.13.

See also
<https://gnu.org/software/guix/manual/html_node/G_002dExpressions.html>.

I hope this clarifies a bit!

Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-30  8:16     ` Ludovic Courtès
@ 2018-03-30 11:17       ` Carlo Zancanaro
  2018-03-30 15:17         ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Carlo Zancanaro @ 2018-03-30 11:17 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30948

[-- Attachment #1: Type: text/plain, Size: 1769 bytes --]

Hey,

On Fri, Mar 30 2018, Ludovic Courtès wrote:
>> From what I can understand it's one of pid 1's responsiblities 
>> to reap child processes, so I would expect this to be set up 
>> for every builder, before the builder is run.
>
> True, but for derivations it’s also “optional” because 
> eventually guix-daemon terminates all its child processes.

As long as the build process doesn't rely on behaviour that, 
strictly speaking, it should be allowed to rely on. It's not an 
issue of resource leaking, it's an issue of correctness.

>> Given it's not specific to the gnu-build-system, I don't think 
>> it really fits there.
>
> Yes, but note that it would be inherited by all the build 
> systems.

Except for trivial-build-system, which is probably fine. I still 
don't think it fits in a specific build system, given it's a 
behaviour that transcends the specific action happening within the 
container.

Putting it in gnu-build-system will solve the problem in all 
realistic cases, so that's probably fine. It's still subtly 
incorrect, but will only be a problem if something using the 
trivial build system relies on pid 1 to reap a process, or if we 
make a new build system not deriving from gnu-build-system (which 
seems unlikely, but not impossible).

> The “build side” is fully specified: ‘guix graph’ shows exactly 
> what Guile is used where, and you can see with, say:
>
>   guix graph -t derivation \
>     -e '(@@ (gnu packages commencement) findutils-boot0)'
>
> that the early derivations run on Guile 2.0.9.
>
> For “host side” code, users can use any Guile >= 2.0.13.

Yeah, okay. That makes sense. I guess I just expected 2.0.13 to be 
the minimum version throughout.

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-30 11:17       ` Carlo Zancanaro
@ 2018-03-30 15:17         ` Ludovic Courtès
  2022-11-24 16:40           ` Maxim Cournoyer
  2022-11-24 16:44           ` bug#30948: " Maxim Cournoyer
  0 siblings, 2 replies; 14+ messages in thread
From: Ludovic Courtès @ 2018-03-30 15:17 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: 30948

Hello,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> On Fri, Mar 30 2018, Ludovic Courtès wrote:
>>> From what I can understand it's one of pid 1's responsiblities to
>>> reap child processes, so I would expect this to be set up for every
>>> builder, before the builder is run.
>>
>> True, but for derivations it’s also “optional” because eventually
>> guix-daemon terminates all its child processes.
>
> As long as the build process doesn't rely on behaviour that, strictly
> speaking, it should be allowed to rely on. It's not an issue of
> resource leaking, it's an issue of correctness.

Right.

>>> Given it's not specific to the gnu-build-system, I don't think it
>>> really fits there.
>>
>> Yes, but note that it would be inherited by all the build systems.
>
> Except for trivial-build-system, which is probably fine. I still don't
> think it fits in a specific build system, given it's a behaviour that
> transcends the specific action happening within the container.
>
> Putting it in gnu-build-system will solve the problem in all realistic
> cases, so that's probably fine. It's still subtly incorrect, but will
> only be a problem if something using the trivial build system relies
> on pid 1 to reap a process, or if we make a new build system not
> deriving from gnu-build-system (which seems unlikely, but not
> impossible).

I agree, every Guile process running as PID 1 should reap processes.

My view is just that this mechanism belongs in “user code”, not in the
low-level mechanisms such as ‘build-expression->derivation’ and
‘gexp->derivation’.  It’s a matter of separation of concerns.

Of course we don’t want to duplicate that code every time, but the way
we should factorize it, IMO, is by putting it in a “normal” module that
people will use.

Putting it in gnu-build-system is an admittedly hacky but easy way to
have it widely shared.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-30 15:17         ` Ludovic Courtès
@ 2022-11-24 16:40           ` Maxim Cournoyer
  2022-11-24 16:44           ` bug#30948: " Maxim Cournoyer
  1 sibling, 0 replies; 14+ messages in thread
From: Maxim Cournoyer @ 2022-11-24 16:40 UTC (permalink / raw)
  To: 30948; +Cc: Carlo Zancanaro, GNU Debbugs, Ludovic Courtès

reassign 30948 guix
thanks
--
Hi,

I'm moving this from 'guix-patches' to 'guix', so that it's more
discoverable as a *bug*.  It still bites us every now and then (grep the
Guix source code for usages of tini to find some occurrences).

Thanks,

Maxim




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#30948: [PATCH core-updates] guix: Reap finished child processes in build containers.
  2018-03-30 15:17         ` Ludovic Courtès
  2022-11-24 16:40           ` Maxim Cournoyer
@ 2022-11-24 16:44           ` Maxim Cournoyer
  2022-11-26 15:11             ` Ludovic Courtès
  1 sibling, 1 reply; 14+ messages in thread
From: Maxim Cournoyer @ 2022-11-24 16:44 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30948, Carlo Zancanaro

Hi,

ludo@gnu.org (Ludovic Courtès) writes:

> Hello,
>
> Carlo Zancanaro <carlo@zancanaro.id.au> skribis:
>
>> On Fri, Mar 30 2018, Ludovic Courtès wrote:
>>>> From what I can understand it's one of pid 1's responsiblities to
>>>> reap child processes, so I would expect this to be set up for every
>>>> builder, before the builder is run.
>>>
>>> True, but for derivations it’s also “optional” because eventually
>>> guix-daemon terminates all its child processes.
>>
>> As long as the build process doesn't rely on behaviour that, strictly
>> speaking, it should be allowed to rely on. It's not an issue of
>> resource leaking, it's an issue of correctness.
>
> Right.
>
>>>> Given it's not specific to the gnu-build-system, I don't think it
>>>> really fits there.

For what it's worth, I agree.  The evaluation container should have the
correct signal handling configured for *any* code about to be evaluated,
not just when on demand, if we want to fix this fully in a way that
won't come back to haunt us in some edge case.

>>> Yes, but note that it would be inherited by all the build systems.
>>
>> Except for trivial-build-system, which is probably fine. I still don't
>> think it fits in a specific build system, given it's a behaviour that
>> transcends the specific action happening within the container.
>>
>> Putting it in gnu-build-system will solve the problem in all realistic
>> cases, so that's probably fine. It's still subtly incorrect, but will
>> only be a problem if something using the trivial build system relies
>> on pid 1 to reap a process, or if we make a new build system not
>> deriving from gnu-build-system (which seems unlikely, but not
>> impossible).
>
> I agree, every Guile process running as PID 1 should reap processes.

Agreed too.

> My view is just that this mechanism belongs in “user code”, not in the
> low-level mechanisms such as ‘build-expression->derivation’ and
> ‘gexp->derivation’.  It’s a matter of separation of concerns.

Why?  On my Guix System, such signal handling is handled by Shepherd, if
I'm not mistaken.  As I user, I can trust the foundation to be sane,
rather than having to provide the bits to make it so myself.

> Of course we don’t want to duplicate that code every time, but the way
> we should factorize it, IMO, is by putting it in a “normal” module that
> people will use.
>
> Putting it in gnu-build-system is an admittedly hacky but easy way to
> have it widely shared.

I think we can do better than hacky here :-)

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#30948: [PATCH core-updates] guix: Reap finished child processes in build containers.
  2022-11-24 16:44           ` bug#30948: " Maxim Cournoyer
@ 2022-11-26 15:11             ` Ludovic Courtès
  2022-11-27  3:00               ` Maxim Cournoyer
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2022-11-26 15:11 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 30948, Carlo Zancanaro

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

>> My view is just that this mechanism belongs in “user code”, not in the
>> low-level mechanisms such as ‘build-expression->derivation’ and
>> ‘gexp->derivation’.  It’s a matter of separation of concerns.
>
> Why?  On my Guix System, such signal handling is handled by Shepherd, if
> I'm not mistaken.  As I user, I can trust the foundation to be sane,
> rather than having to provide the bits to make it so myself.
>
>> Of course we don’t want to duplicate that code every time, but the way
>> we should factorize it, IMO, is by putting it in a “normal” module that
>> people will use.
>>
>> Putting it in gnu-build-system is an admittedly hacky but easy way to
>> have it widely shared.
>
> I think we can do better than hacky here :-)

I think the real issue here is semantic clarity when it comes to
derivation inputs.

If I write:

  (gexp->derivation "foo" #~(mkdir #$output))

I can be sure that my derivation depends on nothing but (default-guile).
This is important for tests, but also to make sure we can use this
primitive everywhere—if it pulled in the Shepherd, I wouldn’t be able to
use to build glibc, because there’d be a cycle.

In that sense, having child-reaping code in gnu-build-system.scm, just
like in (guix least-authority), doesn’t seem unreasonable to me.

That said, I’m open to other proposals so please unleash your
creativity!  :-)

We’re touching core components though so this will require discussion.

Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#30948: [PATCH core-updates] guix: Reap finished child processes in build containers.
  2022-11-26 15:11             ` Ludovic Courtès
@ 2022-11-27  3:00               ` Maxim Cournoyer
  2022-11-28 15:04                 ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Cournoyer @ 2022-11-27  3:00 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30948, Carlo Zancanaro

Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>>> My view is just that this mechanism belongs in “user code”, not in the
>>> low-level mechanisms such as ‘build-expression->derivation’ and
>>> ‘gexp->derivation’.  It’s a matter of separation of concerns.
>>
>> Why?  On my Guix System, such signal handling is handled by Shepherd, if
>> I'm not mistaken.  As I user, I can trust the foundation to be sane,
>> rather than having to provide the bits to make it so myself.
>>
>>> Of course we don’t want to duplicate that code every time, but the way
>>> we should factorize it, IMO, is by putting it in a “normal” module that
>>> people will use.
>>>
>>> Putting it in gnu-build-system is an admittedly hacky but easy way to
>>> have it widely shared.
>>
>> I think we can do better than hacky here :-)
>
> I think the real issue here is semantic clarity when it comes to
> derivation inputs.
>
> If I write:
>
>   (gexp->derivation "foo" #~(mkdir #$output))
>
> I can be sure that my derivation depends on nothing but (default-guile).
> This is important for tests, but also to make sure we can use this
> primitive everywhere—if it pulled in the Shepherd, I wouldn’t be able to
> use to build glibc, because there’d be a cycle.

I was not suggesting to pull in extra dependencies such as Shepherd, but
to weave the to-be-added signal handling logic at a much lower level.
One idea could be to arrange so that the correct signal handlers always
get installed for any Guile code running in the build side (I'm not sure
how, but perhaps by adjusting the gexp "compiler"?).

The handlers could be defined in (guix build signal-handling) or
similar.  Users wouldn't need to explicitly import the module and
install its signal handlers, that'd be taken care of automatically, all
the time.

Does that sound feasible?

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#30948: [PATCH core-updates] guix: Reap finished child processes in build containers.
  2022-11-27  3:00               ` Maxim Cournoyer
@ 2022-11-28 15:04                 ` Ludovic Courtès
  2022-11-28 20:10                   ` Maxim Cournoyer
  2022-11-29  2:07                   ` Maxim Cournoyer
  0 siblings, 2 replies; 14+ messages in thread
From: Ludovic Courtès @ 2022-11-28 15:04 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 30948, Carlo Zancanaro

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> If I write:
>>
>>   (gexp->derivation "foo" #~(mkdir #$output))
>>
>> I can be sure that my derivation depends on nothing but (default-guile).
>> This is important for tests, but also to make sure we can use this
>> primitive everywhere—if it pulled in the Shepherd, I wouldn’t be able to
>> use to build glibc, because there’d be a cycle.
>
> I was not suggesting to pull in extra dependencies such as Shepherd, but
> to weave the to-be-added signal handling logic at a much lower level.
> One idea could be to arrange so that the correct signal handlers always
> get installed for any Guile code running in the build side (I'm not sure
> how, but perhaps by adjusting the gexp "compiler"?).
>
> The handlers could be defined in (guix build signal-handling) or
> similar.  Users wouldn't need to explicitly import the module and
> install its signal handlers, that'd be taken care of automatically, all
> the time.
>
> Does that sound feasible?

Not like this: the imported-modules derivation for (guix build
signal-handling) would be a dependency in themselves.

Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#30948: [PATCH core-updates] guix: Reap finished child processes in build containers.
  2022-11-28 15:04                 ` Ludovic Courtès
@ 2022-11-28 20:10                   ` Maxim Cournoyer
  2022-11-29  2:07                   ` Maxim Cournoyer
  1 sibling, 0 replies; 14+ messages in thread
From: Maxim Cournoyer @ 2022-11-28 20:10 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30948, Carlo Zancanaro

Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> If I write:
>>>
>>>   (gexp->derivation "foo" #~(mkdir #$output))
>>>
>>> I can be sure that my derivation depends on nothing but (default-guile).
>>> This is important for tests, but also to make sure we can use this
>>> primitive everywhere—if it pulled in the Shepherd, I wouldn’t be able to
>>> use to build glibc, because there’d be a cycle.
>>
>> I was not suggesting to pull in extra dependencies such as Shepherd, but
>> to weave the to-be-added signal handling logic at a much lower level.
>> One idea could be to arrange so that the correct signal handlers always
>> get installed for any Guile code running in the build side (I'm not sure
>> how, but perhaps by adjusting the gexp "compiler"?).
>>
>> The handlers could be defined in (guix build signal-handling) or
>> similar.  Users wouldn't need to explicitly import the module and
>> install its signal handlers, that'd be taken care of automatically, all
>> the time.
>>
>> Does that sound feasible?
>
> Not like this: the imported-modules derivation for (guix build
> signal-handling) would be a dependency in themselves.

Can we make it an implicit dependency, since we want it to *always* be
used?

It'd be useless/annoying boilerplate otherwise.

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#30948: [PATCH core-updates] guix: Reap finished child processes in build containers.
  2022-11-28 15:04                 ` Ludovic Courtès
  2022-11-28 20:10                   ` Maxim Cournoyer
@ 2022-11-29  2:07                   ` Maxim Cournoyer
  1 sibling, 0 replies; 14+ messages in thread
From: Maxim Cournoyer @ 2022-11-29  2:07 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 30948, Carlo Zancanaro

Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> If I write:
>>>
>>>   (gexp->derivation "foo" #~(mkdir #$output))
>>>
>>> I can be sure that my derivation depends on nothing but (default-guile).
>>> This is important for tests, but also to make sure we can use this
>>> primitive everywhere—if it pulled in the Shepherd, I wouldn’t be able to
>>> use to build glibc, because there’d be a cycle.
>>
>> I was not suggesting to pull in extra dependencies such as Shepherd, but
>> to weave the to-be-added signal handling logic at a much lower level.
>> One idea could be to arrange so that the correct signal handlers always
>> get installed for any Guile code running in the build side (I'm not sure
>> how, but perhaps by adjusting the gexp "compiler"?).
>>
>> The handlers could be defined in (guix build signal-handling) or
>> similar.  Users wouldn't need to explicitly import the module and
>> install its signal handlers, that'd be taken care of automatically, all
>> the time.
>>
>> Does that sound feasible?
>
> Not like this: the imported-modules derivation for (guix build
> signal-handling) would be a dependency in themselves.

I see a couple of options for the lowest place to inject the minimal
signal handling of a PID.

1. In Guile itself.  We could make it detect when it's running as PID 1
and then set up the required signal handling.  This is apparently what
Bash does, a peculiarity exploited by NixOS (they launch their builder
scripts via Bash, which is PID 1 and takes care of reaping the dead
processes)

2. In a Guile wrapper.  Instead of running Guile directly in the
container, guix-daemon would run it through a wrapper that acts as PID 1.
This would make it a tool comparable to dumb-init [0] or tini [1],
except written in Scheme.

[0] https://github.com/Yelp/dumb-init/
[1] https://github.com/krallin/tini

If we implement 1, it'd make Guile potentially useful as a wrapper
itself to launch scripts in containerized environment (the same as
tini), and it alleviates any integration overhead for us, so I find it
attractive.

What do you think?

For further reading, see [2], which I found interesting.

[2]  https://medium.com/hackernoon/my-process-became-pid-1-and-now-signals-behave-strangely-b05c52cc551c

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-11-29  2:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-26 11:16 [bug#30948] [PATCH core-updates] guix: Reap finished child processes in build containers Carlo Zancanaro
2018-03-26 23:39 ` Carlo Zancanaro
2018-03-29 20:07 ` Ludovic Courtès
2018-03-29 21:15   ` Carlo Zancanaro
2018-03-30  8:16     ` Ludovic Courtès
2018-03-30 11:17       ` Carlo Zancanaro
2018-03-30 15:17         ` Ludovic Courtès
2022-11-24 16:40           ` Maxim Cournoyer
2022-11-24 16:44           ` bug#30948: " Maxim Cournoyer
2022-11-26 15:11             ` Ludovic Courtès
2022-11-27  3:00               ` Maxim Cournoyer
2022-11-28 15:04                 ` Ludovic Courtès
2022-11-28 20:10                   ` Maxim Cournoyer
2022-11-29  2:07                   ` Maxim Cournoyer

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.