* Update on GuixSD containers
@ 2015-06-08 15:20 Thompson, David
2015-06-09 21:28 ` Ludovic Courtès
0 siblings, 1 reply; 14+ messages in thread
From: Thompson, David @ 2015-06-08 15:20 UTC (permalink / raw)
To: guix-devel
Hey folks,
I'd like to give a quick update on the state of wip-container branch.
As of this morning, one can run the below commands and have a somewhat
functional GuixSD container:
# Hardcoded /tmp/container as the container root directory until I
# add a command line switch.
mkdir /tmp/container
guix system container container-config.scm
Where 'container-config.scm' is:
(use-modules (gnu))
;; Minimal GuixSD configuration suitable for a Linux container.
(operating-system
(host-name "container-test")
(timezone "America/New_York")
(locale "en_US.UTF-8")
;; Unused
(bootloader (grub-configuration (device "/dev/sdX")))
;; Dummy FS
(file-systems (cons (file-system
(mount-point "/")
(device "dummy")
(type "dummy"))
%base-file-systems))
(users (cons (user-account
(name "alice")
(comment "Bob's sister")
(group "users")
(supplementary-groups '("wheel" "audio" "video"))
(home-directory "/home/alice"))
%base-user-accounts)))
The activation and boot scripts for the system have been tweaked to
DTRT for a container, and DMD is able to start successfully and start
all of the base services, sans the console-font-tty services for some
reason.
So, this is cool, but much work remains to be done. Our containers
operate in 5 of 6 possible Linux namespaces: mount, PID, UTS, IPC, and
network. The remaining namespace to get working is the user
namespace, which is especially tricky. I don't think even Docker can
use user namespaces properly yet, but I might be wrong. Additionally,
our containers have a loopback device, but have no way of accessing an
outside network such as your LAN or a virtual network on the host
system. There's also no support for cgroups, which would allow us to
limit the resource usage of containers like you can with a VM
hypervisor.
For the long term, we'll need a container daemon to keep track of all
containers on the system to allow for easily starting and stopping
them (right now you have to 'sudo kill -9 <dmd pid>'), spawning new
processes within them (for example, launching bash for an interactive
environment), and whatever else we might want.
In closing, things aren't exactly usable, but I encourage
brave/curious people to take 'guix system container' for a spin and
hack on it to make Guix the best container management tool yet! Also,
I think the code is very easy to follow (unlike Docker's
libcontainer), so if you want to understand what containers *really*
are beyond a buzzword, have a look at gnu/build/linux-container.scm
and gnu/system/linux-container.scm.
Happy hacking,
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-08 15:20 Update on GuixSD containers Thompson, David
@ 2015-06-09 21:28 ` Ludovic Courtès
2015-06-11 14:51 ` Thompson, David
0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2015-06-09 21:28 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> I'd like to give a quick update on the state of wip-container branch.
> As of this morning, one can run the below commands and have a somewhat
> functional GuixSD container:
>
> # Hardcoded /tmp/container as the container root directory until I
> # add a command line switch.
> mkdir /tmp/container
> guix system container container-config.scm
Wonderful! I’ve given it a try, and it works as advertised. ;-)
I was a bit afraid the first time I ran the ‘run-container’ script as
root, but everything went like a charm.
I tried adding this dummy service:
(define (bash-service)
(with-monad %store-monad
(return (service
(documentation "Run Bash from PID 1.")
(provision '(shell))
(start #~(make-forkexec-constructor
(string-append #$bash "/bin/bash")))
(stop #~(make-kill-destructor))
(respawn? #t)))))
... but it dies for some reason. So no shell prompt.
> So, this is cool, but much work remains to be done. Our containers
> operate in 5 of 6 possible Linux namespaces: mount, PID, UTS, IPC, and
> network. The remaining namespace to get working is the user
> namespace, which is especially tricky. I don't think even Docker can
> use user namespaces properly yet, but I might be wrong. Additionally,
> our containers have a loopback device, but have no way of accessing an
> outside network such as your LAN or a virtual network on the host
> system. There's also no support for cgroups, which would allow us to
> limit the resource usage of containers like you can with a VM
> hypervisor.
OK.
> For the long term, we'll need a container daemon to keep track of all
> containers on the system to allow for easily starting and stopping
> them (right now you have to 'sudo kill -9 <dmd pid>'), spawning new
> processes within them (for example, launching bash for an interactive
> environment), and whatever else we might want.
Having launched a bunch of containers and then hacked to kill all the
dmds, I can see why keeping track of containers matters. :-)
Until there’s a daemon to keep track of containers, “guix system
container” could return the PID of the container’s PID1, to make it
easier to kill it later?
> In closing, things aren't exactly usable, but I encourage
> brave/curious people to take 'guix system container' for a spin and
> hack on it to make Guix the best container management tool yet! Also,
> I think the code is very easy to follow (unlike Docker's
> libcontainer), so if you want to understand what containers *really*
> are beyond a buzzword, have a look at gnu/build/linux-container.scm
> and gnu/system/linux-container.scm.
Indeed I find the new code easy to read and well integrated; I like it!
It’s a shame that only CLONE_NEWUSER is available to non-root users. I
wonder what the rationale was. AIUI, Docker’s daemon performs clone(2)
on behalf of clients, right?
Thanks for the great work!
Ludo’.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-09 21:28 ` Ludovic Courtès
@ 2015-06-11 14:51 ` Thompson, David
2015-06-12 15:08 ` Ludovic Courtès
2015-06-12 15:12 ` Ludovic Courtès
0 siblings, 2 replies; 14+ messages in thread
From: Thompson, David @ 2015-06-11 14:51 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Tue, Jun 9, 2015 at 5:28 PM, Ludovic Courtès <ludo@gnu.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> I'd like to give a quick update on the state of wip-container branch.
>> As of this morning, one can run the below commands and have a somewhat
>> functional GuixSD container:
>>
>> # Hardcoded /tmp/container as the container root directory until I
>> # add a command line switch.
>> mkdir /tmp/container
>> guix system container container-config.scm
>
> Wonderful! I’ve given it a try, and it works as advertised. ;-)
> I was a bit afraid the first time I ran the ‘run-container’ script as
> root, but everything went like a charm.
Yeah, running as root is a bit scary. With working user namespaces it
should become less scary. I just don't know how to reasonably start a
system with users of its own that are allowed to write to the file
system. Everything I've tried thus far has failed. I thought that
mapping the uid/gid 0 in the namespace to uid/gid 0 outside of the
namespace would be enough to boot the system, but it didn't work.
> I tried adding this dummy service:
>
> (define (bash-service)
> (with-monad %store-monad
> (return (service
> (documentation "Run Bash from PID 1.")
> (provision '(shell))
> (start #~(make-forkexec-constructor
> (string-append #$bash "/bin/bash")))
> (stop #~(make-kill-destructor))
> (respawn? #t)))))
>
> ... but it dies for some reason. So no shell prompt.
I wouldn't expect that to work because bash isn't actually run in your
tty. To create an interactive environment within the container (or
run any arbitrary program), we need a tool that calls setns() with
open file descriptors for all of the container's namespaces and then
exec() the desired command. I threw together a tool to do this
quickly, but for some reason joining the mount namespace fails with
EINVAL. I have no idea why. Joining the IPC, UTS, PID, and network
namespaces isn't a problem. Enlightenment needed!
>> For the long term, we'll need a container daemon to keep track of all
>> containers on the system to allow for easily starting and stopping
>> them (right now you have to 'sudo kill -9 <dmd pid>'), spawning new
>> processes within them (for example, launching bash for an interactive
>> environment), and whatever else we might want.
>
> Having launched a bunch of containers and then hacked to kill all the
> dmds, I can see why keeping track of containers matters. :-)
>
> Until there’s a daemon to keep track of containers, “guix system
> container” could return the PID of the container’s PID1, to make it
> easier to kill it later?
I'm actually unsure how to acquire the PID of the container's init
process since I clone and exec. Any ideas?
> It’s a shame that only CLONE_NEWUSER is available to non-root users. I
> wonder what the rationale was. AIUI, Docker’s daemon performs clone(2)
> on behalf of clients, right?
Yeah, our daemon would do the same thing. We could maybe even have a
little Guile library that allows one to evaluate arbitrary scheme code
from within the container. :)
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-11 14:51 ` Thompson, David
@ 2015-06-12 15:08 ` Ludovic Courtès
2015-06-13 3:41 ` Thompson, David
2018-07-24 22:22 ` Christopher Lemmer Webber
2015-06-12 15:12 ` Ludovic Courtès
1 sibling, 2 replies; 14+ messages in thread
From: Ludovic Courtès @ 2015-06-12 15:08 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> On Tue, Jun 9, 2015 at 5:28 PM, Ludovic Courtès <ludo@gnu.org> wrote:
[...]
>> I tried adding this dummy service:
>>
>> (define (bash-service)
>> (with-monad %store-monad
>> (return (service
>> (documentation "Run Bash from PID 1.")
>> (provision '(shell))
>> (start #~(make-forkexec-constructor
>> (string-append #$bash "/bin/bash")))
>> (stop #~(make-kill-destructor))
>> (respawn? #t)))))
>>
>> ... but it dies for some reason. So no shell prompt.
>
> I wouldn't expect that to work because bash isn't actually run in your
> tty. To create an interactive environment within the container (or
> run any arbitrary program), we need a tool that calls setns() with
> open file descriptors for all of the container's namespaces and then
> exec() the desired command. I threw together a tool to do this
> quickly, but for some reason joining the mount namespace fails with
> EINVAL. I have no idea why. Joining the IPC, UTS, PID, and network
> namespaces isn't a problem. Enlightenment needed!
Oh, I see. setns(2) specifies 6 reasons for EINVAL...
>> Until there’s a daemon to keep track of containers, “guix system
>> container” could return the PID of the container’s PID1, to make it
>> easier to kill it later?
>
> I'm actually unsure how to acquire the PID of the container's init
> process since I clone and exec. Any ideas?
Isn’t it the return value of ‘clone’?
>> It’s a shame that only CLONE_NEWUSER is available to non-root users. I
>> wonder what the rationale was. AIUI, Docker’s daemon performs clone(2)
>> on behalf of clients, right?
>
> Yeah, our daemon would do the same thing. We could maybe even have a
> little Guile library that allows one to evaluate arbitrary scheme code
> from within the container. :)
Definitely. Another application I’ve always wanted is a least-authority
shell, like Plash [0].
(Speaking of which, I just found Shill [1], which seems similar to Plash
and even has a to-do item regarding package management [2] and is
written in Racket; unfortunately it runs on FreeBSD, for Capsicum.)
Thanks,
Ludo’.
[0] http://plash.beasts.org/contents.html
[1] http://shill.seas.harvard.edu/
[2] http://shill.seas.harvard.edu/projects.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-12 15:08 ` Ludovic Courtès
@ 2015-06-13 3:41 ` Thompson, David
2018-07-24 22:22 ` Christopher Lemmer Webber
1 sibling, 0 replies; 14+ messages in thread
From: Thompson, David @ 2015-06-13 3:41 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Fri, Jun 12, 2015 at 11:08 AM, Ludovic Courtès <ludo@gnu.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> On Tue, Jun 9, 2015 at 5:28 PM, Ludovic Courtès <ludo@gnu.org> wrote:
>>
>>> Until there’s a daemon to keep track of containers, “guix system
>>> container” could return the PID of the container’s PID1, to make it
>>> easier to kill it later?
>>
>> I'm actually unsure how to acquire the PID of the container's init
>> process since I clone and exec. Any ideas?
>
> Isn’t it the return value of ‘clone’?
Oh, you're right. I forgot that the exec() *replaces* the process,
rather than spawning a new one. The script now outputs the PID.
>>> It’s a shame that only CLONE_NEWUSER is available to non-root users. I
>>> wonder what the rationale was. AIUI, Docker’s daemon performs clone(2)
>>> on behalf of clients, right?
>>
>> Yeah, our daemon would do the same thing. We could maybe even have a
>> little Guile library that allows one to evaluate arbitrary scheme code
>> from within the container. :)
>
> Definitely. Another application I’ve always wanted is a least-authority
> shell, like Plash [0].
>
> (Speaking of which, I just found Shill [1], which seems similar to Plash
> and even has a to-do item regarding package management [2] and is
> written in Racket; unfortunately it runs on FreeBSD, for Capsicum.)
That's really cool. Using a container + user-specified shared
directories we can achieve something like this, I think.
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-12 15:08 ` Ludovic Courtès
2015-06-13 3:41 ` Thompson, David
@ 2018-07-24 22:22 ` Christopher Lemmer Webber
1 sibling, 0 replies; 14+ messages in thread
From: Christopher Lemmer Webber @ 2018-07-24 22:22 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Ludovic Courtès writes:
> Definitely. Another application I’ve always wanted is a least-authority
> shell, like Plash [0].
>
> (Speaking of which, I just found Shill [1], which seems similar to Plash
> and even has a to-do item regarding package management [2] and is
> written in Racket; unfortunately it runs on FreeBSD, for Capsicum.)
As a side note, yesterday I learned about Capsicum for Linux:
https://github.com/google/capsicum-linux
Unfortunately it has not seen commits this last year. A shame; it would
really be nice to get such ocap support in GNU/Linux.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-11 14:51 ` Thompson, David
2015-06-12 15:08 ` Ludovic Courtès
@ 2015-06-12 15:12 ` Ludovic Courtès
2015-06-13 1:41 ` Thompson, David
1 sibling, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2015-06-12 15:12 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> Yeah, our daemon would do the same thing. We could maybe even have a
> little Guile library that allows one to evaluate arbitrary scheme code
> from within the container. :)
Actually, something quite easily feasible would be this:
(eval-in-container #~(system* #$evil-program
#$(local-file "important-data.txt"))
#:networking? #f)
... where the container’s store would be populated with just
EVIL-PROGRAM and the local file.
Food for thought...
Ludo’.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-12 15:12 ` Ludovic Courtès
@ 2015-06-13 1:41 ` Thompson, David
2015-06-13 13:06 ` Ludovic Courtès
0 siblings, 1 reply; 14+ messages in thread
From: Thompson, David @ 2015-06-13 1:41 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> Yeah, our daemon would do the same thing. We could maybe even have a
>> little Guile library that allows one to evaluate arbitrary scheme code
>> from within the container. :)
>
> Actually, something quite easily feasible would be this:
>
> (eval-in-container #~(system* #$evil-program
> #$(local-file "important-data.txt"))
> #:networking? #f)
>
> ... where the container’s store would be populated with just
> EVIL-PROGRAM and the local file.
>
> Food for thought...
Ooooh yeah! That would be cool. Though I think we should still spawn
a dmd process as PID 1 to deal with reaping zombie processes. We
could generate a single service that runs the gexp script. How does
that sound?
Thanks for this good idea!
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-13 1:41 ` Thompson, David
@ 2015-06-13 13:06 ` Ludovic Courtès
2015-06-13 13:14 ` Thompson, David
0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2015-06-13 13:06 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>
>>> Yeah, our daemon would do the same thing. We could maybe even have a
>>> little Guile library that allows one to evaluate arbitrary scheme code
>>> from within the container. :)
>>
>> Actually, something quite easily feasible would be this:
>>
>> (eval-in-container #~(system* #$evil-program
>> #$(local-file "important-data.txt"))
>> #:networking? #f)
>>
>> ... where the container’s store would be populated with just
>> EVIL-PROGRAM and the local file.
>>
>> Food for thought...
>
> Ooooh yeah! That would be cool. Though I think we should still spawn
> a dmd process as PID 1 to deal with reaping zombie processes. We
> could generate a single service that runs the gexp script. How does
> that sound?
Wouldn’t it be enough to have the Guile process that evaluates the
expression be PID 1 in the container, as is the case in guix-daemon
containers?
Ludo’.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-13 13:06 ` Ludovic Courtès
@ 2015-06-13 13:14 ` Thompson, David
2015-06-13 20:19 ` Ludovic Courtès
0 siblings, 1 reply; 14+ messages in thread
From: Thompson, David @ 2015-06-13 13:14 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Sat, Jun 13, 2015 at 9:06 AM, Ludovic Courtès <ludo@gnu.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>
>>>> Yeah, our daemon would do the same thing. We could maybe even have a
>>>> little Guile library that allows one to evaluate arbitrary scheme code
>>>> from within the container. :)
>>>
>>> Actually, something quite easily feasible would be this:
>>>
>>> (eval-in-container #~(system* #$evil-program
>>> #$(local-file "important-data.txt"))
>>> #:networking? #f)
>>>
>>> ... where the container’s store would be populated with just
>>> EVIL-PROGRAM and the local file.
>>>
>>> Food for thought...
>>
>> Ooooh yeah! That would be cool. Though I think we should still spawn
>> a dmd process as PID 1 to deal with reaping zombie processes. We
>> could generate a single service that runs the gexp script. How does
>> that sound?
>
> Wouldn’t it be enough to have the Guile process that evaluates the
> expression be PID 1 in the container, as is the case in guix-daemon
> containers?
Sure, it would work, but my concern is that a long-running process on
a user's machine could create and orphan tons of child processes and
nothing would be able to clean them up until the PID namespace is
garbage collected.
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-13 13:14 ` Thompson, David
@ 2015-06-13 20:19 ` Ludovic Courtès
2015-06-16 16:39 ` Thompson, David
0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2015-06-13 20:19 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> On Sat, Jun 13, 2015 at 9:06 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>
>>> On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>>
>>>>> Yeah, our daemon would do the same thing. We could maybe even have a
>>>>> little Guile library that allows one to evaluate arbitrary scheme code
>>>>> from within the container. :)
>>>>
>>>> Actually, something quite easily feasible would be this:
>>>>
>>>> (eval-in-container #~(system* #$evil-program
>>>> #$(local-file "important-data.txt"))
>>>> #:networking? #f)
>>>>
>>>> ... where the container’s store would be populated with just
>>>> EVIL-PROGRAM and the local file.
>>>>
>>>> Food for thought...
>>>
>>> Ooooh yeah! That would be cool. Though I think we should still spawn
>>> a dmd process as PID 1 to deal with reaping zombie processes. We
>>> could generate a single service that runs the gexp script. How does
>>> that sound?
>>
>> Wouldn’t it be enough to have the Guile process that evaluates the
>> expression be PID 1 in the container, as is the case in guix-daemon
>> containers?
>
> Sure, it would work, but my concern is that a long-running process on
> a user's machine could create and orphan tons of child processes and
> nothing would be able to clean them up until the PID namespace is
> garbage collected.
My understanding was that killing a container’s PID 1 (from the outside)
effectively killed all the processes of that PID name space. Isn’t it
the case?
(The daemon works around that by running processes under a separate UID
and doing kill(-1, SIGKILL) under that UID.)
Ludo’.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-13 20:19 ` Ludovic Courtès
@ 2015-06-16 16:39 ` Thompson, David
2015-06-19 12:08 ` Ludovic Courtès
0 siblings, 1 reply; 14+ messages in thread
From: Thompson, David @ 2015-06-16 16:39 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Sat, Jun 13, 2015 at 4:19 PM, Ludovic Courtès <ludo@gnu.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> On Sat, Jun 13, 2015 at 9:06 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>
>>>> On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>>>
>>>>>> Yeah, our daemon would do the same thing. We could maybe even have a
>>>>>> little Guile library that allows one to evaluate arbitrary scheme code
>>>>>> from within the container. :)
>>>>>
>>>>> Actually, something quite easily feasible would be this:
>>>>>
>>>>> (eval-in-container #~(system* #$evil-program
>>>>> #$(local-file "important-data.txt"))
>>>>> #:networking? #f)
>>>>>
>>>>> ... where the container’s store would be populated with just
>>>>> EVIL-PROGRAM and the local file.
>>>>>
>>>>> Food for thought...
>>>>
>>>> Ooooh yeah! That would be cool. Though I think we should still spawn
>>>> a dmd process as PID 1 to deal with reaping zombie processes. We
>>>> could generate a single service that runs the gexp script. How does
>>>> that sound?
>>>
>>> Wouldn’t it be enough to have the Guile process that evaluates the
>>> expression be PID 1 in the container, as is the case in guix-daemon
>>> containers?
>>
>> Sure, it would work, but my concern is that a long-running process on
>> a user's machine could create and orphan tons of child processes and
>> nothing would be able to clean them up until the PID namespace is
>> garbage collected.
>
> My understanding was that killing a container’s PID 1 (from the outside)
> effectively killed all the processes of that PID name space. Isn’t it
> the case?
Yes, that is the case. That triggers the "garbage collection" of that
namespace, if you will. My point is that, without a proper PID 1 that
can DTRT with orphaned processes, a long running process in a
container could potentially create a ton of orphaned child processes
with no way for them to be reaped without killing PID 1. I wouldn't
be very happy if a program that I was running in a sandbox was
polluting the process list. I don't think this is a concern for the
build daemon because the build process is a (relatively) short-lived
process, but running something like a web browser could go on for
days, weeks, etc.
> (The daemon works around that by running processes under a separate UID
> and doing kill(-1, SIGKILL) under that UID.)
So, PID 1 in the build container forks and changes the UID or
something? Sorry, I'm a bit lost right now.
Thanks for trying to explain.
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-16 16:39 ` Thompson, David
@ 2015-06-19 12:08 ` Ludovic Courtès
2015-06-19 12:29 ` Thompson, David
0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2015-06-19 12:08 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> On Sat, Jun 13, 2015 at 4:19 PM, Ludovic Courtès <ludo@gnu.org> wrote:
>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>
>>> On Sat, Jun 13, 2015 at 9:06 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>>
>>>>> On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>>>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>>>>
>>>>>>> Yeah, our daemon would do the same thing. We could maybe even have a
>>>>>>> little Guile library that allows one to evaluate arbitrary scheme code
>>>>>>> from within the container. :)
>>>>>>
>>>>>> Actually, something quite easily feasible would be this:
>>>>>>
>>>>>> (eval-in-container #~(system* #$evil-program
>>>>>> #$(local-file "important-data.txt"))
>>>>>> #:networking? #f)
>>>>>>
>>>>>> ... where the container’s store would be populated with just
>>>>>> EVIL-PROGRAM and the local file.
>>>>>>
>>>>>> Food for thought...
>>>>>
>>>>> Ooooh yeah! That would be cool. Though I think we should still spawn
>>>>> a dmd process as PID 1 to deal with reaping zombie processes. We
>>>>> could generate a single service that runs the gexp script. How does
>>>>> that sound?
>>>>
>>>> Wouldn’t it be enough to have the Guile process that evaluates the
>>>> expression be PID 1 in the container, as is the case in guix-daemon
>>>> containers?
>>>
>>> Sure, it would work, but my concern is that a long-running process on
>>> a user's machine could create and orphan tons of child processes and
>>> nothing would be able to clean them up until the PID namespace is
>>> garbage collected.
>>
>> My understanding was that killing a container’s PID 1 (from the outside)
>> effectively killed all the processes of that PID name space. Isn’t it
>> the case?
>
> Yes, that is the case. That triggers the "garbage collection" of that
> namespace, if you will. My point is that, without a proper PID 1 that
> can DTRT with orphaned processes, a long running process in a
> container could potentially create a ton of orphaned child processes
> with no way for them to be reaped without killing PID 1. I wouldn't
> be very happy if a program that I was running in a sandbox was
> polluting the process list. I don't think this is a concern for the
> build daemon because the build process is a (relatively) short-lived
> process, but running something like a web browser could go on for
> days, weeks, etc.
Yes, I understand. This is definitely an important concern for full
GuixSD containers.
However, ‘eval-in-container’ would be much simpler, synchronous, and
typically for short-lived processes. So I guess the process that runs
‘eval-in-container’ would clone(2) (via ‘call-with-container’) and
simply waitpid(2) the child process (which is PID 1 in its container).
When the parent process gets a SIGINT or SIGHUP, it could send SIGKILL
to the child, thereby terminating the container.
Does that make sense?
>> (The daemon works around that by running processes under a separate UID
>> and doing kill(-1, SIGKILL) under that UID.)
>
> So, PID 1 in the build container forks and changes the UID or
> something?
Yes, with setuid (see build.cc:2180.)
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Update on GuixSD containers
2015-06-19 12:08 ` Ludovic Courtès
@ 2015-06-19 12:29 ` Thompson, David
0 siblings, 0 replies; 14+ messages in thread
From: Thompson, David @ 2015-06-19 12:29 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Fri, Jun 19, 2015 at 8:08 AM, Ludovic Courtès <ludo@gnu.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> On Sat, Jun 13, 2015 at 4:19 PM, Ludovic Courtès <ludo@gnu.org> wrote:
>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>
>>>> On Sat, Jun 13, 2015 at 9:06 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>>>
>>>>>> On Fri, Jun 12, 2015 at 11:12 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>>>>>> "Thompson, David" <dthompson2@worcester.edu> skribis:
>>>>>>>
>>>>>>>> Yeah, our daemon would do the same thing. We could maybe even have a
>>>>>>>> little Guile library that allows one to evaluate arbitrary scheme code
>>>>>>>> from within the container. :)
>>>>>>>
>>>>>>> Actually, something quite easily feasible would be this:
>>>>>>>
>>>>>>> (eval-in-container #~(system* #$evil-program
>>>>>>> #$(local-file "important-data.txt"))
>>>>>>> #:networking? #f)
>>>>>>>
>>>>>>> ... where the container’s store would be populated with just
>>>>>>> EVIL-PROGRAM and the local file.
>>>>>>>
>>>>>>> Food for thought...
>>>>>>
>>>>>> Ooooh yeah! That would be cool. Though I think we should still spawn
>>>>>> a dmd process as PID 1 to deal with reaping zombie processes. We
>>>>>> could generate a single service that runs the gexp script. How does
>>>>>> that sound?
>>>>>
>>>>> Wouldn’t it be enough to have the Guile process that evaluates the
>>>>> expression be PID 1 in the container, as is the case in guix-daemon
>>>>> containers?
>>>>
>>>> Sure, it would work, but my concern is that a long-running process on
>>>> a user's machine could create and orphan tons of child processes and
>>>> nothing would be able to clean them up until the PID namespace is
>>>> garbage collected.
>>>
>>> My understanding was that killing a container’s PID 1 (from the outside)
>>> effectively killed all the processes of that PID name space. Isn’t it
>>> the case?
>>
>> Yes, that is the case. That triggers the "garbage collection" of that
>> namespace, if you will. My point is that, without a proper PID 1 that
>> can DTRT with orphaned processes, a long running process in a
>> container could potentially create a ton of orphaned child processes
>> with no way for them to be reaped without killing PID 1. I wouldn't
>> be very happy if a program that I was running in a sandbox was
>> polluting the process list. I don't think this is a concern for the
>> build daemon because the build process is a (relatively) short-lived
>> process, but running something like a web browser could go on for
>> days, weeks, etc.
>
> Yes, I understand. This is definitely an important concern for full
> GuixSD containers.
>
> However, ‘eval-in-container’ would be much simpler, synchronous, and
> typically for short-lived processes. So I guess the process that runs
> ‘eval-in-container’ would clone(2) (via ‘call-with-container’) and
> simply waitpid(2) the child process (which is PID 1 in its container).
>
> When the parent process gets a SIGINT or SIGHUP, it could send SIGKILL
> to the child, thereby terminating the container.
>
> Does that make sense?
Yes, crystal clear now. Thanks for bearing with me.
>>> (The daemon works around that by running processes under a separate UID
>>> and doing kill(-1, SIGKILL) under that UID.)
>>
>> So, PID 1 in the build container forks and changes the UID or
>> something?
>
> Yes, with setuid (see build.cc:2180.)
Awesome, thank you.
My current container work is figuring out how to spawn interactive
processes in a container, such as bash or a Guile REPL. Seems I need
to learn how to make a pty and maybe do some dup/dup2 calls to pipe
stdin in the parent process to the child container process. Any
wisdom you have (or anyone else reading this) would be most welcome.
:)
- Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2018-07-24 22:22 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-08 15:20 Update on GuixSD containers Thompson, David
2015-06-09 21:28 ` Ludovic Courtès
2015-06-11 14:51 ` Thompson, David
2015-06-12 15:08 ` Ludovic Courtès
2015-06-13 3:41 ` Thompson, David
2018-07-24 22:22 ` Christopher Lemmer Webber
2015-06-12 15:12 ` Ludovic Courtès
2015-06-13 1:41 ` Thompson, David
2015-06-13 13:06 ` Ludovic Courtès
2015-06-13 13:14 ` Thompson, David
2015-06-13 20:19 ` Ludovic Courtès
2015-06-16 16:39 ` Thompson, David
2015-06-19 12:08 ` Ludovic Courtès
2015-06-19 12:29 ` Thompson, David
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.