all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* shepherd service works on host but fails inside system container
@ 2023-03-22 10:20 Vladilen Kozin
  2023-03-22 12:14 ` Vladilen Kozin
  2023-03-27  8:39 ` Attila Lendvai
  0 siblings, 2 replies; 4+ messages in thread
From: Vladilen Kozin @ 2023-03-22 10:20 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2397 bytes --]

Hello guix.

I put together a tailscale system service that's meant to start a tailscale
daemon managed by the system shepherd, that is to say that my
`tailscaled-service-type` specifies `(service-extension
shepherd-root-service-type tailscaled-shepherd-service)`, where
`tailscaled-shepherd-service` creates a `shepherd-service` with (provision
'(tailscaled)) and (requirement '(networking)).

I tested it by lowering to store via `shepherd-service-file` and then
loading the generated script via `sudo herd load root ...`. This works fine
and the daemon starts without a problem.

Next, I try to spawn tailscaled as part of my OS definition:
(services (cons* (service tailscaled-service-type
(tailscaled-configuration)) %base-services))
;; tried %desktop-services too

To test, we create a container:
sudo guix system -K -L /home/vlad/Code/fullmeta-guix/channel container
os.scm --network --expose=/dev/net=/dev

Earlier runs had it complaining that /dev/net/tun was missing, so I exposed
that. Dunno if that's how I'm supposed to handle this. Now,
/var/log/messages show:

Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 Linux
kernel version: 5.18.10
Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 is
CONFIG_TUN enabled in your kernel? `modprobe tun` failed with:
Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48
wgengine.NewUserspaceEngine(tun "tailscale0") error:
tstun.New("tailscale0"): operation not permitted

I feel like maybe I'm missing some kernel modules, but I would've expected
host and container to share the kernel, so I dunno. In fact, when I
randomly attempted adding (kernel-arguments (cons* "CONFIG_TUN=m"
%default-kernel-arguments)) to my os definition, resulting script hash came
out the same, which tells me, containers don't even look at these kernel
params when generating a script.

Any guesses as to why this works under host but not inside container?

Relatedly, does anyone have a nicer workflow they use to define and test
shepherd services? Such containerization was the next step in testing the
service and would've been ok were it not for the above failure, but the
initial indirection with lowering to store, then `sudo herd load root ...`
is a bit too involved and "indirect" for my liking as well - anyone has an
improved way of developing shepherd services?

Thanks!
-- 
Best regards
Vlad Kozin

[-- Attachment #2: Type: text/html, Size: 4504 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: shepherd service works on host but fails inside system container
  2023-03-22 10:20 shepherd service works on host but fails inside system container Vladilen Kozin
@ 2023-03-22 12:14 ` Vladilen Kozin
  2023-03-22 13:11   ` Vladilen Kozin
  2023-03-27  8:39 ` Attila Lendvai
  1 sibling, 1 reply; 4+ messages in thread
From: Vladilen Kozin @ 2023-03-22 12:14 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 4236 bytes --]

I now have a hypothesis as to what's happening. Could someone confirm or
disprove and maybe suggest a solution or point at existing workarounds.

Host and container will share the exact same kernel that's unsurprisingly
already running, so the above has nothing to do with kernel modules or
settings. Fwiw I figured a way to find where kernel modules reside by doing:

$ sudo dmesg | grep -i "kernel command line"
which shows where current system is inside the store and relevant
/lib/modules will be under it. We could then
--expose=/gnu/store/hash-system/lib/modules=/lib/modules if we wanted to.

Real problem, IIUC, is with capabilities. Notion of "container" can be
misleading and evokes thoughts of "vm" when in practice its just a process
with some isolation applied to it. So, presently I'm guessing container
Shepherd maybe PID 1 inside its isolated environment, but from the host pow
it is just a process and one that unlike our host's shepherd may lack
certain capabilities and privileges to e.g. create new devices or load
kernel modules on request, etc. In the sense of
https://man7.org/linux/man-pages/man7/capabilities.7.html maybe?

Am I on the right track? But then, how does one test services like that
that may require ability to modify devices etc? Have we "outgrown"
container and ought to `guix system vm` for such services? Or is there a
way to bless container shepherd with necessary capabilities? If not from
`guix system container` command line, then perhaps dropping down to the
underlying programmatic interface i.e. whatever `guix system container`
ends up calling to containerize a system?

Thanks


On Wed, 22 Mar 2023 at 10:20, Vladilen Kozin <vladilen.kozin@gmail.com>
wrote:

> Hello guix.
>
> I put together a tailscale system service that's meant to start a
> tailscale daemon managed by the system shepherd, that is to say that my
> `tailscaled-service-type` specifies `(service-extension
> shepherd-root-service-type tailscaled-shepherd-service)`, where
> `tailscaled-shepherd-service` creates a `shepherd-service` with (provision
> '(tailscaled)) and (requirement '(networking)).
>
> I tested it by lowering to store via `shepherd-service-file` and then
> loading the generated script via `sudo herd load root ...`. This works fine
> and the daemon starts without a problem.
>
> Next, I try to spawn tailscaled as part of my OS definition:
> (services (cons* (service tailscaled-service-type
> (tailscaled-configuration)) %base-services))
> ;; tried %desktop-services too
>
> To test, we create a container:
> sudo guix system -K -L /home/vlad/Code/fullmeta-guix/channel container
> os.scm --network --expose=/dev/net=/dev
>
> Earlier runs had it complaining that /dev/net/tun was missing, so I
> exposed that. Dunno if that's how I'm supposed to handle this. Now,
> /var/log/messages show:
>
> Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 Linux
> kernel version: 5.18.10
> Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 is
> CONFIG_TUN enabled in your kernel? `modprobe tun` failed with:
> Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48
> wgengine.NewUserspaceEngine(tun "tailscale0") error:
> tstun.New("tailscale0"): operation not permitted
>
> I feel like maybe I'm missing some kernel modules, but I would've expected
> host and container to share the kernel, so I dunno. In fact, when I
> randomly attempted adding (kernel-arguments (cons* "CONFIG_TUN=m"
> %default-kernel-arguments)) to my os definition, resulting script hash came
> out the same, which tells me, containers don't even look at these kernel
> params when generating a script.
>
> Any guesses as to why this works under host but not inside container?
>
> Relatedly, does anyone have a nicer workflow they use to define and test
> shepherd services? Such containerization was the next step in testing the
> service and would've been ok were it not for the above failure, but the
> initial indirection with lowering to store, then `sudo herd load root ...`
> is a bit too involved and "indirect" for my liking as well - anyone has an
> improved way of developing shepherd services?
>
> Thanks!
> --
> Best regards
> Vlad Kozin
>


-- 
Best regards
Vlad Kozin

[-- Attachment #2: Type: text/html, Size: 7616 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: shepherd service works on host but fails inside system container
  2023-03-22 12:14 ` Vladilen Kozin
@ 2023-03-22 13:11   ` Vladilen Kozin
  0 siblings, 0 replies; 4+ messages in thread
From: Vladilen Kozin @ 2023-03-22 13:11 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 4974 bytes --]

I guess capabilities aren't handled by container creation code yet:
https://github.com/guix-mirror/guix/blob/master/gnu/build/linux-container.scm#L262


IIUC this would also effect the idea of isolating system services as
described in
https://guix.gnu.org/en/blog/2017/running-system-services-in-containers/.

It may require deeper understanding of Linux caps, namescase and how it all
handled by Guix code, I suppose. Not something I'll be able to pull off
without hand-holding from core devs, sigh.

On Wed, 22 Mar 2023 at 12:14, Vladilen Kozin <vladilen.kozin@gmail.com>
wrote:

> I now have a hypothesis as to what's happening. Could someone confirm or
> disprove and maybe suggest a solution or point at existing workarounds.
>
> Host and container will share the exact same kernel that's unsurprisingly
> already running, so the above has nothing to do with kernel modules or
> settings. Fwiw I figured a way to find where kernel modules reside by doing:
>
> $ sudo dmesg | grep -i "kernel command line"
> which shows where current system is inside the store and relevant
> /lib/modules will be under it. We could then
> --expose=/gnu/store/hash-system/lib/modules=/lib/modules if we wanted to.
>
> Real problem, IIUC, is with capabilities. Notion of "container" can be
> misleading and evokes thoughts of "vm" when in practice its just a process
> with some isolation applied to it. So, presently I'm guessing container
> Shepherd maybe PID 1 inside its isolated environment, but from the host pow
> it is just a process and one that unlike our host's shepherd may lack
> certain capabilities and privileges to e.g. create new devices or load
> kernel modules on request, etc. In the sense of
> https://man7.org/linux/man-pages/man7/capabilities.7.html maybe?
>
> Am I on the right track? But then, how does one test services like that
> that may require ability to modify devices etc? Have we "outgrown"
> container and ought to `guix system vm` for such services? Or is there a
> way to bless container shepherd with necessary capabilities? If not from
> `guix system container` command line, then perhaps dropping down to the
> underlying programmatic interface i.e. whatever `guix system container`
> ends up calling to containerize a system?
>
> Thanks
>
>
> On Wed, 22 Mar 2023 at 10:20, Vladilen Kozin <vladilen.kozin@gmail.com>
> wrote:
>
>> Hello guix.
>>
>> I put together a tailscale system service that's meant to start a
>> tailscale daemon managed by the system shepherd, that is to say that my
>> `tailscaled-service-type` specifies `(service-extension
>> shepherd-root-service-type tailscaled-shepherd-service)`, where
>> `tailscaled-shepherd-service` creates a `shepherd-service` with (provision
>> '(tailscaled)) and (requirement '(networking)).
>>
>> I tested it by lowering to store via `shepherd-service-file` and then
>> loading the generated script via `sudo herd load root ...`. This works fine
>> and the daemon starts without a problem.
>>
>> Next, I try to spawn tailscaled as part of my OS definition:
>> (services (cons* (service tailscaled-service-type
>> (tailscaled-configuration)) %base-services))
>> ;; tried %desktop-services too
>>
>> To test, we create a container:
>> sudo guix system -K -L /home/vlad/Code/fullmeta-guix/channel container
>> os.scm --network --expose=/dev/net=/dev
>>
>> Earlier runs had it complaining that /dev/net/tun was missing, so I
>> exposed that. Dunno if that's how I'm supposed to handle this. Now,
>> /var/log/messages show:
>>
>> Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48
>> Linux kernel version: 5.18.10
>> Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 is
>> CONFIG_TUN enabled in your kernel? `modprobe tun` failed with:
>> Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48
>> wgengine.NewUserspaceEngine(tun "tailscale0") error:
>> tstun.New("tailscale0"): operation not permitted
>>
>> I feel like maybe I'm missing some kernel modules, but I would've
>> expected host and container to share the kernel, so I dunno. In fact, when
>> I randomly attempted adding (kernel-arguments (cons* "CONFIG_TUN=m"
>> %default-kernel-arguments)) to my os definition, resulting script hash came
>> out the same, which tells me, containers don't even look at these kernel
>> params when generating a script.
>>
>> Any guesses as to why this works under host but not inside container?
>>
>> Relatedly, does anyone have a nicer workflow they use to define and test
>> shepherd services? Such containerization was the next step in testing the
>> service and would've been ok were it not for the above failure, but the
>> initial indirection with lowering to store, then `sudo herd load root ...`
>> is a bit too involved and "indirect" for my liking as well - anyone has an
>> improved way of developing shepherd services?
>>
>> Thanks!
>> --
>> Best regards
>> Vlad Kozin
>>
>
>
> --
> Best regards
> Vlad Kozin
>


-- 
Best regards
Vlad Kozin

[-- Attachment #2: Type: text/html, Size: 9229 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: shepherd service works on host but fails inside system container
  2023-03-22 10:20 shepherd service works on host but fails inside system container Vladilen Kozin
  2023-03-22 12:14 ` Vladilen Kozin
@ 2023-03-27  8:39 ` Attila Lendvai
  1 sibling, 0 replies; 4+ messages in thread
From: Attila Lendvai @ 2023-03-27  8:39 UTC (permalink / raw)
  To: Vladilen Kozin; +Cc: guix-devel

> Relatedly, does anyone have a nicer workflow they use to define and
> test shepherd services?

i'm not sure it's a nicer workflow, but i'm mimicing the Guix tests:

https://github.com/attila-lendvai/guix-crypto/blob/main/tests/swarm-tests.scm#L19

it is based on `guix system vm` and the testing is by manually looking at stuff in the VM (you get a VM console in your terminal). the startup time is in the ballpark of 20-30 secs on my laptop.

adding automated tests to this is i think possible.

let us know if you have a faster cycle.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Since no individual acting separately can lawfully use force to destroy the rights of others, does it not logically follow that the same principle also applies to the common force that is nothing more than the organized combination of the individual forces?”
	— Frédéric Bastiat (1801–1850), 'The Law' (1850)



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-27  8:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-22 10:20 shepherd service works on host but fails inside system container Vladilen Kozin
2023-03-22 12:14 ` Vladilen Kozin
2023-03-22 13:11   ` Vladilen Kozin
2023-03-27  8:39 ` Attila Lendvai

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.