unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#41575: Container with openssh-service requires sshd user on the host
@ 2020-05-28  9:20 Edouard Klein
  2020-08-25  3:15 ` conjaroy
  0 siblings, 1 reply; 5+ messages in thread
From: Edouard Klein @ 2020-05-28  9:20 UTC (permalink / raw)
  To: 41575

Dear guix,

This is a funny one.

Consider this minimal operating system definition:
-----------
(use-modules (gnu))
(use-service-modules ssh)

(operating-system
  (host-name "MinimalSSH")
  (timezone "Europe/Paris")
  (bootloader (bootloader-configuration
               (bootloader grub-bootloader)))
  (file-systems %base-file-systems)
  (services (append (list 
                     (service openssh-service-type
                              (openssh-configuration
                               (port-number 2222))))
                    %base-services)))
-----------

If I try to create a container (with network of course):

guix system container ~/src/gendscraper/minimal_openssh.scm --network

And run the container

sudo /gnu/store/6dvy8acvzkzfba8hjf4nfc3ps2rwns5j-run-container

I get the error I pasted at the end of this email.

If, however, I create a sshd user on the host, it runs without a hitch
and I can talk to the ssh server on localhost:2222

Funny things:
- It will run if I remove the --network (but then I can't connect to the
ssh server, of course)
- It will run if I userdel sshd, until I reboot

The ncsd daemon is running on the host.

My goal with guix containers is to avoid having to make any
configuration on the foreign host (apart from installing guix),
is this normal that the sshd user has to be present for the container
to run the ssh daemon ?

If it is, how can I know in advance which service requires which
configuration on the host ?

Thanks in advance for any help, please do not hesitate to ask for more
information about my config (Arch) if need be.

Cheers,

Edouard.

---------------
sudo /gnu/store/6dvy8acvzkzfba8hjf4nfc3ps2rwns5j-run-container
guile: warning: failed to install locale
system container is running as PID 3934
Run 'sudo guix container exec 3934 /run/current-system/profile/bin/bash --login'
or run 'sudo nsenter -a -t 3934' to get a shell into it.

making '/gnu/store/ml63vj43bv4lrmwdvpm6jqyya24z6zkr-system' the current system...
setting up setuid programs in '/run/setuid-programs'...
populating /etc from /gnu/store/a4d90ypz1xylh97ff2b4ysj33hwnmfva-etc...
Backtrace:
          12 (primitive-load "/gnu/store/6dvy8acvzkzfba8hjf4nfc3ps2r…")
In gnu/build/linux-container.scm:
    297:8 11 (call-with-temporary-directory #<procedure 7f36d0d122d0…>)
   325:16 10 (_ _)
     62:6  9 (call-with-clean-exit _)
In unknown file:
           8 (primitive-load "/gnu/store/ml63vj43bv4lrmwdvpm6jqyya24…")
In ice-9/eval.scm:
    619:8  7 (_ #f)
In unknown file:
           6 (primitive-load "/gnu/store/zdqjch5xknlhp6dvnl6vdrlfnbm…")
In srfi/srfi-1.scm:
    640:9  5 (for-each #<procedure primitive-load (_)> _)
In unknown file:
           4 (primitive-load "/gnu/store/y19c6kipzqigz15v4hvy53x2vaz…")
In gnu/build/activation.scm:
    145:2  3 (activate-users+groups _ _)
In srfi/srfi-1.scm:
    640:9  2 (for-each #<procedure make-home-directory (user)> _)
In gnu/build/activation.scm:
   115:16  1 (make-home-directory #<<user-account> name: "sshd" pass…>)
In unknown file:
           0 (getpw "sshd")

ERROR: In procedure getpw:
In procedure getpw: entry not found




^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#41575: Container with openssh-service requires sshd user on the host
  2020-05-28  9:20 bug#41575: Container with openssh-service requires sshd user on the host Edouard Klein
@ 2020-08-25  3:15 ` conjaroy
  2020-09-09  0:31   ` conjaroy
  0 siblings, 1 reply; 5+ messages in thread
From: conjaroy @ 2020-08-25  3:15 UTC (permalink / raw)
  To: 41575

[-- Attachment #1: Type: text/plain, Size: 1421 bytes --]

I've observed this error under similar circumstances: launching a guix
system container script with network sharing enabled, on a foreign disto
(Debian 10) with nscd running.

Using `strace -f /gnu/store/...-run-container`, we can observe the
container's lookup of user accounts via the foreign distro's nscd socket:

[pid 16582] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 11
[pid 16582] connect(11, {sa_family=AF_UNIX,
sun_path="/var/run/nscd/socket"}, 110) = 0
[pid 16582] sendto(11, "\2\0\0\0\0\0\0\0\t\0\0\0postgres\0", 21,
MSG_NOSIGNAL, NULL, 0) = 21
[pid 16582] poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1
([{fd=11, revents=POLLIN}])
[pid 16582] read(11,
"\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0"...,
36) = 36
[pid 16582] close(11)                   = 0

Since the user ("postgres") is indeed missing in the foreign disto, the
lookup fails. In this case, disabling nscd on the foreign distro allowed
the container script to run without error.

Based on comments in https://issues.guix.info/issue/28128, I see that it
was a deliberate choice to bind-mount the foreign distro's nscd socket
inside the container (instead of starting a separate containerized nscd
instance). But I'm having trouble seeing why it's acceptable to leak state
from the foreign distro's user space into the container. Is there something
I'm missing?

Cheers,

Jason

[-- Attachment #2: Type: text/html, Size: 1835 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#41575: Container with openssh-service requires sshd user on the host
  2020-08-25  3:15 ` conjaroy
@ 2020-09-09  0:31   ` conjaroy
  2020-09-13 10:39     ` edk
  0 siblings, 1 reply; 5+ messages in thread
From: conjaroy @ 2020-09-09  0:31 UTC (permalink / raw)
  To: 41575; +Cc: edk

[-- Attachment #1: Type: text/plain, Size: 3043 bytes --]

In an eariler bug comment [1] I corroborated that nscd was leaking
/etc/passwd information from the host OS into the Guix container, and I
wondered aloud why the container would use the host OS's nscd if there was
a risk of this happening.

I've looked into how Guix configures its own nscd, and it turns out that by
default it enables lookups only for `hosts` and `services` - not for
`passwd`, `group`, or `netgroup`. Presumably, then, this configuration is
sufficient for nscd to prevent the glibc compatibility issues described in
the manual [3].

After adding the following 3 lines in nscd.conf on my foreign distro
(Debian 10) and restarting nscd, my Guix system containers were able to
boot successfully while talking to the daemon:

        enable-cache            passwd          no
        enable-cache            group           no
        enable-cache            netgroup        no

So I think the bug here is that the Guix manual page advising the use of
nscd on a foreign distro [3] doesn't elaborate on which types of service
lookups are safe to enable in the daemon. If Guix is used only to build and
run binaries then perhaps it could use nscd for all lookups, but this is
evidently not the case for Guix system containers.


Cheers,

Jason


[1] https://www.mail-archive.com/bug-guix@gnu.org/msg19915.html
[2]
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/base.scm?h=version-1.1.0#n1238
[3] https://guix.gnu.org/manual/en/html_node/Application-Setup.html

On Mon, Aug 24, 2020 at 11:15 PM conjaroy <conjaroy@gmail.com> wrote:

> I've observed this error under similar circumstances: launching a guix
> system container script with network sharing enabled, on a foreign disto
> (Debian 10) with nscd running.
>
> Using `strace -f /gnu/store/...-run-container`, we can observe the
> container's lookup of user accounts via the foreign distro's nscd socket:
>
> [pid 16582] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 11
> [pid 16582] connect(11, {sa_family=AF_UNIX,
> sun_path="/var/run/nscd/socket"}, 110) = 0
> [pid 16582] sendto(11, "\2\0\0\0\0\0\0\0\t\0\0\0postgres\0", 21,
> MSG_NOSIGNAL, NULL, 0) = 21
> [pid 16582] poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1
> ([{fd=11, revents=POLLIN}])
> [pid 16582] read(11,
> "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0"...,
> 36) = 36
> [pid 16582] close(11)                   = 0
>
> Since the user ("postgres") is indeed missing in the foreign disto, the
> lookup fails. In this case, disabling nscd on the foreign distro allowed
> the container script to run without error.
>
> Based on comments in https://issues.guix.info/issue/28128, I see that it
> was a deliberate choice to bind-mount the foreign distro's nscd socket
> inside the container (instead of starting a separate containerized nscd
> instance). But I'm having trouble seeing why it's acceptable to leak state
> from the foreign distro's user space into the container. Is there something
> I'm missing?
>
> Cheers,
>
> Jason
>

[-- Attachment #2: Type: text/html, Size: 4250 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#41575: Container with openssh-service requires sshd user on the host
  2020-09-09  0:31   ` conjaroy
@ 2020-09-13 10:39     ` edk
  2020-09-13 15:08       ` conjaroy
  0 siblings, 1 reply; 5+ messages in thread
From: edk @ 2020-09-13 10:39 UTC (permalink / raw)
  To: conjaroy; +Cc: 41575

Thank you for this thourough investigation and for finding the
workaround !

I just submitted a patch to the doc based on your email.

Cheers,

Edouard.
conjaroy writes:

> In an eariler bug comment [1] I corroborated that nscd was leaking
> /etc/passwd information from the host OS into the Guix container, and I
> wondered aloud why the container would use the host OS's nscd if there was
> a risk of this happening.
>
> I've looked into how Guix configures its own nscd, and it turns out that by
> default it enables lookups only for `hosts` and `services` - not for
> `passwd`, `group`, or `netgroup`. Presumably, then, this configuration is
> sufficient for nscd to prevent the glibc compatibility issues described in
> the manual [3].
>
> After adding the following 3 lines in nscd.conf on my foreign distro
> (Debian 10) and restarting nscd, my Guix system containers were able to
> boot successfully while talking to the daemon:
>
>         enable-cache            passwd          no
>         enable-cache            group           no
>         enable-cache            netgroup        no
>
> So I think the bug here is that the Guix manual page advising the use of
> nscd on a foreign distro [3] doesn't elaborate on which types of service
> lookups are safe to enable in the daemon. If Guix is used only to build and
> run binaries then perhaps it could use nscd for all lookups, but this is
> evidently not the case for Guix system containers.
>
>
> Cheers,
>
> Jason
>
>
> [1] https://www.mail-archive.com/bug-guix@gnu.org/msg19915.html
> [2]
> https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/base.scm?h=version-1.1.0#n1238
> [3] https://guix.gnu.org/manual/en/html_node/Application-Setup.html
>
> On Mon, Aug 24, 2020 at 11:15 PM conjaroy <conjaroy@gmail.com> wrote:
>
>> I've observed this error under similar circumstances: launching a guix
>> system container script with network sharing enabled, on a foreign disto
>> (Debian 10) with nscd running.
>>
>> Using `strace -f /gnu/store/...-run-container`, we can observe the
>> container's lookup of user accounts via the foreign distro's nscd socket:
>>
>> [pid 16582] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 11
>> [pid 16582] connect(11, {sa_family=AF_UNIX,
>> sun_path="/var/run/nscd/socket"}, 110) = 0
>> [pid 16582] sendto(11, "\2\0\0\0\0\0\0\0\t\0\0\0postgres\0", 21,
>> MSG_NOSIGNAL, NULL, 0) = 21
>> [pid 16582] poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1
>> ([{fd=11, revents=POLLIN}])
>> [pid 16582] read(11,
>> "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0"...,
>> 36) = 36
>> [pid 16582] close(11)                   = 0
>>
>> Since the user ("postgres") is indeed missing in the foreign disto, the
>> lookup fails. In this case, disabling nscd on the foreign distro allowed
>> the container script to run without error.
>>
>> Based on comments in https://issues.guix.info/issue/28128, I see that it
>> was a deliberate choice to bind-mount the foreign distro's nscd socket
>> inside the container (instead of starting a separate containerized nscd
>> instance). But I'm having trouble seeing why it's acceptable to leak state
>> from the foreign distro's user space into the container. Is there something
>> I'm missing?
>>
>> Cheers,
>>
>> Jason
>>





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#41575: Container with openssh-service requires sshd user on the host
  2020-09-13 10:39     ` edk
@ 2020-09-13 15:08       ` conjaroy
  0 siblings, 0 replies; 5+ messages in thread
From: conjaroy @ 2020-09-13 15:08 UTC (permalink / raw)
  To: edk; +Cc: 41575

[-- Attachment #1: Type: text/plain, Size: 3618 bytes --]

My pleasure, Edouard. Thanks for the doc update!

Jason

On Sun, Sep 13, 2020 at 6:39 AM <edk@beaver-labs.com> wrote:

> Thank you for this thourough investigation and for finding the
> workaround !
>
> I just submitted a patch to the doc based on your email.
>
> Cheers,
>
> Edouard.
> conjaroy writes:
>
> > In an eariler bug comment [1] I corroborated that nscd was leaking
> > /etc/passwd information from the host OS into the Guix container, and I
> > wondered aloud why the container would use the host OS's nscd if there
> was
> > a risk of this happening.
> >
> > I've looked into how Guix configures its own nscd, and it turns out that
> by
> > default it enables lookups only for `hosts` and `services` - not for
> > `passwd`, `group`, or `netgroup`. Presumably, then, this configuration is
> > sufficient for nscd to prevent the glibc compatibility issues described
> in
> > the manual [3].
> >
> > After adding the following 3 lines in nscd.conf on my foreign distro
> > (Debian 10) and restarting nscd, my Guix system containers were able to
> > boot successfully while talking to the daemon:
> >
> >         enable-cache            passwd          no
> >         enable-cache            group           no
> >         enable-cache            netgroup        no
> >
> > So I think the bug here is that the Guix manual page advising the use of
> > nscd on a foreign distro [3] doesn't elaborate on which types of service
> > lookups are safe to enable in the daemon. If Guix is used only to build
> and
> > run binaries then perhaps it could use nscd for all lookups, but this is
> > evidently not the case for Guix system containers.
> >
> >
> > Cheers,
> >
> > Jason
> >
> >
> > [1] https://www.mail-archive.com/bug-guix@gnu.org/msg19915.html
> > [2]
> >
> https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/base.scm?h=version-1.1.0#n1238
> > [3] https://guix.gnu.org/manual/en/html_node/Application-Setup.html
> >
> > On Mon, Aug 24, 2020 at 11:15 PM conjaroy <conjaroy@gmail.com> wrote:
> >
> >> I've observed this error under similar circumstances: launching a guix
> >> system container script with network sharing enabled, on a foreign disto
> >> (Debian 10) with nscd running.
> >>
> >> Using `strace -f /gnu/store/...-run-container`, we can observe the
> >> container's lookup of user accounts via the foreign distro's nscd
> socket:
> >>
> >> [pid 16582] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0)
> = 11
> >> [pid 16582] connect(11, {sa_family=AF_UNIX,
> >> sun_path="/var/run/nscd/socket"}, 110) = 0
> >> [pid 16582] sendto(11, "\2\0\0\0\0\0\0\0\t\0\0\0postgres\0", 21,
> >> MSG_NOSIGNAL, NULL, 0) = 21
> >> [pid 16582] poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1
> >> ([{fd=11, revents=POLLIN}])
> >> [pid 16582] read(11,
> >>
> "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0"...,
> >> 36) = 36
> >> [pid 16582] close(11)                   = 0
> >>
> >> Since the user ("postgres") is indeed missing in the foreign disto, the
> >> lookup fails. In this case, disabling nscd on the foreign distro allowed
> >> the container script to run without error.
> >>
> >> Based on comments in https://issues.guix.info/issue/28128, I see that
> it
> >> was a deliberate choice to bind-mount the foreign distro's nscd socket
> >> inside the container (instead of starting a separate containerized nscd
> >> instance). But I'm having trouble seeing why it's acceptable to leak
> state
> >> from the foreign distro's user space into the container. Is there
> something
> >> I'm missing?
> >>
> >> Cheers,
> >>
> >> Jason
> >>
>
>

[-- Attachment #2: Type: text/html, Size: 5129 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-13 15:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28  9:20 bug#41575: Container with openssh-service requires sshd user on the host Edouard Klein
2020-08-25  3:15 ` conjaroy
2020-09-09  0:31   ` conjaroy
2020-09-13 10:39     ` edk
2020-09-13 15:08       ` conjaroy

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).