On Sun, Sep 13, 2020 at 6:39 AM <edk@beaver-labs.com> wrote:

Thank you for this thourough investigation and for finding the
workaround !

I just submitted a patch to the doc based on your email.

Cheers,

Edouard.
conjaroy writes:

> In an eariler bug comment [1] I corroborated that nscd was leaking
> /etc/passwd information from the host OS into the Guix container, and I
> wondered aloud why the container would use the host OS's nscd if there was
> a risk of this happening.
>
> I've looked into how Guix configures its own nscd, and it turns out that by
> default it enables lookups only for `hosts` and `services` - not for
> `passwd`, `group`, or `netgroup`. Presumably, then, this configuration is
> sufficient for nscd to prevent the glibc compatibility issues described in
> the manual [3].
>
> After adding the following 3 lines in nscd.conf on my foreign distro
> (Debian 10) and restarting nscd, my Guix system containers were able to
> boot successfully while talking to the daemon:
>
> enable-cache passwd no
> enable-cache group no
> enable-cache netgroup no
>
> So I think the bug here is that the Guix manual page advising the use of
> nscd on a foreign distro [3] doesn't elaborate on which types of service
> lookups are safe to enable in the daemon. If Guix is used only to build and
> run binaries then perhaps it could use nscd for all lookups, but this is
> evidently not the case for Guix system containers.
>
>
> Cheers,
>
> Jason
>
>
> [1] https://www.mail-archive.com/bug-guix@gnu.org/msg19915.html
> [2]
> https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/services/base.scm?h=version-1.1.0#n1238
> [3] https://guix.gnu.org/manual/en/html_node/Application-Setup.html
>
> On Mon, Aug 24, 2020 at 11:15 PM conjaroy <conjaroy@gmail.com> wrote:
>
>> I've observed this error under similar circumstances: launching a guix
>> system container script with network sharing enabled, on a foreign disto
>> (Debian 10) with nscd running.
>>
>> Using `strace -f /gnu/store/...-run-container`, we can observe the
>> container's lookup of user accounts via the foreign distro's nscd socket:
>>
>> [pid 16582] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 11
>> [pid 16582] connect(11, {sa_family=AF_UNIX,
>> sun_path="/var/run/nscd/socket"}, 110) = 0
>> [pid 16582] sendto(11, "\2\0\0\0\0\0\0\0\t\0\0\0postgres\0", 21,
>> MSG_NOSIGNAL, NULL, 0) = 21
>> [pid 16582] poll([{fd=11, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1
>> ([{fd=11, revents=POLLIN}])
>> [pid 16582] read(11,
>> "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0"...,
>> 36) = 36
>> [pid 16582] close(11) = 0
>>
>> Since the user ("postgres") is indeed missing in the foreign disto, the
>> lookup fails. In this case, disabling nscd on the foreign distro allowed
>> the container script to run without error.
>>
>> Based on comments in https://issues.guix.info/issue/28128, I see that it
>> was a deliberate choice to bind-mount the foreign distro's nscd socket
>> inside the container (instead of starting a separate containerized nscd
>> instance). But I'm having trouble seeing why it's acceptable to leak state
>> from the foreign distro's user space into the container. Is there something
>> I'm missing?
>>
>> Cheers,
>>
>> Jason
>>