unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
* Nss libraries not found when using guix pack
@ 2022-03-02  9:46 Jean-Christophe HAESSIG
  2022-03-02 12:20 ` Jean-Christophe HAESSIG
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Christophe HAESSIG @ 2022-03-02  9:46 UTC (permalink / raw)
  To: help-guix@gnu.org

Hi,

I tried to deploy Slurm using guix pack :
guix pack -R -S /sbin=sbin -S /bin=bin slurm@19.05.8 nss-pam-ldapd sssd

User and authentication data comes from ldap (sssd). The libraries are 
present but the Slurm binary does not find them, and fails with an 
invalid user error.

Excerpt of strace :
pid 22647] openat(AT_FDCWD, 
"/gnu/store/rkimbfkypl6gp6sjdjrv1lm1cn3q9xfa-slurm-19.05.8/etc/ld.so.cache", 
O_RDONLY|O_CLOEXEC) = 3
[pid 22647] newfstatat(3, "", {st_mode=S_IFREG|0444, st_size=8796, ...}, 
AT_EMPTY_PATH) = 0
[pid 22647] mmap(NULL, 8796, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fadd528e000
[pid 22647] close(3)                    = 0
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/glibc-hwcaps/x86-64-v2/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/glibc-hwcaps/x86-64-v2", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/x86_64/x86_64/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/x86_64/x86_64", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/x86_64/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/x86_64", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/x86_64/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/x86_64", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/tls", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/x86_64/x86_64/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/x86_64/x86_64", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/x86_64/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/x86_64", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/x86_64/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/x86_64", 
0x7ffd828d0b60, 0) = -1 ENOENT (No such file or directory)
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libnss_ldap.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] newfstatat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib", 
{st_mode=S_IFDIR|0555, st_size=4096, ...}, 0) = 0
[pid 22647] munmap(0x7fadd528e000, 8796) = 0
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/rkimbfkypl6gp6sjdjrv1lm1cn3q9xfa-slurm-19.05.8/etc/ld.so.cache", 
O_RDONLY|O_CLOEXEC) = 3
[pid 22647] newfstatat(3, "", {st_mode=S_IFREG|0444, st_size=8796, ...}, 
AT_EMPTY_PATH) = 0
[pid 22647] mmap(NULL, 8796, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fadd528e000
[pid 22647] close(3)                    = 0
[pid 22647] openat(AT_FDCWD, 
"/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libnss_sss.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22647] munmap(0x7fadd528e000, 8796) = 0
[pid 22647] poll([{fd=2, events=POLLOUT}], 1, 5000) = 1 ([{fd=2, 
revents=POLLOUT}])
[pid 22647] newfstatat(2, "", {st_mode=S_IFCHR|0620, 
st_rdev=makedev(136, 0), ...}, AT_EMPTY_PATH) = 0
[pid 22647] write(2, "slurmdbd: fatal: Invalid user fo"..., 62slurmdbd:
fatal: Invalid user for SlurmUser XXXXXX, ignored

-------------------

The library exists :
$ find /opt/ -iname 'libnss_sss.so*'
/opt/slurm/gnu/store/g9zf74246si96vhp47cyhkby89gj38py-sssd-1.16.5/lib/libnss_sss.so.2
/opt/slurm/gnu/store/nz0apgr4bxfmd77iijp8jn264c65z4s4-profile/lib/libnss_sss.so.2

------------------

but is not listed in 
/gnu/store/rkimbfkypl6gp6sjdjrv1lm1cn3q9xfa-slurm-19.05.8/etc/ld.so.cache

How can this be fixed properly ?

Thank you,
J.C.H

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nss libraries not found when using guix pack
  2022-03-02  9:46 Nss libraries not found when using guix pack Jean-Christophe HAESSIG
@ 2022-03-02 12:20 ` Jean-Christophe HAESSIG
  2022-03-08 10:40   ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Christophe HAESSIG @ 2022-03-02 12:20 UTC (permalink / raw)
  To: help-guix@gnu.org

On 02/03/2022 10:46, Jean-Christophe HAESSIG wrote:

> I tried to deploy Slurm using guix pack :
> guix pack -R -S /sbin=sbin -S /bin=bin slurm@19.05.8 nss-pam-ldapd sssd
> 
> User and authentication data comes from ldap (sssd). The libraries are
> present but the Slurm binary does not find them, and fails with an
> invalid user error.

I jumped that hurdle with LD_PRELOAD, but this is not an acceptable fix 
of course. However, I did that only to realize that Slurm in guix is 
compiled without mysql support, so I'll need to change the package, 
which I have never done.

I wanted to use Slurm from Guix because Debian does not provide every 
possible Slurm version. This can be a problem when a Slurm cluster must 
be upgraded without shutting it down completely. I hoped to gain some 
independence from my host distribution but it appears that won't be so 
simple...

JCH

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nss libraries not found when using guix pack
  2022-03-02 12:20 ` Jean-Christophe HAESSIG
@ 2022-03-08 10:40   ` Ludovic Courtès
  2022-03-15 10:16     ` Packaging Slurm (Was: Nss libraries not found when using guix pack) Jean-Christophe HAESSIG
  0 siblings, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2022-03-08 10:40 UTC (permalink / raw)
  To: Jean-Christophe HAESSIG; +Cc: help-guix@gnu.org

Salut Jean-Christophe,  :-)

Jean-Christophe HAESSIG <haessigj@igbmc.fr> skribis:

> On 02/03/2022 10:46, Jean-Christophe HAESSIG wrote:
>
>> I tried to deploy Slurm using guix pack :
>> guix pack -R -S /sbin=sbin -S /bin=bin slurm@19.05.8 nss-pam-ldapd sssd
>> 
>> User and authentication data comes from ldap (sssd). The libraries are
>> present but the Slurm binary does not find them, and fails with an
>> invalid user error.
>
> I jumped that hurdle with LD_PRELOAD, but this is not an acceptable fix 
> of course.

Yeah, I did something similar in the past:

  https://lists.gnu.org/archive/html/guix-devel/2020-08/msg00168.html

Maybe we could have a package transformation option, say
‘--with-nss-plugins=…’, that would wrap binaries to have LD_LIBRARY_PATH
pointing to the chosen NSS plugins.

Not pretty, but I’m afraid this is hardly avoidable.

Thoughts?

> However, I did that only to realize that Slurm in guix is compiled
> without mysql support, so I'll need to change the package, which I
> have never done.

This would be a welcome change, though it would have a noticeable impact
on the closure size:

--8<---------------cut here---------------start------------->8---
$ guix size slurm |tail -1
total: 134.7 MiB
$ guix size slurm mariadb |tail -1
total: 421.4 MiB
--8<---------------cut here---------------end--------------->8---

> I wanted to use Slurm from Guix because Debian does not provide every 
> possible Slurm version. This can be a problem when a Slurm cluster must 
> be upgraded without shutting it down completely. I hoped to gain some 
> independence from my host distribution but it appears that won't be so 
> simple...

Interesting.  From our earlier discussion, this sounds like quite an
endeavor, but I’d be curious to know what the stumbling blocks are and
how we can overcome them!

Thanks,
Ludo’.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Packaging Slurm (Was: Nss libraries not found when using guix pack)
  2022-03-08 10:40   ` Ludovic Courtès
@ 2022-03-15 10:16     ` Jean-Christophe HAESSIG
  2022-03-17 18:25       ` Packaging Slurm Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Christophe HAESSIG @ 2022-03-15 10:16 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: help-guix@gnu.org

On 08/03/2022 11:40, Ludovic Courtès wrote:
> Salut Jean-Christophe,  :-)
Salut,

>> I jumped that hurdle with LD_PRELOAD, but this is not an acceptable fix
>> of course.
> 
> Yeah, I did something similar in the past:
> 
>    https://lists.gnu.org/archive/html/guix-devel/2020-08/msg00168.html
> 
> Maybe we could have a package transformation option, say
> ‘--with-nss-plugins=…’, that would wrap binaries to have LD_LIBRARY_PATH
> pointing to the chosen NSS plugins.
> 
> Not pretty, but I’m afraid this is hardly avoidable.
> 
> Thoughts?

I don't really know what the implications of this would be. I continued 
exploring packaging Slurm with Guix and deploying it on Debian.
I feel what i'm trying to do is slightly out of scope of Guix's intent : 
I used guix pack with various options -R, -RR but these are made to 
enable regular users to run software from guix packages. When the 
software is intended to be run by root, things seem to go awry. I had 
errors because the program tries to switch user and groups.

--------------
mount("none", "/tmp/guix-exec-C6ZnPc", "tmpfs", 0, NULL) = 0
clone(child_stack=NULL, flags=CLONE_NEWNS|CLONE_NEWUSER|SIGCHLD) = 4061
openat(AT_FDCWD, "/proc/4061/setgroups", O_WRONLY) = 3
write(3, "deny\0", 5)                   = 5
close(3)                                = 0
getuid()                                = 0
--------------

and later :

--------------
[pid  4061] newfstatat(5, "", {st_mode=S_IFREG|0644, st_size=10406312, 
...}, AT_EMPTY_PATH) = 0
[pid  4061] setgroups(2, [3000, 51692]) = -1 EPERM (Operation not permitted)
[pid  4061] poll([{fd=2, events=POLLOUT}], 1, 5000) = 1 ([{fd=2, 
revents=POLLOUT}])
[pid  4061] newfstatat(2, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, 
AT_EMPTY_PATH) = 0
[pid  4061] write(2, "slurmdbd: fatal: Failed to set s"..., 89slurmdbd: 
fatal: Failed to set supplementary groups, initgroups: Operation not 
permitted
--------------

When the program is directly run with its final system user account, it 
starts correctly, still complains about not being able to fiddle with 
groups but doesn't crash:

slurmdbd: Not running as root. Can't drop supplementary groups

I only got this to work with -RR. -R got me other permission errors 
about not being able to setup subuid/subgid. System is Debian 10.9 with 
kernel 4.19. I expected containers to be well available and didn't know 
if the errors could come from what the program tries to do as root so I 
didn't check thoroughly yet.

>> However, I did that only to realize that Slurm in guix is compiled
>> without mysql support, so I'll need to change the package, which I
>> have never done.

I managed to compile with mysql thanks to input from others. Thanks to them.

> This would be a welcome change, though it would have a noticeable impact
> on the closure size:
> 
> --8<---------------cut here---------------start------------->8---
> $ guix size slurm |tail -1
> total: 134.7 MiB
> $ guix size slurm mariadb |tail -1
> total: 421.4 MiB
> --8<---------------cut here---------------end--------------->8---

I don't know if this could change anything but AFAIK mariadb is a 
dependency of slurmdbd only. Debian has separate packages for the 
accounting daemon, the controller daemon (slurmctld) and the client 
(slurmd) but there still is one source package.

Since only one host runs the dbd, not having to bundle mariadb libs on 
all the clients would reduce the bill - if it is possible to cherry-pick 
binaries like that in Guix.

>> I wanted to use Slurm from Guix because Debian does not provide every 
>> possible Slurm version. This can be a problem when a Slurm cluster must 
>> be upgraded without shutting it down completely. I hoped to gain some 
>> independence from my host distribution but it appears that won't be so 
>> simple...

> Interesting.  From our earlier discussion, this sounds like quite an
> endeavor, but I’d be curious to know what the stumbling blocks are and
> how we can overcome them!

For the time being, I'm still confident it can be done somehow, at least 
temporarily to enable a smooth upgrade. There are some minor hurdles 
e.g. Debian decided to change the paths in etc, var and the like to 
slurm-llnl. I managed to build several versions from git, I'm still 
blocked with 18.08 which doesn't compile because of "multiple definition 
of 'opt'". Only thing I can think of is something is too recent wrt 
slurm version.

I guess running Guix system would remove many problems but I'm not ready 
for that and since I'm interested in the shared software use case for a 
cluster, there would still remain the "battle for /gnu/store" issue.

Thanks,
JC

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Packaging Slurm
  2022-03-15 10:16     ` Packaging Slurm (Was: Nss libraries not found when using guix pack) Jean-Christophe HAESSIG
@ 2022-03-17 18:25       ` Ludovic Courtès
  2022-03-18 16:05         ` Jean-Christophe HAESSIG
  0 siblings, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2022-03-17 18:25 UTC (permalink / raw)
  To: Jean-Christophe HAESSIG; +Cc: help-guix@gnu.org

Hello,

Jean-Christophe HAESSIG <haessigj@igbmc.fr> skribis:

> I don't really know what the implications of this would be. I continued 
> exploring packaging Slurm with Guix and deploying it on Debian.
> I feel what i'm trying to do is slightly out of scope of Guix's intent : 
> I used guix pack with various options -R, -RR but these are made to 
> enable regular users to run software from guix packages. When the 
> software is intended to be run by root, things seem to go awry. I had 
> errors because the program tries to switch user and groups.
>
> --------------
> mount("none", "/tmp/guix-exec-C6ZnPc", "tmpfs", 0, NULL) = 0
> clone(child_stack=NULL, flags=CLONE_NEWNS|CLONE_NEWUSER|SIGCHLD) = 4061
> openat(AT_FDCWD, "/proc/4061/setgroups", O_WRONLY) = 3
> write(3, "deny\0", 5)                   = 5
> close(3)                                = 0
> getuid()                                = 0
> --------------
>
> and later :
>
> --------------
> [pid  4061] newfstatat(5, "", {st_mode=S_IFREG|0644, st_size=10406312, 
> ...}, AT_EMPTY_PATH) = 0
> [pid  4061] setgroups(2, [3000, 51692]) = -1 EPERM (Operation not permitted)
> [pid  4061] poll([{fd=2, events=POLLOUT}], 1, 5000) = 1 ([{fd=2, 
> revents=POLLOUT}])
> [pid  4061] newfstatat(2, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, 
> AT_EMPTY_PATH) = 0
> [pid  4061] write(2, "slurmdbd: fatal: Failed to set s"..., 89slurmdbd: 
> fatal: Failed to set supplementary groups, initgroups: Operation not 
> permitted
> --------------

Can you try with:

  GUIX_EXECUTION_ENGINE=fakechroot ./bin/sulrmbdb …

assuming you’re using a -RR pack?

> When the program is directly run with its final system user account, it 
> starts correctly, still complains about not being able to fiddle with 
> groups but doesn't crash:
>
> slurmdbd: Not running as root. Can't drop supplementary groups
>
> I only got this to work with -RR. -R got me other permission errors 
> about not being able to setup subuid/subgid. System is Debian 10.9 with 
> kernel 4.19. I expected containers to be well available and didn't know 
> if the errors could come from what the program tries to do as root so I 
> didn't check thoroughly yet.

Yeah, presumably things running in an unprivileged user namespace (this
is the case with -R and also with GUIX_EXECUTION_ENGINE=userns) can’t
call setgroups(2).

>> This would be a welcome change, though it would have a noticeable impact
>> on the closure size:
>> 
>> --8<---------------cut here---------------start------------->8---
>> $ guix size slurm |tail -1
>> total: 134.7 MiB
>> $ guix size slurm mariadb |tail -1
>> total: 421.4 MiB
>> --8<---------------cut here---------------end--------------->8---
>
> I don't know if this could change anything but AFAIK mariadb is a 
> dependency of slurmdbd only. Debian has separate packages for the 
> accounting daemon, the controller daemon (slurmctld) and the client 
> (slurmd) but there still is one source package.

Here we could have a separate output maybe:

  https://guix.gnu.org/manual/devel/en/html_node/Packages-with-Multiple-Outputs.html

[...]

> For the time being, I'm still confident it can be done somehow, at least 
> temporarily to enable a smooth upgrade. There are some minor hurdles 
> e.g. Debian decided to change the paths in etc, var and the like to 
> slurm-llnl. I managed to build several versions from git, I'm still 
> blocked with 18.08 which doesn't compile because of "multiple definition 
> of 'opt'". Only thing I can think of is something is too recent wrt 
> slurm version.

FWIW I recently fixed that build error in Guix:

  https://git.savannah.gnu.org/cgit/guix.git/commit/?id=dd98dc42fe8d898bbdf8b3f988120a81bb145f77

> I guess running Guix system would remove many problems but I'm not ready 
> for that and since I'm interested in the shared software use case for a 
> cluster, there would still remain the "battle for /gnu/store" issue.

Where “battle from /gnu/store” is the chicken-and-egg when booting,
right?  (That is, if /gnu/store is on NFS, then how do you boot.)

HTH,
Ludo’.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Packaging Slurm
  2022-03-17 18:25       ` Packaging Slurm Ludovic Courtès
@ 2022-03-18 16:05         ` Jean-Christophe HAESSIG
  2022-03-18 17:16           ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Christophe HAESSIG @ 2022-03-18 16:05 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: help-guix@gnu.org

On 17/03/2022 19:25, Ludovic Courtès wrote:
> Hello,
Hi,

> Can you try with:
> 
>    GUIX_EXECUTION_ENGINE=fakechroot ./bin/sulrmbdb …
> 
> assuming you’re using a -RR pack?

Yes, there's the relevant output :

# strace -f -E GUIX_EXECUTION_ENGINE=fakechroot -E 
LD_LIBRARY_PATH=/opt/slurm/gnu/store/j417whqiy5gz2rbmlnknla3wl43jgk1z-profile/lib/ 
/opt/slurm/sbin/slurmdbd -D

newfstatat(AT_FDCWD, 
"/gnu/store/ygljcnlacasf5vc164pm4dp9ysc5ddbq-slurm-mysql-19.05-19.05.8/sbin//slurmdbd", 
0x7ffdaf6e4f60, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
mkdir("/tmp/guix-exec-5lyvIe", 0700)    = 0

[...]

[pid 10534] connect(6, {sa_family=AF_UNIX, 
sun_path="/tmp/guix-exec-5lyvIe/run/mysqld/mysqld.sock"}, 46) = 0
[pid 10534] fcntl(6, F_SETFL, O_RDONLY) = 0
[pid 10534] setsockopt(6, SOL_SOCKET, SO_RCVTIMEO, 
"\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
[pid 10534] setsockopt(6, SOL_SOCKET, SO_SNDTIMEO, 
"\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
[pid 10534] setsockopt(6, SOL_IP, IP_TOS, [8], 4) = -1 EOPNOTSUPP 
(Operation not supported)
[pid 10534] setsockopt(6, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
[pid 10534] recvfrom(6, "b\0\0\0\n5.5.5-10.1.37-MariaDB-0+deb"..., 
16384, MSG_DONTWAIT, NULL, NULL) = 102

[...]

[pid 10630] setgroups(2, [3000, 51692]) = 0
[pid 10630] getegid()                   = 0
[pid 10630] setgid(3000)                = 0
[pid 10630] getuid()                    = 0
[pid 10630] setuid(100020)              = 0
[pid 10630] prctl(PR_SET_DUMPABLE, 1)   = 0

[...]

[pid 10534] connect(7, {sa_family=AF_UNIX, 
sun_path="/tmp/guix-exec-5lyvIe/run/mysqld/mysqld.sock"}, 46) = -1 
EACCES (Permission denied)
[pid 10534] close(7)                    = 0

[...]

[pid 10534] newfstatat(2, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, 
AT_EMPTY_PATH) = 0
[pid 10534] write(2, "slurmdbd: error: mysql_real_conn"..., 131slurmdbd: 
error: mysql_real_connect failed: 2002 Can't connect to local MySQL 
server through socket '/run/mysqld/mysqld.sock' (13)
) = 131
[pid 10534] poll([{fd=2, events=POLLOUT}], 1, 5000) = 1 ([{fd=2, 
revents=POLLOUT}])
[pid 10534] newfstatat(2, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, 
AT_EMPTY_PATH) = 0
[pid 10534] write(2, "slurmdbd: error: Problem getting"..., 47slurmdbd: 
error: Problem getting cache of data
) = 47

[...]

[pid 10534] newfstatat(2, "", {st_mode=S_IFIFO|0600, st_size=0, ...}, 
AT_EMPTY_PATH) = 0
[pid 10534] write(2, "slurmdbd: error: unable to re-co"..., 59slurmdbd: 
error: unable to re-connect to as_mysql database
) = 59
[pid 10534] futex(0x7fa53885f910, 
FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 10535, NULL, 0xffffffff 
<unfinished ...>
[pid 10536] <... clock_nanosleep resumed> 0x7fa53875dd90) = 0
[pid 10536] clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=5, tv_nsec=0}, 
0x7fa53875dd90) = 0
[pid 10536] clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=5, tv_nsec=0}, 
0x7fa53875dd90) = 0
[pid 10536] clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=5, tv_nsec=0}, 
0x7fa53875dd90) = 0
[pid 10536] clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=5, tv_nsec=0}, 
slurmdbd: error: We need a connection to run this

The program hangs from there and is inoperative. The socket to the 
database, which is in mode ugo+rwx is successfully used once, then the 
program fiddles with its groups, gid, uid and then can't open it 
anymore. I see that the /tmp/guix-exec-xxxx directory is created with 
0700 rights, which means it cannot be traversed anymore when the uid of 
the program has changed.


> Here we could have a separate output maybe:
> 
>    https://guix.gnu.org/manual/devel/en/html_node/Packages-with-Multiple-Outputs.html

I'll give it a try

> 
> Where “battle from /gnu/store” is the chicken-and-egg when booting,
> right?  (That is, if /gnu/store is on NFS, then how do you boot.)

Sure, it is even a problem with guix on a foreign distro, if you need 
some of the software to be managed locally and the rest by shared nfs.
My best bet would be to have guix recompiled to use an alternate 
location as the store and the statedir, have it manage the local system 
and let regular users take advantage of substitutes. I don't feel this 
would be straightforward however.

Thanks,
JCH

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Packaging Slurm
  2022-03-18 16:05         ` Jean-Christophe HAESSIG
@ 2022-03-18 17:16           ` Ludovic Courtès
  0 siblings, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2022-03-18 17:16 UTC (permalink / raw)
  To: Jean-Christophe HAESSIG; +Cc: help-guix@gnu.org

Hi!

Jean-Christophe HAESSIG <haessigj@igbmc.fr> skribis:

> # strace -f -E GUIX_EXECUTION_ENGINE=fakechroot -E 
> LD_LIBRARY_PATH=/opt/slurm/gnu/store/j417whqiy5gz2rbmlnknla3wl43jgk1z-profile/lib/ 
> /opt/slurm/sbin/slurmdbd -D

[...]

> The program hangs from there and is inoperative. The socket to the 
> database, which is in mode ugo+rwx is successfully used once, then the 
> program fiddles with its groups, gid, uid and then can't open it 
> anymore. I see that the /tmp/guix-exec-xxxx directory is created with 
> 0700 rights, which means it cannot be traversed anymore when the uid of 
> the program has changed.

Woow, interesting.  Clearly relocatable packs were primarily intended
for “regular” programs (not daemons), which is why you’re making these
interesting discoveries.  :-)

Off the top of my head, I’m not sure what the solution could be.  If you
feel so inclined, you can take a look at run-in-namespace.c in Guix,
which implements those execution engines.

>> Where “battle from /gnu/store” is the chicken-and-egg when booting,
>> right?  (That is, if /gnu/store is on NFS, then how do you boot.)
>
> Sure, it is even a problem with guix on a foreign distro, if you need 
> some of the software to be managed locally and the rest by shared nfs.
> My best bet would be to have guix recompiled to use an alternate 
> location as the store and the statedir, have it manage the local system 
> and let regular users take advantage of substitutes. I don't feel this 
> would be straightforward however.

Indeed; in general, using a non-standard store directory is a bad idea,
for many reasons (no substitutes, risks of failing builds just because
of the different store directory length, etc.).

I wonder if there might be a solution where you’d overlay the local
/gnu/store on top of the NFS-mounted store, something like that.  But I
don’t have the full picture of the setup you have.

Thanks,
Ludo’.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-18 17:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-02  9:46 Nss libraries not found when using guix pack Jean-Christophe HAESSIG
2022-03-02 12:20 ` Jean-Christophe HAESSIG
2022-03-08 10:40   ` Ludovic Courtès
2022-03-15 10:16     ` Packaging Slurm (Was: Nss libraries not found when using guix pack) Jean-Christophe HAESSIG
2022-03-17 18:25       ` Packaging Slurm Ludovic Courtès
2022-03-18 16:05         ` Jean-Christophe HAESSIG
2022-03-18 17:16           ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).