* issues with offloading
@ 2015-02-05 10:54 Ricardo Wurmus
2015-02-05 22:39 ` Ludovic Courtès
0 siblings, 1 reply; 2+ messages in thread
From: Ricardo Wurmus @ 2015-02-05 10:54 UTC (permalink / raw)
To: guix-devel
Hi Guix,
I'm trying to set up offloading such that I can have a powerful build
host compile stuff for me when my workstation isn't strong enough. I
bumped into a couple of issues while doing this, prompting me to write
this email to discuss what might be changed to improve this.
* lsh required
The manual does not appear to mention that for offloading lsh is
expected to be installed on the submitting host. Since I only had
OpenSSH installed (on the local workstation and the remote server) I
decided to redefine %lsh-command and %lshg-command:
(define %lsh-command "ssh")
(define %lshg-command "ssh")
When the command in these variables does not exist there is no error
message at all. I only discovered the issue because machine-load
returned +inf.0 for every machine in the list (defined in
/etc/guix/machines.scm) and looped indefinitely to find a suitable
machine.
Here are some recommendations:
- make %lsh-command and %lshg-command configurable or mention in the
documentation that lsh must be available in the PATH.
- print an error message when "remote-pipe" fails due to not finding
the command specified in %lsh-command / %lshg-command
- only run once over the machines given in /etc/guix/machines.scm
instead of looping indefinitely, or alternatively print the reason
for skipping a machine (e.g. by stating that machine-load is +inf.0)
* does not work with unpriviledged user
I assumed that all I needed was an SSH key for an unprivileged user on
the remote machine in order to log on to the remote build host and
talk to the local guix-daemon there. However, we actually run Guile
scripts on the remote instead of letting the privileged daemon perform
known-to-be-safe commands.
This is a problem with register-gc-root, for example. It creates a
directory in %state-directory where an unprivileged user likely has no
write permissions. This mkdir fails silently because register-gc-root
does not bother checking the result of
(false-if-exception (mkdir root-directory))
When the root-directory (e.g. /var/guix/gcroots/tmp) cannot be created
by the remote user running the guile script, the following (symlink
...) fails.
Recommendations:
- instead of sending a script to be executed by a remote Guile process
running as the unprivileged SSH user it may make sense to bake this
feature into the daemon. The daemon has permissions on
%state-directory anyway, while a regular user probably shouldn't.
- check the return value of (false-if-exception (mkdir
root-directory)), or do not use false-if-exception at all to fail
right there when the directory should be created rather than failing
when the symlink to a non-existing directory cannot be created.
This would arguably result in a clearer error message.
This is as far as I got. What do you think?
~~ Ricardo
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: issues with offloading
2015-02-05 10:54 issues with offloading Ricardo Wurmus
@ 2015-02-05 22:39 ` Ludovic Courtès
0 siblings, 0 replies; 2+ messages in thread
From: Ludovic Courtès @ 2015-02-05 22:39 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: guix-devel
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:
> * lsh required
>
> The manual does not appear to mention that for offloading lsh is
> expected to be installed on the submitting host. Since I only had
> OpenSSH installed (on the local workstation and the remote server) I
> decided to redefine %lsh-command and %lshg-command:
>
> (define %lsh-command "ssh")
> (define %lshg-command "ssh")
That won’t work because the command-line options that are passed are
lsh-specific.
> When the command in these variables does not exist there is no error
> message at all. I only discovered the issue because machine-load
> returned +inf.0 for every machine in the list (defined in
> /etc/guix/machines.scm) and looped indefinitely to find a suitable
> machine.
>
> Here are some recommendations:
>
> - make %lsh-command and %lshg-command configurable or mention in the
> documentation that lsh must be available in the PATH.
Yes.
> - print an error message when "remote-pipe" fails due to not finding
> the command specified in %lsh-command / %lshg-command
Done.
However, there’s a wip-guile-ssh branch, which ideally is the future: it
uses the Guile-SSH library instead of invoking lsh. This should improve
integration and error handling.
There were issues with old versions of Guile-SSH that have been
addressed since, so we should rebase it and see how well it works.
> - only run once over the machines given in /etc/guix/machines.scm
> instead of looping indefinitely, or alternatively print the reason
> for skipping a machine (e.g. by stating that machine-load is +inf.0)
Yes.
> * does not work with unpriviledged user
[...]
> This is a problem with register-gc-root, for example. It creates a
> directory in %state-directory where an unprivileged user likely has no
> write permissions. This mkdir fails silently because register-gc-root
> does not bother checking the result of
>
> (false-if-exception (mkdir root-directory))
>
> When the root-directory (e.g. /var/guix/gcroots/tmp) cannot be created
> by the remote user running the guile script, the following (symlink
> ...) fails.
The idea was that /var/guix/gcroots/tmp would be created by the
administrator and made world-writable (similarly,
/var/guix/gcroots/profiles/per-user/$USER is writable by $USER.)
However, this is not documented and does not happen automatically.
I think this could be worked around by doing everything in a single
process on the remote side: we would run a single program there that
would take care of reporting missing store items, importing them,
performing the build, and writing the result. That way, we would no
longer need the special directory for GC roots.
Needs some more thought.
> Recommendations:
>
> - instead of sending a script to be executed by a remote Guile process
> running as the unprivileged SSH user it may make sense to bake this
> feature into the daemon. The daemon has permissions on
> %state-directory anyway, while a regular user probably shouldn't.
I don’t think this is a good idea.
> - check the return value of (false-if-exception (mkdir
> root-directory)), or do not use false-if-exception at all to fail
> right there when the directory should be created rather than failing
> when the symlink to a non-existing directory cannot be created.
> This would arguably result in a clearer error message.
I’ve improved that.
I realize there are several ways all this could be improved, most
notably: a) one remote process, b) Guile-SSH. Let’s see what we can
do.
Thanks for your feedback!
Ludo’.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-02-05 22:39 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-05 10:54 issues with offloading Ricardo Wurmus
2015-02-05 22:39 ` Ludovic Courtès
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.