unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* issues with offloading
@ 2015-02-05 10:54 Ricardo Wurmus
  2015-02-05 22:39 ` Ludovic Courtès
  0 siblings, 1 reply; 2+ messages in thread
From: Ricardo Wurmus @ 2015-02-05 10:54 UTC (permalink / raw)
  To: guix-devel

Hi Guix,

I'm trying to set up offloading such that I can have a powerful build
host compile stuff for me when my workstation isn't strong enough.  I
bumped into a couple of issues while doing this, prompting me to write
this email to discuss what might be changed to improve this.

* lsh required

  The manual does not appear to mention that for offloading lsh is
  expected to be installed on the submitting host.  Since I only had
  OpenSSH installed (on the local workstation and the remote server) I
  decided to redefine %lsh-command and %lshg-command:

    (define %lsh-command "ssh")
    (define %lshg-command "ssh")

  When the command in these variables does not exist there is no error
  message at all.  I only discovered the issue because machine-load
  returned +inf.0 for every machine in the list (defined in
  /etc/guix/machines.scm) and looped indefinitely to find a suitable
  machine.

  Here are some recommendations:

  - make %lsh-command and %lshg-command configurable or mention in the
    documentation that lsh must be available in the PATH.

  - print an error message when "remote-pipe" fails due to not finding
    the command specified in %lsh-command / %lshg-command

  - only run once over the machines given in /etc/guix/machines.scm
    instead of looping indefinitely, or alternatively print the reason
    for skipping a machine (e.g. by stating that machine-load is +inf.0)

* does not work with unpriviledged user

  I assumed that all I needed was an SSH key for an unprivileged user on
  the remote machine in order to log on to the remote build host and
  talk to the local guix-daemon there.  However, we actually run Guile
  scripts on the remote instead of letting the privileged daemon perform
  known-to-be-safe commands.

  This is a problem with register-gc-root, for example.  It creates a
  directory in %state-directory where an unprivileged user likely has no
  write permissions.  This mkdir fails silently because register-gc-root
  does not bother checking the result of

    (false-if-exception (mkdir root-directory))

  When the root-directory (e.g. /var/guix/gcroots/tmp) cannot be created
  by the remote user running the guile script, the following (symlink
  ...) fails.

  Recommendations:

  - instead of sending a script to be executed by a remote Guile process
    running as the unprivileged SSH user it may make sense to bake this
    feature into the daemon.  The daemon has permissions on
    %state-directory anyway, while a regular user probably shouldn't.

  - check the return value of (false-if-exception (mkdir
    root-directory)), or do not use false-if-exception at all to fail
    right there when the directory should be created rather than failing
    when the symlink to a non-existing directory cannot be created.
    This would arguably result in a clearer error message.

This is as far as I got.  What do you think?

~~ Ricardo

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: issues with offloading
  2015-02-05 10:54 issues with offloading Ricardo Wurmus
@ 2015-02-05 22:39 ` Ludovic Courtès
  0 siblings, 0 replies; 2+ messages in thread
From: Ludovic Courtès @ 2015-02-05 22:39 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

> * lsh required
>
>   The manual does not appear to mention that for offloading lsh is
>   expected to be installed on the submitting host.  Since I only had
>   OpenSSH installed (on the local workstation and the remote server) I
>   decided to redefine %lsh-command and %lshg-command:
>
>     (define %lsh-command "ssh")
>     (define %lshg-command "ssh")

That won’t work because the command-line options that are passed are
lsh-specific.

>   When the command in these variables does not exist there is no error
>   message at all.  I only discovered the issue because machine-load
>   returned +inf.0 for every machine in the list (defined in
>   /etc/guix/machines.scm) and looped indefinitely to find a suitable
>   machine.
>
>   Here are some recommendations:
>
>   - make %lsh-command and %lshg-command configurable or mention in the
>     documentation that lsh must be available in the PATH.

Yes.

>   - print an error message when "remote-pipe" fails due to not finding
>     the command specified in %lsh-command / %lshg-command

Done.

However, there’s a wip-guile-ssh branch, which ideally is the future: it
uses the Guile-SSH library instead of invoking lsh.  This should improve
integration and error handling.

There were issues with old versions of Guile-SSH that have been
addressed since, so we should rebase it and see how well it works.

>   - only run once over the machines given in /etc/guix/machines.scm
>     instead of looping indefinitely, or alternatively print the reason
>     for skipping a machine (e.g. by stating that machine-load is +inf.0)

Yes.

> * does not work with unpriviledged user

[...]

>   This is a problem with register-gc-root, for example.  It creates a
>   directory in %state-directory where an unprivileged user likely has no
>   write permissions.  This mkdir fails silently because register-gc-root
>   does not bother checking the result of
>
>     (false-if-exception (mkdir root-directory))
>
>   When the root-directory (e.g. /var/guix/gcroots/tmp) cannot be created
>   by the remote user running the guile script, the following (symlink
>   ...) fails.

The idea was that /var/guix/gcroots/tmp would be created by the
administrator and made world-writable (similarly,
/var/guix/gcroots/profiles/per-user/$USER is writable by $USER.)

However, this is not documented and does not happen automatically.

I think this could be worked around by doing everything in a single
process on the remote side: we would run a single program there that
would take care of reporting missing store items, importing them,
performing the build, and writing the result.  That way, we would no
longer need the special directory for GC roots.

Needs some more thought.

>   Recommendations:
>
>   - instead of sending a script to be executed by a remote Guile process
>     running as the unprivileged SSH user it may make sense to bake this
>     feature into the daemon.  The daemon has permissions on
>     %state-directory anyway, while a regular user probably shouldn't.

I don’t think this is a good idea.

>   - check the return value of (false-if-exception (mkdir
>     root-directory)), or do not use false-if-exception at all to fail
>     right there when the directory should be created rather than failing
>     when the symlink to a non-existing directory cannot be created.
>     This would arguably result in a clearer error message.

I’ve improved that.

I realize there are several ways all this could be improved, most
notably: a) one remote process, b) Guile-SSH.  Let’s see what we can
do.

Thanks for your feedback!

Ludo’.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-02-05 22:39 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-05 10:54 issues with offloading Ricardo Wurmus
2015-02-05 22:39 ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).