unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Offloading acquired build slot loop
@ 2015-04-22  0:55 Les Harris
  2015-04-22 21:18 ` Ludovic Courtès
  0 siblings, 1 reply; 4+ messages in thread
From: Les Harris @ 2015-04-22  0:55 UTC (permalink / raw)
  To: guix-devel

I have the 0.8.1 release of Guix installed on my local machine and am
trying to get offloading configured so builds will occur on a build
server onto which I also have 0.8.1 installed.  Both the local machine
and the build server are x86_64-linux architectures.

I have lsh installed on the local machine and can successfully `lsh
build.server` into the build server.  On the build server, I have set
/gcroots/tmp to be world-writable.

In addition I believe I have done all the documented steps like
generating and authorizing the keys for both the local machine and build
server and have added the build server to the local machine's
machines.scm.

When Guix recognizes a build to be performed it appears to start the
offload. But that process then seems to end up in an endless loop where
it keeps saying it has acquired a build slot:

,----
| process 20852 acquired build slot
| '/usr/local/var/guix/offload/build.server.com/0'
| waiting for locks or build slots...
| process 20852 acquired build slot
| '/usr/local/var/guix/offload/build.server.com/0'
| process 20852 acquired build slot
| '/usr/local/var/guix/offload/build.server.com/0'
| process 20852 acquired build slot
| '/usr/local/var/guix/offload/build.server.com/0'
| process 20852 acquired build slot
| '/usr/local/var/guix/offload/build.server.com/0'
`----

I can go to /usr/local/var/guix/offload on the local machine and see the
following contents:

build.server.com.slots.lock machine-choice.lock build.server.com
(directory which contains a file named 0)

To my limited knowledge, this all looks to be correct.  Can anyone offer
any insight on any configuration steps I might have missed or perhaps
the location of a log that documents the offload that I could search for
an error?

-- 
Do they only stand
By ignorance, is that their happy state,
The proof of their obedience and their faith?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Offloading acquired build slot loop
  2015-04-22  0:55 Offloading acquired build slot loop Les Harris
@ 2015-04-22 21:18 ` Ludovic Courtès
  2015-05-03  8:00   ` Les Harris
  0 siblings, 1 reply; 4+ messages in thread
From: Ludovic Courtès @ 2015-04-22 21:18 UTC (permalink / raw)
  To: Les Harris; +Cc: guix-devel

Les Harris <les@lesharris.com> skribis:

> In addition I believe I have done all the documented steps like
> generating and authorizing the keys for both the local machine and build
> server and have added the build server to the local machine's
> machines.scm.

Sounds good.  This build server is the only machine listed in
machines.scm, and it really points to a different machine, right?
(Sorry for the dumb L1-support-style question, just to be sure.  ;-))

> When Guix recognizes a build to be performed it appears to start the
> offload. But that process then seems to end up in an endless loop where
> it keeps saying it has acquired a build slot:
>
> ,----
> | process 20852 acquired build slot
> | '/usr/local/var/guix/offload/build.server.com/0'
> | waiting for locks or build slots...
> | process 20852 acquired build slot
> | '/usr/local/var/guix/offload/build.server.com/0'
> | process 20852 acquired build slot
> | '/usr/local/var/guix/offload/build.server.com/0'
> | process 20852 acquired build slot
> | '/usr/local/var/guix/offload/build.server.com/0'
> | process 20852 acquired build slot
> | '/usr/local/var/guix/offload/build.server.com/0'
> `----

The offload hook can reply in 3 different ways when asked for a build
machine: reject (for instance when building for MIPS but machines.scm
doesn’t list any MIPS machine), accept, or postpone (for instance when
there’s one of more matching machine, but it’s currently unreachable or
overloaded.)

The loop here seems to indicate that you’re in the “postpone” situation:
build.server.com matches, but maybe it does not respond when the offload
hook runs “lsh build.server.com cat /proc/loadavg” to estimate its
current load (see ‘machine-load’ in offload.scm.)

Could you double-check whether this command succeeds as root and
non-interactively (no passphrase prompt):

  lsh -l USER build.server.com cat /proc/loadavg

?

TIA!

Ludo’.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Offloading acquired build slot loop
  2015-04-22 21:18 ` Ludovic Courtès
@ 2015-05-03  8:00   ` Les Harris
  2015-05-06 20:38     ` Ludovic Courtès
  0 siblings, 1 reply; 4+ messages in thread
From: Les Harris @ 2015-05-03  8:00 UTC (permalink / raw)
  To: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Could you double-check whether this command succeeds as root and
> non-interactively (no passphrase prompt):
>
>   lsh -l USER build.server.com cat /proc/loadavg

As a followup to this, this set me on the right track. Thank you!  It is
very obvious in hindsight but only my user could lsh into the build
server, root did not have the needed key.  Once that was fixed I could
proceed.

I ran into two further issues that I have resolved.

Issue 1)  The offload build failed saying there was no code for module
(guix config)  because the build server guile could not find guix in its
loadpath since I had installed guix to /usr/local   I just symlinked in
/usr/local/share/guile/site/2.0 to the right place in the /usr/share
tree and this was resolved.

Issue 2) The offload build failed with a permission denied error when
trying to access the guix daemon's socket.  The user local-guix was
using to lsh into the build server did not have write permissions (it
had read) on the socket file.  Giving that user write permissions fixed
this issue.

So combine those three additional things with my initial setup and now I
have functioning offloading.

There are many hidden assumptions in setting up the offloading that I
feel should be documented.  Are documentation patches accepted?

-- 
Do they only stand
By ignorance, is that their happy state,
The proof of their obedience and their faith?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Offloading acquired build slot loop
  2015-05-03  8:00   ` Les Harris
@ 2015-05-06 20:38     ` Ludovic Courtès
  0 siblings, 0 replies; 4+ messages in thread
From: Ludovic Courtès @ 2015-05-06 20:38 UTC (permalink / raw)
  To: Les Harris; +Cc: guix-devel

Hi,

Les Harris <les@lesharris.com> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Could you double-check whether this command succeeds as root and
>> non-interactively (no passphrase prompt):
>>
>>   lsh -l USER build.server.com cat /proc/loadavg
>
> As a followup to this, this set me on the right track. Thank you!  It is
> very obvious in hindsight but only my user could lsh into the build
> server, root did not have the needed key.  Once that was fixed I could
> proceed.

Good to know.  :-)

> I ran into two further issues that I have resolved.
>
> Issue 1)  The offload build failed saying there was no code for module
> (guix config)  because the build server guile could not find guix in its
> loadpath since I had installed guix to /usr/local   I just symlinked in
> /usr/local/share/guile/site/2.0 to the right place in the /usr/share
> tree and this was resolved.
>
> Issue 2) The offload build failed with a permission denied error when
> trying to access the guix daemon's socket.  The user local-guix was
> using to lsh into the build server did not have write permissions (it
> had read) on the socket file.  Giving that user write permissions fixed
> this issue.
>
> So combine those three additional things with my initial setup and now I
> have functioning offloading.

Great.  Well, I reckon this is a terrible user experience.  :-/

> There are many hidden assumptions in setting up the offloading that I
> feel should be documented.  Are documentation patches accepted?

Definitely, yes!

Thank you,
Ludo’.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-06 20:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-22  0:55 Offloading acquired build slot loop Les Harris
2015-04-22 21:18 ` Ludovic Courtès
2015-05-03  8:00   ` Les Harris
2015-05-06 20:38     ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).