unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
* Problems with "guix offload test" after re-install
@ 2017-10-11 19:52 Brantley, Michael
  2017-10-12  8:05 ` Ludovic Courtès
  0 siblings, 1 reply; 4+ messages in thread
From: Brantley, Michael @ 2017-10-11 19:52 UTC (permalink / raw)
  To: help-guix@gnu.org

[-- Attachment #1: Type: text/plain, Size: 6915 bytes --]

Hello,

As this is my debut posting I just wanted to start by thanking the GUIX community for all your hard work!  :-)

I've been working on a project to evaluate GUIX using a process that involves frequent rebuilds/re-installs, and following a rebuild of my testbed hosts on Monday I found that "guix offload test" no longer worked for me. It is now failing as shown below, and notably it fails with a different error message on the first attempt than the latter ones:

$ guix offload test /etc/guix/machines.scm guixbuild03.foo.bar.com
guix offload: testing 1 build machines defined in '/etc/guix/machines.scm'...
guix offload: 'guixbuild03.foo.bar.com' is running guile (GNU Guile) 2.2.2
guix offload: error: failed to use Guix module on 'guixbuild03.foo.bar.com' (test returned #<unspecified>)
$
$ guix offload test /etc/guix/machines.scm guixbuild03.foo.bar.com
guix offload: testing 1 build machines defined in '/etc/guix/machines.scm'...
guix offload: 'guixbuild03.foo.bar.com' is running guile (GNU Guile) 2.2.2
Backtrace:
           7 (primitive-load "/gnu/store/pwny051w3gd1rry5bs1vyw5r7bj...")
In guix/ui.scm:
  1384:12  6 (run-guix-command _ . _)
In ice-9/boot-9.scm:
    837:9  5 (catch srfi-34 #<procedure 26a71c0 at guix/ui.scm:460:...> ...)
    837:9  4 (catch system-error #<procedure 26a71e0 at guix/script...> ...)
In guix/scripts/offload.scm:
    615:6  3 (check-machine-availability _ _)
In srfi/srfi-1.scm:
   656:11  2 (for-each #<procedure assert-node-has-guix (node name)> ...)
In guix/scripts/offload.scm:
    547:2  1 (assert-node-has-guix #<node guixuser@guixbuild03.nyc...> ...)
In ssh/dist/node.scm:
    397:8  0 (node-eval #<node guixuser@guixbuild03.foo.bar.com...> ...)

ssh/dist/node.scm:397:8: In procedure node-eval:
ssh/dist/node.scm:397:8: Throw to key `node-repl-error' with args `("Evaluation failed" "scheme@(guile-user)> ERROR: In procedure display:\nERROR: In procedure display: Wrong type argument in position 2: #<closed: file ad5e00>" ())'.
$

Digging into this further with the help of strace I see how the "guile --listen=37146" process is getting started on the build machine by way of ssh, and I can see that guile successfully processes commands received on the first incoming TCP connection to port 37146, but that commands received in subsequent TCP connections fail with the "#<closed: file xxxxxx>" error. I was able to simulate this on the CLI by running the following on the "build-machine" host:

In one shell:

$ sudo -u guixuser guix environment
[guixuser@guixbuild03 guix]$ bash -c "nohup guile --listen=37146 0<&- &>/dev/null"

... and then from another shell on this same host:

$ telnet localhost 37146
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GNU Guile 2.2.2
Copyright (C) 1995-2017 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (begin
                       (use-modules (guix))
                       (with-store store
                         (add-text-to-store store "test"
                                            "Hello, build machine!")))
acquiring global GC lock `/var/guix/gc.lock'
acquiring read lock on `/var/guix/temproots/17043'
acquiring write lock on `/var/guix/temproots/17043'
downgrading to read lock on `/var/guix/temproots/17043'
$1 = "/gnu/store/883yjkl46dxw9mzykykmbs0yzwyxm17z-test"
scheme@(guile-user)> ,q
Connection closed by foreign host.
$
$ telnet localhost 37146
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GNU Guile 2.2.2
Copyright (C) 1995-2017 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (begin
                       (use-modules (guix))
                       (with-store store
                         (add-text-to-store store "test"
                                            "Hello, build machine!")))
ERROR: In procedure display:
ERROR: In procedure display: Wrong type argument in position 2: #<closed: file c98f50>

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>

Has anyone seen this problem? I've pored over the mailing lists and documentation and most of the similar problems involved incorrect paths and environment variables, but in this case I'm pretty certain I've got that part right, especially as this was working for me last week. Regardless, here are what I hope are the relevant variables:

[guixuser@guixbuild03 guix]$ echo $PATH
/var/guix/profiles/per-user/root/guix-profile/bin:/var/guix/profiles/per-user/root/guix-profile/sbin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
[guixuser@guixbuild03 guix]$ which guix
/var/guix/profiles/per-user/root/guix-profile/bin/guix
[guixuser@guixbuild03 guix]$ which guile
/var/guix/profiles/per-user/root/guix-profile/bin/guile
[guixuser@guixbuild03 guix]$ env | grep GUILE
GUILE_LOAD_COMPILED_PATH=/gnu/store/bcmf06k2n1pfwqkzpclvvc3w9jdfi71a-guile-json-0.6.0/lib/guile/2.2/site-ccache:/gnu/store/66dp9mxn43jkhhrpcy6v5cdqibbl4bjc-gnutls-3.5.13/lib/guile/2.2/site-ccache:/gnu/store/hxc46s4d98c9s5v8jpg7fhbsn011ch74-guile-git-0.0-3.e156a10/lib/guile/2.2/site-ccache:/gnu/store/0lv9jjhalrz65m358ynyk2366wshcd1n-guile-ssh-0.11.2/lib/guile/2.2/site-ccache:/var/guix/profiles/per-user/root/guix-profile/lib/guile/2.2/site-ccache:/var/guix/profiles/per-user/root/guix-profile/share/guile/site/2.2:/var/guix/profiles/per-user/root/guix-profile/lib/guile/2.2/site-ccache
GUILE_LOAD_PATH=/gnu/store/bcmf06k2n1pfwqkzpclvvc3w9jdfi71a-guile-json-0.6.0/share/guile/site/2.2:/gnu/store/66dp9mxn43jkhhrpcy6v5cdqibbl4bjc-gnutls-3.5.13/share/guile/site/2.2:/gnu/store/hxc46s4d98c9s5v8jpg7fhbsn011ch74-guile-git-0.0-3.e156a10/share/guile/site/2.2:/gnu/store/0lv9jjhalrz65m358ynyk2366wshcd1n-guile-ssh-0.11.2/share/guile/site/2.2:/var/guix/profiles/per-user/root/guix-profile/share/guile/site/2.2:/var/guix/profiles/per-user/root/guix-profile/share/guile/site/2.2
[guixuser@guixbuild03 guix]$ env | grep GUIX
GUIX_LOCPATH=/var/guix/profiles/per-user/root/guix-profile/lib/locale
GUIX_ENVIRONMENT=/gnu/store/49gp32rmsly4fs548b69iqvmn1gjck6k-profile
[guixuser@guixbuild03 guix]$

You'll note that I'm having "guixuser" use the default "root" user's profile - strace doesn't reveal any failed attempts to write to the root profile by "guixuser" so I'm not thinking this would be a problem, but just wanted to highlight this in case it was relevant.

Many thanks in advance for your help!
--
- Michael

[-- Attachment #2: Type: text/html, Size: 22390 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-10-13 15:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-11 19:52 Problems with "guix offload test" after re-install Brantley, Michael
2017-10-12  8:05 ` Ludovic Courtès
2017-10-12 10:02   ` Brantley, Michael
2017-10-13 15:41     ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).