From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Brantley, Michael" Subject: Problems with "guix offload test" after re-install Date: Wed, 11 Oct 2017 19:52:55 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="_000_a5ea58535cf54dfa8b8472ee7b0e2a51mbxtoa2winmaildeshawcom_" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:44166) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e2N40-0001r6-Tf for help-guix@gnu.org; Wed, 11 Oct 2017 15:53:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e2N3t-0007O6-HL for help-guix@gnu.org; Wed, 11 Oct 2017 15:53:08 -0400 Received: from nat61-nat1.fw1.nyc.shaw.net ([205.231.104.61]:33428 helo=deshaw.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1e2N3t-0007Ng-9T for help-guix@gnu.org; Wed, 11 Oct 2017 15:53:01 -0400 Received: from deshaw.com (localhost [127.0.0.1]) by squire.dr.deshaw.com (Postfix) with ESMTPS id 223BB86B4AB2 for ; Wed, 11 Oct 2017 15:53:00 -0400 (EDT) Received: from mbxtoa1.winmail.deshaw.com (mbxtoa1.winmail.deshaw.com [10.219.75.10]) by squire.dr.deshaw.com (Postfix) with ESMTPS id 20BE186B4AAE for ; Wed, 11 Oct 2017 15:53:00 -0400 (EDT) Content-Language: en-US List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-guix-bounces+gcggh-help-guix=m.gmane.org@gnu.org Sender: "Help-Guix" To: "help-guix@gnu.org" --_000_a5ea58535cf54dfa8b8472ee7b0e2a51mbxtoa2winmaildeshawcom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello, As this is my debut posting I just wanted to start by thanking the GUIX com= munity for all your hard work! :-) I've been working on a project to evaluate GUIX using a process that involv= es frequent rebuilds/re-installs, and following a rebuild of my testbed hos= ts on Monday I found that "guix offload test" no longer worked for me. It i= s now failing as shown below, and notably it fails with a different error m= essage on the first attempt than the latter ones: $ guix offload test /etc/guix/machines.scm guixbuild03.foo.bar.com guix offload: testing 1 build machines defined in '/etc/guix/machines.scm'.= .. guix offload: 'guixbuild03.foo.bar.com' is running guile (GNU Guile) 2.2.2 guix offload: error: failed to use Guix module on 'guixbuild03.foo.bar.com'= (test returned #) $ $ guix offload test /etc/guix/machines.scm guixbuild03.foo.bar.com guix offload: testing 1 build machines defined in '/etc/guix/machines.scm'.= .. guix offload: 'guixbuild03.foo.bar.com' is running guile (GNU Guile) 2.2.2 Backtrace: 7 (primitive-load "/gnu/store/pwny051w3gd1rry5bs1vyw5r7bj...") In guix/ui.scm: 1384:12 6 (run-guix-command _ . _) In ice-9/boot-9.scm: 837:9 5 (catch srfi-34 # ...= ) 837:9 4 (catch system-error # ...= ) In guix/scripts/offload.scm: 615:6 3 (check-machine-availability _ _) In srfi/srfi-1.scm: 656:11 2 (for-each # ...) In guix/scripts/offload.scm: 547:2 1 (assert-node-has-guix # ...) In ssh/dist/node.scm: 397:8 0 (node-eval # ...) ssh/dist/node.scm:397:8: In procedure node-eval: ssh/dist/node.scm:397:8: Throw to key `node-repl-error' with args `("Evalua= tion failed" "scheme@(guile-user)> ERROR: In procedure display:\nERROR: In = procedure display: Wrong type argument in position 2: #" ())'. $ Digging into this further with the help of strace I see how the "guile --li= sten=3D37146" process is getting started on the build machine by way of ssh= , and I can see that guile successfully processes commands received on the = first incoming TCP connection to port 37146, but that commands received in = subsequent TCP connections fail with the "#" error. I = was able to simulate this on the CLI by running the following on the "build= -machine" host: In one shell: $ sudo -u guixuser guix environment [guixuser@guixbuild03 guix]$ bash -c "nohup guile --listen=3D37146 0<&- &>/= dev/null" ... and then from another shell on this same host: $ telnet localhost 37146 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GNU Guile 2.2.2 Copyright (C) 1995-2017 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> (begin (use-modules (guix)) (with-store store (add-text-to-store store "test" "Hello, build machine!"))) acquiring global GC lock `/var/guix/gc.lock' acquiring read lock on `/var/guix/temproots/17043' acquiring write lock on `/var/guix/temproots/17043' downgrading to read lock on `/var/guix/temproots/17043' $1 =3D "/gnu/store/883yjkl46dxw9mzykykmbs0yzwyxm17z-test" scheme@(guile-user)> ,q Connection closed by foreign host. $ $ telnet localhost 37146 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GNU Guile 2.2.2 Copyright (C) 1995-2017 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> (begin (use-modules (guix)) (with-store store (add-text-to-store store "test" "Hello, build machine!"))) ERROR: In procedure display: ERROR: In procedure display: Wrong type argument in position 2: # Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. scheme@(guile-user) [1]> Has anyone seen this problem? I've pored over the mailing lists and documen= tation and most of the similar problems involved incorrect paths and enviro= nment variables, but in this case I'm pretty certain I've got that part rig= ht, especially as this was working for me last week. Regardless, here are w= hat I hope are the relevant variables: [guixuser@guixbuild03 guix]$ echo $PATH /var/guix/profiles/per-user/root/guix-profile/bin:/var/guix/profiles/per-us= er/root/guix-profile/sbin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin [guixuser@guixbuild03 guix]$ which guix /var/guix/profiles/per-user/root/guix-profile/bin/guix [guixuser@guixbuild03 guix]$ which guile /var/guix/profiles/per-user/root/guix-profile/bin/guile [guixuser@guixbuild03 guix]$ env | grep GUILE GUILE_LOAD_COMPILED_PATH=3D/gnu/store/bcmf06k2n1pfwqkzpclvvc3w9jdfi71a-guil= e-json-0.6.0/lib/guile/2.2/site-ccache:/gnu/store/66dp9mxn43jkhhrpcy6v5cdqi= bbl4bjc-gnutls-3.5.13/lib/guile/2.2/site-ccache:/gnu/store/hxc46s4d98c9s5v8= jpg7fhbsn011ch74-guile-git-0.0-3.e156a10/lib/guile/2.2/site-ccache:/gnu/sto= re/0lv9jjhalrz65m358ynyk2366wshcd1n-guile-ssh-0.11.2/lib/guile/2.2/site-cca= che:/var/guix/profiles/per-user/root/guix-profile/lib/guile/2.2/site-ccache= :/var/guix/profiles/per-user/root/guix-profile/share/guile/site/2.2:/var/gu= ix/profiles/per-user/root/guix-profile/lib/guile/2.2/site-ccache GUILE_LOAD_PATH=3D/gnu/store/bcmf06k2n1pfwqkzpclvvc3w9jdfi71a-guile-json-0.= 6.0/share/guile/site/2.2:/gnu/store/66dp9mxn43jkhhrpcy6v5cdqibbl4bjc-gnutls= -3.5.13/share/guile/site/2.2:/gnu/store/hxc46s4d98c9s5v8jpg7fhbsn011ch74-gu= ile-git-0.0-3.e156a10/share/guile/site/2.2:/gnu/store/0lv9jjhalrz65m358ynyk= 2366wshcd1n-guile-ssh-0.11.2/share/guile/site/2.2:/var/guix/profiles/per-us= er/root/guix-profile/share/guile/site/2.2:/var/guix/profiles/per-user/root/= guix-profile/share/guile/site/2.2 [guixuser@guixbuild03 guix]$ env | grep GUIX GUIX_LOCPATH=3D/var/guix/profiles/per-user/root/guix-profile/lib/locale GUIX_ENVIRONMENT=3D/gnu/store/49gp32rmsly4fs548b69iqvmn1gjck6k-profile [guixuser@guixbuild03 guix]$ You'll note that I'm having "guixuser" use the default "root" user's profil= e - strace doesn't reveal any failed attempts to write to the root profile = by "guixuser" so I'm not thinking this would be a problem, but just wanted = to highlight this in case it was relevant. Many thanks in advance for your help! -- - Michael --_000_a5ea58535cf54dfa8b8472ee7b0e2a51mbxtoa2winmaildeshawcom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hello,

 

As this is my debut posting I just wanted to start b= y thanking the GUIX community for all your hard work!  :-)<= /p>

 

I’ve been working on a project to evaluate GUI= X using a process that involves frequent rebuilds/re-installs, and followin= g a rebuild of my testbed hosts on Monday I found that “guix offload = test” no longer worked for me. It is now failing as shown below, and notably it fails with a different error message on the= first attempt than the latter ones:

 

$ guix offload test /etc/guix/machines.scm g= uixbuild03.foo.bar.com

guix offload: testing 1 build machines defin= ed in '/etc/guix/machines.scm'...

guix offload: 'guixbuild03.foo.bar.com' is r= unning guile (GNU Guile) 2.2.2

guix offload: error: failed to use Guix modu= le on 'guixbuild03.foo.bar.com' (test returned #<unspecified>)

$

$ guix offload test /etc/guix/machines.scm g= uixbuild03.foo.bar.com

guix offload: testing 1 build machines defin= ed in '/etc/guix/machines.scm'...

guix offload: 'guixbuild03.foo.bar.com' is r= unning guile (GNU Guile) 2.2.2

Backtrace:

       &n= bsp;   7 (primitive-load "/gnu/store/pwny051w3gd1rry5bs1vyw5= r7bj…")

In guix/ui.scm:

  1384:12  6 (run-guix-command _ .= _)

In ice-9/boot-9.scm:

    837:9  5 (catch srfi= -34 #<procedure 26a71c0 at guix/ui.scm:460:…> …)

    837:9  4 (catch syst= em-error #<procedure 26a71e0 at guix/script…> …)

In guix/scripts/offload.scm:

    615:6  3 (check-mach= ine-availability _ _)

In srfi/srfi-1.scm:

   656:11  2 (for-each #<p= rocedure assert-node-has-guix (node name)> …)

In guix/scripts/offload.scm:

    547:2  1 (assert-nod= e-has-guix #<node guixuser@guixbuild03.nyc…> …)

In ssh/dist/node.scm:

    397:8  0 (node-eval = #<node guixuser@guixbuild03.foo.bar.com…> …)

 

ssh/dist/node.scm:397:8: In procedure node-e= val:

ssh/dist/node.scm:397:8: Throw to key `node-= repl-error' with args `("Evaluation failed" "scheme@(guile-u= ser)> ERROR: In procedure display:\nERROR: In procedure display: Wrong type argument in position 2: #<closed: file ad5e00>&q= uot; ())'.

$

 

Digging into this further with the help of strace I = see how the "guile --listen=3D37146" process is getting started o= n the build machine by way of ssh, and I can see that guile successfully pr= ocesses commands received on the first incoming TCP connection to port 37146, but that commands received in subsequent TCP= connections fail with the "#<closed: file xxxxxx>" error. = I was able to simulate this on the CLI by running the following on the &quo= t;build-machine" host:

 

In one shell:

 

$ sudo -u guixuser guix environment

[guixuser@guixbuild03 guix]$ bash -c "n= ohup guile --listen=3D37146 0<&- &>/dev/null"=

 

... and then from another shell on this same host:

 

$ telnet localhost 37146

Trying 127.0.0.1...

Connected to localhost.

Escape character is '^]'.<= /p>

GNU Guile 2.2.2

Copyright (C) 1995-2017 Free Software Founda= tion, Inc.

 

Guile comes with ABSOLUTELY NO WARRANTY; for= details type `,show w'.

This program is free software, and you are w= elcome to redistribute it

under certain conditions; type `,show c' for= details.

 

Enter `,help' for help.

scheme@(guile-user)> (begin

       &n= bsp;            &nbs= p;  (use-modules (guix))

       &n= bsp;            = ;   (with-store store

       &n= bsp;            = ;     (add-text-to-store store "test"

       &n= bsp;            = ;            &n= bsp;           "Hell= o, build machine!")))

acquiring global GC lock `/var/guix/gc.lock'=

acquiring read lock on `/var/guix/temproots/= 17043'

acquiring write lock on `/var/guix/temproots= /17043'

downgrading to read lock on `/var/guix/tempr= oots/17043'

$1 =3D "/gnu/store/883yjkl46dxw9mzykykm= bs0yzwyxm17z-test"

scheme@(guile-user)> ,q=

Connection closed by foreign host.

$

$ telnet localhost 37146

Trying 127.0.0.1...

Connected to localhost.

Escape character is '^]'.<= /p>

GNU Guile 2.2.2

Copyright (C) 1995-2017 Free Software Founda= tion, Inc.

 

Guile comes with ABSOLUTELY NO WARRANTY; for= details type `,show w'.

This program is free software, and you are w= elcome to redistribute it

under certain conditions; type `,show c' for= details.

 

Enter `,help' for help.

scheme@(guile-user)> (begin

       &n= bsp;            &nbs= p;  (use-modules (guix))

       &n= bsp;            = ;   (with-store store

       &n= bsp;            = ;     (add-text-to-store store "test"

       &n= bsp;            = ;            &n= bsp;           "Hell= o, build machine!")))

ERROR: In procedure display:

ERROR: In procedure display: Wrong type argu= ment in position 2: #<closed: file c98f50>

 

Entering a new prompt.  Type `,bt' for = a backtrace or `,q' to continue.

scheme@(guile-user) [1]>

 

Has anyone seen this problem? I've pored over the ma= iling lists and documentation and most of the similar problems involved inc= orrect paths and environment variables, but in this case I'm pretty certain= I've got that part right, especially as this was working for me last week. Regardless, here are what I hope are= the relevant variables:

 

[guixuser@guixbuild03 guix]$ echo $PATH=

/var/guix/profiles/per-user/root/guix-profil= e/bin:/var/guix/profiles/per-user/root/guix-profile/sbin:/usr/local/bin:/us= r/bin:/usr/local/sbin:/usr/sbin

[guixuser@guixbuild03 guix]$ which guix=

/var/guix/profiles/per-user/root/guix-profil= e/bin/guix

[guixuser@guixbuild03 guix]$ which guile

/var/guix/profiles/per-user/root/guix-profil= e/bin/guile

[guixuser@guixbuild03 guix]$ env | grep GUIL= E

GUILE_LOAD_COMPILED_PATH=3D/gnu/store/bcmf06= k2n1pfwqkzpclvvc3w9jdfi71a-guile-json-0.6.0/lib/guile/2.2/site-ccache:/gnu/= store/66dp9mxn43jkhhrpcy6v5cdqibbl4bjc-gnutls-3.5.13/lib/guile/2.2/site-cca= che:/gnu/store/hxc46s4d98c9s5v8jpg7fhbsn011ch74-guile-git-0.0-3.e156a10/lib= /guile/2.2/site-ccache:/gnu/store/0lv9jjhalrz65m358ynyk2366wshcd1n-guile-ss= h-0.11.2/lib/guile/2.2/site-ccache:/var/guix/profiles/per-user/root/guix-pr= ofile/lib/guile/2.2/site-ccache:/var/guix/profiles/per-user/root/guix-profi= le/share/guile/site/2.2:/var/guix/profiles/per-user/root/guix-profile/lib/g= uile/2.2/site-ccache

GUILE_LOAD_PATH=3D/gnu/store/bcmf06k2n1pfwqk= zpclvvc3w9jdfi71a-guile-json-0.6.0/share/guile/site/2.2:/gnu/store/66dp9mxn= 43jkhhrpcy6v5cdqibbl4bjc-gnutls-3.5.13/share/guile/site/2.2:/gnu/store/hxc4= 6s4d98c9s5v8jpg7fhbsn011ch74-guile-git-0.0-3.e156a10/share/guile/site/2.2:/= gnu/store/0lv9jjhalrz65m358ynyk2366wshcd1n-guile-ssh-0.11.2/share/guile/sit= e/2.2:/var/guix/profiles/per-user/root/guix-profile/share/guile/site/2.2:/v= ar/guix/profiles/per-user/root/guix-profile/share/guile/site/2.2=

[guixuser@guixbuild03 guix]$ env | grep GUIX=

GUIX_LOCPATH=3D/var/guix/profiles/per-user/r= oot/guix-profile/lib/locale

GUIX_ENVIRONMENT=3D/gnu/store/49gp32rmsly4fs= 548b69iqvmn1gjck6k-profile

[guixuser@guixbuild03 guix]$

 

You’ll note that I’m having “guixu= ser” use the default “root” user’s profile – = strace doesn’t reveal any failed attempts to write to the root profil= e by “guixuser” so I’m not thinking this would be a probl= em, but just wanted to highlight this in case it was relevant.

 

Many thanks in advance for your help!

--

- Michael

--_000_a5ea58535cf54dfa8b8472ee7b0e2a51mbxtoa2winmaildeshawcom_--