From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#37762: =?UTF-8?Q?=E2=80=98guix_?= =?UTF-8?Q?offload=E2=80=99?= sets too short a timeout Date: Tue, 15 Oct 2019 12:22:04 +0200 Message-ID: <87lftmwboz.fsf@inria.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:44709) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iKJyp-0002YW-Fo for bug-guix@gnu.org; Tue, 15 Oct 2019 06:23:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iKJyo-0007Fr-8A for bug-guix@gnu.org; Tue, 15 Oct 2019 06:23:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:33956) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iKJyo-0007Fl-4i for bug-guix@gnu.org; Tue, 15 Oct 2019 06:23:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1iKJyn-0003LG-UQ for bug-guix@gnu.org; Tue, 15 Oct 2019 06:23:01 -0400 Sender: "Debbugs-submit" Resent-Message-ID: Received: from eggs.gnu.org ([2001:470:142:3::10]:44658) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iKJxz-0001qi-Kb for bug-Guix@gnu.org; Tue, 15 Oct 2019 06:22:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iKJxx-0006k0-8J for bug-Guix@gnu.org; Tue, 15 Oct 2019 06:22:11 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:47193) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iKJxw-0006iQ-U2 for bug-Guix@gnu.org; Tue, 15 Oct 2019 06:22:09 -0400 List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: bug-Guix@gnu.org Hello Guix, In (guix scripts offload) the SSH session is created like this: (make-session #:user (build-machine-user machine) #:host (build-machine-name machine) #:port (build-machine-port machine) #:timeout 10 ;seconds ;; =E2=80=A6 ) What this means is that any connect(2), read(2), or write(2) call on the underlying file descriptors that takes more than 10 seconds is interpreted as EOF (at least on the Scheme side when reading from a channel port; on the C side we might be able to distinguish.) This was fine with libssh < 0.9.0 because that timeout was not honored when reading from a channel due to a bug they fixed in libssh commit e4e51ccc1340e313c203842d0180a1c4e33c95cc. libssh 0.9.0, added in Guix commit 44941fd7dbc77a7bf84a9be63a309eca3ffdc1c2, contains this bug fix, meaning that the 10s session timeout is actually honored now. So in practice, if you offload a build process and that process remains silent for 10s (which is not that much!), then =E2=80=98guix offload=E2=80= =99 thinks it=E2=80=99s done and (confusingly) goes on to fetch the result from the bu= ild machine, which is of course unavailable. The end result is an equally confusing error message like this (the last two lines): --8<---------------cut here---------------start------------->8--- starting phase `bootstrap' running './autogen.sh' patch-shebang: ./autogen.sh: changing `/bin/sh' to `/gnu/store/iql3p5zvz0nw= csckdpywdkqxccx95ygx-bash-minimal-5.0.7/bin/sh' autoreconf: Entering directory `.' autoreconf: configure.ac: not using Gettext autoreconf: running: aclocal -I config/m4 /gnu/store/iql3p5zvz0nwcsckdpywdkqxccx95ygx-bash-minimal-5.0.7/bin/sh: git:= command not found guix offload: error: corrupt input while restoring archive from # guix build: error: build of `/gnu/store/dpz058x83sc7y1krpkdn84b45vl5p9cz-uc= x-1.6.1.drv' failed --8<---------------cut here---------------end--------------->8--- Working on a bug fix=E2=80=A6 Ludo=E2=80=99.