From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxim Cournoyer Subject: bug#34033: Offloading sometimes hangs Date: Fri, 21 Feb 2020 23:37:06 -0500 Message-ID: <87wo8fqlu5.fsf@apteryx.i-did-not-set--mail-host-address--so-tickle-me> References: <87o98obikk.fsf@gnu.org> <87fttuq2mz.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:57446) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j5MYF-0000vQ-8t for bug-guix@gnu.org; Fri, 21 Feb 2020 23:38:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j5MYE-00033e-85 for bug-guix@gnu.org; Fri, 21 Feb 2020 23:38:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:42541) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j5MYE-00033X-55 for bug-guix@gnu.org; Fri, 21 Feb 2020 23:38:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j5MYE-000270-1l for bug-guix@gnu.org; Fri, 21 Feb 2020 23:38:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87fttuq2mz.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 14 Jan 2019 23:45:56 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane-mx.org@gnu.org Sender: "bug-Guix" To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 34033@debbugs.gnu.org Hello Ludovic, Ludovic Court=C3=A8s writes: > Hello, > > Ludovic Court=C3=A8s skribis: > >> A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP >> instead of just POLLIN. > > Reported here: > > https://www.libssh.org/archive/libssh/2019-01/0000000.html > > A fix has been proposed by upstream and should be committed shortly. > >> Additionally, we could change Guile-SSH so that we can specify a timeout >> when reading from a channel. > > Turns out we can set a per-session timeout, which we already do (see > #:timeout in =E2=80=98open-ssh-session=E2=80=99 in (guix scripts offload)= ) but > =E2=80=98ssh_channel_read=E2=80=99 would ignore it and instead pass an in= finite timeout > to poll(2): > > https://www.libssh.org/archive/libssh/2019-01/0000001.html > > This issue happens to be fixed in libssh 0.8.x, so I upgraded our libssh > package in commit a8b0556ea1e439c89dc1ba33c8864e8b9b811f08. > > (That still doesn=E2=80=99t tell us why our =E2=80=98guix offload=E2=80= =99 processes would > occasionally be stuck but at least it ensures the build farm keeps > making progress even when that happens.) > > Ludo=E2=80=99. Seems the patch in the response at the URL you linked is awaiting some feedback/review. Is this the reason 'guix substitute' hangs for so long when the substitute server is down? (like 1 minute or so). Maxim