From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id K7EbDdeePmL9HwAAgWs5BA (envelope-from ) for ; Sat, 26 Mar 2022 06:04:23 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id 0KF1BNeePmICLwEAG6o9tA (envelope-from ) for ; Sat, 26 Mar 2022 06:04:23 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9A1E41568B for ; Sat, 26 Mar 2022 06:04:22 +0100 (CET) Received: from localhost ([::1]:42630 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nXyb7-0007Sh-CD for larch@yhetil.org; Sat, 26 Mar 2022 01:04:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:55058) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nXyao-0007SP-JJ for bug-guix@gnu.org; Sat, 26 Mar 2022 01:04:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:58989) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nXyan-0000Vp-TL for bug-guix@gnu.org; Sat, 26 Mar 2022 01:04:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nXyan-0007El-N2 for bug-guix@gnu.org; Sat, 26 Mar 2022 01:04:01 -0400 Subject: bug#41625: Sporadic guix-offload crashes due to EOF errors Resent-From: Maxim Cournoyer Original-Sender: "Debbugs-submit" Resent-To: bug-guix@gnu.org Resent-Date: Sat, 26 Mar 2022 05:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: cc-closed 41625 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Marius Bakke Mail-Followup-To: 41625@debbugs.gnu.org, maxim.cournoyer@gmail.com, marius@gnu.org Received: via spool by 41625-done@debbugs.gnu.org id=D41625.164827102727784 (code D ref 41625); Sat, 26 Mar 2022 05:04:01 +0000 Received: (at 41625-done) by debbugs.gnu.org; 26 Mar 2022 05:03:47 +0000 Received: from localhost ([127.0.0.1]:52884 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nXyaZ-0007E4-1l for submit@debbugs.gnu.org; Sat, 26 Mar 2022 01:03:47 -0400 Received: from mail-qk1-f169.google.com ([209.85.222.169]:42938) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nXyaX-0007Ds-LX for 41625-done@debbugs.gnu.org; Sat, 26 Mar 2022 01:03:46 -0400 Received: by mail-qk1-f169.google.com with SMTP id 85so7487541qkm.9 for <41625-done@debbugs.gnu.org>; Fri, 25 Mar 2022 22:03:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=pHO6jH2C7oqrLV1AtTFrjGKXnc1yPPHEtyMNvxVvaQo=; b=GGk0tanmPQmq7F7TfUloRQGcZ0N+uv6Vgob7CF+UGx5dri7hVM7QOmgMjhYkUy1S+v LWvwh27+eiiVsYK8491qK9DGZQ7m+wI3jluMo1jiJ4GXLp6muOlrR2T592JQQ9js9lfB /F5FdXaiiN8STt/TSiAC6I06zN2wfJvgCjwKNhP7gQc/AarG72/IsVg2NqqQtpLwo5Mu LM3Q1pxQMvRkSs4nn+aKOpl3s9p4VkopILrTPcL2C4bmcRl8c4AHSS5MvyeAaHOJ18gl N2m1pXHFTmkHJ9liDhbNonXcPSKY6xA2iHzKWAxyAsi08F82Buq9EngrK8txGgkR3Gyc 2Ysg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=pHO6jH2C7oqrLV1AtTFrjGKXnc1yPPHEtyMNvxVvaQo=; b=vRY8PpJ0PvN/AROl291ICTn3iKcboqVjExf6efM+3Qn0mxphRnT62lbMUJzxxklU89 HOmVYRDiil40/VZpfItnPHAZkeeN3EI0RempGQbKFcErv7YoBksZMSt9e/Jq9rlZc+D0 MPTo6cokPs6Ekv8A4IQ/ZKREkGv5ZYGuHnq9OSktXjTcxy9+a1Zr3uXGHlb9bpELnh/n rsA3Z5L2e+0e8sQ9EmDtEQ/LtDFrQHhCzh9zdqfvcDxQqUxLzZNKkPV89d2NjEc07muE j22qUH/NiDrBIDq+ewKLtIB97kYYrOQ5ZEBEBiFfC2+4WFNxbQqwFqru/u4tKetM7txA kU5g== X-Gm-Message-State: AOAM531G8KHQA7hFSc/OhPJZWRvBh5V/ZiiyavLzTSlqS+DO3Non68bI eKZjikP4QWQibjtHM5doQYEBDc8U1Ns= X-Google-Smtp-Source: ABdhPJwN+zn0NcBXzi53pOvppV9uapWLMCFlnbt3iwlda4fESbeVJZZumVCB03nYTtXphK/QXikNew== X-Received: by 2002:a05:620a:4706:b0:67d:a135:c912 with SMTP id bs6-20020a05620a470600b0067da135c912mr9549346qkb.344.1648271020037; Fri, 25 Mar 2022 22:03:40 -0700 (PDT) Received: from hurd (dsl-157-48.b2b2c.ca. [66.158.157.48]) by smtp.gmail.com with ESMTPSA id bm21-20020a05620a199500b0067d5e6c7bd8sm4487234qkb.56.2022.03.25.22.03.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 22:03:39 -0700 (PDT) From: Maxim Cournoyer References: <87mtsky9um.fsf@gmail.com> <20210525155003.27590-1-maxim.cournoyer@gmail.com> <875yz61rvt.fsf@gnu.org> <87mtsikwsm.fsf_-_@gmail.com> <87h7ipa433.fsf@gnu.org> <8735u8l7iy.fsf@gmail.com> Date: Sat, 26 Mar 2022 01:03:37 -0400 In-Reply-To: <8735u8l7iy.fsf@gmail.com> (Maxim Cournoyer's message of "Thu, 27 May 2021 07:51:01 -0400") Message-ID: <87v8w1wbiu.fsf_-_@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 41625-done@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1648271062; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-to: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=pHO6jH2C7oqrLV1AtTFrjGKXnc1yPPHEtyMNvxVvaQo=; b=bxg8cZnq91cCueZtDy6GRmMg4T3XMHCSB7kfxZy+J3uURPCLNXvM7C0Jtm8URtTA9TB4m9 ljB2cKTZp1nkAswKjNcaqYCCcb6t7cc9eShLpgoVZCDuNPCADi3tNgz0SyDPyPAMy2vfTl JRbEM5tbD/K0rGppaQoaV0zelvih0SbFIUovgq0XKMGzveDmiutZDjOBk0ob5LhXIhSd/n yg6MB5j/v/Pq45Uu9s3JDaCF9mCjUTtPW9UFB3XxwEfZclQrAWb0CNRj7kb+inZfGo8Nbc ItIGPfrXudhVMadGrThlHT/uSPgbtIwKdgXwcE2eiE3P0lOUoLzql1LHOm+2+A== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1648271062; a=rsa-sha256; cv=none; b=DpThgkzhrmbsepeDAPSjbxs31K6kykI82m9QjkscXL+2t2T1TuvWOwQpnEl1sYq4wKbZ/L MpBzzNCdXDeYTrGQYWjsP7Q9WSCYjclHMgdwSgaAMLE7wM7qLSmG0nhKRGJ09KZmSE/Lj4 7qjCWLXmlwoUtJFiC01+WcX7/wL+6VaerDWKnp8a8Z2fBXi6HIvyMlu62Cs/6muL5zdk2A kC9VryIubUwmMM0JGtzAUNqClVd0bbfvyl9UsirgHDdEe1KQertcyuj6esJG0/by0QWe7z GdUm6+gvODJuBZJaXcLpvUJseo0D2TRsDrPFSf2uLDHf8MkR7X1EgFrHQRXi4w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20210112 header.b=GGk0tanm; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 5.93 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20210112 header.b=GGk0tanm; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 9A1E41568B X-Spam-Score: 5.93 X-Migadu-Scanner: scn1.migadu.com X-TUID: mZwPyvR6J9BN Hello, Maxim Cournoyer writes: > Hi Marius, > > Marius Bakke writes: > >> Maxim Cournoyer skriver: >> >>>> Is running =E2=80=98guix offload test /etc/guix/machines.scm overdrive= 1=E2=80=99 on >>>> berlin enough to reproduce the issue? If so, we could monitor/strace >>>> sshd on overdrive1 to get a better understanding of what=E2=80=99s goi= ng on. >>> >>> It's actually difficult to trigger it; it seems to happen mostly on the >>> first try after a long time without connecting to the machine; on the >>> 2nd and later tries, everything is smooth. Waiting a few minutes is not >>> enough to re-trigger the problem. >>> >>> I've managed to see the problem a few lucky times with: >>> >>> --8<---------------cut here---------------start------------->8--- >>> while true; do guix offload test /etc/guix/machines.scm overdrive1; done >>> --8<---------------cut here---------------end--------------->8--- >> >> I used to be able to reproduce it by inducing a high load on the target >> machine and just let Guix keep trying to connect. But now I did that, >> and set overload threshold to 0.0 for good measure, and Guix has been >> waiting patiently for two hours without failure. >> >> So AFAICT this bug has been fixed. Perhaps Berlin or the Overdrive >> simply needs to be updated? > > Ah! Do you have root access to overdrive1? It'd be interesting to > reconfigure it to update the guix-daemon and see if the problem > vanishes. Good news, this seems resolved with the newer Guile-SSH 0.15.1, where long delays to return some output no longer triggers an EOF response (instead now the client waits still). I believe it was fixed by this commit [0]. Many thanks to Artyom Poptsov for fixing it! Closing. Maxim [0] https://github.com/artyom-poptsov/guile-ssh/commit/fefaab9e925d015b01a= bc7c76ea4017c373ad895