From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id +GXBNVXObGP5DgAAbAwnHQ (envelope-from ) for ; Thu, 10 Nov 2022 11:11:33 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id KFzrNVXObGNLNwEA9RJhRA (envelope-from ) for ; Thu, 10 Nov 2022 11:11:33 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 89CB6BC81 for ; Thu, 10 Nov 2022 11:11:29 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ot4St-0003F1-PM; Thu, 10 Nov 2022 05:07:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ot4Se-0002xa-It for bug-guix@gnu.org; Thu, 10 Nov 2022 05:07:09 -0500 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ot4Sc-0006Ws-BV for bug-guix@gnu.org; Thu, 10 Nov 2022 05:07:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ot4Sb-0007JG-TB for bug-guix@gnu.org; Thu, 10 Nov 2022 05:07:01 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#58290: guile ssh error on guix deploy Resent-From: Andrew Tropin Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Thu, 10 Nov 2022 10:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58290 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Marius Bakke , Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: poptsov.artyom@gmail.com, 58290@debbugs.gnu.org Received: via spool by 58290-submit@debbugs.gnu.org id=B58290.166807481128082 (code B ref 58290); Thu, 10 Nov 2022 10:07:01 +0000 Received: (at 58290) by debbugs.gnu.org; 10 Nov 2022 10:06:51 +0000 Received: from localhost ([127.0.0.1]:41951 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ot4SQ-0007Ir-9I for submit@debbugs.gnu.org; Thu, 10 Nov 2022 05:06:50 -0500 Received: from relay4-d.mail.gandi.net ([217.70.183.196]:53349) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ot4SN-0007Id-Gf for 58290@debbugs.gnu.org; Thu, 10 Nov 2022 05:06:48 -0500 Received: (Authenticated sender: andrew@trop.in) by mail.gandi.net (Postfix) with ESMTPSA id 84F5CE000F; Thu, 10 Nov 2022 10:06:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=trop.in; s=gm1; t=1668074801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zmZbfLD4i93CGK3gha5lI+KdcCa4HqSZlxwN1iDFmQg=; b=G1CP2nRJ0VFICM8ExccA0dy9eMSTWBZWlfEiyFYHPg1QYb0Hvbfl/5zuPzpvB9V4DkL8kX 057Rko/6wRPjGuLyHMvhdfux/t0AvHKXKRuAWEZ2PcLbpjwYnBhfdBZj/GBu5L+P3f8wWI XYeMBqTuKdxmybfLRRwlb7Zf5NKjCQtDpOIU/wGiu5fU60JLzohZEJVLnnURuY7q2tQnbV HL9wYAnoyqFFxEA3+yB+u74jtgvO0QbgiVTsvp30A2WhZ+oFDGQjJAESCEMIstzuJ4bUGl Dvs7pp4yz02prxXCR862hb5wrPfdwri2NsrA/HextVaQahCYNfeVmP0fwEdwXw== From: Andrew Tropin In-Reply-To: <875yfv38jt.fsf@gnu.org> References: <87sfk34zhj.fsf@trop.in> <87bkqm8p5z.fsf@gnu.org> <87o7uec39j.fsf@trop.in> <8735bmsee9.fsf@gnu.org> <87h702uzrb.fsf@gnu.org> <87y1tdkv1d.fsf@gnu.org> <875yfv38jt.fsf@gnu.org> Date: Thu, 10 Nov 2022 14:06:33 +0400 Message-ID: <87h6z75djq.fsf@trop.in> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: bug-guix-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1668075089; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:in-reply-to:in-reply-to:references:references: list-id:list-help:list-unsubscribe:list-subscribe:list-post: dkim-signature; bh=zmZbfLD4i93CGK3gha5lI+KdcCa4HqSZlxwN1iDFmQg=; b=r56iC2aNQA19z21OzVYObcCSPlBQF6a/POk27V0cajMEr9ChH+dMJEt8KDWAQpd9UCYpOg dsmFT1Co3piCoNRkcSVeK+TSd4dLrIUz16kiXEpAAs/I9nGXC9bdcfaOJu1XypDrOehNb+ Qc+SX0QuoPmY3rp7XHDSRQIMwqnnq3pRTww5BRTlZxHHnGyFKYj6WMJAZ09SDWgrsFVmvP QAVYAOCFtNwFAWQ9iAi7knN/n+J53i3rE+vgvvOKtLBCdBvbCQLp0r9AltuW+BQyEONryg 0hqJRQ297TFoyeY/2i70sgxnQ7j2GobpOOHxwAhCZbNC/KRGAvpDH5lx/Tv99Q== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1668075089; a=rsa-sha256; cv=none; b=TLNSVRYQrClnsjA4YFx4+/uK2TyTqIEbbyZBVkSqpA8t5KcYYSqtKqyqvy+13ToNU3CkPl EWXTcoCR7OMw3M12MQ1wr/UmqzVS51eYTH7rcLUz+Gn7CuLS8i/qUmYCUAb+BTG05oSOjz RCONKkVyWcdlKuOmoo5DtP3yZFwR36oikDaWZH+4JGiVo9dvhwed69mpJ+2bIzR/8Z6rqE 5MzklJ+pWlLhVV8WURNKMWH4AVBRDxLtsbLX1envNB4p7HdGzhnuGIGJBDZjCXPIhX+YWU Q6RxI7jSW4CFl1tVuOWkD1mfVgFrVxv1RrLXMRuIQOV2RGDvzlfOZc1UvyivDg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=trop.in header.s=gm1 header.b=G1CP2nRJ; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 0.28 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=trop.in header.s=gm1 header.b=G1CP2nRJ; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 89CB6BC81 X-Spam-Score: 0.28 X-Migadu-Scanner: scn1.migadu.com X-TUID: xeWE9INxeeHU --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 2022-11-04 00:47, Marius Bakke wrote: > Ludovic Court=C3=A8s skriver: > >> Hi, >> >> Marius Bakke skribis: >> >>> I have the same problem. Log messages around failure look something >>> like this (extracted from above message): >>> >>> Oct 4 11:51:49 localhost shepherd[1]: Service sshd-2 has been starte= d.=20 >>> Oct 4 11:51:50 localhost sshd[250]: Accepted publickey for bob from = 185.70.53.164 port 1915 ssh2: RSA SHA256:/X3jyhyMizAOKOjCfXK5cLN3Pa0vmi7dbQ= G+fxK3d3Y >>> Oct 4 11:52:28 localhost sshd[252]: error: no more sessions >>> Oct 4 11:52:28 localhost shepherd[1]: 1 connection still in use afte= r sshd-2 termination.=20 >>> Oct 4 11:52:28 localhost shepherd[1]: Service sshd-2 has been disabl= ed.=20 >>> >>> Then deploy crashes with 'Channel opening failure'. >> >> =E2=80=9Cno more sessions=E2=80=9D comes from this: >> >> --8<---------------cut here---------------start------------->8--- >> int >> session_open(Authctxt *authctxt, int chanid) >> { >> Session *s =3D session_new(); >> debug("session_open: channel %d", chanid); >> if (s =3D=3D NULL) { >> error("no more sessions"); >> return 0; >> } >> >> [...] >> >> } >> --8<---------------cut here---------------end--------------->8--- >> >> Would be interesting to run sshd in verbose/debug mode and see why we >> hit that; it could be because the maximum number of sessions has been >> reached or something. > > I enabled (log-level 'debug) and this is what happens server side: > > Nov 3 21:48:25 capella sshd[26345]: debug1: permanently_set_uid: 0/0 > Nov 3 21:50:27 capella sshd[26115]: debug1: Received SIGCHLD. > Nov 3 21:50:27 capella sshd[26115]: debug1: session_by_pid: pid 26345 > Nov 3 21:50:27 capella sshd[26115]: debug1: session_exit_message: sessio= n 8 channel 8 pid 26345 > Nov 3 21:50:27 capella sshd[26115]: debug1: session_exit_message: releas= e channel 8 > Nov 3 21:50:28 capella sshd[26115]: debug1: server_input_channel_open: c= type session rchan 65 win 64000 max 32768 > Nov 3 21:50:28 capella sshd[26115]: debug1: input_session_request > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 0: new [server-sessi= on] > Nov 3 21:50:28 capella sshd[26115]: debug1: session_new: session 0 > Nov 3 21:50:28 capella sshd[26115]: debug1: session_open: channel 0 > Nov 3 21:50:28 capella sshd[26115]: debug1: session_open: session 0: lin= k with channel 0 > Nov 3 21:50:28 capella sshd[26115]: debug1: server_input_channel_open: c= onfirm session > Nov 3 21:50:28 capella sshd[26115]: debug1: server_input_channel_req: ch= annel 0 request exec reply 1 > Nov 3 21:50:28 capella sshd[26115]: debug1: session_by_channel: session = 0 channel 0 > Nov 3 21:50:28 capella sshd[26115]: debug1: session_input_channel_req: s= ession 0 req exec > Nov 3 21:50:28 capella sshd[26535]: debug1: PAM: reinitializing credenti= als > Nov 3 21:50:28 capella sshd[26535]: debug1: permanently_set_uid: 0/0 > Nov 3 21:50:28 capella sshd[26115]: debug1: server_input_channel_open: c= type session rchan 66 win 64000 max 32768 > Nov 3 21:50:28 capella sshd[26115]: debug1: input_session_request > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 10: new [server-sess= ion] > Nov 3 21:50:28 capella sshd[26115]: debug1: session_open: channel 10 > ["no more sessions" error occurs here, in a different log file] > Nov 3 21:50:28 capella sshd[26115]: debug1: session open failed, free ch= annel 10 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 10: free: server-ses= sion, nchannels 11 > Nov 3 21:50:28 capella sshd[26115]: debug1: server_input_channel_open: f= ailure session > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 0: free: server-sess= ion, nchannels 10 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 1: free: server-sess= ion, nchannels 9 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 2: free: server-sess= ion, nchannels 8 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 3: free: server-sess= ion, nchannels 7 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 4: free: server-sess= ion, nchannels 6 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 5: free: server-sess= ion, nchannels 5 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 6: free: server-sess= ion, nchannels 4 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 7: free: server-sess= ion, nchannels 3 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 8: free: server-sess= ion, nchannels 2 > Nov 3 21:50:28 capella sshd[26115]: debug1: channel 9: free: server-sess= ion, nchannels 1 > Nov 3 21:50:28 capella sshd[26115]: debug1: do_cleanup > > I compared this with a successful deploy: > > Nov 3 21:44:22 capella sshd[25842]: debug1: permanently_set_uid: 0/0 > Nov 3 21:46:25 capella sshd[25612]: debug1: Received SIGCHLD. > Nov 3 21:46:25 capella sshd[25612]: debug1: session_by_pid: pid 25842 > Nov 3 21:46:25 capella sshd[25612]: debug1: session_exit_message: sessio= n 6 channel 6 pid 25842 > Nov 3 21:46:25 capella sshd[25612]: debug1: session_exit_message: releas= e channel 6 > Nov 3 21:46:26 capella sshd[25612]: debug1: server_input_channel_open: c= type session rchan 65 win 64000 max 32768 > Nov 3 21:46:26 capella sshd[25612]: debug1: input_session_request > Nov 3 21:46:26 capella sshd[25612]: debug1: channel 0: new [server-sessi= on] > Nov 3 21:46:26 capella sshd[25612]: debug1: session_new: session 0 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_open: channel 0 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_open: session 0: lin= k with channel 0 > Nov 3 21:46:26 capella sshd[25612]: debug1: server_input_channel_open: c= onfirm session > Nov 3 21:46:26 capella sshd[25612]: debug1: server_input_channel_req: ch= annel 0 request exec reply 1 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_by_channel: session = 0 channel 0 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_input_channel_req: s= ession 0 req exec > Nov 3 21:46:26 capella sshd[26032]: debug1: PAM: reinitializing credenti= als > Nov 3 21:46:26 capella sshd[26032]: debug1: permanently_set_uid: 0/0 > Nov 3 21:46:26 capella sshd[25612]: debug1: server_input_channel_open: c= type session rchan 66 win 64000 max 32768 > Nov 3 21:46:26 capella sshd[25612]: debug1: input_session_request > Nov 3 21:46:26 capella sshd[25612]: debug1: channel 8: new [server-sessi= on] > Nov 3 21:46:26 capella sshd[25612]: debug1: session_new: session 8 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_open: channel 8 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_open: session 8: lin= k with channel 8 > Nov 3 21:46:26 capella sshd[25612]: debug1: server_input_channel_open: c= onfirm session > Nov 3 21:46:26 capella sshd[25612]: debug1: server_input_channel_req: ch= annel 8 request exec reply 1 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_by_channel: session = 8 channel 8 > Nov 3 21:46:26 capella sshd[25612]: debug1: session_input_channel_req: s= ession 8 req exec > Nov 3 21:46:26 capella sshd[26043]: debug1: PAM: reinitializing credenti= als > Nov 3 21:46:26 capella sshd[26043]: debug1: permanently_set_uid: 0/0 > Nov 3 21:46:28 capella sshd[25612]: debug1: chan_shutdown_extended_read:= channel 8: (i0 o3 sock -1 wfd -1 efd 13 [read]) > Nov 3 21:46:28 capella sshd[25612]: debug1: server_input_channel_open: c= type session rchan 67 win 64000 max 32768 > Nov 3 21:46:28 capella sshd[25612]: debug1: input_session_request > Nov 3 21:46:28 capella sshd[25612]: debug1: channel 9: new [server-sessi= on] > Nov 3 21:46:28 capella sshd[25612]: debug1: session_new: session 9 > [... more channels and stuffs before graceful shutdown ...] > > I believe the two-minute window before SIGCHLD is waiting for some > Shepherd services to time out (long story..!). > > In my current tests the failure always occur at the > very end of deploy: > > building /gnu/store/19yajyzw4jhnkkz9w0l9gm4az0jxihkc-install-bootloader.s= cm.drv... > building /gnu/store/m7ngq5gszyswm6sficinz7ghpra30dl4-remote-exp.scm.drv... > ;;; [2022/11/03 21:50:28.892606, 0] [GSSH ERROR] Channel opening failure:= channel 66 error (2) open failed: # > > What's interesting is that all of the failed deploys fail to open > "channel 10", whereas successful deploys never reach it. All of these > are deploying the exact same configuration and commit, so the SSH > traffic should nearly identical. > > I have attached the transcripts of each session (by grepping the PIDs > from /var/log/debug). Grepping for session_open yields some > inconstencies, but haven't studied them in detail yet. > >>> Due to the number of SSH connections made by deploy and frequent >>> occurence of this problem, I was not able to successfully deploy through >>> many attempts. >> >> Ouch. Normally, thanks to memoization, =E2=80=98guix deploy=E2=80=99 op= ens only one >> session per machine. (I think it works well with the 25 local build >> machines behind berlin; it=E2=80=99s also fine for a small set of machin= es I >> take care of at work.) > > Have they been rebooted since the switch to inetd-style sshd? It fails > more often than not for me. > >>> I found that removing the memoizing open-machine-ssh-session* in >>> (gnu machine ssh) works around it and can happily deploy again. >>> >>> (that is, just use 'open-machine-ssh-session' instead) >> >> Interesting. That gives us a hint. Could you add a =E2=80=98pk=E2=80= =99 in >> =E2=80=98open-machine-ssh-session*=E2=80=99 to see how many connections = it opens? > > Indeed it only opens a single connection. The problem seems to be > with multiple "channels" and "sessions" over a single connection. > Very detailed report, thank you for digging in! CCed Artyom Poptsov, author of Guile-SSH. =2D-=20 Best regards, Andrew Tropin --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEKEGaxlA4dEDH6S/6IgjSCVjB3rAFAmNszSoACgkQIgjSCVjB 3rC+bBAAgcU8svGJMtm7hzefZHQfH0cs2XP/6VFd5ae5AqpQ4Kw2ZENvED+eSIdv BuzmsdXhLwLGITEFJdhyROrDlKtz85IwCsl9NXrWrOa0cNVW9bD9ibqjKnNcCMP8 jMjdoGxo2Khj6DAMdma9l1G7BfJrutTHvxid3MrmnSk3ZEQLK0haEKkkifK7vogq l1s1sLicEyx+9ux4BweZoGbIwuXVl7/dwiRNyS/02xE+8S8YR4dTuOFMoAH5P7Hy 2GlQyif9jyZji58msLhbfwu4WmZf0yiVRIDzjoNA5YtT8ZCJ/0auOCihtQi1ri2o Nwaf44LywmadtO7RpXnp4rVyxvXV3SYcgXajDFyMAZxvKEHqIc8pUkvpcdS/RfkR KLkCOY03Xlf52F8Ien3+9g6zLibDqaXXgUF9GNVfg7FqaxVH2LlTc0EUcBIL7oiQ /TvddMtklO+nMXfVx9OBIiI5RgwT0eW93nxSOO0Ouh2nh9jJDLuD9oQoyeudd/w4 rD0LkWlBSOSqi+8bJ4QkpmbMLELoEIi9IiNa/HpH0W3q8H5DFWLtkZ70XRlMxcky IunQA1sQDXfSBMPWzH798ESk9RfIxwTSdIKdXG3Y02kd4PFIjjvUoSjqUjZ3jDLB 6n6lwWUUZCCxvm3U/WRSOqGCtvJAxZaoTJnuq/aIFw7dr4/LQmI= =/h8b -----END PGP SIGNATURE----- --=-=-=--