From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id uOpDADZdoWByGgAAgWs5BA (envelope-from ) for ; Sun, 16 May 2021 19:58:14 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id qAtlNzVdoWBrXgAAbx9fmQ (envelope-from ) for ; Sun, 16 May 2021 17:58:13 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 19D7816D75 for ; Sun, 16 May 2021 19:58:12 +0200 (CEST) Received: from localhost ([::1]:54496 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1liL1m-0000M7-4O for larch@yhetil.org; Sun, 16 May 2021 13:58:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44710) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1liL1g-0000K6-9B for bug-guix@gnu.org; Sun, 16 May 2021 13:58:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:39880) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1liL1e-0004Vd-Aq for bug-guix@gnu.org; Sun, 16 May 2021 13:58:04 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1liL1e-00024g-AZ for bug-guix@gnu.org; Sun, 16 May 2021 13:58:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#48468: substitute server connection timeout Resent-From: Mathieu Othacehe Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Sun, 16 May 2021 17:58:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 48468 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 48468@debbugs.gnu.org X-Debbugs-Original-To: bug-guix@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.16211878767959 (code B ref -1); Sun, 16 May 2021 17:58:02 +0000 Received: (at submit) by debbugs.gnu.org; 16 May 2021 17:57:56 +0000 Received: from localhost ([127.0.0.1]:51426 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1liL1Y-00024J-5i for submit@debbugs.gnu.org; Sun, 16 May 2021 13:57:56 -0400 Received: from lists.gnu.org ([209.51.188.17]:56062) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1liL1W-00024A-Al for submit@debbugs.gnu.org; Sun, 16 May 2021 13:57:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44674) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1liL1V-0000JG-NQ for bug-guix@gnu.org; Sun, 16 May 2021 13:57:53 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:57356) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1liL1V-0004Qj-Ga for bug-guix@gnu.org; Sun, 16 May 2021 13:57:53 -0400 Received: from [2a01:e34:ed27:e500:a880:d241:574d:84e1] (port=51494 helo=meije) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1liL1T-0007vp-Vn for bug-guix@gnu.org; Sun, 16 May 2021 13:57:52 -0400 From: Mathieu Othacehe Date: Sun, 16 May 2021 19:57:49 +0200 Message-ID: <87lf8e4l42.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1621187892; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:list-id:list-help:list-unsubscribe:list-subscribe: list-post; bh=IVprAp3STpaDKQsCwB0aIo0fF0kUfHeJYfyPZ6m5iLA=; b=bvqQ6mK9PjGkYKHotJiGfM66Y73vuriqijqUr+35aGdXWLZTj07t0HXOEG7bAwqaSkE6v+ DFq9vi5p62S+j7wVnPDYHhfhMTTTpfIZ9sxQ/eAqdTsev8zBj+UQhrxwuuXMhhlau57FZ2 r2t+58OKXPnjXMc2HyMDlsvOdsvVf4t8vegzsuXc1baOg6bmBP7SIeQ4SYOpL8bfwJ+w6W vMrqLXGqhZFl9w+FBusM/zK9EN+N3KV9ddIBkQ8iwHbA8ivAgJvQ5hGtLpeXttCIy6zphy ozMvRcWeg1B0jZrl4iaBl4PEbICkBySAJqlwyRmcOhJWLRQKLukFTPkNa9u6iw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1621187892; a=rsa-sha256; cv=none; b=mbX95bHUcb1028Met+MLTaBDKMNYVnpAg73QPLM6kUOsrEYs6UGE/JcLSSUFChAzMNg49C vMjT9FLmxkTCEbVNhV/tS6w+euFQhDxe5AsFFVC70A5meMXAGWKgiFrLzs9ax7QlTa63FY GQQ0O1M40jBKXFNm4VFuv59lB9aQWNTg+5v0tLmdod31RYZr0xdTs4mG9GqHGVAao+Xlyl QAYxBuXY166gWfFhukftJ5EGBmvmB5UrwPyOJzxImuXe2R7AUf78+FiZ7a99IheKY6OBSF tggZVlUU50bIwOjeGtJ2c2laP9LwnT7xfKYP3uyy38wNyoy1WHuXFvICvLF+yg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Migadu-Spam-Score: -1.45 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Migadu-Queue-Id: 19D7816D75 X-Spam-Score: -1.45 X-Migadu-Scanner: scn0.migadu.com X-TUID: hOK3rQRN5GJg Hello, We recently have a lot of those errors on Cuirass: --8<---------------cut here---------------start------------->8--- guix substitute: warning: while fetching http://141.80.167.131:5557/nar/g7ka09613k5v1vlznh87yg35905ggw51-python2-scipy-1.2.2-guile-builder: server is somewhat slow guix substitute: warning: try `--no-substitutes' if the problem persists guix substitute: error: connect*: Connection timed out --8<---------------cut here---------------end--------------->8--- which means that the workers are failing to connect to the Cuirass remote-server publish server on berlin at 141.80.167.131:5557. Stracing this publish server shows that connection reuse seems to be broken: --8<---------------cut here---------------start------------->8--- accept4(9, {sa_family=AF_INET, sin_port=htons(41742), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41744), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41746), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25 accept4(9, {sa_family=AF_INET, sin_port=htons(41748), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24 accept4(9, {sa_family=AF_INET, sin_port=htons(41750), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41752), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41754), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25 accept4(9, {sa_family=AF_INET, sin_port=htons(41756), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41758), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 26 accept4(9, {sa_family=AF_INET, sin_port=htons(41760), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24 accept4(9, {sa_family=AF_INET, sin_port=htons(41762), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41764), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41766), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41768), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 22 accept4(9, {sa_family=AF_INET, sin_port=htons(41770), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41772), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41774), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41776), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41778), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41780), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 accept4(9, {sa_family=AF_INET, sin_port=htons(41782), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21 --8<---------------cut here---------------end--------------->8--- Investigating it, I found that the connection is closed and opened multiple times in the call-with-cached-connection procedure of the (guix script substitute) module. It looks like its because a 'bad-headers exception is raised when trying to parse an eof object: --8<---------------cut here---------------start------------->8--- ;;; (error bad-header (read-header-line #)) --8<---------------cut here---------------end--------------->8--- I'm not sure where this eof comes from. There is this comment in the http-multiple-get procedure in (guix http-client): --8<---------------cut here---------------start------------->8--- ;; Swallow networking errors that could occur due to connection reuse ;; and the like; they will be handled down the road when trying to ;; read responses. (false-if-networking-error (begin (for-each (cut write-request <> buffer) batch) (put-bytevector p (get)) (force-output p)))) --8<---------------cut here---------------end--------------->8--- which would suggest that connection reuse could cause networking errors? What also puzzles me it that the main guix publish server on berlin does not seem to present this issue. That would indicate that this error is caused by how the Cuirass remote-server publish server is started or configured. Ludo, Chris, any idea? I will keep searching anyway :) Thanks, Mathieu