From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:1008:1e59::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id sJRbCuo8UWbW/QAAA41jLg (envelope-from ) for ; Sat, 25 May 2024 03:20:42 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id uGvSBOo8UWaFtAAA62LTzQ (envelope-from ) for ; Sat, 25 May 2024 03:20:42 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RWMrpboO; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1716600041; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=rbOkA/v1tqtw34KiuAvdi5GP1Y5OEBnZDQytySgo824=; b=uL5XrYjgWLYrf2oKfANEaNiLB/fqhcIrymKjyt1O7Mb9I4iP/sM8Pcay+8SADPY7B5B6Kt 1WNLjsnhrGhFChooGEo0QK3e0Cgt9X4IVusqR7R+hPnd69dfHElSOqhxs1QSUgpRLEmDuI oCjXzxzS/TUTHptlnNk5mpF6XHt8gDyDkUrkLtP68bxG5cMUHJqA1irxEUol8DgPUT36AS CYUHCK6eEzs6cQVvjftIsLuqUS6bjJ57DpYfiFQZBjto1fRpIkmYgCSfrdlolgqfHZ0BsO +3y0kJApaAWqkktGXE5ga9X6PZgrTbVq9a/c8NhqeGzt7xnWlBBYcqOQW+j++A== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RWMrpboO; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=key1; d=yhetil.org; t=1716600041; a=rsa-sha256; cv=none; b=e2uOiPO4c0dZXGraDf0II0lqFxbRDlHsJyJb1t5EuTXbpzLyS7thbCNGDCjQrrczYjp4gD T9N/wHQFFyUbKHFlC6OE2Vf+ce0dHm0X8E/ZO86G4HMqdxuw5cnOfd7SN3XhOA9LVQd5He EvU0KpkkpUSv/Af6dTmMzwDZx7d9Q8S/dSgt0ZYWChD5a5/a1fx435eULPHY7g4i0B8JWI PcHzG7eV4n6i/1Ta0bC0dKp0FyZidwFddm/Du4kR8AESm/Z3PSipRtcDC2V5mvmtH883LT RslsO0G7kHntg9s+BnRnpOwx/gF65E3cB+tKUiSNhTQIv9y9Xp5poxb4TxPBfA== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id AF63468160 for ; Sat, 25 May 2024 03:20:41 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAg4q-0004Uo-Eo; Fri, 24 May 2024 21:20:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAg4o-0004R7-0b for guix-devel@gnu.org; Fri, 24 May 2024 21:20:02 -0400 Received: from mail-qk1-x72f.google.com ([2607:f8b0:4864:20::72f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAg4d-0001b6-G1; Fri, 24 May 2024 21:20:01 -0400 Received: by mail-qk1-x72f.google.com with SMTP id af79cd13be357-794ab12341aso119649685a.1; Fri, 24 May 2024 18:19:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716599986; x=1717204786; darn=gnu.org; h=content-transfer-encoding:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rbOkA/v1tqtw34KiuAvdi5GP1Y5OEBnZDQytySgo824=; b=RWMrpboOdhSH+OwAvjGVO2AsdjfAmSa+aTgQcsmNay66ANFvPwesBiBBQBMnC3y77J j6FGp5ism6SywY/DW/WicGp4yjBwRwsW5pKk+ByIcuhCIbxnDq/7H/urvW5OJiGCoAHv VBVJR0gQ/28wHAMbY1DfnRiaiU2yp3wph/5QkBh0crMAqvJnKAWV7A8BKZ4jMGvlH4Cb nOb8viIjy/7BfvzGJNX+B4BjL8YxD5NMqA7t1lH3MK/RHvj1vn6keLUWFzv0YQVvG7mx q1XGZZlksRvW46wk7t2NLdkWMyslFNkl+JQV676NKNyk0CBKQWeFZAltd05LZViFvkNE VNVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716599986; x=1717204786; h=content-transfer-encoding:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=rbOkA/v1tqtw34KiuAvdi5GP1Y5OEBnZDQytySgo824=; b=WsjybgfYhpBcxL5poZA8v2h5uzOePVIJtPGHRGkaLllsHdqRD6IhxbW8DYj0AvsCve N7YnvDwU9LIPiucKYi8CHT0AMZNrvT5OdmNWlmycRk1MVKxAmOfpbR8GNrDkD1Su2eDP gW92WFqIV9evndioggjsWhnyXhGk01XJGq6OJVGV99pBk3vskzX1qdENaKfV8kC181Jr 3egwAP5mkEa3bIz2ditAZrB91HL4rSid+bITBiWGVMRQme2oRmGntGmFLmleO+5R3cAN wGsxJaiSEdgV2nV8UgmKhIJ7hD5t7U/RKThUZhH7F37GjT2H1eBx+Gy3MPJ+WkurlykI VVSg== X-Gm-Message-State: AOJu0YyuS8ltN0kkIWufiXwZelfw3ISWsK5C+AxKCUeXDg4uLmes6Prk O52gFtkt1OaDCVd8/69h2k0NgY2Z7VIyYtODwlrKvDR7idelM4dg3eUaPw== X-Google-Smtp-Source: AGHT+IF8SbSAA/W0O10ked9TE/3pxPmQBKZdLwwhzCCfuXHC6OkC7MO2ho4DWP9UMTErFbS2Td6/6g== X-Received: by 2002:a05:620a:470b:b0:790:ef27:ef4a with SMTP id af79cd13be357-794a08fbdd1mr1242094985a.6.1716599985991; Fri, 24 May 2024 18:19:45 -0700 (PDT) Received: from hurd (dsl-10-130-164.b2b2c.ca. [72.10.130.164]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abca80bdsm107919885a.24.2024.05.24.18.19.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 18:19:45 -0700 (PDT) From: Maxim Cournoyer To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: guix-devel@gnu.org Subject: Re: Postmortem of service downtime In-Reply-To: <877cfk77vk.fsf@gnu.org> ("Ludovic =?utf-8?Q?Court=C3=A8s=22'?= =?utf-8?Q?s?= message of "Thu, 23 May 2024 19:31:11 +0200") References: <877cfk77vk.fsf@gnu.org> Date: Fri, 24 May 2024 21:19:44 -0400 Message-ID: <87ikz2it73.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2607:f8b0:4864:20::72f; envelope-from=maxim.cournoyer@gmail.com; helo=mail-qk1-x72f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Spam-Score: -9.56 X-Migadu-Queue-Id: AF63468160 X-Migadu-Scanner: mx10.migadu.com X-Migadu-Spam-Score: -9.56 X-TUID: 1JBYKm2zLe0G Hi Ludovic, Ludovic Court=C3=A8s writes: > From Sunday May 19th to Tuesday may 21st, for about 36h, > bayfront.guix.gnu.org, the machine behind many services went down: > > https://lists.gnu.org/archive/html/info-guix/2024-05/msg00000.html > > Affected web sites and services included: > > guix.gnu.org > bordeaux.guix.gnu.org > logs.guix.gnu.org > hpc.guix.info > foundation.guix.info > packages.guix.gnu.org > qa.guix.gnu.org > [...] > A large part of the slowness was due to =E2=80=98guix substitute=E2= =80=99 reading > all the 300K+ entries from /var/guix/substitute/cache and deleting > them, one by one (this took several minutes). Chris had mentioned > that performance issue in the past; it=E2=80=99s not much of a proble= m on > one=E2=80=99s laptop with an SSD, but it=E2=80=99s clearly a problem = here where > there are more entries than usual. We should at least drastically > reduce the TTL of cache entries. Interesting! > =E2=80=A2 qa-frontpage failed to build when we first reconfigured the m= achine, > so we commented it out. This is now fixed: > > https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id= =3D3fecb1e8fdea65a7440fec403c1c52da197b5dfe > > =E2=80=A2 guix-packages-website (the server behind packages.guix.gnu.or= g) > still refuses to start with an Artanis error: > > https://issues.guix.gnu.org/71138 > > Ludo=E2=80=99, on behalf on the emergency rescue^W^W sysadmin team. Phew! Thanks for the detailed write-up and for the fixes/thankless work of bringing the machine back up and running. --=20 Maxim