From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id 2JRcMdpKaWGnOQAAgWs5BA (envelope-from ) for ; Fri, 15 Oct 2021 11:33:14 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id gK0YLdpKaWENIgAA1q6Kng (envelope-from ) for ; Fri, 15 Oct 2021 09:33:14 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3A3ED13B93 for ; Fri, 15 Oct 2021 11:33:14 +0200 (CEST) Received: from localhost ([::1]:52326 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mbJaT-00065Q-AT for larch@yhetil.org; Fri, 15 Oct 2021 05:33:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39578) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mbJUu-0007i5-La for guix-devel@gnu.org; Fri, 15 Oct 2021 05:27:29 -0400 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]:36207) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mbJUp-0002JO-JQ for guix-devel@gnu.org; Fri, 15 Oct 2021 05:27:28 -0400 IronPort-HdrOrdr: =?us-ascii?q?A9a23=3ARJz6q69XY+jrRujdZc1uk+AeI+orL9Y04lQ7?= =?us-ascii?q?vn2ZESYlFfBxl6iV88jzpiWE7gr5P0tQ5exoWZPwPE80mqQFgrX5UY3OYOCigh?= =?us-ascii?q?rNEGgA1/qc/9SDIVydygc1784JGMJD4Z/LfD1HZK3BjjVQZuxB/DDxysGVbInl?= =?us-ascii?q?o0uFBjsaEp2Ipz0JcjqzAwl7QwReA5o/CYCR7NZdpyexPS0HY8jrXj0IQIH41q?= =?us-ascii?q?f2fbzdEGU7OyI=3D?= X-IronPort-AV: E=Sophos;i="5.84,326,1620684000"; d="scan'208";a="396000644" Received: from unknown (HELO ribbon) ([193.50.110.252]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Oct 2021 11:27:17 +0200 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: zimoun Subject: Re: Substitute retention References: <87a6tdce94.fsf@inria.fr> <87mu4iv0gc.fsf@inria.fr> <87v9c0ap22.fsf_-_@gnu.org> <87wnmsn5lz.fsf_-_@gnu.org> <87bl44vfvg.fsf_-_@gmail.com> <87o880byyz.fsf@inria.fr> <87czoay4sq.fsf@inria.fr> <86h7dmms8c.fsf@gmail.com> <87y26ytek6.fsf_-_@inria.fr> <861r4qkti9.fsf@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 24 =?utf-8?Q?Vend=C3=A9miaire?= an 230 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Fri, 15 Oct 2021 11:27:17 +0200 In-Reply-To: <861r4qkti9.fsf@gmail.com> (zimoun's message of "Tue, 12 Oct 2021 20:06:22 +0200") Message-ID: <87v91yk58q.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.104; envelope-from=ludovic.courtes@inria.fr; helo=mail3-relais-sop.national.inria.fr X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1634290394; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=o87TvryN0pZKGr1biqCUGYXFYTyAUb19o6CrIwt8How=; b=CU5VLs7SCSlLfBwENDa2SDVPTY2XdD6QeXC3glC8W77NWQpknxrcvmS2jHfnJY/3vJsiYu Sq3M2Fu/z38Ej52Ap377OlKu/TkR5KqxVY6FuniuOXj/9qMt/iQtU4gStvKP5HJfohPOC3 aPcAFwWs8u+Ybfand+GRaoGBJv/bjpkLDhqF2Xq3/1doiwVsSjMVYTNRDdTWx72HbjVYcd YvzpB537U5L0EBJsNfrWM20pUkgHdEPyy7Ic1veYgwy1GXW00Cya9DNQOij4ZjJBrM/AbF 9z3BIpJPG8VoIt+PXxST8joBZ4IYixdRSkkosE7NuWLgRTGfMxegERILDD8ARw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1634290394; a=rsa-sha256; cv=none; b=PsMUw2rQ4mLO5s3c21pb+Veay0iLVvw/L30rlEp547+F873H0iEr1E1Q9DmkU/xNjvkWpi qIFIQSJPXlmgLv1Vsx/wWz9wbDFqZ6pKt3xW4rM59xyrUL2yfrByeWj5IbqS5DGwvGCfrc nhjlSyQl6LWitrT+XX+dL+OrOXOnqvlqLnBfh8WtpdpdneLlaysxtk4UmHtg1PVNrdDzXh 3bPQImjd//w/zh1MagBIIK9Nf7AKUquOcPm7Coa4QajZiaZd+pjwUhDWiKZPdiRcgoTuTa vLwrXjLW0pbQZ5VPhi/GaWiV4powES01ayV2ncvuGv9wv3LUCx63r1tkWOef6w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -2.42 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 3A3ED13B93 X-Spam-Score: -2.42 X-Migadu-Scanner: scn0.migadu.com X-TUID: /0FLxcVgfssF Hi! zimoun skribis: >>> missed by both build farms using 2 different strategies to collect the >>> thing to build (fetch every 5 minutes or fetch from guix-commits). It >>> is a quick back to envelope so keep that with some salt. :-) >> >> OK. > > To make it explicit of #1, I was talking about the =E2=80=9Cmodular=E2=80= =9D Guix, i.e., > when running =E2=80=9Cguix pull=E2=80=9D or =E2=80=9Cguix time-machine=E2= =80=9D it leads to build the > derivations module-import.drv, guix-.drv, guix-command.drv, > guix-module-union.drv, guix--modules.drv, > guix-packages-modules.drv, guix-system-tests-modules.drv, > guix-packages-base-modules.drv, etc. On slow machines, it can be > unpleasant; not to say unpractical. Even for recent commits. Ah I see. Yeah, this can be kinda annoying, and amplified by the fact that CI only builds at each push, not at each commit. That said, this is mitigated by the fact that one typically travels to a previously-fetched commit, which is a commit that has been built by CI rather than a commit in between two pushes. > Basically, commit 59d10c3112 is from March 14, 2020 and it takes ~29min > on my slow laptop. And to compare apple to apple, let take another > commit one year later from March 14, 2021, e.g., commit 7327295462. It > takes ~5min on the same machine. Yeah, OK. > To be on the same wavelength, > > $ git log --format=3D"%h %cd" --after=3D2021-03-14 --reverse | head -n16 > [...] > 2babf7d831 Sun Mar 14 19:16:55 2021 +0100 > b15720182e Sun Mar 14 13:24:21 2021 -0500 > 207aa62e6b Sun Mar 14 13:24:21 2021 -0500 > 30f5381487 Sun Mar 14 13:24:21 2021 -0500 > af25357b7d Sun Mar 14 13:24:21 2021 -0500 > 7164d2105a Sun Mar 14 13:24:21 2021 -0500 > 078f3288e2 Sun Mar 14 13:24:21 2021 -0500 > 5a31eb7d35 Sun Mar 14 13:24:21 2021 -0500 > 620206b680 Sun Mar 14 13:24:22 2021 -0500 > b76762a9b7 Sun Mar 14 13:24:22 2021 -0500 > cbfcbb79df Sun Mar 14 19:43:35 2021 +0100 > > and Cuirass builds only one of b15720182e, 207aa62e6b, 30f5381487, > af25357b7d, 7164d2105a, 078f3288e2, 5a31eb7d35, 620206b680 or > b76762a9b7. > > Considering the Build Coordinator, it uses guix-commits and from my > understanding it reads: > > > > therefore, b15720182e would be missed but not b76762a9b7=E2=80=93which wo= uld be > missed by Cuirass. > > Cuirass and the Build Coordinator cannot each build the both commits > b15720182e and b76762a9b7. > > Cuirass check every 5 minutes and Build Coordinator reads =E2=80=9Cstate= =E2=80=9D from > guix-commits. Other said, none of them builds all these =E2=80=9Cmodular= =E2=80=9D > derivations for all the commits; even for recent commits. > > The rough estimate is half of commits are missed by both build farms. > Therefore, using =E2=80=9Cguix time-machine=E2=80=9D with a random commit= and one gets > 1/2 probability to build something just to get the inferior =E2=80=93 asi= de the > TTL policy. Right. Not every derivation produced by (guix self) needs to be rebuilt in between two commits, but anything that depends on *package-modules* typically has to be rebuilt. We can reduce the amount of rebuilt like I did in commit abd38dcee16f0ac71191527c38dcd3659111e2ba, but you=E2=80=99ll always have th= e big (gnu packages =E2=80=A6) derivation. >> So what can we do to address this issue? I *think* we could use a >> higher TTL on berlin, and we can try that right away (9 months to being >> with?). > > I *think* the issue is not TTL for question #1. :-) But the issue that > the both build farms do not build these =E2=80=9Cmodular=E2=80=9D derivat= ions for all > the commits. Here, I am focused on x86_64-linux which is the case of > interest for such topic (scientific context), IMHO. > > Considering to build for every commit for all architectures is not > affordable. > > I agree that increasing the TTL will help for question #2 about > long-support of substitutes. Understood! >> However, there is an upper bound anyway. To make informed decisions on >> the retention policy, we should monitor storage space on berlin/bayfront >> to better estimate what can be done. We have Zabbix but it=E2=80=99s not >> accessible from the outside; maybe we could graph storage space >> somewhere so people can grab the data and work on those estimates? > > Based on the size of these derivations for one commit, we could > extrapolate back to envelope. Well, question #1 seems doable > storage-speaking. > > The issue of #1 is to build these derivations for all the commits. > IMHO. > > About #2, yeah if some data are available, I can try to make some > estimates. > > > Well, #1 seems actionable. However, #2 raises=E2=80=A6 > >> What if we decide that we need to provide substitutes for 2y old >> commits? In that case, we need a plan to scale up. That could be >> renting storage space somewhere. That=E2=80=99s largely non-technical w= ork that >> needs attention. > > =E2=80=A6a strong question. :-) What do =E2=80=9Cwe=E2=80=9D do for what = =E2=80=9Cwe=E2=80=9D build? > > Indeed, numbers are missing to make informed decisions on long-term > storage of substitutes. What is Nix doing? Nix, AFAIK, is doing like everyone else: pouring money on Amazon. Last I heard they=E2=80=99d retain substitutes basically indefinitely on Amazon = S3 (incidentally, one motivation for them to work with Software Heritage, AIUI, is that it would allow them to store less data on the storage they pay for :-)). For the record, berlin (aka ci.guix.gnu.org; it was donated by the Max Delbr=C3=BCck Center, MDC, and is generously hosted by them) has a 37=C2=A0= TiB disk for /gnu/store and =E2=80=9Cbaked=E2=80=9D substitutes. That=E2=80=99= s a lot. Technically though, a lot of it is used by less important substitutes such as disk images or intermediate =E2=80=98core-updates=E2=80=99 substitu= tes. In the end we seem to be filling it more quickly than you=E2=80=99d think! Perhaps we need a better strategy with a low TTL for, say, intermediate =E2=80=98core-updates=E2=80=99 substitutes (no need to keep them more than = a few weeks if we know we=E2=80=99re doing a world rebuild right after). It cannot be = done as things are though because =E2=80=98guix publish=E2=80=99 doesn=E2=80=99t= distinguish between store items. Or we could restart the Amazon front-end that Chris Marusich had set up right before 1.0 was released. Or we could build our own front-end for substitute delivery as a proxy to berlin, thereby distributing the burden. Thoughts? > I think that having 2 build farms building in parallel is a strength. > So let exploit it. :-) What one could have in mind is to challenge the > outputs; if they are identical, let keep only one version =E2=80=9Csomewh= ere=E2=80=9D > and remove the other from the =E2=80=9Celsewhere=E2=80=9D. > > For instance, we (I? with help) could resume this discussion: > > I hadn=E2=80=99t seen this message, interesting! Note however that bordeaux.guix has a tenth of the storage space of berlin (3.6=C2=A0TiB), so right now we probably can=E2=80=99t count on it f= or long-term substitute storage. > Or maybe, for the identical outputs, one could imagine (dream? for) a > cooking service for missing outputs. Well, I do not know how this is > actionable. :-) Well, if we keep .drv around, we could arrange so that =E2=80=98guix publis= h=E2=80=99 rebuilds on-demand, after all. I=E2=80=99m not sure how practical that wou= ld be, though. Ludo=E2=80=99.