From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id kC3ZDppKtF9SNQAA0tVLHw (envelope-from ) for ; Tue, 17 Nov 2020 22:11:38 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 6Am9CppKtF+kcQAAB5/wlQ (envelope-from ) for ; Tue, 17 Nov 2020 22:11:38 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CBC179403CD for ; Tue, 17 Nov 2020 22:11:37 +0000 (UTC) Received: from localhost ([::1]:41346 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kf9CK-0007ej-HB for larch@yhetil.org; Tue, 17 Nov 2020 17:11:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46878) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kf9BQ-0007On-Gb for guix-devel@gnu.org; Tue, 17 Nov 2020 17:10:40 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36297) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kf9BP-0002il-DX; Tue, 17 Nov 2020 17:10:40 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=39646 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kf9BO-0007GK-IM; Tue, 17 Nov 2020 17:10:38 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Christopher Baines Subject: Re: Thoughts on building things for substitutes and the Guix Build Coordinator References: <87tutnlnjy.fsf@cbaines.net> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 27 Brumaire an 229 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 17 Nov 2020 23:10:36 +0100 In-Reply-To: <87tutnlnjy.fsf@cbaines.net> (Christopher Baines's message of "Tue, 17 Nov 2020 20:45:53 +0000") Message-ID: <87blfvocrn.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -1.51 X-TUID: 93JPTCKopiQL Hi! Christopher Baines skribis: > The way the Guix Build Coordinator generates compressed nars where the > agent runs, then sends them over the network to the coordinator has a > few benefits. The (sometimes expensive) work of generating the nars > takes place where the agents are, so if you've got a bunch of machines > running agents, that work is distributed. > > Also, when the nars are received by the coordinator, you have exactly > what you need for serving substitutes. You just generate narinfo files, > and then place the nars + narinfos where they can be fetched. The Guix > Build Coordinator contains code to help with this. > > Because you aren't copying the store items back in to a single store, or > serving substitutes from the store, you don't need to scale the store to > serve more substitutes. You've still got a bunch of nars + narinfos to > store, but I think that is an easier problem to tackle. Yes, this is good for the use case of providing substitutes and it would certainly help on a big build farm like berlin. I see a lot could be shared with (guix scripts publish) and (guix scripts substitute). We should extract the relevant bits and move them to new modules explicitly meant for more general consumption. I think it=E2=80=99s important to reduce duplication. > The Guix Build Coordinator supports prioritisation of builds. You can > assign a priority to builds, and it'll try to order builds in such a way > that the higher priority builds get processed first. If the aim is to > serve substitutes, doing some prioritisation might help building the > most fetched things first. Neat! > Another feature supported by the Guix Build Coordinator is retries. If a > build fails, the Guix Build Coordinator can automatically retry it. In a > perfect world, everything would succeed first time, but because the > world isn't perfect, there still can be intermittent build > failures. Retrying failed builds even once can help reduce the chance > that a failure leads to no substitutes for that builds as well as any > builds that depend on that output. That=E2=80=99s nice too; it=E2=80=99s one of the practical issues we have w= ith Cuirass and that=E2=80=99s tempting to ignore because =E2=80=9Chey it=E2=80=99s all= functional!=E2=80=9D, but then reality gets in the way. > Now the not so good things: > > The Guix Build Coordinator just builds things, if you want to build all > Guix packages, you need to work out the derivations, then submit builds > for all of them. There's a script I wrote that does this with the help > of a Guix Data Service instance, but that might not be ideal for all > deployments. Even though it can handle the building of things, and most > of the serving substitutes part (just not the serving bit), some other > component(s) are needed. That=E2=80=99s OK; it=E2=80=99s good that these two things (computing deriv= ations and building them) are separate. > Because the build results don't end up in a store (they could, but as > set out above, not being in the store is a feature I think), you can't > use `guix gc` to get rid of old store entries/substitutes. I have some > ideas about what to implement to provide some kind of GC approach over a > bunch of nars + narinfos, but I haven't implemented anything yet. =E2=80=98guix publish=E2=80=99 has support for that via (guix cache), so if= we could share code, that=E2=80=99d be great. One option would be to populate /var/cache/guix/publish and to let =E2=80= =98guix publish=E2=80=99 serve it from there. > There could be issues with the implementation=E2=80=A6 I'd like to think = it's > relatively simple, but that doesn't mean there aren't issues. For some > reason or another, getting backtraces for exceptions rarely works. Most > of the time the coordinator tries to print a backtrace, the part of > Guile doing that raises an exception. I've managed to cause it to > segfault, through using SQLite incorrectly, which hasn't been obvious to > fix at least for me. Additionally, there are some places where I'm > fighting against bits of Guix, things like checking for substitutes > without caching, or substituting a derivation without starting to build > it. I=E2=80=99ve haven=E2=80=99t yet watched your talk but I=E2=80=99ve what Ma= thieu=E2=80=99s, where he admits to being concerned about the reliability of code involving Fibers and/or SQLite (which I can understand given his/our experience, although I=E2=80=99m maybe less pessimistic). What=E2=80=99s your experience, how d= o you feel about it? > Finally, the instrumentation is somewhat reliant on Prometheus, and if > you want a pretty dashboard, then you might need Grafana too. Both of > these things aren't packaged for Guix, Prometheus might be feasible to > package within the next few months, I doubt the same is true for Grafana > (due to the use of NPM). Heh. :-) Thanks for this update! Ludo=E2=80=99.