From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id 6Pp4NZk2tF+/XQAA0tVLHw (envelope-from ) for ; Tue, 17 Nov 2020 20:46:17 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id aEM8MZk2tF/QVQAAbx9fmQ (envelope-from ) for ; Tue, 17 Nov 2020 20:46:17 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 196A994042B for ; Tue, 17 Nov 2020 20:46:17 +0000 (UTC) Received: from localhost ([::1]:43166 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kf7rj-0002NP-Uf for larch@yhetil.org; Tue, 17 Nov 2020 15:46:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:54010) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kf7rX-0002N1-6q for guix-devel@gnu.org; Tue, 17 Nov 2020 15:46:04 -0500 Received: from mira.cbaines.net ([212.71.252.8]:60384) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kf7rU-0007Hi-Uk for guix-devel@gnu.org; Tue, 17 Nov 2020 15:46:02 -0500 Received: from localhost (188.30.135.14.threembb.co.uk [188.30.135.14]) by mira.cbaines.net (Postfix) with ESMTPSA id 87EAD27BBF5 for ; Tue, 17 Nov 2020 20:45:58 +0000 (GMT) Received: from capella (localhost [127.0.0.1]) by localhost (OpenSMTPD) with ESMTP id 2d5d41fd for ; Tue, 17 Nov 2020 20:45:56 +0000 (UTC) User-agent: mu4e 1.4.13; emacs 27.1 From: Christopher Baines To: guix-devel@gnu.org Subject: Thoughts on building things for substitutes and the Guix Build Coordinator Date: Tue, 17 Nov 2020 20:45:53 +0000 Message-ID: <87tutnlnjy.fsf@cbaines.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Received-SPF: pass client-ip=212.71.252.8; envelope-from=mail@cbaines.net; helo=mira.cbaines.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/11/17 14:34:24 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -3.11 X-TUID: 7tt93Fd8vQaq --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hey, In summary, this email lists the good things and bad things that you might experience when using the Guix Build Coordinator for providing substitutes for Guix. So, over the last ~7 months I've been working on the Guix Build Coordinator [1]. I think the first email I sent about it is [2], and I'm not sure if I've sent another one. I did prepare a talk on it though which goes through some of the workings [3]. 1: https://git.cbaines.net/guix/build-coordinator/tree/README.org 2: https://lists.gnu.org/archive/html/guix-devel/2020-04/msg00323.html 3: https://xana.lepiller.eu/guix-days-2020/guix-days-2020-christopher-baine= s-guix-build-coordinator.webm Over the last few weeks I've fixed up and tested the Guix services for the Guix Build Coordinator, as well as fixing some major issues like it segfaulting frequently. I've been using the Guix Build Coordinator build substitutes for guix.cbaines.net, which is my testing ground for providing substitutes. I think it's working reasonably well. I wanted to write this email though to set out more about actually using the Guix Build Coordinator to build things for substitutes, to help inform any conversations that happen about that. First, the good things: The way the Guix Build Coordinator generates compressed nars where the agent runs, then sends them over the network to the coordinator has a few benefits. The (sometimes expensive) work of generating the nars takes place where the agents are, so if you've got a bunch of machines running agents, that work is distributed. Also, when the nars are received by the coordinator, you have exactly what you need for serving substitutes. You just generate narinfo files, and then place the nars + narinfos where they can be fetched. The Guix Build Coordinator contains code to help with this. Because you aren't copying the store items back in to a single store, or serving substitutes from the store, you don't need to scale the store to serve more substitutes. You've still got a bunch of nars + narinfos to store, but I think that is an easier problem to tackle. This isn't strictly a benefit of the Guix Build Coordinator, but in contrast to Cuirass when run on a store which is subject to periodic garbage collection, assuming you're pairing the Guix Build Coordinator with the Guix Data Service to provide substitutes for the derivations, you don't run the risk of garbage collecting the derivations prior to building them. As I say, this isn't really a benefit of the Guix Build Coordinator, you'd potentially have the same issue if you ran the Guix Build Coordinator with guix publish (on a machine which GC's) to provide derivations, but I thought I'd mention it anyway. The Guix Build Coordinator supports prioritisation of builds. You can assign a priority to builds, and it'll try to order builds in such a way that the higher priority builds get processed first. If the aim is to serve substitutes, doing some prioritisation might help building the most fetched things first. Another feature supported by the Guix Build Coordinator is retries. If a build fails, the Guix Build Coordinator can automatically retry it. In a perfect world, everything would succeed first time, but because the world isn't perfect, there still can be intermittent build failures. Retrying failed builds even once can help reduce the chance that a failure leads to no substitutes for that builds as well as any builds that depend on that output. Now the not so good things: The Guix Build Coordinator just builds things, if you want to build all Guix packages, you need to work out the derivations, then submit builds for all of them. There's a script I wrote that does this with the help of a Guix Data Service instance, but that might not be ideal for all deployments. Even though it can handle the building of things, and most of the serving substitutes part (just not the serving bit), some other component(s) are needed. Because the build results don't end up in a store (they could, but as set out above, not being in the store is a feature I think), you can't use `guix gc` to get rid of old store entries/substitutes. I have some ideas about what to implement to provide some kind of GC approach over a bunch of nars + narinfos, but I haven't implemented anything yet. There could be issues with the implementation=E2=80=A6 I'd like to think it= 's relatively simple, but that doesn't mean there aren't issues. For some reason or another, getting backtraces for exceptions rarely works. Most of the time the coordinator tries to print a backtrace, the part of Guile doing that raises an exception. I've managed to cause it to segfault, through using SQLite incorrectly, which hasn't been obvious to fix at least for me. Additionally, there are some places where I'm fighting against bits of Guix, things like checking for substitutes without caching, or substituting a derivation without starting to build it. Finally, the instrumentation is somewhat reliant on Prometheus, and if you want a pretty dashboard, then you might need Grafana too. Both of these things aren't packaged for Guix, Prometheus might be feasible to package within the next few months, I doubt the same is true for Grafana (due to the use of NPM). I think that's a somewhat objective look at what using the Guix Build Coordinator might be like at the moment. Just let me know if you have any thoughts or questions? Thanks, Chris --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAl+0NoFfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh aW5lcy5uZXQACgkQXiijOwuE9XeshhAAi/SnXJqiSjHT9EB/eEDg0o8jP7M6O8rO C/gbzwpxvsUKduinh1bE/A5iWDaKT4KGauTN9zzwldmltlz7pBwuuikc/1bh8vYK AlmDjofiwzfJL5rUKCYB9nnDHtOZ7R/GnjmTxs/IrsUohIASTdsElDjN6Jij2NOc ik6fF2/aorS4GmvFU0sJUEJA56hrRm32adFKUBG27KYHJL4DNuPptOf2mW3JjD9+ dY018YycDOJ0OEnQaPkejfJHwaeEzoRtA6Dd3Bzl7AB2NvTsGJeLcu3eO2LOquNq 7xDIx34DMRioccv9u2w595uH0zzpvHWXj2uygij1fKcYN/u1EogS8yK19FsyQ+5I HhlgOfh6UALJL2Wuh1dgYaY2hOT2Dz+Bfs6VUUJgXm8VIaKxq79fDRqAGz6YtTV2 lZAwrnnU0rY0EInp6aE4C1+g/yMm+eCqBwnh/OFGUuJx3dVJF3tB28wM6Zl2gtQ4 SEd7P/fOrWMjJ0+SuvJcUT4ehXXWcjFBdTsoB6t9A3nHKhcaBPUaWvn/gdk0+b9Y tYzid6bxqjCnBfx0B8UbwBjvpgSZ9fI5ESocROQD3SykqxkV9KhJn9hC5iO8a197 KDZvNTISW3CDY/C9PBJO/m2ciqyWwb/ePt4vdDkR8nR/CAQfk/8ccM8r4UISWhei CTRcOh/jwig= =WKEd -----END PGP SIGNATURE----- --=-=-=--