Hey, In summary, this email lists the good things and bad things that you might experience when using the Guix Build Coordinator for providing substitutes for Guix. So, over the last ~7 months I've been working on the Guix Build Coordinator [1]. I think the first email I sent about it is [2], and I'm not sure if I've sent another one. I did prepare a talk on it though which goes through some of the workings [3]. 1: https://git.cbaines.net/guix/build-coordinator/tree/README.org 2: https://lists.gnu.org/archive/html/guix-devel/2020-04/msg00323.html 3: https://xana.lepiller.eu/guix-days-2020/guix-days-2020-christopher-baines-guix-build-coordinator.webm Over the last few weeks I've fixed up and tested the Guix services for the Guix Build Coordinator, as well as fixing some major issues like it segfaulting frequently. I've been using the Guix Build Coordinator build substitutes for guix.cbaines.net, which is my testing ground for providing substitutes. I think it's working reasonably well. I wanted to write this email though to set out more about actually using the Guix Build Coordinator to build things for substitutes, to help inform any conversations that happen about that. First, the good things: The way the Guix Build Coordinator generates compressed nars where the agent runs, then sends them over the network to the coordinator has a few benefits. The (sometimes expensive) work of generating the nars takes place where the agents are, so if you've got a bunch of machines running agents, that work is distributed. Also, when the nars are received by the coordinator, you have exactly what you need for serving substitutes. You just generate narinfo files, and then place the nars + narinfos where they can be fetched. The Guix Build Coordinator contains code to help with this. Because you aren't copying the store items back in to a single store, or serving substitutes from the store, you don't need to scale the store to serve more substitutes. You've still got a bunch of nars + narinfos to store, but I think that is an easier problem to tackle. This isn't strictly a benefit of the Guix Build Coordinator, but in contrast to Cuirass when run on a store which is subject to periodic garbage collection, assuming you're pairing the Guix Build Coordinator with the Guix Data Service to provide substitutes for the derivations, you don't run the risk of garbage collecting the derivations prior to building them. As I say, this isn't really a benefit of the Guix Build Coordinator, you'd potentially have the same issue if you ran the Guix Build Coordinator with guix publish (on a machine which GC's) to provide derivations, but I thought I'd mention it anyway. The Guix Build Coordinator supports prioritisation of builds. You can assign a priority to builds, and it'll try to order builds in such a way that the higher priority builds get processed first. If the aim is to serve substitutes, doing some prioritisation might help building the most fetched things first. Another feature supported by the Guix Build Coordinator is retries. If a build fails, the Guix Build Coordinator can automatically retry it. In a perfect world, everything would succeed first time, but because the world isn't perfect, there still can be intermittent build failures. Retrying failed builds even once can help reduce the chance that a failure leads to no substitutes for that builds as well as any builds that depend on that output. Now the not so good things: The Guix Build Coordinator just builds things, if you want to build all Guix packages, you need to work out the derivations, then submit builds for all of them. There's a script I wrote that does this with the help of a Guix Data Service instance, but that might not be ideal for all deployments. Even though it can handle the building of things, and most of the serving substitutes part (just not the serving bit), some other component(s) are needed. Because the build results don't end up in a store (they could, but as set out above, not being in the store is a feature I think), you can't use `guix gc` to get rid of old store entries/substitutes. I have some ideas about what to implement to provide some kind of GC approach over a bunch of nars + narinfos, but I haven't implemented anything yet. There could be issues with the implementation… I'd like to think it's relatively simple, but that doesn't mean there aren't issues. For some reason or another, getting backtraces for exceptions rarely works. Most of the time the coordinator tries to print a backtrace, the part of Guile doing that raises an exception. I've managed to cause it to segfault, through using SQLite incorrectly, which hasn't been obvious to fix at least for me. Additionally, there are some places where I'm fighting against bits of Guix, things like checking for substitutes without caching, or substituting a derivation without starting to build it. Finally, the instrumentation is somewhat reliant on Prometheus, and if you want a pretty dashboard, then you might need Grafana too. Both of these things aren't packaged for Guix, Prometheus might be feasible to package within the next few months, I doubt the same is true for Grafana (due to the use of NPM). I think that's a somewhat objective look at what using the Guix Build Coordinator might be like at the moment. Just let me know if you have any thoughts or questions? Thanks, Chris