* plz is there a roadmap for a more resilient substitutes infrastructure? @ 2018-11-02 12:16 Giovanni Biscuolo 2018-11-02 21:04 ` Pjotr Prins ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Giovanni Biscuolo @ 2018-11-02 12:16 UTC (permalink / raw) To: guix-devel [-- Attachment #1: Type: text/plain, Size: 1835 bytes --] Ciao, recently users and developers are facing hard to manage problems due to the maintainance of hydra.gnu.org and its proxy mirror.hydra.gnu.org [1] since 23 Oct 2018 unfortunately many recent reports from users in help-guix and guix-devel mailing list clearly shows that berlin.guixsd.org it's still not a solution, since several missing substitutes are forcing users to "build the world" [2] please is there a roadmap in GNU and/or Guix devel team to address this problems? GuixSD is now well known in the free software community, please aknowledge that this king of problems are detrimental to project reputation given the prolonged issue, please also consider writing an *official* blog post explaining the current situation and steps adopted to prevent similar issues in the future Me and many others would be very happy to help building a more resilient substitutes infrastructure: just tell us how to do for example: 1. is there a method to "replicate the whole store of an official server (e.g. hydra.gnu.org once healed)" so we can just "guix publish" a *complete* mirror? In this case a ready to use official mirror-config.scm could be useful 2. is there an official mirrors directory users can look at when needed? 3. is there a plan to build a service similar to http://httpredir.debian.org/? (I looked on the web but did not find any reference to such plan) ciao Giovanni [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=33151 [2] peronally I'm trying to install a bare-bones.scm machine in a VM and guix is compiling many many packages, including texlive... :-S (using berlin.guixsd.org as substitute URL) [3] https://www.gnu.org/software/guix/manual/en/html_node/Invoking-guix-publish.html -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-02 12:16 plz is there a roadmap for a more resilient substitutes infrastructure? Giovanni Biscuolo @ 2018-11-02 21:04 ` Pjotr Prins 2018-11-02 22:51 ` Julien Lepiller 2018-11-02 21:13 ` Devan Carpenter 2018-11-06 11:23 ` Ludovic Courtès 2 siblings, 1 reply; 8+ messages in thread From: Pjotr Prins @ 2018-11-02 21:04 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: guix-devel On Fri, Nov 02, 2018 at 01:16:03PM +0100, Giovanni Biscuolo wrote: > please is there a roadmap in GNU and/or Guix devel team to address this > problems? I think it would be a good idea to create a more distributed approach for creating and finding substitutes. A simple name service would help. We could even use IPFS or something to fetch nar files - IPFS comes with a name service. That way anyone building a substitute could push it to IPFS and expose it to the rest of the world. Since IPFS is content-addressable we can prevent injections. Any change to the file would change its location. So the address + NAR hash is safe. And no key setting required. Does away with the dependency on just a few machines. Maintaining machines is a pain. Why not distribute the effort? I am happy to build some stuff and put it out there - in fact I already run my own substitute server, but it has only the substitutes I need. If we all do that we can bundle resources together. Guix can easily support that. If someone wants to think this through and can write a prototype it would make a great talk at FOSDEM. We can also discuss it at Guix days. Pj. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-02 21:04 ` Pjotr Prins @ 2018-11-02 22:51 ` Julien Lepiller 2018-11-03 6:10 ` Pjotr Prins 0 siblings, 1 reply; 8+ messages in thread From: Julien Lepiller @ 2018-11-02 22:51 UTC (permalink / raw) To: guix-devel, Pjotr Prins, Giovanni Biscuolo We could easily distribute nar files over distributed networks (IPFS, bittorrent, …) but we still need a "canonical source" that builds these packages, otherwise how do you know what you are looking for? Don't we always need some sort of central authority? Le 2 novembre 2018 22:04:51 GMT+01:00, Pjotr Prins <pjotr.public12@thebird.nl> a écrit : >On Fri, Nov 02, 2018 at 01:16:03PM +0100, Giovanni Biscuolo wrote: >> please is there a roadmap in GNU and/or Guix devel team to address >this >> problems? > >I think it would be a good idea to create a more distributed approach >for creating and finding substitutes. A simple name service would >help. We could even use IPFS or something to fetch nar files - IPFS >comes with a name service. That way anyone building a substitute could >push it to IPFS and expose it to the rest of the world. Since IPFS is >content-addressable we can prevent injections. Any change to the file >would change its location. So the address + NAR hash is safe. And no >key setting required. > >Does away with the dependency on just a few machines. Maintaining >machines is a pain. Why not distribute the effort? I am happy to build >some stuff and put it out there - in fact I already run my own >substitute server, but it has only the substitutes I need. If we all >do that we can bundle resources together. Guix can easily support >that. > >If someone wants to think this through and can write a prototype it >would make a great talk at FOSDEM. We can also discuss it at Guix >days. > >Pj. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-02 22:51 ` Julien Lepiller @ 2018-11-03 6:10 ` Pjotr Prins 0 siblings, 0 replies; 8+ messages in thread From: Pjotr Prins @ 2018-11-03 6:10 UTC (permalink / raw) To: Julien Lepiller; +Cc: guix-devel On Fri, Nov 02, 2018 at 11:51:20PM +0100, Julien Lepiller wrote: > We could easily distribute nar files over distributed networks (IPFS, bittorrent, …) but we still need a "canonical source" that builds these packages, otherwise how do you know what you are looking for? Don't we always need some sort of central authority? Yes. A name service which is fed from accredited build servers. It is not hard to keep a few build servers in the 'air' which can be replaced on demand - even run in the cloud or in VMs. What is hard it to create a 100% uptime service that serves many generations of nars. Lot of data, and the data load can be high. This is what we ought to consider fanning out. Guix can support both systems, existing and new. Just add a substitute-url which resolves to an IPFS based naming scheme. Could even be integrated with guix-publish. Anyone who would run a guix-publish server could choose to expose an IPFS node for sharing. But I think it can be lighter weight. If we have a name service we could indeed just make use of any protocol that serves files. As long as the download hash is known. So, guix-named provides pointers to nar entities with their download hash and guix-download is capable of querying guix-named and provides more protocols. IPFS protocol is well defined and there exist implementations in multiple languages. Anyway, this all requires more thought and a proof-of-concept. The point really is to design a distributed system based on existing components. Pj. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-02 12:16 plz is there a roadmap for a more resilient substitutes infrastructure? Giovanni Biscuolo 2018-11-02 21:04 ` Pjotr Prins @ 2018-11-02 21:13 ` Devan Carpenter 2018-11-06 11:23 ` Ludovic Courtès 2 siblings, 0 replies; 8+ messages in thread From: Devan Carpenter @ 2018-11-02 21:13 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1513 bytes --] Giovanni Biscuolo transcribed 2.7K bytes: > Ciao, > > recently users and developers are facing hard to manage problems due to > the maintainance of hydra.gnu.org and its proxy mirror.hydra.gnu.org [1] > since 23 Oct 2018 > > unfortunately many recent reports from users in help-guix and guix-devel > mailing list clearly shows that berlin.guixsd.org it's still not a > solution, since several missing substitutes are forcing users to "build > the world" [2] > > please is there a roadmap in GNU and/or Guix devel team to address this > problems? > > GuixSD is now well known in the free software community, please > aknowledge that this king of problems are detrimental to project > reputation > > given the prolonged issue, please also consider writing an *official* > blog post explaining the current situation and steps adopted to prevent > similar issues in the future +1 > Me and many others would be very happy to help building a more resilient > substitutes infrastructure: just tell us how to do > > for example: > > 1. is there a method to "replicate the whole store of an official server > (e.g. hydra.gnu.org once healed)" so we can just "guix publish" a > *complete* mirror? In this case a ready to use official > mirror-config.scm could be useful > I am almost to a point where I am ready to setup a server for this purpose, so I am also interested if there is a practical way to do "bootstrap" it from another build server like this. cheers [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-02 12:16 plz is there a roadmap for a more resilient substitutes infrastructure? Giovanni Biscuolo 2018-11-02 21:04 ` Pjotr Prins 2018-11-02 21:13 ` Devan Carpenter @ 2018-11-06 11:23 ` Ludovic Courtès 2018-11-06 11:31 ` Pierre Neidhardt 2018-11-11 18:56 ` Giovanni Biscuolo 2 siblings, 2 replies; 8+ messages in thread From: Ludovic Courtès @ 2018-11-06 11:23 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: guix-devel Ciao Giovanni, Giovanni Biscuolo <g@xelera.eu> skribis: > recently users and developers are facing hard to manage problems due to > the maintainance of hydra.gnu.org and its proxy mirror.hydra.gnu.org [1] > since 23 Oct 2018 We Guix developers don’t have control over the physical hardware behind hydra.gnu.org; for this machine, we rely on the work of the FSF sysadmins for all things hardware/networking. Unfortunately in this case, this maintenance period was rather unprepared: it wasn’t supposed to last a whole week, rather a few hours or a day at most. Most of the time it took was about copying data to a new disk (!). Had this been prepared, we could have arranged to keep hydra.gnu.org up until the replacement was ready. We Guix developers didn’t have much visibility over what was going on though, and we just didn’t anticipate this. It is clear that this prolonged downtime was harmful to many users and to the project’s reputation. What to do from here? Our main focus is on making berlin.guixsd.org the primary build farm of the project. It has the advantage that one Guix dev has physical access to it (Ricardo); it’s also much more powerful than hydra.gnu.org and the associated build machines. Yet, there’s more work to do: berlin has just 1T of disk space. Ricardo started looking on growing it but was stuck on software issues IIRC. I think fixing this should be a priority, so I think we should help Ricardo fix the software issues as much as we can. That alone doesn’t fix the resilience issue: berlin.guixsd.org could go down at some point for some time. To address that, a possibility that was discussed recently on guix-sysadmin is use bayfront.guixsd.org has a separate build farm and/or mirror of berlin. On top of that we could have a service like httpredir.debian.org, or maybe even a CDN where we’d replicate substitutes, or torrents (looking at you, Julien ;-)). At this point, all these options are still on the table, and anyone with expertise in these areas is very much welcome! > given the prolonged issue, please also consider writing an *official* > blog post explaining the current situation and steps adopted to prevent > similar issues in the future We set up the info-guix mailing list with that in mind (but too late for this incident). Posting blog posts is also a good idea; we should have done that, with instructions on how to switch to berlin.guixsd.org. > 1. is there a method to "replicate the whole store of an official server > (e.g. hydra.gnu.org once healed)" so we can just "guix publish" a > *complete* mirror? In this case a ready to use official > mirror-config.scm could be useful mirror.hydra.gnu.org is a simple nginx proxy to hydra.gnu.org. You can find its config here: https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/nginx/mirror.conf In the past a few people set up their own mirrors using a similar configuration. > 2. is there an official mirrors directory users can look at when needed? No. > 3. is there a plan to build a service similar to > http://httpredir.debian.org/? (I looked on the web but did not find any > reference to such plan) Like I wrote, there’s no concrete plan at this point, which means it’s an opportunity for you and anyone else to chime in and give a hand! Thanks, Ludo’. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-06 11:23 ` Ludovic Courtès @ 2018-11-06 11:31 ` Pierre Neidhardt 2018-11-11 18:56 ` Giovanni Biscuolo 1 sibling, 0 replies; 8+ messages in thread From: Pierre Neidhardt @ 2018-11-06 11:31 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 375 bytes --] > On top of that we could have a service like httpredir.debian.org, or > maybe even a CDN where we’d replicate substitutes, or torrents (looking > at you, Julien ;-)). Also IPFS would be an option. I'm working on the package these days and any help getting the "gx" downloader to work would be very appreciated! :D -- Pierre Neidhardt https://ambrevar.xyz/ [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: plz is there a roadmap for a more resilient substitutes infrastructure? 2018-11-06 11:23 ` Ludovic Courtès 2018-11-06 11:31 ` Pierre Neidhardt @ 2018-11-11 18:56 ` Giovanni Biscuolo 1 sibling, 0 replies; 8+ messages in thread From: Giovanni Biscuolo @ 2018-11-11 18:56 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 6062 bytes --] Hi! sorry for my late reply I confess I haven't still read the whole Guix/GuixSD Reference Maulal, so my apologies if I'm asking something already documented :-S ludo@gnu.org (Ludovic Courtès) writes: [...] > We Guix developers don’t have control over the physical hardware behind > hydra.gnu.org; for this machine, we rely on the work of the FSF > sysadmins for all things hardware/networking. OK, thanks for this info > Unfortunately in this case, this maintenance period was rather > unprepared: it wasn’t supposed to last a whole week, rather a few hours > or a day at most. Most of the time it took was about copying data to a > new disk (!). is it published somewhere what are the minimum hardware and disk needs for a complete GuixSD distribution build server? > Had this been prepared, we could have arranged to keep > hydra.gnu.org up until the replacement was ready. We Guix developers > didn’t have much visibility over what was going on though, and we just > didn’t anticipate this. sorry about that, I'm a sysadmin and I know how much my work is impacting others :-) > It is clear that this prolonged downtime was harmful to many users and > to the project’s reputation. GuixSD does not deserve this kind of harm :-( > What to do from here? I once saw the existance of https://git.savannah.gnu.org/cgit/guix/maintenance.git [1] you pointed me (below), but did not read the entire tree now I see we have https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/doc/1.0.org should we add a new "super" task named "resilience of subsitutes network"? looking at https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/machines.scm it seems that some deggree of resilience for hydra.gnu.org is already in place but this does not seem to work as a distributed source of substitute servers, but "just" to offload build jobs to the defined list of build servers could servers in "machines.scm" also be used as substitutes servers? > Our main focus is on making berlin.guixsd.org the primary build farm of > the project. It has the advantage that one Guix dev has physical access > to it (Ricardo); it’s also much more powerful than hydra.gnu.org and the > associated build machines. OK, I see it https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/doc/1.0.org#n30 more details could help fix related issues IMHO a public guixsd.org Sysadmins Manual should be in the roadmap (as MAYBE): that could help the core team job, show the community how the job is done *and* help others to build on our best practices Guix/GuixSD is *the* perfect tool for IaC (infrastructure as code), could be *very* interesting to develop a "Literate GuixSD IaC package" as a meta-project :-) maybe we could (slowly) build a reproducible IaC literate devops document, based on org-mode babel, so we'd have both tangled code and exported documentation > Yet, there’s more work to do: berlin has just 1T of disk space. Ricardo > started looking on growing it but was stuck on software issues IIRC. I > think fixing this should be a priority, so I think we should help > Ricardo fix the software issues as much as we can. I realize I'm pretty new in this community and you can't trust me since we do non even know each other... but I could help if needed, just tell me (in private if more appropriate) what's the hardware issue > That alone doesn’t fix the resilience issue: berlin.guixsd.org could go > down at some point for some time. > > To address that, a possibility that was discussed recently on > guix-sysadmin is use bayfront.guixsd.org has a separate build farm guess you meant "use bayfront.guixsd.org *as* a separate build farm" > and/or mirror of berlin. [...] >> given the prolonged issue, please also consider writing an *official* >> blog post explaining the current situation and steps adopted to prevent >> similar issues in the future > > We set up the info-guix mailing list with that in mind (but too late for > this incident). Posting blog posts is also a good idea; we should have > done that, with instructions on how to switch to berlin.guixsd.org. given the impact on project reputation, please consider a "post-mortem" blog post on what happened: something in line with Ludo's reply to me not all interested users and observers read this (and others) mailing list archives >> 1. is there a method to "replicate the whole store of an official server >> (e.g. hydra.gnu.org once healed)" so we can just "guix publish" a >> *complete* mirror? In this case a ready to use official >> mirror-config.scm could be useful > > mirror.hydra.gnu.org is a simple nginx proxy to hydra.gnu.org. You can > find its config here: > > https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/nginx/mirror.conf OK, so it's caching proxy I'll see if and how I can build a similar one sorry but I still don't understand why mirror.hydra.gnu.org failed serving substitutes during a 0.15 installation started from the install CD: it was a cache size problem? > In the past a few people set up their own mirrors using a similar > configuration. we shold build a network of organizations and individuals for this >> 2. is there an official mirrors directory users can look at when needed? > > No. I volunteer to keep such a list and coordinate the "volunteers network", if you want >> 3. is there a plan to build a service similar to >> http://httpredir.debian.org/? (I looked on the web but did not find any >> reference to such plan) > > Like I wrote, there’s no concrete plan at this point, which means it’s > an opportunity for you and anyone else to chime in and give a hand! I have no experience in building such a service but it definitely fits in my professional enhancement plan, so I'm still not able to lead such a project but I can help ciao Giovanni -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-11-11 18:57 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-11-02 12:16 plz is there a roadmap for a more resilient substitutes infrastructure? Giovanni Biscuolo 2018-11-02 21:04 ` Pjotr Prins 2018-11-02 22:51 ` Julien Lepiller 2018-11-03 6:10 ` Pjotr Prins 2018-11-02 21:13 ` Devan Carpenter 2018-11-06 11:23 ` Ludovic Courtès 2018-11-06 11:31 ` Pierre Neidhardt 2018-11-11 18:56 ` Giovanni Biscuolo
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).