* "guix pack -f docker" does too much work @ 2024-05-29 12:58 Ricardo Wurmus 2024-05-30 13:10 ` Michal Atlas 2024-06-01 13:58 ` Ludovic Courtès 0 siblings, 2 replies; 14+ messages in thread From: Ricardo Wurmus @ 2024-05-29 12:58 UTC (permalink / raw) To: guix-devel Hi Guix, a few months ago "guix pack -f docker" was modified to produce layers. This is great! Unfortunately, "guix pack" itself still produces one big tarball containing all these layers. There is no sharing of previously built layers, because they are all hidden inside the pack. I think it would be great if "guix pack -f docker" could avoid building all these identical layers again and again. Perhaps it would be possible to have a single derivation for each layer? This way we wouldn't have to recreate the same layer archives every time. What do you think? -- Ricardo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-05-29 12:58 "guix pack -f docker" does too much work Ricardo Wurmus @ 2024-05-30 13:10 ` Michal Atlas 2024-06-17 11:21 ` Ludovic Courtès 2024-06-01 13:58 ` Ludovic Courtès 1 sibling, 1 reply; 14+ messages in thread From: Michal Atlas @ 2024-05-30 13:10 UTC (permalink / raw) To: guix-devel Hello Ricardo, I greatly agree, it would be an awesome QOL improvement. Just want to mention that it might be nice to take inspiration from the Nix dockerTools, since they already have quite a lot of effort put into this. Including for example an option called `streamLayeredImage` [1] which doesn't generate a tarball at all, but rather a script that outputs the layers without assembling them, in a format which Docker or Podman can import without the huge intermediary file. i.e. $(guix pack ...) | docker load [1]: https://ryantm.github.io/nixpkgs/builders/images/dockertools/#ssec-pkgs-dockerTools-streamLayeredImage So that'd allow Guix to skip generating the final tarball altogether, which makes packing very swift. Also seems that Nix's way only quickly imports the changed layers? And Guix's always imports the whole thing, at least I think? Reading through how they do it, it seems that they pass the raw store paths to this python script [2] and it does the rest? Save for figuring out some merging of paths since there's a limit to the number of layers, I don't think this would be too difficult to port (after we find what license the script is under at least, or replicate the behaviour in Guile). [2]: https://github.com/NixOS/nixpkgs/blob/90509d6d66eb1524e2798a2a8627f44ae413f174/pkgs/build-support/docker/stream_layered_image.py What do you think? --- Atlas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-05-30 13:10 ` Michal Atlas @ 2024-06-17 11:21 ` Ludovic Courtès 2024-06-17 11:57 ` Michal Atlas 0 siblings, 1 reply; 14+ messages in thread From: Ludovic Courtès @ 2024-06-17 11:21 UTC (permalink / raw) To: Michal Atlas; +Cc: guix-devel Hi, Michal Atlas <michal_atlas+gnu@posteo.net> skribis: > I greatly agree, it would be an awesome QOL improvement. If there’s consensus, let’s see how we can get that done. The advantage of having (guix docker) & co. all in Scheme is that moving it from a derivation to code running straight from ‘guix pack’ is definitely feasible (a bit of work though because ‘guix pack’ has quite a few backends). > Just want to mention that it might be nice to take inspiration from > the Nix dockerTools, since they already have quite a lot of effort put > into this. > > Including for example an option called `streamLayeredImage` [1] which > doesn't generate a tarball at all, but rather a script that outputs > the layers without assembling them, in a format which Docker or Podman > can import without the huge intermediary file. > > i.e. $(guix pack ...) | docker load > > [1]: > https://ryantm.github.io/nixpkgs/builders/images/dockertools/#ssec-pkgs-dockerTools-streamLayeredImage Nice! Sounds very much in line with what Ricardo was proposing. > Also seems that Nix's way only quickly imports the changed layers? And > Guix's always imports the whole thing, at least I think? What do you mean by “imports the whole thing”? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-06-17 11:21 ` Ludovic Courtès @ 2024-06-17 11:57 ` Michal Atlas 2024-06-17 21:24 ` Ludovic Courtès 0 siblings, 1 reply; 14+ messages in thread From: Michal Atlas @ 2024-06-17 11:57 UTC (permalink / raw) To: Ludovic Courtès, Michal Atlas; +Cc: guix-devel Hi, >> Also seems that Nix's way only quickly imports the changed layers? And >> Guix's always imports the whole thing, at least I think? > What do you mean by “imports the whole thing”? I'm not sure what exactly happens, so correct me if I'm wrong, however if I time the different approaches, I think that how Guix creates a single-layered image, then if anything changes the entire image gets re-imported into docker. Though with the layered approach, if only one or two paths change, then those get imported, (and even though there's still some baseline that compression takes up) docker importing just the changed paths is a very noticeable speedup. On that note, I know that guix pack goes through %compressors in order, however zstd is an insane improvement over gzip when working with containers, would it perhaps be possible to default to it, or would that break far too many workflows, or is there another reason? Perhaps during changing how guix pack works would be a good time to make both breaking changes at once? Thanks, Michal. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-06-17 11:57 ` Michal Atlas @ 2024-06-17 21:24 ` Ludovic Courtès 0 siblings, 0 replies; 14+ messages in thread From: Ludovic Courtès @ 2024-06-17 21:24 UTC (permalink / raw) To: Michal Atlas; +Cc: Michal Atlas, guix-devel, Oleg Pykhalov Hi, Michal Atlas <michal_atlas@posteo.net> skribis: >>> Also seems that Nix's way only quickly imports the changed layers? And >>> Guix's always imports the whole thing, at least I think? >> What do you mean by “imports the whole thing”? > > I'm not sure what exactly happens, so correct me if I'm wrong, however > if I time the different approaches, I think that how Guix creates a > single-layered image, then if anything changes the entire image gets > re-imported into docker. Oh, there’s the quite recent ‘--max-layers’ option: https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-pack.html However the default is to create a single layer. Maybe worth changing to 32 or so? Oleg, WDYT? (We should also document the default value of ‘--max-layers’ in the manual: I had to check the code…) > On that note, I know that guix pack goes through %compressors in > order, however zstd is an insane improvement over gzip when working > with containers, would it perhaps be possible to default to it, or > would that break far too many workflows, or is there another reason? > Perhaps during changing how guix pack works would be a good time to > make both breaking changes at once? If Docker itself always understands zstd, then we could change the default, indeed. For other backends, such as plain tarballs, we could make that change but it’s going to be potentially more of a breaking change. Thoughts? Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-05-29 12:58 "guix pack -f docker" does too much work Ricardo Wurmus 2024-05-30 13:10 ` Michal Atlas @ 2024-06-01 13:58 ` Ludovic Courtès 2024-06-01 19:07 ` Ricardo Wurmus ` (3 more replies) 1 sibling, 4 replies; 14+ messages in thread From: Ludovic Courtès @ 2024-06-01 13:58 UTC (permalink / raw) To: Ricardo Wurmus; +Cc: guix-devel Hi, Ricardo Wurmus <rekado@elephly.net> skribis: > a few months ago "guix pack -f docker" was modified to produce layers. > This is great! Unfortunately, "guix pack" itself still produces one big > tarball containing all these layers. There is no sharing of previously > built layers, because they are all hidden inside the pack. Right. > I think it would be great if "guix pack -f docker" could avoid building > all these identical layers again and again. Perhaps it would be > possible to have a single derivation for each layer? This way we > wouldn't have to recreate the same layer archives every time. That sounds nice in terms of saving CPU time. It’s less nice in terms of disk usage: a single ‘guix pack -f docker’ run would populate the store with roughly twice the size of the closure. I think each solution (single derivation vs. one derivation per layer) makes a different tradeoff. I don’t have a strong feeling about which one is better. WDYT? Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-06-01 13:58 ` Ludovic Courtès @ 2024-06-01 19:07 ` Ricardo Wurmus 2024-06-03 7:09 ` Andy Wingo ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Ricardo Wurmus @ 2024-06-01 19:07 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel Ludovic Courtès <ludo@gnu.org> writes: >> I think it would be great if "guix pack -f docker" could avoid building >> all these identical layers again and again. Perhaps it would be >> possible to have a single derivation for each layer? This way we >> wouldn't have to recreate the same layer archives every time. > > That sounds nice in terms of saving CPU time. It’s less nice in terms > of disk usage: a single ‘guix pack -f docker’ run would populate the > store with roughly twice the size of the closure. Arguably we don't actually care all that much for the Docker image that ends up in the store. It's really a temporary thing that we want to load into Docker or upload somewhere else. I've often wanted to stream the eventual output of "guix pack" to a pipe, precisely because I don't want to store the same thing twice: once in the store and once in the Docker storage backend. It's actually worse than that: I often end up having dozens of packs in the store whose layers are almost all identical. > I think each solution (single derivation vs. one derivation per layer) > makes a different tradeoff. I don’t have a strong feeling about which > one is better. Can we have both? I realize that adding the option to stream build output to a pipe is not a trivial change, but it would solve the unnecessary storage requirement for packs. "docker load" reads from standard input, but other packs would also benefit from a streaming output; an example is Docker-free deployment to a remote server: just pipe "guix pack" to a remote tar process and you're all set. -- Ricardo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-06-01 13:58 ` Ludovic Courtès 2024-06-01 19:07 ` Ricardo Wurmus @ 2024-06-03 7:09 ` Andy Wingo 2024-06-04 18:14 ` Simon Tournier 2024-09-14 14:55 ` Maxim Cournoyer 3 siblings, 0 replies; 14+ messages in thread From: Andy Wingo @ 2024-06-03 7:09 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Ricardo Wurmus, guix-devel On Sat 01 Jun 2024 15:58, Ludovic Courtès <ludo@gnu.org> writes: >> I think it would be great if "guix pack -f docker" could avoid building >> all these identical layers again and again. Perhaps it would be >> possible to have a single derivation for each layer? This way we >> wouldn't have to recreate the same layer archives every time. > > That sounds nice in terms of saving CPU time. It’s less nice in terms > of disk usage: a single ‘guix pack -f docker’ run would populate the > store with roughly twice the size of the closure. If the concern is CPU time, I would make sure you have switched to zstd or some other faster codec, via `guix pack -f docker -C zstd`. You probably already knew but if you haven't tried, it's quite surprising :) Andy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-06-01 13:58 ` Ludovic Courtès 2024-06-01 19:07 ` Ricardo Wurmus 2024-06-03 7:09 ` Andy Wingo @ 2024-06-04 18:14 ` Simon Tournier 2024-09-14 14:55 ` Maxim Cournoyer 3 siblings, 0 replies; 14+ messages in thread From: Simon Tournier @ 2024-06-04 18:14 UTC (permalink / raw) To: Ludovic Courtès, Ricardo Wurmus; +Cc: guix-devel Hi, On Sat, 01 Jun 2024 at 15:58, Ludovic Courtès <ludo@gnu.org> wrote: >> I think it would be great if "guix pack -f docker" could avoid building >> all these identical layers again and again. Perhaps it would be >> possible to have a single derivation for each layer? This way we >> wouldn't have to recreate the same layer archives every time. > > That sounds nice in terms of saving CPU time. It’s less nice in terms > of disk usage: a single ‘guix pack -f docker’ run would populate the > store with roughly twice the size of the closure. > > I think each solution (single derivation vs. one derivation per layer) > makes a different tradeoff. I don’t have a strong feeling about which > one is better. I share Ricardo wish. From my perspective, I do not care much about polluting my local Guix store when building Docker images. Because all that will be removed at the next GC – once all the work is loaded elsewhere. However, it appears frustrating to build again and again complete large images when the difference is sometimes just a couple of packages. I would be in favor to share more derivations between images. :-) Cheers, simon ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-06-01 13:58 ` Ludovic Courtès ` (2 preceding siblings ...) 2024-06-04 18:14 ` Simon Tournier @ 2024-09-14 14:55 ` Maxim Cournoyer 2024-09-14 18:36 ` Ricardo Wurmus 3 siblings, 1 reply; 14+ messages in thread From: Maxim Cournoyer @ 2024-09-14 14:55 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Ricardo Wurmus, guix-devel Hi, Ludovic Courtès <ludo@gnu.org> writes: > Hi, > > Ricardo Wurmus <rekado@elephly.net> skribis: > >> a few months ago "guix pack -f docker" was modified to produce layers. >> This is great! Unfortunately, "guix pack" itself still produces one big >> tarball containing all these layers. There is no sharing of previously >> built layers, because they are all hidden inside the pack. > > Right. > >> I think it would be great if "guix pack -f docker" could avoid building >> all these identical layers again and again. Perhaps it would be >> possible to have a single derivation for each layer? This way we >> wouldn't have to recreate the same layer archives every time. > > That sounds nice in terms of saving CPU time. It’s less nice in terms > of disk usage: a single ‘guix pack -f docker’ run would populate the > store with roughly twice the size of the closure. > > I think each solution (single derivation vs. one derivation per layer) > makes a different tradeoff. I don’t have a strong feeling about which > one is better. In past discussions (such as the implementation of the 'RPM' pack format) we had concluded that a single derivation was preferable. Large chunks to be sent to offload machines over the network are not very practical, and as Ludovic said, they also require more store space. -- Thanks, Maxim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-09-14 14:55 ` Maxim Cournoyer @ 2024-09-14 18:36 ` Ricardo Wurmus 2024-09-15 0:42 ` Suhail Singh 2024-09-26 16:18 ` Simon Tournier 0 siblings, 2 replies; 14+ messages in thread From: Ricardo Wurmus @ 2024-09-14 18:36 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: Ludovic Courtès, guix-devel Maxim Cournoyer <maxim.cournoyer@gmail.com> writes: >>> I think it would be great if "guix pack -f docker" could avoid building >>> all these identical layers again and again. Perhaps it would be >>> possible to have a single derivation for each layer? This way we >>> wouldn't have to recreate the same layer archives every time. >> >> That sounds nice in terms of saving CPU time. It’s less nice in terms >> of disk usage: a single ‘guix pack -f docker’ run would populate the >> store with roughly twice the size of the closure. >> >> I think each solution (single derivation vs. one derivation per layer) >> makes a different tradeoff. I don’t have a strong feeling about which >> one is better. > > In past discussions (such as the implementation of the 'RPM' pack > format) we had concluded that a single derivation was preferable. Large > chunks to be sent to offload machines over the network are not very > practical, and as Ludovic said, they also require more store space. Dependent on the situation I can see one approach to be preferrable to the other, and in other situations this could very well be reversed. Can we expose this choice to the command line interface of "guix pack"? -- Ricardo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-09-14 18:36 ` Ricardo Wurmus @ 2024-09-15 0:42 ` Suhail Singh 2024-09-26 16:18 ` Simon Tournier 1 sibling, 0 replies; 14+ messages in thread From: Suhail Singh @ 2024-09-15 0:42 UTC (permalink / raw) To: Ricardo Wurmus; +Cc: Maxim Cournoyer, Ludovic Courtès, guix-devel Ricardo Wurmus <rekado@elephly.net> writes: >>>> I think it would be great if "guix pack -f docker" could avoid building >>>> all these identical layers again and again. Perhaps it would be >>>> possible to have a single derivation for each layer? This way we >>>> wouldn't have to recreate the same layer archives every time. >>> >>> That sounds nice in terms of saving CPU time. It’s less nice in terms >>> of disk usage: a single ‘guix pack -f docker’ run would populate the >>> store with roughly twice the size of the closure. >>> >>> I think each solution (single derivation vs. one derivation per layer) >>> makes a different tradeoff. I don’t have a strong feeling about which >>> one is better. >> >> In past discussions (such as the implementation of the 'RPM' pack >> format) we had concluded that a single derivation was preferable. Large >> chunks to be sent to offload machines over the network are not very >> practical, and as Ludovic said, they also require more store space. > > Dependent on the situation I can see one approach to be preferrable to > the other, and in other situations this could very well be reversed. I agree. > Can we expose this choice to the command line interface of "guix pack"? That would be quite helpful, indeed. Happy to help with this if someone can point me in the right direction, provided the effort of "pointing me in the right direction" isn't too great to be impractical. -- Suhail ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-09-14 18:36 ` Ricardo Wurmus 2024-09-15 0:42 ` Suhail Singh @ 2024-09-26 16:18 ` Simon Tournier 2024-10-05 14:01 ` Maxim Cournoyer 1 sibling, 1 reply; 14+ messages in thread From: Simon Tournier @ 2024-09-26 16:18 UTC (permalink / raw) To: Ricardo Wurmus, Maxim Cournoyer; +Cc: Ludovic Courtès, guix-devel Hi, On Sat, 14 Sep 2024 at 20:36, Ricardo Wurmus <rekado@elephly.net> wrote: >> In past discussions (such as the implementation of the 'RPM' pack >> format) we had concluded that a single derivation was preferable. Large >> chunks to be sent to offload machines over the network are not very >> practical, and as Ludovic said, they also require more store space. Well, the argument “require more store space” appears to me as an “half-joke”when you know all the space that is required by Guix for a day-to-day usage. :-) I don’t buy it. ;-) About offload, indeed. And this can be a “bad surprise”. > Dependent on the situation I can see one approach to be preferrable to > the other, and in other situations this could very well be reversed. I agree. > Can we expose this choice to the command line interface of "guix pack"? Maybe the switch could be with the option ’--no-offload’ instead of adding yet another one. Cheers, simon ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: "guix pack -f docker" does too much work 2024-09-26 16:18 ` Simon Tournier @ 2024-10-05 14:01 ` Maxim Cournoyer 0 siblings, 0 replies; 14+ messages in thread From: Maxim Cournoyer @ 2024-10-05 14:01 UTC (permalink / raw) To: Simon Tournier; +Cc: Ricardo Wurmus, Ludovic Courtès, guix-devel Hi, Simon Tournier <zimon.toutoune@gmail.com> writes: [...] >> Dependent on the situation I can see one approach to be preferrable to >> the other, and in other situations this could very well be reversed. > > I agree. Why not, if someone's itch is strong enough to implement it! >> Can we expose this choice to the command line interface of "guix pack"? > > Maybe the switch could be with the option ’--no-offload’ instead of > adding yet another one. I conflating this behavior with that switch would bring more confusion than good; I'd favor separate and explicitly named options. -- Thanks, Maxim ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-10-05 14:02 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-29 12:58 "guix pack -f docker" does too much work Ricardo Wurmus 2024-05-30 13:10 ` Michal Atlas 2024-06-17 11:21 ` Ludovic Courtès 2024-06-17 11:57 ` Michal Atlas 2024-06-17 21:24 ` Ludovic Courtès 2024-06-01 13:58 ` Ludovic Courtès 2024-06-01 19:07 ` Ricardo Wurmus 2024-06-03 7:09 ` Andy Wingo 2024-06-04 18:14 ` Simon Tournier 2024-09-14 14:55 ` Maxim Cournoyer 2024-09-14 18:36 ` Ricardo Wurmus 2024-09-15 0:42 ` Suhail Singh 2024-09-26 16:18 ` Simon Tournier 2024-10-05 14:01 ` Maxim Cournoyer
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).