Stephen Scheck writes: > IF any of the store files resulting from `guix pull` are ephemeral > (i.e. intermediate build results not anchored to a profile) AND guix > GC worked inside the container, my approach might still work - yes > there would be image and layers growth but it might be small enough > not to care between periodic image rebases. But I'm starting to doubt > that, or at least it is difficult to quantify with the GC issues. I think you're right about it being difficult to quantify the GC issues. Basically, when you run "guix pull", the current Guix will "build" (i.e., maybe download via substitutes, maybe build from source) the new Guix, which puts it into the store, and updates the profile symlinks to make it current. In the process of doing this, some intermediate builds might be performed if substitutes are not available. Although the new Guix will remain live in the store after the profile symlinks are updated to make it current, (1) intermediate results might be left dead after "guix pull" is finished, and (2) if the old Guix is sufficiently different from the new Guix, it will also become dead after the symlinks that were keeping it live are removed. So, the amount of garbage that will be left over depends on a few factors, like whether substitutes were available, and how different the new Guix is from the old one. It can also depend on how the guix-daemon has been started (see "--gc-keep-outputs" and --gc-keep-derivations" in the "Invoking guix-daemon" section of the manual). In the case of your Docker images, most (all?) of the garbage is coming from case (2) above: as Guix changes, the old Guix will be made dead and GC'd (hypothetically, let's suppose GC is working), but it will still exist on prior layers, since it came from a prior layer. As for case (1), the intermediate results, I think they are not contributing to your image size for two reasons: substitutes are probably available, and even if they weren't available, the intermediates would probably appear during "guix pull", which means they'd be on the top layer and would be GC'd, so they wouldn't be included in any layer of the next image. The fact that the biggest dead paths in your latest image consist entirely of store paths that look suspiciously like they came from prior Guix installations is further evidence in support of this theory. --8<---------------cut here---------------start------------->8--- root@guix /# du -Phc $(guix gc --list-dead) 2>/dev/null | sort -hk 1,1 | tail finding garbage collector roots... determining live/dead paths... 187M /gnu/store/0vwg9aqzs5xrk10vcs4dl105s3f42ilf-guix-b1affd477-modules/lib/guile/3.0/site-ccache 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib/guile 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib/guile/3.0 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib/guile/3.0/site-ccache 194M /gnu/store/hz2rn2l0jixg91q4rsdcwc489y71ll29-guix-05e1edf22-modules 198M /gnu/store/5mhn1ynxvy7jihsknsnv3yspkkvc0r5s-guix-2e59ae238-modules 210M /gnu/store/0vwg9aqzs5xrk10vcs4dl105s3f42ilf-guix-b1affd477-modules 210M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules 3.0G total root@guix /# --8<---------------cut here---------------end--------------->8--- These "guix-HASH-modules" directories, for example, are used as part of each Guix installation: --8<---------------cut here---------------start------------->8--- root@guix /# realpath ~/.config/guix/current/share/guile /gnu/store/mj6pf6nf0kf03nhh7bmpc6m43v6knq6m-guix-a5374cde9-modules/share/guile root@guix /# --8<---------------cut here---------------end--------------->8--- Each of them has a total closure size of almost 500 MB, although since they might share some references, each one individually is adding "only" about 200 MB. --8<---------------cut here---------------start------------->8--- root@guix /# guix size /gnu/store/mj6pf6nf0kf03nhh7bmpc6m43v6knq6m-guix-a5374cde9-modules store item total self /gnu/store/mj6pf6nf0kf03nhh7bmpc6m43v6knq6m-guix-a5374cde9-modules 485.9 206.9 42.6% /gnu/store/hkmsljl2sf4nk96b35f0bmfkr2lqanfq-guix-packages-base 105.7 105.7 21.8% /gnu/store/s7izb7j0s5rzcq297nd7ba9sfiqh5zmz-guix-system 43.2 43.2 8.9% /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31 38.4 36.7 7.6% /gnu/store/01b4w3m6mp55y531kyi1g8shh722kwqm-gcc-7.5.0-lib 71.0 32.6 6.7% /gnu/store/wcv5mscivggkygnz68nn2671fr3kapjc-guix-packages-base-source 19.4 19.4 4.0% /gnu/store/6zygksmvzcq92xf65cna91dbf7a4zblh-guix-extra 19.4 19.4 4.0% /gnu/store/a7wiy24mmcilbqp39pl0jdlw10vbvavb-guix-cli 8.0 7.3 1.5% /gnu/store/f6k9b4grrfpip4h5lrmpnsnn2gqziihr-guix-system-tests 4.6 4.6 1.0% /gnu/store/gbrd1laxsncb9zd218pyglisxyxymmbd-guix-system-source 1.9 1.9 0.4% /gnu/store/mmhimfwmmidf09jw1plw3aw1g1zn2nkh-bash-static-5.0.16 1.6 1.6 0.3% /gnu/store/5lr8miawrk380zw8yjy0crcl6vcs10s3-guix-extra-source 1.5 1.5 0.3% /gnu/store/pwcp239kjf7lnj5i4lkdzcfcxwcfyk72-bash-minimal-5.0.16 39.4 1.0 0.2% /gnu/store/r7k859hmcnkazf492fasqvk25jflnfk6-xz-5.2.4 73.0 0.9 0.2% /gnu/store/bhs4rj58v8j1narb2454raan2ps38xd8-grep-3.4 72.9 0.8 0.2% /gnu/store/z0572147hprpbjrcjqkgrv3f80ip2klx-guix-cli-source 0.7 0.7 0.1% /gnu/store/a9f7wmc75hbpg520phw9z4l9asm3qvsw-bzip2-1.0.8 72.5 0.4 0.1% /gnu/store/7y0nin2d0j46j26a1n46bl5zl3px0zvz-guix-system-tests-source 0.3 0.3 0.1% /gnu/store/rykm237xkmq7rl1p0nwass01p090p88x-zlib-1.2.11 71.2 0.2 0.0% /gnu/store/jqr5bz89gfwhxcndnhq333dyclvkq7ws-lzlib-1.11 71.2 0.2 0.0% /gnu/store/378zjf2kgajcfd7mfr98jn5xyc5wa3qv-gzip-1.10 73.1 0.2 0.0% /gnu/store/kfj1lc84v50imn3raijgih4salilmf1a-guix-packages-base-modules 125.2 0.0 0.0% /gnu/store/lvszhqs57scb2ax18l2nrn9dwiyf6iza-guix-system-tests-modules 4.9 0.0 0.0% /gnu/store/lr65f259z1730p7bvplsj9k6yvbkyh39-guix-system-modules 45.1 0.0 0.0% /gnu/store/nk1x6cdif8pd9vi04nzxfqinh0ag06am-guix-extra-modules 20.9 0.0 0.0% /gnu/store/s6vlfscnfvnrlv3yfag6qsy5j6c9pxqb-guix-cli-modules 8.0 0.0 0.0% total: 485.9 MiB root@guix /# --8<---------------cut here---------------end--------------->8--- And there are still other components adding space each time you run "guix pull", like the "guix-system" component, for example: --8<---------------cut here---------------start------------->8--- root@guix /# du -Phc $(guix gc --list-dead | grep guix-system) 2>/dev/null | sort -hk 1,1 | tail finding garbage collector roots... determining live/dead paths... 44M /gnu/store/qhbk7g8z97m37iak1s1yn2my82gv0lj5-guix-system/gnu 44M /gnu/store/slwkzcmg6r1lr9a16x3krd2ax384p8wr-guix-system 44M /gnu/store/slwkzcmg6r1lr9a16x3krd2ax384p8wr-guix-system/gnu 44M /gnu/store/vwzk618h1wxy6z9i06xnhnxj4gvhkiss-guix-system 44M /gnu/store/vwzk618h1wxy6z9i06xnhnxj4gvhkiss-guix-system/gnu 44M /gnu/store/w47fgv8p2hvaqdwywymwvm0qlh4gw0ih-guix-system 44M /gnu/store/w47fgv8p2hvaqdwywymwvm0qlh4gw0ih-guix-system/gnu 44M /gnu/store/zf67wb6c0s97vwmywjq09hy9jq0w5mmi-guix-system 44M /gnu/store/zf67wb6c0s97vwmywjq09hy9jq0w5mmi-guix-system/gnu 523M total root@guix /# --8<---------------cut here---------------end--------------->8--- Anyway, the point is: you begin with a previous image. The previous image already has these store paths from the previous installation of Guix. Therefore, they exist on the previous layer. Because they exist on the previous layer, they cannot be removed from the Docker image, and they are carried forward in that previous layer, to all new images. Regardless of any changes to guix-daemon we might make, the way in which you build your images will cause them to grow by hundreds of megabytes every day. > Actually, there might be another way around this, still avoiding the > need for a custom Runner, for example mounting /var/guix and > /gnu/store into the container instead of belonging to it. If done that > way, layer accumulation wouldn't be an issue, and maybe GC between > layers neither. This sounds like a great idea, actually! "The right way" to do Docker containers is to have a single process per container, and to not store state in the Docker container. We're violating that principle on both counts when we run an entire GNU/Linux distribution inside a Docker container, especially since the guix-daemon is all about managing the "state" of /var/guix and /gnu/store. If you can somehow move that "state" into a Docker volume instead of the container itself, that would definitely be an improvement. It may be tricky, though, since if guix-daemon sees stuff in /gnu/store that is inconsistent with its database in /var/guix, bad things can happen. So you'll have to ensure they remain consistent with one another. >> Besides store items, I noticed two other things about your images: >> >> - The contents of /var is growing slowly without bound, but it isn't >> nearly as bad as the contents of /gnu/store. This is probably due to >> log files; consider pruning them. >> > > These are presumably OK to delete, without any special handling for Guix? I think the answer is "probably", but I would stop guix-daemon first. Other processes may be using /var, too, so I would stop them, also. >> - Your script runs "docker commit" while guix-daemon (and other >> programs) are still running. To ensure the guix-daemon's database (or >> other things) does not become corrupt, consider terminating all >> processes before committing the new image. >> > > `docker commit` pauses the container (unless you tell it not to) ... > although I guess that could still cause problems if Guix store writes > aren't implemented in an atomic way. I'm not sure what "pause" means in the Docker documentation, but since I can run "docker commit" while running a shell in the container, and the shell doesn't get terminated, it clearly doesn't terminate the processes. It might be safe do just pause the container when committing, but it's definitely safe if you gracefully shut down all processes first. This definitely ensures that things like databases are left in known good states when committing the image. What I'm saying is that, yeah sure, you can probably get away with not gracefully shutting down the processes. Similarly, you can often get away with pulling the power cord out of your computer because a lot of software and storage is pretty robust by default nowadays. However, it increases the risk of encountering a problem like data corruption, so it's better to shut things down gracefully if you can. -- Chris