* Proposition to streamline our NAR collection to just zstd-compressed ones @ 2024-01-10 2:32 Maxim Cournoyer 2024-01-10 11:36 ` Ludovic Courtès 2024-01-17 16:32 ` Simon Tournier 0 siblings, 2 replies; 5+ messages in thread From: Maxim Cournoyer @ 2024-01-10 2:32 UTC (permalink / raw) To: guix-devel, guix-sysadmin Hello Guix, and Happy New Year! It's been on my head for quite a bit of time (about 2 years, according to [0]), to streamline our offering of cached nars. Letting go of gzip 2 years ago, along a more aggressive garbage collection policy allowed us to reduce our storage needs by at least 6.5 TiB. I'm proposing to do the same with our lzip compressed nars, to let go of an additional 3.9 TiB: --8<---------------cut here---------------start------------->8--- $ du -sh /var/cache/guix/publish/{lzip,zstd} 3.9T /var/cache/guix/publish/lzip 4.1T /var/cache/guix/publish/zstd $ find /var/cache/guix/publish/lzip -name '*.nar' | wc -l 4484645 $ find /var/cache/guix/publish/zstd -name '*.nar' | wc -l 4461195 --8<---------------cut here---------------end--------------->8--- The above suggests that zstd compressed nars are about 5% larger than the lzip ones, which is not big enough to justify carrying both, in my opinion. In exchange for a little bit more bandwidth, users would have the nars decompressed much faster with less CPU overhead locally. Having our complete nars collection fit in around 4 TiB would also open the door for simple rsync-based mirroring, which I have started working on. What do you think? Should we go ahead and effect the following simple change for the Berlin build farm? --8<---------------cut here---------------start------------->8--- modified hydra/modules/sysadmin/services.scm @@ -683,7 +683,7 @@ to a selected directory.") ;; <https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00097.html> ;; for the compression ratio/decompression speed ;; tradeoffs. - (compression '(("lzip" 9) ("zstd" 19))) + (compression '(("zstd" 19))) (cache-bypass-threshold cache-bypass-threshold) (workers publish-workers))) --8<---------------cut here---------------end--------------->8--- -- Thanks, Maxim ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Proposition to streamline our NAR collection to just zstd-compressed ones 2024-01-10 2:32 Proposition to streamline our NAR collection to just zstd-compressed ones Maxim Cournoyer @ 2024-01-10 11:36 ` Ludovic Courtès 2024-01-15 8:31 ` Efraim Flashner 2024-01-18 10:13 ` Giovanni Biscuolo 2024-01-17 16:32 ` Simon Tournier 1 sibling, 2 replies; 5+ messages in thread From: Ludovic Courtès @ 2024-01-10 11:36 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: guix-devel, guix-sysadmin Hello, Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis: > It's been on my head for quite a bit of time (about 2 years, according > to [0]), to streamline our offering of cached nars. Letting go of gzip > 2 years ago, along a more aggressive garbage collection policy allowed > us to reduce our storage needs by at least 6.5 TiB. I'm proposing to do > the same with our lzip compressed nars, to let go of an additional 3.9 > TiB: Those space savings would be welcome. > The above suggests that zstd compressed nars are about 5% larger than > the lzip ones, which is not big enough to justify carrying both, in my > opinion. In exchange for a little bit more bandwidth, users would have > the nars decompressed much faster with less CPU overhead locally. The difference is slightly higher, with lzip being 8% smaller, for a big package like ungoogled-chromium or icecat: --8<---------------cut here---------------start------------->8--- $ wget -qO- https://ci.guix.gnu.org/7n95j1zlnwzc44azjs7nj8givnzdfs87.narinfo|grep -B1 ^FileSize Compression: lzip FileSize: 85783483 -- Compression: zstd FileSize: 92796393 $ wget -qO- https://ci.guix.gnu.org/prpjnnnhay0alanmkgjh66vfwjlb98kq.narinfo|grep -B1 ^FileSize Compression: lzip FileSize: 295991 -- Compression: zstd FileSize: 323456 --8<---------------cut here---------------end--------------->8--- But yeah, even though adaptive compression selection on the client is a minor improvement, whether it warrants the extra space is debatable. > What do you think? Should we go ahead and effect the following simple > change for the Berlin build farm? > > modified hydra/modules/sysadmin/services.scm > @@ -683,7 +683,7 @@ to a selected directory.") > ;; <https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00097.html> > ;; for the compression ratio/decompression speed > ;; tradeoffs. > - (compression '(("lzip" 9) ("zstd" 19))) > + (compression '(("zstd" 19))) No objection from me, but… … an important consideration: zstd support was added in 1.3.0, released in May 2021. From experience we know that users on foreign distros rarely, if ever, upgrade the daemon (on top of that, upgrading the daemon is non-trivial to someone who initially installed the Debian package, from what I’ve seen, because one needs to fiddle with the .service file to adjust file names and the likes), and we can be sure that many are still running an old daemon. We spent a lot of time on user support after gzip substitutes had been removed (‘guix substitute’ would just crash) and we must avoid that. (guix store) emits a warning when connecting to an “old” daemon, but only for daemons older than 2018. We could emit a warning based on whether or not “builtin:git-download” is available, but maybe that’s too early? In addition to the warning, we should communicate in advance and make sure our instructions on how to upgrade the daemon are accurate and clear. Thoughts? Ludo’. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Proposition to streamline our NAR collection to just zstd-compressed ones 2024-01-10 11:36 ` Ludovic Courtès @ 2024-01-15 8:31 ` Efraim Flashner 2024-01-18 10:13 ` Giovanni Biscuolo 1 sibling, 0 replies; 5+ messages in thread From: Efraim Flashner @ 2024-01-15 8:31 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Maxim Cournoyer, guix-devel, guix-sysadmin [-- Attachment #1: Type: text/plain, Size: 4503 bytes --] On Wed, Jan 10, 2024 at 12:36:51PM +0100, Ludovic Courtès wrote: > Hello, > > Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis: > > > It's been on my head for quite a bit of time (about 2 years, according > > to [0]), to streamline our offering of cached nars. Letting go of gzip > > 2 years ago, along a more aggressive garbage collection policy allowed > > us to reduce our storage needs by at least 6.5 TiB. I'm proposing to do > > the same with our lzip compressed nars, to let go of an additional 3.9 > > TiB: > > Those space savings would be welcome. > > > The above suggests that zstd compressed nars are about 5% larger than > > the lzip ones, which is not big enough to justify carrying both, in my > > opinion. In exchange for a little bit more bandwidth, users would have > > the nars decompressed much faster with less CPU overhead locally. > > The difference is slightly higher, with lzip being 8% smaller, for a big > package like ungoogled-chromium or icecat: > > --8<---------------cut here---------------start------------->8--- > $ wget -qO- https://ci.guix.gnu.org/7n95j1zlnwzc44azjs7nj8givnzdfs87.narinfo|grep -B1 ^FileSize > Compression: lzip > FileSize: 85783483 > -- > Compression: zstd > FileSize: 92796393 > $ wget -qO- https://ci.guix.gnu.org/prpjnnnhay0alanmkgjh66vfwjlb98kq.narinfo|grep -B1 ^FileSize > Compression: lzip > FileSize: 295991 > -- > Compression: zstd > FileSize: 323456 > --8<---------------cut here---------------end--------------->8--- > > But yeah, even though adaptive compression selection on the client is a > minor improvement, whether it warrants the extra space is debatable. There's another zstd flag that we should probably add: --rsyncable. --rsyncable: zstd will periodically synchronize the compression state to make the compressed file more rsync-friendly. There is a negligible impact to compression ratio, and a potential impact to compression speed, perceptible at higher speeds, for example when combining --rsyncable with many parallel worker threads. This feature does not work with --single-thread. You probably don´t want to use it with long range mode, since it will decrease the effectiveness of the synchronization points, but your mileage may vary. > > What do you think? Should we go ahead and effect the following simple > > change for the Berlin build farm? > > > > modified hydra/modules/sysadmin/services.scm > > @@ -683,7 +683,7 @@ to a selected directory.") > > ;; <https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00097.html> > > ;; for the compression ratio/decompression speed > > ;; tradeoffs. > > - (compression '(("lzip" 9) ("zstd" 19))) > > + (compression '(("zstd" 19))) > > No objection from me, but… > > … an important consideration: zstd support was added in 1.3.0, released > in May 2021. > > From experience we know that users on foreign distros rarely, if ever, > upgrade the daemon (on top of that, upgrading the daemon is non-trivial > to someone who initially installed the Debian package, from what I’ve > seen, because one needs to fiddle with the .service file to adjust file > names and the likes), and we can be sure that many are still running an > old daemon. We spent a lot of time on user support after gzip > substitutes had been removed (‘guix substitute’ would just crash) and we > must avoid that. > > (guix store) emits a warning when connecting to an “old” daemon, but > only for daemons older than 2018. We could emit a warning based on > whether or not “builtin:git-download” is available, but maybe that’s too > early? builtin:git-download sometimes bites me on my machines since I don't upgrade my aarch64/riscv64 installs that often. Also, 2018 is now about 5 years ago. It might be a good idea to just have a rolling YEAR-3 warning that the daemon is getting old and they might be missing out on features present in newer daemon versions. > In addition to the warning, we should communicate in advance and make > sure our instructions on how to upgrade the daemon are accurate and > clear. > > Thoughts? > > Ludo’. > -- Efraim Flashner <efraim@flashner.co.il> רנשלפ םירפא GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Proposition to streamline our NAR collection to just zstd-compressed ones 2024-01-10 11:36 ` Ludovic Courtès 2024-01-15 8:31 ` Efraim Flashner @ 2024-01-18 10:13 ` Giovanni Biscuolo 1 sibling, 0 replies; 5+ messages in thread From: Giovanni Biscuolo @ 2024-01-18 10:13 UTC (permalink / raw) To: Ludovic Courtès, Maxim Cournoyer; +Cc: guix-devel, guix-sysadmin [-- Attachment #1: Type: text/plain, Size: 2579 bytes --] Hello, Ludovic Courtès <ludo@gnu.org> writes: [...] > From experience we know that users on foreign distros rarely, if ever, > upgrade the daemon (on top of that, upgrading the daemon is non-trivial > to someone who initially installed the Debian package, from what I’ve > seen, because one needs to fiddle with the .service file to adjust file > names and the likes), The upgrade instructions are in (info "(guix) Upgrading Guix"). I run the daemon on Debian but installed it with the install script, not with the Debian package: I'm going to test the installation on a VM and I'll see/document what a user should do to upgrade a daemon installed that way My /etc/systemd/system/guix-daemon.service is: --8<---------------cut here---------------start------------->8--- # This is a "service unit file" for the systemd init system to launch # 'guix-daemon'. Drop it in /etc/systemd/system or similar to have # 'guix-daemon' automatically started. [Unit] Description=Build daemon for GNU Guix [Service] ExecStart=/var/guix/profiles/per-user/root/current-guix/bin/guix-daemon --build-users-group=guixbuild --substitute-urls='https://ci.guix.gnu.org https://bordeaux.guix.gnu.org' Environment=GUIX_LOCPATH=/var/guix/profiles/per-user/root/guix-profile/lib/locale LC_ALL=en_US.utf8 Environment=TMPDIR=/home/guix-builder RemainAfterExit=yes StandardOutput=syslog StandardError=syslog # See <https://lists.gnu.org/archive/html/guix-devel/2016-04/msg00608.html>. # Some package builds (for example, go@1.8.1) may require even more than # 1024 tasks. TasksMax=8192 [Install] WantedBy=multi-user.target --8<---------------cut here---------------end--------------->8--- I tweaked it a little bit to add "--substitute-urls" to ExecStart and "LC_ALL" to Environment, but the Guix provided one should work. AFAIU following the official daemon upgrade instructions should do the job: right? If this is not the case with the Debian package IMO it's a Debian package (.service file) bug: we should add a footnote to (info "(guix) Upgrading Guix") and file a bug upstream if needed, no? [...] > In addition to the warning, we should communicate in advance and make > sure our instructions on how to upgrade the daemon are accurate and > clear. IMO the instructions on (info "(guix) Upgrading Guix") are clear; they are just for a systemd based distro but should be easily "transposed" to a different init system by the users... or not? Thanks! Gio' -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Proposition to streamline our NAR collection to just zstd-compressed ones 2024-01-10 2:32 Proposition to streamline our NAR collection to just zstd-compressed ones Maxim Cournoyer 2024-01-10 11:36 ` Ludovic Courtès @ 2024-01-17 16:32 ` Simon Tournier 1 sibling, 0 replies; 5+ messages in thread From: Simon Tournier @ 2024-01-17 16:32 UTC (permalink / raw) To: Maxim Cournoyer, guix-devel, guix-sysadmin Hi, On Tue, 09 Jan 2024 at 21:32, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote: > What do you think? Should we go ahead and effect the following simple > change for the Berlin build farm? > > --8<---------------cut here---------------start------------->8--- > modified hydra/modules/sysadmin/services.scm > @@ -683,7 +683,7 @@ to a selected directory.") > ;; <https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00097.html> > ;; for the compression ratio/decompression speed > ;; tradeoffs. > - (compression '(("lzip" 9) ("zstd" 19))) > + (compression '(("zstd" 19))) > (cache-bypass-threshold cache-bypass-threshold) > (workers publish-workers))) > --8<---------------cut here---------------end--------------->8--- I think it is a good idea but the change is more than just oneline. ;-) I agree with Ludo: the change requires communication. Something like: 1. Blog post. Something like that [1], a bit extended with a Migration section. 2. A news (guix pull --news) announcing the sunset date. And probably pointing to the blog post (or elsewhere) for helping the migration. 3. Optionally emit a warning when the daemon is “too” old. I agree that the extra space can be annoying. In the same time, user experience matters more, IMHO. Cheers, simon 1: https://guix.gnu.org/en/blog/2022/sunsetting-gzip-substitutes-availability/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-01-18 10:13 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-10 2:32 Proposition to streamline our NAR collection to just zstd-compressed ones Maxim Cournoyer 2024-01-10 11:36 ` Ludovic Courtès 2024-01-15 8:31 ` Efraim Flashner 2024-01-18 10:13 ` Giovanni Biscuolo 2024-01-17 16:32 ` Simon Tournier
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).