unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* When substitute download + decompression is CPU-bound
@ 2020-12-14 22:20 Ludovic Courtès
  2020-12-14 22:29 ` Julien Lepiller
                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Ludovic Courtès @ 2020-12-14 22:20 UTC (permalink / raw)
  To: guix-devel

Hi Guix!

Consider these two files:

  https://ci.guix.gnu.org/nar/gzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-87.0.4280.88-0.b78cb92
  https://ci.guix.gnu.org/nar/lzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-87.0.4280.88-0.b78cb92

Quick decompression bench:

--8<---------------cut here---------------start------------->8---
$ du -h /tmp/uc.nar.[gl]z
103M	/tmp/uc.nar.gz
71M	/tmp/uc.nar.lz
$ gunzip -c < /tmp/uc.nar.gz| wc -c
350491552
$ time lzip -d </tmp/uc.nar.lz >/dev/null

real	0m6.040s
user	0m5.950s
sys	0m0.036s
$ time gunzip -c < /tmp/uc.nar.gz >/dev/null

real	0m2.009s
user	0m1.977s
sys	0m0.032s
--8<---------------cut here---------------end--------------->8---

The decompression throughput (compressed bytes read in the first column,
uncompressed bytes written in the second column) is:

          input   |  output
  gzip: 167 MiB/s | 52 MB/s
  lzip:  56 MiB/s | 11 MB/s

Indeed, if you run this from a computer on your LAN:

  wget -O - … | gunzip > /dev/null

you’ll find that wget caps at 50 M/s with gunzip, whereas with lunzip it
caps at 11 MB/s.

From my place I get a peak download bandwidth of 30+ MB/s from
ci.guix.gnu.org, thus substitute downloads are CPU-bound (I can’t go
beyond 11 M/s due to decompression).  I must say it never occurred to me
it could be the case when we introduced lzip substitutes.

I’d get faster substitute downloads with gzip (I would download more but
the time-to-disk would be smaller.)  Specifically, download +
decompression of ungoogled-chromium from the LAN completes in 2.4s for
gzip vs. 7.1s for lzip.  On a low-end ARMv7 device, also on the LAN, I
get 32s (gzip) vs. 53s (lzip).

Where to go from here?  Several options:

  0. Lzip decompression speed increases with compression ratio, but
     we’re already using ‘--best’ on ci.  The only way we could gain is
     by using “multi-member archives” and then parallel decompression as
     done in plzip, but that’s probably not supported in lzlib.  So
     we’re probably stuck here.

  1. Since ci.guix.gnu.org still provides both gzip and lzip archives,
     ‘guix substitute’ could automatically pick one or the other
     depending on the CPU and bandwidth.  Perhaps a simple trick would
     be to check the user/wall-clock time ratio and switch to gzip for
     subsequent downloads if that ratio is close to one.  How well would
     that work?

  2. Use Zstd like all the cool kids since it seems to have a much
     higher decompression speed: <https://facebook.github.io/zstd/>.
     630 MB/s on ungoogled-chromium on my laptop.  Woow.

  3. Allow for parallel downloads (really: parallel decompression) as
     Julien did in <https://issues.guix.gnu.org/39728>.

My preference would be #2, #1, and #3, in this order.  #2 is great but
it’s quite a bit of work, whereas #1 could be deployed quickly.  I’m not
fond of #3 because it just papers over the underlying issue and could be
counterproductive if the number of jobs is wrong.

Thoughts?

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-14 22:20 When substitute download + decompression is CPU-bound Ludovic Courtès
@ 2020-12-14 22:29 ` Julien Lepiller
  2020-12-14 22:59 ` Nicolò Balzarotti
  2020-12-15 10:40 ` Jonathan Brielmaier
  2 siblings, 0 replies; 43+ messages in thread
From: Julien Lepiller @ 2020-12-14 22:29 UTC (permalink / raw)
  To: guix-devel, Ludovic Courtès

[-- Attachment #1: Type: text/plain, Size: 3502 bytes --]

My proposed changes to allow for parallel download assume downloads are network-bound, so they can be separate from other jobs. If downloads are actually CPU-bound, then it has indeed no merit at all :)

Le 14 décembre 2020 17:20:17 GMT-05:00, "Ludovic Courtès" <ludo@gnu.org> a écrit :
>Hi Guix!
>
>Consider these two files:
>
>https://ci.guix.gnu.org/nar/gzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-87.0.4280.88-0.b78cb92
>https://ci.guix.gnu.org/nar/lzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-87.0.4280.88-0.b78cb92
>
>Quick decompression bench:
>
>--8<---------------cut here---------------start------------->8---
>$ du -h /tmp/uc.nar.[gl]z
>103M	/tmp/uc.nar.gz
>71M	/tmp/uc.nar.lz
>$ gunzip -c < /tmp/uc.nar.gz| wc -c
>350491552
>$ time lzip -d </tmp/uc.nar.lz >/dev/null
>
>real	0m6.040s
>user	0m5.950s
>sys	0m0.036s
>$ time gunzip -c < /tmp/uc.nar.gz >/dev/null
>
>real	0m2.009s
>user	0m1.977s
>sys	0m0.032s
>--8<---------------cut here---------------end--------------->8---
>
>The decompression throughput (compressed bytes read in the first
>column,
>uncompressed bytes written in the second column) is:
>
>          input   |  output
>  gzip: 167 MiB/s | 52 MB/s
>  lzip:  56 MiB/s | 11 MB/s
>
>Indeed, if you run this from a computer on your LAN:
>
>  wget -O - … | gunzip > /dev/null
>
>you’ll find that wget caps at 50 M/s with gunzip, whereas with lunzip
>it
>caps at 11 MB/s.
>
>From my place I get a peak download bandwidth of 30+ MB/s from
>ci.guix.gnu.org, thus substitute downloads are CPU-bound (I can’t go
>beyond 11 M/s due to decompression).  I must say it never occurred to
>me
>it could be the case when we introduced lzip substitutes.
>
>I’d get faster substitute downloads with gzip (I would download more
>but
>the time-to-disk would be smaller.)  Specifically, download +
>decompression of ungoogled-chromium from the LAN completes in 2.4s for
>gzip vs. 7.1s for lzip.  On a low-end ARMv7 device, also on the LAN, I
>get 32s (gzip) vs. 53s (lzip).
>
>Where to go from here?  Several options:
>
>  0. Lzip decompression speed increases with compression ratio, but
>     we’re already using ‘--best’ on ci.  The only way we could gain is
>    by using “multi-member archives” and then parallel decompression as
>     done in plzip, but that’s probably not supported in lzlib.  So
>     we’re probably stuck here.
>
>  1. Since ci.guix.gnu.org still provides both gzip and lzip archives,
>     ‘guix substitute’ could automatically pick one or the other
>     depending on the CPU and bandwidth.  Perhaps a simple trick would
>     be to check the user/wall-clock time ratio and switch to gzip for
>    subsequent downloads if that ratio is close to one.  How well would
>     that work?
>
>  2. Use Zstd like all the cool kids since it seems to have a much
>     higher decompression speed: <https://facebook.github.io/zstd/>.
>     630 MB/s on ungoogled-chromium on my laptop.  Woow.
>
>  3. Allow for parallel downloads (really: parallel decompression) as
>     Julien did in <https://issues.guix.gnu.org/39728>.
>
>My preference would be #2, #1, and #3, in this order.  #2 is great but
>it’s quite a bit of work, whereas #1 could be deployed quickly.  I’m
>not
>fond of #3 because it just papers over the underlying issue and could
>be
>counterproductive if the number of jobs is wrong.
>
>Thoughts?
>
>Ludo’.

[-- Attachment #2: Type: text/html, Size: 4247 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-14 22:20 When substitute download + decompression is CPU-bound Ludovic Courtès
  2020-12-14 22:29 ` Julien Lepiller
@ 2020-12-14 22:59 ` Nicolò Balzarotti
  2020-12-15  7:52   ` Pierre Neidhardt
  2020-12-15 11:36   ` Ludovic Courtès
  2020-12-15 10:40 ` Jonathan Brielmaier
  2 siblings, 2 replies; 43+ messages in thread
From: Nicolò Balzarotti @ 2020-12-14 22:59 UTC (permalink / raw)
  To: Ludovic Courtès, guix-devel

Ludovic Courtès <ludo@gnu.org> writes:

> Hi Guix!
>
Hi Ludo

> Quick decompression bench:

I guess this benchmark follows the distri talk, doesn't it? :)

File size with zstd vs zstd -9 vs current lzip:
- 71M uc.nar.lz
- 87M uc.nar.zst-9
- 97M uc.nar.zst-default

> Where to go from here?  Several options:

>   1. Since ci.guix.gnu.org still provides both gzip and lzip archives,
>      ‘guix substitute’ could automatically pick one or the other
>      depending on the CPU and bandwidth.  Perhaps a simple trick would
>      be to check the user/wall-clock time ratio and switch to gzip for
>      subsequent downloads if that ratio is close to one.  How well would
>      that work?

I'm not sure using heuristics (i.e., guessing what should work better,
like in 1.) is the way to go, as temporary slowdowns to the network/cpu
will during the first download would affect the decision.

>   2. Use Zstd like all the cool kids since it seems to have a much
>      higher decompression speed: <https://facebook.github.io/zstd/>.
>      630 MB/s on ungoogled-chromium on my laptop.  Woow.

I know this means more work to do, but it seems to be the best
alternative.  However, if we go that way, will we keep lzip substitutes?
The 20% difference in size between lzip/zstd would mean a lot with slow
(mobile) network connections.

Nicolò


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-14 22:59 ` Nicolò Balzarotti
@ 2020-12-15  7:52   ` Pierre Neidhardt
  2020-12-15  9:45     ` Nicolò Balzarotti
  2020-12-15 11:42     ` Ludovic Courtès
  2020-12-15 11:36   ` Ludovic Courtès
  1 sibling, 2 replies; 43+ messages in thread
From: Pierre Neidhardt @ 2020-12-15  7:52 UTC (permalink / raw)
  To: Nicolò Balzarotti, Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 680 bytes --]

Another option is plzip (parallel Lzip, an official part of Lzip).

> decompression of ungoogled-chromium from the LAN completes in 2.4s for
> gzip vs. 7.1s for lzip.  On a low-end ARMv7 device, also on the LAN, I
> get 32s (gzip) vs. 53s (lzip).

With four cores, plzip would beat gzip in the first case.
With only 2 cores, plzip would beat gzip in the second case.

What's left to do to implement plzip support?  That's the good news:
almost nothing!

- On the Lzip binding side, we need to add support for multi pages.
  It's a bit of work but not that much.
- On the Guix side, there is nothing to do.

Cheers!

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15  7:52   ` Pierre Neidhardt
@ 2020-12-15  9:45     ` Nicolò Balzarotti
  2020-12-15  9:54       ` Pierre Neidhardt
  2020-12-15 11:42     ` Ludovic Courtès
  1 sibling, 1 reply; 43+ messages in thread
From: Nicolò Balzarotti @ 2020-12-15  9:45 UTC (permalink / raw)
  To: Pierre Neidhardt, Ludovic Courtès, guix-devel

Pierre Neidhardt <mail@ambrevar.xyz> writes:

> Another option is plzip (parallel Lzip, an official part of Lzip).

Wouldn't that mean that this will become a problem when we'll have
parallel downloads (and sometimes parallel decompression will happen)?


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15  9:45     ` Nicolò Balzarotti
@ 2020-12-15  9:54       ` Pierre Neidhardt
  2020-12-15 10:03         ` Nicolò Balzarotti
  0 siblings, 1 reply; 43+ messages in thread
From: Pierre Neidhardt @ 2020-12-15  9:54 UTC (permalink / raw)
  To: Nicolò Balzarotti, Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 615 bytes --]

Nicolò Balzarotti <anothersms@gmail.com> writes:

> Pierre Neidhardt <mail@ambrevar.xyz> writes:
>
>> Another option is plzip (parallel Lzip, an official part of Lzip).
>
> Wouldn't that mean that this will become a problem when we'll have
> parallel downloads (and sometimes parallel decompression will happen)?

What do you mean?

Parallel decompression is unrelated to downloads as far as I
understand.  Once the archive (or just archive chunks?) is available,
plzip can decompress multiple segments at the same time if enough cores
are available.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15  9:54       ` Pierre Neidhardt
@ 2020-12-15 10:03         ` Nicolò Balzarotti
  2020-12-15 10:13           ` Pierre Neidhardt
  0 siblings, 1 reply; 43+ messages in thread
From: Nicolò Balzarotti @ 2020-12-15 10:03 UTC (permalink / raw)
  To: Pierre Neidhardt, Ludovic Courtès, guix-devel

Pierre Neidhardt <mail@ambrevar.xyz> writes:

>
> What do you mean?
>
If you download multiple files at a time, you might end up decompressing
them simultaneously.  Plzip won't help then on a dual core machine,
where you might end up being cpu bound again then. Is this right?

If it is, reducing the overall cpu usage seems to be a better approach
in the long term.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 10:03         ` Nicolò Balzarotti
@ 2020-12-15 10:13           ` Pierre Neidhardt
  2020-12-15 10:14             ` Pierre Neidhardt
  0 siblings, 1 reply; 43+ messages in thread
From: Pierre Neidhardt @ 2020-12-15 10:13 UTC (permalink / raw)
  To: Nicolò Balzarotti, Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]

Nicolò Balzarotti <anothersms@gmail.com> writes:

> If you download multiple files at a time, you might end up decompressing
> them simultaneously.  Plzip won't help then on a dual core machine,
> where you might end up being cpu bound again then. Is this right?
>
> If it is, reducing the overall cpu usage seems to be a better approach
> in the long term.

An answer to this may be in pipelining the process.

The parallel downloads would feed the archives to the pipeline and the
parallel decompressor would pop the archives out of the pipeline one by
one.

If I'm not mistaken, this should yield optimal results regardless of the
network or CPU performance.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 10:13           ` Pierre Neidhardt
@ 2020-12-15 10:14             ` Pierre Neidhardt
  0 siblings, 0 replies; 43+ messages in thread
From: Pierre Neidhardt @ 2020-12-15 10:14 UTC (permalink / raw)
  To: Nicolò Balzarotti, Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 130 bytes --]

Here the "pipeline" could be a CSP channel.
Not sure what the term is in Guile.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-14 22:20 When substitute download + decompression is CPU-bound Ludovic Courtès
  2020-12-14 22:29 ` Julien Lepiller
  2020-12-14 22:59 ` Nicolò Balzarotti
@ 2020-12-15 10:40 ` Jonathan Brielmaier
  2020-12-15 19:43   ` Joshua Branson
  2 siblings, 1 reply; 43+ messages in thread
From: Jonathan Brielmaier @ 2020-12-15 10:40 UTC (permalink / raw)
  To: guix-devel

Super interesting findings!

On 14.12.20 23:20, Ludovic Courtès wrote:
>    2. Use Zstd like all the cool kids since it seems to have a much
>       higher decompression speed: <https://facebook.github.io/zstd/>.
>       630 MB/s on ungoogled-chromium on my laptop.  Woow.

Not only decompression speed is fast, compression is as well:

size	file			time for compression (lower is better)
335M	uc.nar

104M	uc.nar.gz
	 	  8
71M	uc.nar.lz.level9	120
74M	uc.nar.lz.level6
  	 80
82M	uc.nar.lz.level3	 30
89M	uc.nar.lz
.level1	 16
97M	uc.nar.zst	 	  1

So I am bought by zstd, as user and as substitution server care taker :)

For mobile users and users without internet flatrates the increased nar
size is a problem.
Although I think the problem here is not bewtween gzip, lzip and zstd.
It's the fact that we completely download the new package even if's just
some 100 lines of diffoscope diff[0]. And most of them is due to the
change /gnu/store name...

[0] diffoscope --max-diff-block-lines 0
/gnu/store/zvcn2r352wxnmq7jayz5myg23gh9s17q-icedove-78.5.1
/gnu/store/dzjym6y7b9z4apgvvydj9lf0kbaa8qbv-icedove-78.5.1
lines: 783
size: 64k


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-14 22:59 ` Nicolò Balzarotti
  2020-12-15  7:52   ` Pierre Neidhardt
@ 2020-12-15 11:36   ` Ludovic Courtès
  2020-12-15 11:45     ` Nicolò Balzarotti
  1 sibling, 1 reply; 43+ messages in thread
From: Ludovic Courtès @ 2020-12-15 11:36 UTC (permalink / raw)
  To: Nicolò Balzarotti; +Cc: guix-devel

Hi,

Nicolò Balzarotti <anothersms@gmail.com> skribis:

> I guess this benchmark follows the distri talk, doesn't it? :)

Yes, that and my own quest for optimization opportunities.  :-)

> File size with zstd vs zstd -9 vs current lzip:
> - 71M uc.nar.lz
> - 87M uc.nar.zst-9
> - 97M uc.nar.zst-default
>
>> Where to go from here?  Several options:
>
>>   1. Since ci.guix.gnu.org still provides both gzip and lzip archives,
>>      ‘guix substitute’ could automatically pick one or the other
>>      depending on the CPU and bandwidth.  Perhaps a simple trick would
>>      be to check the user/wall-clock time ratio and switch to gzip for
>>      subsequent downloads if that ratio is close to one.  How well would
>>      that work?
>
> I'm not sure using heuristics (i.e., guessing what should work better,
> like in 1.) is the way to go, as temporary slowdowns to the network/cpu
> will during the first download would affect the decision.

I suppose we could time each substitute download and adjust the choice
continually.

It might be better to provide a command-line flag to choose between
optimizing for bandwidth usage (users with limited Internet access may
prefer that) or for speed.

>>   2. Use Zstd like all the cool kids since it seems to have a much
>>      higher decompression speed: <https://facebook.github.io/zstd/>.
>>      630 MB/s on ungoogled-chromium on my laptop.  Woow.
>
> I know this means more work to do, but it seems to be the best
> alternative.  However, if we go that way, will we keep lzip substitutes?
> The 20% difference in size between lzip/zstd would mean a lot with slow
> (mobile) network connections.

A lot in what sense?  In terms of bandwidth usage, right?

In terms of speed, zstd would probably reduce the time-to-disk as soon
as you have ~15 MB/s peak bandwidth or more.

Anyway, we’re not there yet, but I suppose if we get zstd support, we
could configure berlin to keep lzip and zstd (rather than lzip and gzip
as is currently the case).

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15  7:52   ` Pierre Neidhardt
  2020-12-15  9:45     ` Nicolò Balzarotti
@ 2020-12-15 11:42     ` Ludovic Courtès
  2020-12-15 12:31       ` Pierre Neidhardt
  1 sibling, 1 reply; 43+ messages in thread
From: Ludovic Courtès @ 2020-12-15 11:42 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: guix-devel, Nicolò Balzarotti

Hi,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Another option is plzip (parallel Lzip, an official part of Lzip).
>
>> decompression of ungoogled-chromium from the LAN completes in 2.4s for
>> gzip vs. 7.1s for lzip.  On a low-end ARMv7 device, also on the LAN, I
>> get 32s (gzip) vs. 53s (lzip).
>
> With four cores, plzip would beat gzip in the first case.
> With only 2 cores, plzip would beat gzip in the second case.
>
> What's left to do to implement plzip support?  That's the good news:
> almost nothing!
>
> - On the Lzip binding side, we need to add support for multi pages.
>   It's a bit of work but not that much.
> - On the Guix side, there is nothing to do.

Well, ‘guix publish’ would first need to create multi-member archives,
right?

Also, lzlib (which is what we use) does not implement parallel
decompression, AIUI.

Even if it did, would we be able to take advantage of it?  Currently
‘restore-file’ expects to read an archive stream sequentially.

Even if I’m wrong :-), decompression speed would at best be doubled on
multi-core machines (wouldn’t help much on low-end ARM devices), and
that’s very little compared to the decompression speed achieved by zstd.

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 11:36   ` Ludovic Courtès
@ 2020-12-15 11:45     ` Nicolò Balzarotti
  0 siblings, 0 replies; 43+ messages in thread
From: Nicolò Balzarotti @ 2020-12-15 11:45 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

Ludovic Courtès <ludo@gnu.org> writes:

> A lot in what sense?  In terms of bandwidth usage, right?

Yep, I think most of mobile data plans are still limited.  Even if here
in Italy is easy to get 50Gb+/monthly, I think it's not the same
worldwide.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 11:42     ` Ludovic Courtès
@ 2020-12-15 12:31       ` Pierre Neidhardt
  2020-12-18 14:59         ` Ludovic Courtès
  0 siblings, 1 reply; 43+ messages in thread
From: Pierre Neidhardt @ 2020-12-15 12:31 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel, Nicolò Balzarotti

[-- Attachment #1: Type: text/plain, Size: 1716 bytes --]

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> Well, ‘guix publish’ would first need to create multi-member archives,
> right?

Correct, but it's trivial once the bindings have been implemented.

> Also, lzlib (which is what we use) does not implement parallel
> decompression, AIUI.

Yes it does, multi-member archives is a non-optional part of the Lzip
specs, and lzlib implemetns all the specs.

> Even if it did, would we be able to take advantage of it?  Currently
> ‘restore-file’ expects to read an archive stream sequentially.

Yes it works, I just tried this:

--8<---------------cut here---------------start------------->8---
cat big-file.lz | plzip -d -o big-file -
--8<---------------cut here---------------end--------------->8---

Decompression happens in parallel.

> Even if I’m wrong :-), decompression speed would at best be doubled on
> multi-core machines (wouldn’t help much on low-end ARM devices), and
> that’s very little compared to the decompression speed achieved by zstd.

Why doubled?  If the archive has more than CORE-NUMBER segments, then
the decompression duration can be divided by CORE-NUMBER.

All that said, I think we should have both:

- Parallel lzip support is the easiest to add at this point.
  It's the best option for people with low bandwidth.  This can benefit
  most of the planet I suppose.

- zstd is best for users with high bandwidth (or with slow hardware).
  We need to write the necessary bindings though, so it will take a bit
  more time.

Then the users can choose which compression they prefer, mostly
depending on their hardware and bandwidth.

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 10:40 ` Jonathan Brielmaier
@ 2020-12-15 19:43   ` Joshua Branson
  2021-01-07 10:45     ` Guillaume Le Vaillant
  0 siblings, 1 reply; 43+ messages in thread
From: Joshua Branson @ 2020-12-15 19:43 UTC (permalink / raw)
  To: Jonathan Brielmaier; +Cc: guix-devel


Looking on the Zstandard website (https://facebook.github.io/zstd/), it
mentions google's snappy compression library
(https://github.com/google/snappy).  Snappy has some fairly good
benchmarks too:

Compressor 	Ratio 	Compression 	Decompress.
zstd    	2.884   500 MB/s 	1660 MB/s
snappy  	2.073   560 MB/s 	1790 MB/s

Would snappy be easier to use than Zstandard?

--
Joshua Branson
Sent from Emacs and Gnus
  https://gnucode.me
  https://video.hardlimit.com/accounts/joshua_branson/video-channels
  https://propernaming.org
  "You can have whatever you want, as long as you help

enough other people get what they want." - Zig Ziglar


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 12:31       ` Pierre Neidhardt
@ 2020-12-18 14:59         ` Ludovic Courtès
  2020-12-18 15:33           ` Pierre Neidhardt
  0 siblings, 1 reply; 43+ messages in thread
From: Ludovic Courtès @ 2020-12-18 14:59 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: guix-devel, Nicolò Balzarotti

Hi Pierre,

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Well, ‘guix publish’ would first need to create multi-member archives,
>> right?
>
> Correct, but it's trivial once the bindings have been implemented.

OK.

>> Also, lzlib (which is what we use) does not implement parallel
>> decompression, AIUI.
>
> Yes it does, multi-member archives is a non-optional part of the Lzip
> specs, and lzlib implemetns all the specs.

Nice.

>> Even if it did, would we be able to take advantage of it?  Currently
>> ‘restore-file’ expects to read an archive stream sequentially.
>
> Yes it works, I just tried this:
>
> cat big-file.lz | plzip -d -o big-file -
>
> Decompression happens in parallel.
>
>> Even if I’m wrong :-), decompression speed would at best be doubled on
>> multi-core machines (wouldn’t help much on low-end ARM devices), and
>> that’s very little compared to the decompression speed achieved by zstd.
>
> Why doubled?  If the archive has more than CORE-NUMBER segments, then
> the decompression duration can be divided by CORE-NUMBER.

My laptop has 4 cores, so at best I’d get a 4x speedup, compared to the
10x speedup with zstd that also comes with much lower resource usage,
etc.

> All that said, I think we should have both:
>
> - Parallel lzip support is the easiest to add at this point.
>   It's the best option for people with low bandwidth.  This can benefit
>   most of the planet I suppose.
>
> - zstd is best for users with high bandwidth (or with slow hardware).
>   We need to write the necessary bindings though, so it will take a bit
>   more time.
>
> Then the users can choose which compression they prefer, mostly
> depending on their hardware and bandwidth.

Would you like to give parallel lzip a try?

Thanks!

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-18 14:59         ` Ludovic Courtès
@ 2020-12-18 15:33           ` Pierre Neidhardt
  0 siblings, 0 replies; 43+ messages in thread
From: Pierre Neidhardt @ 2020-12-18 15:33 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel, Nicolò Balzarotti

[-- Attachment #1: Type: text/plain, Size: 704 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> My laptop has 4 cores, so at best I’d get a 4x speedup, compared to the
> 10x speedup with zstd that also comes with much lower resource usage,
> etc.

Of course, it's a trade off between high compression and high speed :)

Since there is no universal best option, I think it's best to support both.

> Would you like to give parallel lzip a try?

It shouldn't be too hard for me considering I already have experience
with Lzip, but I can only reasonably do this after FOSDEM, so in 1.5
month from now.

If I forget, please ping me ;)

If there is any taker before that, please go ahead! :)

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2020-12-15 19:43   ` Joshua Branson
@ 2021-01-07 10:45     ` Guillaume Le Vaillant
  2021-01-07 11:00       ` Pierre Neidhardt
  2021-01-14 21:51       ` Ludovic Courtès
  0 siblings, 2 replies; 43+ messages in thread
From: Guillaume Le Vaillant @ 2021-01-07 10:45 UTC (permalink / raw)
  To: Joshua Branson; +Cc: guix-devel


[-- Attachment #1.1: Type: text/plain, Size: 353 bytes --]


I compared gzip, lzip and zstd when compressing a 580 MB pack (therefore
containing "subsitutes" for several packages) with different compression
levels. Maybe the results can be of some use to someone.

Note that the plots only show the results using only 1 thread and
standard compression levels, and that the speed axis is using
logarithmic scale.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: compression-benchmark.org --]
[-- Type: text/x-org, Size: 9159 bytes --]

Machine used for the tests:
 - CPU: Intel i7-3630QM
 - RAM: 16 MiB

Programs:
 - gzip 1.10
 - pigz 2.4
 - lzip 1.21
 - plzip 1.8
 - zstd 1.4.4
 - pzstd 1.4.4

Uncompressed file:
 - name: monero-0.17.1.5-pack.tar
 - size: 582707200 bytes

#+PLOT: script:"compression-benchmark.plot"
| Comp. command  | Comp. time | Comp. size | Comp. speed | Comp. ratio | Decomp. time | Decomp. speed |
|----------------+------------+------------+-------------+-------------+--------------+---------------|
| gzip -1        |      7.999 |  166904534 |    72847506 |       3.491 |        3.292 |      50700041 |
| gzip -2        |      8.469 |  161859128 |    68804723 |       3.600 |        3.214 |      50360650 |
| gzip -3        |     10.239 |  157839772 |    56910558 |       3.692 |        3.144 |      50203490 |
| gzip -4        |     11.035 |  151039457 |    52805365 |       3.858 |        3.104 |      48659619 |
| gzip -5        |     13.767 |  146693142 |    42326375 |       3.972 |        3.143 |      46672969 |
| gzip -6        |     19.707 |  144364588 |    29568539 |       4.036 |        3.001 |      48105494 |
| gzip -7        |     24.014 |  143727357 |    24265312 |       4.054 |        2.993 |      48021168 |
| gzip -8        |     43.219 |  143062985 |    13482663 |       4.073 |        2.969 |      48185579 |
| gzip -9        |     70.930 |  142803637 |     8215243 |       4.080 |        2.964 |      48179365 |
| pigz -1 -p 4   |      2.247 |  165745308 |   259326747 |       3.516 |        1.919 |      86370666 |
| pigz -2 -p 4   |      2.394 |  160661935 |   243403175 |       3.627 |        1.862 |      86284605 |
| pigz -3 -p 4   |      2.776 |  156696382 |   209908934 |       3.719 |        1.817 |      86239065 |
| pigz -4 -p 4   |      3.045 |  150539955 |   191365255 |       3.871 |        1.787 |      84241721 |
| pigz -5 -p 4   |      3.855 |  146289903 |   151156213 |       3.983 |        1.732 |      84462992 |
| pigz -6 -p 4   |      5.378 |  143967093 |   108350167 |       4.048 |        1.721 |      83653163 |
| pigz -7 -p 4   |      6.579 |  143350506 |    88570786 |       4.065 |        1.702 |      84224739 |
| pigz -8 -p 4   |      11.76 |  142738270 |    49549932 |       4.082 |        1.720 |      82987366 |
| pigz -9 -p 4   |     19.878 |  142479078 |    29314176 |       4.090 |        1.691 |      84257290 |
| lzip -0        |     16.686 |  130302649 |    34921923 |       4.472 |        9.981 |      13055070 |
| lzip -1        |     42.011 |  118070414 |    13870348 |       4.935 |        8.669 |      13619842 |
| lzip -2        |     51.395 |  112769303 |    11337819 |       5.167 |        8.368 |      13476255 |
| lzip -3        |     69.344 |  106182860 |     8403138 |       5.488 |        8.162 |      13009417 |
| lzip -4        |     89.781 |  100072461 |     6490318 |       5.823 |        7.837 |      12769231 |
| lzip -5        |    119.626 |   95033235 |     4871075 |       6.132 |        7.586 |      12527450 |
| lzip -6        |    155.740 |   83063613 |     3741538 |       7.015 |        6.856 |      12115463 |
| lzip -7        |    197.485 |   78596381 |     2950640 |       7.414 |        6.586 |      11933857 |
| lzip -8        |    238.076 |   72885403 |     2447568 |       7.995 |        6.227 |      11704738 |
| lzip -9        |    306.368 |   72279340 |     1901985 |       8.062 |        6.203 |      11652320 |
| plzip -0 -n 4  |      4.821 |  131211238 |   120868533 |       4.441 |        2.829 |      46380784 |
| plzip -1 -n 4  |     13.453 |  120565830 |    43314294 |       4.833 |        2.604 |      46300242 |
| plzip -2 -n 4  |     15.695 |  114874773 |    37126932 |       5.073 |        2.398 |      47904409 |
| plzip -3 -n 4  |     20.563 |  108896468 |    28337655 |       5.351 |        2.486 |      43803889 |
| plzip -4 -n 4  |     26.871 |  102285879 |    21685356 |       5.697 |        2.375 |      43067739 |
| plzip -5 -n 4  |     35.220 |   97402840 |    16544781 |       5.982 |        2.448 |      39788742 |
| plzip -6 -n 4  |     45.812 |   89260273 |    12719532 |       6.528 |        2.145 |      41613181 |
| plzip -7 -n 4  |     62.723 |   82944080 |     9290168 |       7.025 |        2.080 |      39876962 |
| plzip -8 -n 4  |     71.928 |   78477272 |     8101257 |       7.425 |        2.120 |      37017581 |
| plzip -9 -n 4  |    103.744 |   75648923 |     5616780 |       7.703 |        2.578 |      29344035 |
| zstd -1        |      2.057 |  145784609 |   283280117 |       3.997 |        0.639 |     228144928 |
| zstd -2        |      2.316 |  136049621 |   251600691 |       4.283 |        0.657 |     207077049 |
| zstd -3        |      2.733 |  127702753 |   213211562 |       4.563 |        0.650 |     196465774 |
| zstd -4        |      3.269 |  126224007 |   178252432 |       4.616 |        0.658 |     191829798 |
| zstd -5        |      5.136 |  122024478 |   113455452 |       4.775 |        0.680 |     179447762 |
| zstd -6        |      6.394 |  120035201 |    91133438 |       4.854 |        0.652 |     184103069 |
| zstd -7        |      8.510 |  116048780 |    68473231 |       5.021 |        0.612 |     189622190 |
| zstd -8        |      9.875 |  114821611 |    59008324 |       5.075 |        0.593 |     193628349 |
| zstd -9        |     12.478 |  113868149 |    46698766 |       5.117 |        0.588 |     193653315 |
| zstd -10       |     14.982 |  111113753 |    38893819 |       5.244 |        0.578 |     192238327 |
| zstd -11       |     16.391 |  110674252 |    35550436 |       5.265 |        0.583 |     189835767 |
| zstd -12       |     21.008 |  110031164 |    27737395 |       5.296 |        0.570 |     193037130 |
| zstd -13       |     51.259 |  109262475 |    11367900 |       5.333 |        0.561 |     194763770 |
| zstd -14       |     58.897 |  108632734 |     9893665 |       5.364 |        0.562 |     193296680 |
| zstd -15       |     82.514 |  107956132 |     7061919 |       5.398 |        0.557 |     193817113 |
| zstd -16       |     78.935 |  105533404 |     7382114 |       5.522 |        0.576 |     183217715 |
| zstd -17       |     89.832 |   94165409 |     6486633 |       6.188 |        0.565 |     166664441 |
| zstd -18       |    115.663 |   91124039 |     5037974 |       6.395 |        0.614 |     148410487 |
| zstd -19       |    157.008 |   90229137 |     3711322 |       6.458 |        0.614 |     146952992 |
| zstd -20       |    162.499 |   80742922 |     3585913 |       7.217 |        0.605 |     133459375 |
| zstd -21       |    207.122 |   79619348 |     2813353 |       7.319 |        0.611 |     130309899 |
| zstd -22       |    277.177 |   78652901 |     2102293 |       7.409 |        0.634 |     124058203 |
| pzstd -1 -p 4  |      0.621 |  146665510 |   938336876 |       3.973 |        0.196 |     748293418 |
| pzstd -2 -p 4  |      0.720 |  137416958 |   809315556 |       4.240 |        0.227 |     605361048 |
| pzstd -3 -p 4  |      1.180 |  128748806 |   493819661 |       4.526 |        0.231 |     557354139 |
| pzstd -4 -p 4  |      1.786 |  127373154 |   326263830 |       4.575 |        0.240 |     530721475 |
| pzstd -5 -p 4  |      2.635 |  123216422 |   221141252 |       4.729 |        0.240 |     513401758 |
| pzstd -6 -p 4  |      3.774 |  121257316 |   154400424 |       4.806 |        0.251 |     483096876 |
| pzstd -7 -p 4  |      3.988 |  117361187 |   146115145 |       4.965 |        0.263 |     446240255 |
| pzstd -8 -p 4  |      4.540 |  116172098 |   128349604 |       5.016 |        0.240 |     484050408 |
| pzstd -9 -p 4  |      5.083 |  115237287 |   114638442 |       5.057 |        0.268 |     429989877 |
| pzstd -10 -p 4 |      5.630 |  112359994 |   103500391 |       5.186 |        0.226 |     497168115 |
| pzstd -11 -p 4 |      5.991 |  111969711 |    97263762 |       5.204 |        0.246 |     455161427 |
| pzstd -12 -p 4 |      8.001 |  111326376 |    72829296 |       5.234 |        0.227 |     490424564 |
| pzstd -13 -p 4 |     16.035 |  110525395 |    36339707 |       5.272 |        0.259 |     426738977 |
| pzstd -14 -p 4 |     18.145 |  109957500 |    32113927 |       5.299 |        0.253 |     434614625 |
| pzstd -15 -p 4 |     24.791 |  109358520 |    23504788 |       5.328 |        0.224 |     488207679 |
| pzstd -16 -p 4 |     23.940 |  106888588 |    24340317 |       5.452 |        0.234 |     456788838 |
| pzstd -17 -p 4 |     29.099 |   97393935 |    20024991 |       5.983 |        0.266 |     366142613 |
| pzstd -18 -p 4 |     37.124 |   94273955 |    15696240 |       6.181 |        0.284 |     331950546 |
| pzstd -19 -p 4 |     48.798 |   93531545 |    11941211 |       6.230 |        0.262 |     356990630 |
| pzstd -20 -p 4 |     54.860 |   82067608 |    10621713 |       7.100 |        0.302 |     271747046 |
| pzstd -21 -p 4 |     64.179 |   79735488 |     9079406 |       7.308 |        0.389 |     204975548 |
| pzstd -22 -p 4 |    256.242 |   78688788 |     2274050 |       7.405 |        0.585 |     134510749 |
#+TBLFM: $4='(format "%d" (round (/ 582707200.0 $2)));N :: $5='(format "%.3f" (/ 582707200.0 $3));N :: $7='(format "%d" (round (/ $3 $6)));N

[-- Attachment #1.3: compression-benchmark.plot --]
[-- Type: text/plain, Size: 1631 bytes --]

set terminal png size 1920, 1080
set style data linespoints
set logscale y
set xlabel "Compression ratio"
set ylabel "Compression speed (MB/s)"
set output "compression.png"
plot '$datafile' every ::0::8 using 5:($4 / 1000000) linecolor "dark-violet" title "gzip", \
     '$datafile' every ::0::8 using 5:($4 / 1000000):(substr(stringcolumn(1), 7, 8)) with labels textcolor "dark-violet" offset -1, -1 notitle, \
     '$datafile' every ::18::27 using 5:($4 / 1000000) linecolor "navy" title "lzip", \
     '$datafile' every ::18::27 using 5:($4 / 1000000):(substr(stringcolumn(1), 7, 8)) with labels textcolor "navy" offset 0, -1 notitle, \
     '$datafile' every ::38::56 using 5:($4 / 1000000) linecolor "olive" title "zstd", \
     '$datafile' every ::38::56 using 5:($4 / 1000000):(substr(stringcolumn(1), 7, 9)) with labels textcolor "olive" offset 1, 1 notitle

set ylabel "Decompression speed (MB/s)"
set output "decompression.png"
plot '$datafile' every ::0::8 using 5:($7 / 1000000) linecolor "dark-violet" title "gzip", \
     '$datafile' every ::0::8 using 5:($7 / 1000000):(substr(stringcolumn(1), 7, 8)) with labels textcolor "dark-violet" offset 0, -1 notitle, \
     '$datafile' every ::18::27 using 5:($7 / 1000000) linecolor "navy" title "lzip", \
     '$datafile' every ::18::27 using 5:($7 / 1000000):(substr(stringcolumn(1), 7, 8)) with labels textcolor "navy" offset 0, -1 notitle, \
     '$datafile' every ::38::56 using 5:($7 / 1000000) linecolor "olive" title "zstd", \
     '$datafile' every ::38::56 using 5:($7 / 1000000):(substr(stringcolumn(1), 7, 8)) with labels textcolor "olive" offset 0, -1 notitle

[-- Attachment #1.4: compression.png --]
[-- Type: image/png, Size: 16056 bytes --]

[-- Attachment #1.5: decompression.png --]
[-- Type: image/png, Size: 12804 bytes --]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-07 10:45     ` Guillaume Le Vaillant
@ 2021-01-07 11:00       ` Pierre Neidhardt
  2021-01-07 11:33         ` Guillaume Le Vaillant
  2021-01-14 21:51       ` Ludovic Courtès
  1 sibling, 1 reply; 43+ messages in thread
From: Pierre Neidhardt @ 2021-01-07 11:00 UTC (permalink / raw)
  To: Guillaume Le Vaillant, Joshua Branson; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 392 bytes --]

Wow, impressive! :)

Guillaume Le Vaillant <glv@posteo.net> writes:

> Note that the plots only show the results using only 1 thread and

Doesn't 1 thread defeat the purpose of parallel compression / decompression?

> Machine used for the tests:
>  - CPU: Intel i7-3630QM
>  - RAM: 16 MiB

I suppose you meant 16 GiB ;)

Cheers!

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-07 11:00       ` Pierre Neidhardt
@ 2021-01-07 11:33         ` Guillaume Le Vaillant
  0 siblings, 0 replies; 43+ messages in thread
From: Guillaume Le Vaillant @ 2021-01-07 11:33 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]


Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Wow, impressive! :)
>
> Guillaume Le Vaillant <glv@posteo.net> writes:
>
>> Note that the plots only show the results using only 1 thread and
>
> Doesn't 1 thread defeat the purpose of parallel compression / decompression?
>

It was just to get a better idea of the relative compression and
decompression speeds of the algorithms. When using n threads, if the
file is big enough, the speeds are almost multiplied by n and the
compression ratio is a little lower.

>> Machine used for the tests:
>>  - CPU: Intel i7-3630QM
>>  - RAM: 16 MiB
>
> I suppose you meant 16 GiB ;)

Yes, of course :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-07 10:45     ` Guillaume Le Vaillant
  2021-01-07 11:00       ` Pierre Neidhardt
@ 2021-01-14 21:51       ` Ludovic Courtès
  2021-01-14 22:08         ` Nicolò Balzarotti
  2021-01-15  8:10         ` When substitute download + decompression is CPU-bound Pierre Neidhardt
  1 sibling, 2 replies; 43+ messages in thread
From: Ludovic Courtès @ 2021-01-14 21:51 UTC (permalink / raw)
  To: Guillaume Le Vaillant; +Cc: guix-devel

Hi Guillaume,

Guillaume Le Vaillant <glv@posteo.net> skribis:

> I compared gzip, lzip and zstd when compressing a 580 MB pack (therefore
> containing "subsitutes" for several packages) with different compression
> levels. Maybe the results can be of some use to someone.

It’s insightful, thanks a lot!

One takeaway for me is that zstd decompression remains an order of
magnitude faster than the others, regardless of the compression level.

Another one is that at level 10 and higher zstd achieves compression
ratios that are more in the ballpark of lzip.

If we are to change the compression methods used at ci.guix.gnu.org, we
could use zstd >= 10.

We could also drop gzip, but there are probably pre-1.1 daemons out
there that understand nothing but gzip¹, so perhaps that’ll have to
wait.  Now, compressing substitutes three times may be somewhat
unreasonable.

Thoughts?

Ludo’.

¹ https://guix.gnu.org/en/blog/2020/gnu-guix-1.1.0-released/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-14 21:51       ` Ludovic Courtès
@ 2021-01-14 22:08         ` Nicolò Balzarotti
  2021-01-28 17:53           ` Are gzip-compressed substitutes still used? Ludovic Courtès
  2021-01-15  8:10         ` When substitute download + decompression is CPU-bound Pierre Neidhardt
  1 sibling, 1 reply; 43+ messages in thread
From: Nicolò Balzarotti @ 2021-01-14 22:08 UTC (permalink / raw)
  To: Ludovic Courtès, Guillaume Le Vaillant; +Cc: guix-devel

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> We could also drop gzip, but there are probably pre-1.1 daemons out
> there that understand nothing but gzip¹, so perhaps that’ll have to
> wait.  Now, compressing substitutes three times may be somewhat
> unreasonable.
>
> Thoughts?
>
Is there a request log where we can check whether this is true?


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-14 21:51       ` Ludovic Courtès
  2021-01-14 22:08         ` Nicolò Balzarotti
@ 2021-01-15  8:10         ` Pierre Neidhardt
  2021-01-28 17:58           ` Ludovic Courtès
  1 sibling, 1 reply; 43+ messages in thread
From: Pierre Neidhardt @ 2021-01-15  8:10 UTC (permalink / raw)
  To: Ludovic Courtès, Guillaume Le Vaillant; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1443 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> One takeaway for me is that zstd decompression remains an order of
> magnitude faster than the others, regardless of the compression level.
>
> Another one is that at level 10 and higher zstd achieves compression
> ratios that are more in the ballpark of lzip.

Hmmm, this is roughly true for lzip < level 6, but as soon as lzip hits level 6
(the default!) it compresses up to twice as much!

> If we are to change the compression methods used at ci.guix.gnu.org, we
> could use zstd >= 10.

On Guillaume's graph, the compression speed at the default level 3 is
about 110 MB/s, while at level 10 it's about 40 MB/s, which is
approximately the gzip speed.

If server compression time does not matter, then I agree, level >= 10
would be a good option.

What about zstd level 19 then?  It's as slow as lzip to compress, but
decompresses still blazingly fast, which is what we are trying to
achieve here, _while_ offering a compression ration in the ballpark of
lzip level 6 (but still not that of lzip level 9).

> We could also drop gzip, but there are probably pre-1.1 daemons out
> there that understand nothing but gzip¹, so perhaps that’ll have to
> wait.  Now, compressing substitutes three times may be somewhat
> unreasonable.

Agreed, maybe release an announcement and give it a few months / 1 year?

Cheers!

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Are gzip-compressed substitutes still used?
  2021-01-14 22:08         ` Nicolò Balzarotti
@ 2021-01-28 17:53           ` Ludovic Courtès
  2021-03-17 17:12             ` Ludovic Courtès
  0 siblings, 1 reply; 43+ messages in thread
From: Ludovic Courtès @ 2021-01-28 17:53 UTC (permalink / raw)
  To: Nicolò Balzarotti; +Cc: guix-devel

Hi Nicolò,

Nicolò Balzarotti <anothersms@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> We could also drop gzip, but there are probably pre-1.1 daemons out
>> there that understand nothing but gzip¹, so perhaps that’ll have to
>> wait.  Now, compressing substitutes three times may be somewhat
>> unreasonable.
>>
>> Thoughts?
>>
> Is there a request log where we can check whether this is true?

I finally got around to checking this.

I picked a relatively popular substitute for which the lzip-compressed
variant is smaller than the gzip-compressed variant, and thus modern
‘guix substitute’ chooses lzip over gzip:

--8<---------------cut here---------------start------------->8---
$ wget -q -O - https://ci.guix.gnu.org/7rpj4dmn9g64zqp8vkc0byx93glix2pm.narinfo | head -7
StorePath: /gnu/store/7rpj4dmn9g64zqp8vkc0byx93glix2pm-gtk+-3.24.23
URL: nar/gzip/7rpj4dmn9g64zqp8vkc0byx93glix2pm-gtk%2B-3.24.23
Compression: gzip
FileSize: 13982949
URL: nar/lzip/7rpj4dmn9g64zqp8vkc0byx93glix2pm-gtk%2B-3.24.23
Compression: lzip
FileSize: 7223862
--8<---------------cut here---------------end--------------->8---

On berlin, I looked at the HTTPS nginx logs and did this:

--8<---------------cut here---------------start------------->8---
ludo@berlin /var/log/nginx$ tail -10000000 < https.access.log > /tmp/sample.log
ludo@berlin /var/log/nginx$ date
Thu 28 Jan 2021 06:18:01 PM CET
ludo@berlin /var/log/nginx$ grep /7rpj4dmn9g64zqp8vkc0byx93glix2pm-gtk < /tmp/sample.log |wc -l
1304
ludo@berlin /var/log/nginx$ grep /gzip/7rpj4dmn9g64zqp8vkc0byx93glix2pm-gtk < /tmp/sample.log |wc -l
17
ludo@berlin /var/log/nginx$ grep /lzip/7rpj4dmn9g64zqp8vkc0byx93glix2pm-gtk < /tmp/sample.log |wc -l
1287
--8<---------------cut here---------------end--------------->8---

The 10M-request sample covers requests from Jan. 10th to now.  Over that
period, 99% of the GTK+ downloads were made as lzip.  We see similar
results with less popular packages and with core packages:

--8<---------------cut here---------------start------------->8---
ludo@berlin /var/log/nginx$ grep /01xi3sig314wgwa1j9sxk37vl816mj74-r-minimal < /tmp/sample.log | wc -l
85
ludo@berlin /var/log/nginx$ grep /gzip/01xi3sig314wgwa1j9sxk37vl816mj74-r-minimal < /tmp/sample.log | wc -l
1
ludo@berlin /var/log/nginx$ grep /lzip/01xi3sig314wgwa1j9sxk37vl816mj74-r-minimal < /tmp/sample.log | wc -l
84
ludo@berlin /var/log/nginx$ grep /0m0vd873jp61lcm4xa3ljdgx381qa782-guile-3.0.2 < /tmp/sample.log |wc -l
1601
ludo@berlin /var/log/nginx$ grep /gzip/0m0vd873jp61lcm4xa3ljdgx381qa782-guile-3.0.2 < /tmp/sample.log |wc -l
8
ludo@berlin /var/log/nginx$ grep /lzip/0m0vd873jp61lcm4xa3ljdgx381qa782-guile-3.0.2 < /tmp/sample.log |wc -l
1593
--8<---------------cut here---------------end--------------->8---

From that, we could deduce that about 1% of our users who take
substitutes from ci.guix are still using a pre-1.1.0 daemon without
support for lzip compression.

I find it surprisingly low: 1.1.0 was released “only” 9 months ago,
which is not a lot for someone used to the long release cycles of
“stable” distros.

It might be underestimated: users running an old daemon probably update
less often and may thus be underrepresented in the substitute logs.


As for whether it’s OK to drop gzip substitutes altogether: I’m not
confident about knowingly breaking 1% or more of the deployed Guixes,
but it’s all about tradeoffs.

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-15  8:10         ` When substitute download + decompression is CPU-bound Pierre Neidhardt
@ 2021-01-28 17:58           ` Ludovic Courtès
  2021-01-29  9:45             ` Pierre Neidhardt
  2021-01-29 13:33             ` zimoun
  0 siblings, 2 replies; 43+ messages in thread
From: Ludovic Courtès @ 2021-01-28 17:58 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: guix-devel

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> On Guillaume's graph, the compression speed at the default level 3 is
> about 110 MB/s, while at level 10 it's about 40 MB/s, which is
> approximately the gzip speed.
>
> If server compression time does not matter, then I agree, level >= 10
> would be a good option.
>
> What about zstd level 19 then?  It's as slow as lzip to compress, but
> decompresses still blazingly fast, which is what we are trying to
> achieve here, _while_ offering a compression ration in the ballpark of
> lzip level 6 (but still not that of lzip level 9).

We could do that.  I suppose a possible agenda would be:

  1. Start providing zstd susbstitutes anytime.  However, most clients
     will keep choosing lzip because it usually compresses better.

  2. After the next release, stop providing lzip substitutes and provide
     only gzip + zstd-19.

This option has the advantage that it wouldn’t break any installation.
It’s not as nice as the ability to choose a download strategy, as we
discussed earlier, but implementing that download strategy sounds
tricky.

Thoughts?

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-28 17:58           ` Ludovic Courtès
@ 2021-01-29  9:45             ` Pierre Neidhardt
  2021-01-29 11:23               ` Guillaume Le Vaillant
  2021-01-29 13:33             ` zimoun
  1 sibling, 1 reply; 43+ messages in thread
From: Pierre Neidhardt @ 2021-01-29  9:45 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

Hi Ludo!

Ludovic Courtès <ludo@gnu.org> writes:

>> On Guillaume's graph, the compression speed at the default level 3 is
>> about 110 MB/s, while at level 10 it's about 40 MB/s, which is
>> approximately the gzip speed.
>>
>> If server compression time does not matter, then I agree, level >= 10
>> would be a good option.
>>
>> What about zstd level 19 then?  It's as slow as lzip to compress, but
>> decompresses still blazingly fast, which is what we are trying to
>> achieve here, _while_ offering a compression ration in the ballpark of
>> lzip level 6 (but still not that of lzip level 9).
>
> We could do that.  I suppose a possible agenda would be:
>
>   1. Start providing zstd susbstitutes anytime.  However, most clients
>      will keep choosing lzip because it usually compresses better.
>
>   2. After the next release, stop providing lzip substitutes and provide
>      only gzip + zstd-19.
>
> This option has the advantage that it wouldn’t break any installation.

But why would we keep gzip since it offers no benefits compared to zstd?
It feels like continuing to carry a (huge) burden forever...

Besides, dropping Lzip seems like a step backward in my opinion.  Users
with lower bandwidth (or simply further away from Berlin) will be
impacted a lot.

I would opt for dropping gzip instead, only to keep zstd-19 and lzip-9
(possibly plzip-9 if we update the bindings).

> It’s not as nice as the ability to choose a download strategy, as we
> discussed earlier, but implementing that download strategy sounds
> tricky.

If the user can choose their favourite substitute compression, I believe
it's usually enough since they are the best judge of their bandwidth /
hardware requirements.

Wouldn't this simple enough?

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-29  9:45             ` Pierre Neidhardt
@ 2021-01-29 11:23               ` Guillaume Le Vaillant
  2021-01-29 11:55                 ` Nicolò Balzarotti
  2021-02-01 22:18                 ` Ludovic Courtès
  0 siblings, 2 replies; 43+ messages in thread
From: Guillaume Le Vaillant @ 2021-01-29 11:23 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]

Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Hi Ludo!
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> I suppose a possible agenda would be:
>>
>>   1. Start providing zstd susbstitutes anytime.  However, most clients
>>      will keep choosing lzip because it usually compresses better.
>>
>>   2. After the next release, stop providing lzip substitutes and provide
>>      only gzip + zstd-19.
>>
>> This option has the advantage that it wouldn’t break any installation.
>
> But why would we keep gzip since it offers no benefits compared to zstd?
> It feels like continuing to carry a (huge) burden forever...
>
> Besides, dropping Lzip seems like a step backward in my opinion.  Users
> with lower bandwidth (or simply further away from Berlin) will be
> impacted a lot.
>
> I would opt for dropping gzip instead, only to keep zstd-19 and lzip-9
> (possibly plzip-9 if we update the bindings).
>
>> It’s not as nice as the ability to choose a download strategy, as we
>> discussed earlier, but implementing that download strategy sounds
>> tricky.
>
> If the user can choose their favourite substitute compression, I believe
> it's usually enough since they are the best judge of their bandwidth /
> hardware requirements.
>
> Wouldn't this simple enough?

Here are a few numbers for the installation time in seconds (download
time + decompression time) when fetching 580 MB of substitutes for
download speeds between 0.5 MB/s and 20 MB/s.

| Download speed | gzip -9 | lzip -9 | zstd -19 |
|----------------+---------+---------+----------|
|            0.5 |     287 |     151 |      181 |
|            1.0 |     144 |      78 |       91 |
|            1.5 |      97 |      54 |       61 |
|            2.0 |      73 |      42 |       46 |
|            2.5 |      59 |      35 |       37 |
|            3.0 |      49 |      30 |       31 |
|            3.5 |      42 |      27 |       26 |
|            4.0 |      37 |      24 |       23 |
|            4.5 |      33 |      22 |       21 |
|            5.0 |      30 |      21 |       19 |
|            5.5 |      28 |      19 |       17 |
|            6.0 |      25 |      18 |       16 |
|            6.5 |      24 |      17 |       14 |
|            7.0 |      22 |      17 |       14 |
|            7.5 |      21 |      16 |       13 |
|            8.0 |      20 |      15 |       12 |
|            8.5 |      18 |      15 |       11 |
|            9.0 |      18 |      14 |       11 |
|            9.5 |      17 |      14 |       10 |
|           10.0 |      16 |      13 |       10 |
|           11.0 |      15 |      13 |        9 |
|           12.0 |      14 |      12 |        8 |
|           13.0 |      13 |      12 |        8 |
|           14.0 |      12 |      11 |        7 |
|           15.0 |      11 |      11 |        7 |
|           16.0 |      11 |      11 |        6 |
|           17.0 |      10 |      10 |        6 |
|           18.0 |      10 |      10 |        6 |
|           19.0 |       9 |      10 |        5 |
|           20.0 |       9 |      10 |        5 |

When the download speed is lower than 3.5 MB/s, Lzip is better, and
above that speed Zstd is better.

As Gzip is never the best choice, it would make sense to drop it, even
if we have to wait a little until everyone has updated their Guix daemon
to a version with at least Lzip support.

I think there are many people (like me) with a download speed slower
than 3 MB/s, so like Pierre I would prefer keeping "lzip -9" and
"zstd -19".

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-29 11:23               ` Guillaume Le Vaillant
@ 2021-01-29 11:55                 ` Nicolò Balzarotti
  2021-01-29 12:13                   ` Pierre Neidhardt
  2021-02-01 22:18                 ` Ludovic Courtès
  1 sibling, 1 reply; 43+ messages in thread
From: Nicolò Balzarotti @ 2021-01-29 11:55 UTC (permalink / raw)
  To: Guillaume Le Vaillant, Pierre Neidhardt; +Cc: guix-devel

Guillaume Le Vaillant <glv@posteo.net> writes:

> Here are a few numbers for the installation time in seconds (download
> time + decompression time) when fetching 580 MB of substitutes for
> download speeds between 0.5 MB/s and 20 MB/s.

Which hardware did you use?  Since you are fixing the download speed,
those results really depend on cpu speed.

> As Gzip is never the best choice, it would make sense to drop it, even
> if we have to wait a little until everyone has updated their Guix daemon

My hypothesis is that this won't be the case on something slow like the
raspberry pi 1.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-29 11:55                 ` Nicolò Balzarotti
@ 2021-01-29 12:13                   ` Pierre Neidhardt
  2021-01-29 13:06                     ` Guillaume Le Vaillant
  2021-01-29 14:55                     ` Nicolò Balzarotti
  0 siblings, 2 replies; 43+ messages in thread
From: Pierre Neidhardt @ 2021-01-29 12:13 UTC (permalink / raw)
  To: Nicolò Balzarotti, Guillaume Le Vaillant; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

Nicolò Balzarotti <anothersms@gmail.com> writes:

>> As Gzip is never the best choice, it would make sense to drop it, even
>> if we have to wait a little until everyone has updated their Guix daemon
>
> My hypothesis is that this won't be the case on something slow like the
> raspberry pi 1.

What wouldn't be the case?  If you mean that "gzip is never the best
choice", wouldn't Zstd outperform gzip on the Raspberry Pi 1 too?

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-29 12:13                   ` Pierre Neidhardt
@ 2021-01-29 13:06                     ` Guillaume Le Vaillant
  2021-01-29 14:55                     ` Nicolò Balzarotti
  1 sibling, 0 replies; 43+ messages in thread
From: Guillaume Le Vaillant @ 2021-01-29 13:06 UTC (permalink / raw)
  To: Pierre Neidhardt; +Cc: guix-devel, Nicolò Balzarotti

[-- Attachment #1: Type: text/plain, Size: 1323 bytes --]

Nicolò Balzarotti <anothersms@gmail.com> skribis:

> Which hardware did you use?  Since you are fixing the download speed,
> those results really depend on cpu speed.

I ran these tests on a laptop from 2012 with an Intel i7-3630QM CPU.

When the CPU speed increases, the download speed limit below which Lzip
is the best choice also increases.
For example, in my test Lzip is the best choice if the download speed
is below 3.5 MB/s. With a CPU running twice faster, Lzip is the best
choice when the download speed is below 6.5 MB/s.


Pierre Neidhardt <mail@ambrevar.xyz> skribis:

> Nicolò Balzarotti <anothersms@gmail.com> writes:
>
>>> As Gzip is never the best choice, it would make sense to drop it, even
>>> if we have to wait a little until everyone has updated their Guix daemon
>>
>> My hypothesis is that this won't be the case on something slow like the
>> raspberry pi 1.
>
> What wouldn't be the case?  If you mean that "gzip is never the best
> choice", wouldn't Zstd outperform gzip on the Raspberry Pi 1 too?

I saw a compression benchmark somewhere on the internet (I can't
remember where right now) where Gzip decompression on a Raspberry Pi 2
was around 40 MB/s, and Zstd decompression was around 50 MB/s. I guess
Zstd will also be faster than Gzip on a Raspberry Pi 1.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-28 17:58           ` Ludovic Courtès
  2021-01-29  9:45             ` Pierre Neidhardt
@ 2021-01-29 13:33             ` zimoun
  1 sibling, 0 replies; 43+ messages in thread
From: zimoun @ 2021-01-29 13:33 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix Devel

Hi,

On Thu, 28 Jan 2021 at 19:07, Ludovic Courtès <ludo@gnu.org> wrote:

> We could do that.  I suppose a possible agenda would be:
>
>   1. Start providing zstd susbstitutes anytime.  However, most clients
>      will keep choosing lzip because it usually compresses better.
>
>   2. After the next release, stop providing lzip substitutes and provide
>      only gzip + zstd-19.
>
> This option has the advantage that it wouldn’t break any installation.
> It’s not as nice as the ability to choose a download strategy, as we
> discussed earlier, but implementing that download strategy sounds
> tricky.

I propose to announce at the next release (v1.3) that strategy X will
be dropped at the next next release (v1.4), explaining the daemon
upgrade and/or point to documentation.

From my understanding (thanks Guillaume for the plots!), X means gzip.
And we should keep lzip-9 (users with a weak network) et zstd-19, as
Pierre and Guillaume are proposing.

So gzip would stay until v1.4, i.e., more or less 1 year (or 1.5 years) more.


All the best,
simon


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-29 12:13                   ` Pierre Neidhardt
  2021-01-29 13:06                     ` Guillaume Le Vaillant
@ 2021-01-29 14:55                     ` Nicolò Balzarotti
  1 sibling, 0 replies; 43+ messages in thread
From: Nicolò Balzarotti @ 2021-01-29 14:55 UTC (permalink / raw)
  To: Pierre Neidhardt, Guillaume Le Vaillant; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

Pierre Neidhardt <mail@ambrevar.xyz> writes:

> Nicolò Balzarotti <anothersms@gmail.com> writes:
>
> What wouldn't be the case?  If you mean that "gzip is never the best
> choice", wouldn't Zstd outperform gzip on the Raspberry Pi 1 too?

My bad, you are right.  Also, memory usage shoudn't be a problem.  gzip
uses way less (testd on ungoogled chromium, I get ~16kb peak heap size
for gzip, 8Mb for zstd and 32Mb for lzip), but I'd expect guix to be
running on systems with more than 8Mb of memory.  Just for reference,
here's the memory profiling script


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: profiling --]
[-- Type: text/x-org, Size: 10135 bytes --]

#+PROPERTY: header-args:bash :session valgrind

* NAR Decompression memory benchmark

#+begin_src bash :results none
guix environment --ad-hoc valgrind lzip gzip zstd wget
#+end_src

#+begin_src bash :cache yes
valgrind --version | sed 's/-/ /'
lzip --version | head -1
gunzip --version | head -1 | sed 's/\s(/(/'
zstd --version | sed -e 's/command.*v//' -e 's/,.*//' -e 's/**//'
#+end_src

#+RESULTS[e07cecfd5cc770b7a898408b80678f2e8ea7772e]:
| valgrind     | 3.16.1 |
| lzip         |   1.21 |
| gunzip(gzip) |    1.1 |
| zstd         |  1.4.4 |

Just noticed that there should be a new zstd release ([[https://github.com/facebook/zstd/releases/][zstd 1.4.8]]), and
a new lzip release ([[https://download.savannah.gnu.org/releases/lzip/][lzip 1.22]]).

** Prepare required data 

   #+begin_src bash :cache yes
   wget https://ci.guix.gnu.org/nar/gzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-87.0.4280.88-0.b78cb92 -O uc.nar.gz
   wget https://ci.guix.gnu.org/nar/lzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-87.0.4280.88-0.b78cb92 -O uc.nar.lz
   #+end_src

   #+RESULTS[ea17e5a54da1ca54a9c82f264912675d9ca981a0]:

   Create zstd compressed file

   #+begin_src bash :results none
   gunzip -c < uc.nar.gz > uc.nar
   zstd -19 uc.nar -o uc.nar.zstd
   #+end_src
   
   Check file sizes

   #+begin_src bash
   ls -lh --sort=size | head -5
   #+end_src

   #+RESULTS:
   | total      | 585M |      |       |      |     |    |       |             |
   | -rw-r--r-- |    1 | nixo | users | 335M | Jan | 29 | 15:14 | uc.nar      |
   | -rw-r--r-- |    1 | nixo | users | 103M | Jan | 29 | 15:13 | uc.nar.gz   |
   | -rw-r--r-- |    1 | nixo | users | 78M  | Jan | 29 | 15:14 | uc.nar.zstd |
   | -rw-r--r-- |    1 | nixo | users | 71M  | Jan | 29 | 15:13 | uc.nar.lz   |


** Decompress

 #+name: massif
 #+begin_src bash :session valgrind :var command="ls" input="." output="/dev/null" name="ls"
 time valgrind --tool=massif --log-file=/dev/null --time-unit=B --trace-children=yes --massif-out-file=$name.massif $command < $input >$output
 #+end_src

 #+call: massif(command="gunzip -c", input="uc.nar.gz", output="/dev/null", name="gzip")

 #+RESULTS:
 | nixo@guixSD | ~/prof   | [env]$ | nixo@guixSD | ~/prof | [env]$ | nixo@guixSD | ~/prof | [env]$ |
 | real        | 0m8.291s |        |             |        |        |             |        |        |
 | user        | 0m7.910s |        |             |        |        |             |        |        |
 | sys         | 0m0.201s |        |             |        |        |             |        |        |

 #+call: massif(command="lzip -d", input="uc.nar.lz", output="/dev/null", name="lzip")

 #+RESULTS:
 | nixo@guixSD | ~/prof    | [env]$ | nixo@guixSD | ~/prof | [env]$ | nixo@guixSD | ~/prof | [env]$ |
 | real        | 0m22.378s |        |             |        |        |             |        |        |
 | user        | 0m20.959s |        |             |        |        |             |        |        |
 | sys         | 0m0.345s  |        |             |        |        |             |        |        |

 #+call: massif(command="zstd -d", input="uc.nar.zstd", output="/dev/null", name="zstd")

 #+RESULTS:
 | nixo@guixSD | ~/prof   | [env]$ | nixo@guixSD | ~/prof | [env]$ | nixo@guixSD | ~/prof | [env]$ |
 | real        | 0m4.607s |        |             |        |        |             |        |        |
 | user        | 0m4.157s |        |             |        |        |             |        |        |
 | sys         | 0m0.135s |        |             |        |        |             |        |        |

** Check massif output

#+begin_src bash :results raw drawer
  for ext in gzip lzip zstd; do
      ms_print $ext.massif > $ext.graph
  done
#+end_src

#+RESULTS:
:results:
:end:

--------------------------------------------------------------------------------
Command:            /gnu/store/378zjf2kgajcfd7mfr98jn5xyc5wa3qv-gzip-1.10/bin/gzip -d -c
Massif arguments:   --time-unit=B --massif-out-file=gzip.massif
ms_print arguments: gzip.massif
--------------------------------------------------------------------------------


    KB
15.59^      #                                                                 
     |      #                                                                 
     |      #                                                                 
     |      #                                                                 
     |      #                                                                 
     |      #                                                                :
     |      #                                                                :
     |      #                                                                :
     |  :   #                                ::         :                    :
     |  :   #                        ::      :          :                    :
     |  :  :#:  ::::: ::  : ::       :   :   :  :       :             @@     :
     |  :  :#:  :: :  ::  : :        :   :   :  :       :             @      :
     |  :  :#:  :: :  ::  : :        :   :   :  :       :             @      :
     |  :  :#:  :: :  ::  : :        :   :   :  :       :           : @      :
     |  :  :#:  :: :  ::  : :        :   :   :  :       :           : @      :
     |  :  :#:  :: : :::  :::     ::::   :   :  :       ::          : @ :    :
     |  :  :#:  :: : :::  ::: ::  :: :   :   : ::::     ::::        : @ :    :
     |  :  :#::::: : :::  ::: :   :: :   ::: : ::: :    :::   :::@@ : @ :::  :
     |  :  :#:: :: : :::::::: :   :: :   :: :: ::: :    :::   :: @ :::@ ::   :
     |  ::::#:: :: : :::: ::: :   :: :   :: :: ::: :    :::   :: @ :::@ :: : :
   0 +----------------------------------------------------------------------->MB
     0                                                                   114.8

Number of snapshots: 53
 Detailed snapshots: [5 (peak), 21, 39, 42, 46, 50]

--------------------------------------------------------------------------------
Command:            lzip -d
Massif arguments:   --time-unit=B --massif-out-file=lzip.massif
ms_print arguments: lzip.massif
--------------------------------------------------------------------------------


    MB
32.09^                                    ################################### 
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
   0 +----------------------------------------------------------------------->MB
     0                                                                   64.18

Number of snapshots: 12
 Detailed snapshots: [6 (peak)]
--------------------------------------------------------------------------------
Command:            zstd -d
Massif arguments:   --time-unit=B --massif-out-file=zstd.massif
ms_print arguments: zstd.massif
--------------------------------------------------------------------------------


    MB
8.665^                                    #                                   
     |                                   :#:::::::::::::::::::::::::::::::::  
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
     |                                   :#                                   
   0 +----------------------------------------------------------------------->MB
     0                                                                   17.33

Number of snapshots: 18
 Detailed snapshots: [9 (peak)]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: When substitute download + decompression is CPU-bound
  2021-01-29 11:23               ` Guillaume Le Vaillant
  2021-01-29 11:55                 ` Nicolò Balzarotti
@ 2021-02-01 22:18                 ` Ludovic Courtès
  1 sibling, 0 replies; 43+ messages in thread
From: Ludovic Courtès @ 2021-02-01 22:18 UTC (permalink / raw)
  To: Guillaume Le Vaillant; +Cc: guix-devel

Hi,

Guillaume Le Vaillant <glv@posteo.net> skribis:

> Pierre Neidhardt <mail@ambrevar.xyz> skribis:

[...]

>>> It’s not as nice as the ability to choose a download strategy, as we
>>> discussed earlier, but implementing that download strategy sounds
>>> tricky.
>>
>> If the user can choose their favourite substitute compression, I believe
>> it's usually enough since they are the best judge of their bandwidth /
>> hardware requirements.

As should be clear with what Guillaume and Nico posted, it’s pretty hard
to determine whether you need one compression algorithm or the other,
and it changes as you move your laptop around (different networking,
different CPU frequency scaling strategy, etc.).

> Here are a few numbers for the installation time in seconds (download
> time + decompression time) when fetching 580 MB of substitutes for
> download speeds between 0.5 MB/s and 20 MB/s.
>
> | Download speed | gzip -9 | lzip -9 | zstd -19 |
> |----------------+---------+---------+----------|
> |            0.5 |     287 |     151 |      181 |
> |            1.0 |     144 |      78 |       91 |
> |            1.5 |      97 |      54 |       61 |
> |            2.0 |      73 |      42 |       46 |
> |            2.5 |      59 |      35 |       37 |
> |            3.0 |      49 |      30 |       31 |
> |            3.5 |      42 |      27 |       26 |
> |            4.0 |      37 |      24 |       23 |
> |            4.5 |      33 |      22 |       21 |
> |            5.0 |      30 |      21 |       19 |
> |            5.5 |      28 |      19 |       17 |
> |            6.0 |      25 |      18 |       16 |
> |            6.5 |      24 |      17 |       14 |
> |            7.0 |      22 |      17 |       14 |
> |            7.5 |      21 |      16 |       13 |
> |            8.0 |      20 |      15 |       12 |
> |            8.5 |      18 |      15 |       11 |
> |            9.0 |      18 |      14 |       11 |
> |            9.5 |      17 |      14 |       10 |
> |           10.0 |      16 |      13 |       10 |
> |           11.0 |      15 |      13 |        9 |
> |           12.0 |      14 |      12 |        8 |
> |           13.0 |      13 |      12 |        8 |
> |           14.0 |      12 |      11 |        7 |
> |           15.0 |      11 |      11 |        7 |
> |           16.0 |      11 |      11 |        6 |
> |           17.0 |      10 |      10 |        6 |
> |           18.0 |      10 |      10 |        6 |
> |           19.0 |       9 |      10 |        5 |
> |           20.0 |       9 |      10 |        5 |
>
> When the download speed is lower than 3.5 MB/s, Lzip is better, and
> above that speed Zstd is better.
>
> As Gzip is never the best choice, it would make sense to drop it, even
> if we have to wait a little until everyone has updated their Guix daemon
> to a version with at least Lzip support.

Right.  We can drop it eventually, maybe soon since only 1% of our
downloads pick gzip.

> I think there are many people (like me) with a download speed slower
> than 3 MB/s, so like Pierre I would prefer keeping "lzip -9" and
> "zstd -19".

Understood.  To me, that means we need to implement something smart.

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-01-28 17:53           ` Are gzip-compressed substitutes still used? Ludovic Courtès
@ 2021-03-17 17:12             ` Ludovic Courtès
  2021-03-17 17:33               ` Léo Le Bouter
                                 ` (3 more replies)
  0 siblings, 4 replies; 43+ messages in thread
From: Ludovic Courtès @ 2021-03-17 17:12 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

> From that, we could deduce that about 1% of our users who take
> substitutes from ci.guix are still using a pre-1.1.0 daemon without
> support for lzip compression.
>
> I find it surprisingly low: 1.1.0 was released “only” 9 months ago,
> which is not a lot for someone used to the long release cycles of
> “stable” distros.

(See
<https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00378.html>
for the initial message.)

Here’s an update, 1.5 month later.  This time I’m looking at nginx logs
covering Feb 8th to Mar 17th and using a laxer regexp than in the
message above, here are the gzip/lzip download ratio for several
packages:

--8<---------------cut here---------------start------------->8---
ludo@berlin ~$ ./nar-download-stats.sh /tmp/sample3.log                                                                gtk%2B-3: gzip/lzip ratio: 37/3255 1%
glib-2: gzip/lzip ratio: 97/8629 1%
coreutils-8: gzip/lzip ratio: 81/2306 3%
python-3: gzip/lzip ratio: 120/7177 1%
r-minimal-[34]: gzip/lzip ratio: 8/302 2%
openmpi-4: gzip/lzip ratio: 19/236 8%
hwloc-2: gzip/lzip ratio: 10/43 23%
gfortran-7: gzip/lzip ratio: 6/225 2%
--8<---------------cut here---------------end--------------->8---

(Script attached.)

The hwloc/openmpi outlier is intriguing.  Is it one HPC web site running
an old daemon, or several of them?  Looking more closely, it’s 22 of
them on 8 different networks (looking at the first three digits of the
IP address):

--8<---------------cut here---------------start------------->8---
ludo@berlin ~$ grep -E '/gzip/[[:alnum:]]{32}-(hwloc-2|openmpi-4)\.[[:digit:]]+\.[[:digit:]]+ ' < /tmp/sample3.log | cut -f1 -d- | sort -u | wc -l
22
ludo@berlin ~$ grep -E '/gzip/[[:alnum:]]{32}-(hwloc-2|openmpi-4)\.[[:digit:]]+\.[[:digit:]]+ ' < /tmp/sample3.log | cut -f1 -d- | cut -f 1-3 -d. | sort -u | wc -l
8
--8<---------------cut here---------------end--------------->8---

Conclusion?  It still sounds like we can’t reasonably remove gzip
support just yet.

I’d still like to start providing zstd-compressed substitutes though.
So I think what we can do is:

  • start providing zstd substitutes on berlin right now so that when
    1.2.1 comes out, at least some substitutes are available as zstd;

  • when 1.2.1 is announced, announce that gzip substitutes may be
    removed in the future and invite users to upgrade;

  • revisit this issue with an eye on dropping gzip within 6–18 months.

Thoughts?

Ludo’.


[-- Attachment #2: the script --]
[-- Type: text/plain, Size: 605 bytes --]

#!/bin/sh

if [ ! "$#" = 1 ]
then
    echo "Usage: $1 NGINX-LOG-FILE"
    exit 1
fi

set -e

sample="$1"
items="gtk%2B-3 glib-2 coreutils-8 python-3 r-minimal-[34] openmpi-4 hwloc-2 gfortran-7"

for i in $items
do
    # Tweak the regexp so we don't catch ".drv" substitutes as these
    # usually compress better with gzip.
    lzip="$(grep -E "/lzip/[[:alnum:]]{32}-$i\\.[[:digit:]]+(\\.[[:digit:]]+)? " < "$sample" | wc -l)"
    gzip="$(grep -E "/gzip/[[:alnum:]]{32}-$i\\.[[:digit:]]+(\\.[[:digit:]]+)? " < "$sample" | wc -l)"
    echo "$i: gzip/lzip ratio: $gzip/$lzip $(($gzip * 100 / $lzip))%"
done

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 17:12             ` Ludovic Courtès
@ 2021-03-17 17:33               ` Léo Le Bouter
  2021-03-17 18:08                 ` Vagrant Cascadian
  2021-03-17 18:06               ` zimoun
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 43+ messages in thread
From: Léo Le Bouter @ 2021-03-17 17:33 UTC (permalink / raw)
  To: Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 220 bytes --]

Just as a reminder siding with vagrantc here:

We must ensure the Debian 'guix' package can still work and upgrade
from it's installed version, so ensure that removing gzip doesnt break
initial 'guix pull' with it.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 17:12             ` Ludovic Courtès
  2021-03-17 17:33               ` Léo Le Bouter
@ 2021-03-17 18:06               ` zimoun
  2021-03-17 18:20               ` Jonathan Brielmaier
  2021-03-18 17:25               ` Pierre Neidhardt
  3 siblings, 0 replies; 43+ messages in thread
From: zimoun @ 2021-03-17 18:06 UTC (permalink / raw)
  To: Ludovic Courtès, guix-devel

Hi,

On Wed, 17 Mar 2021 at 18:12, Ludovic Courtès <ludo@gnu.org> wrote:

> I’d still like to start providing zstd-compressed substitutes though.
> So I think what we can do is:
>
>   • start providing zstd substitutes on berlin right now so that when
>     1.2.1 comes out, at least some substitutes are available as zstd;
>
>   • when 1.2.1 is announced, announce that gzip substitutes may be
>     removed in the future and invite users to upgrade;
>
>   • revisit this issue with an eye on dropping gzip within 6–18 months.

Sounds reasonable.  The full removal could be announced for the 1.4
release.  Even if we do not know when it will happen. ;-)  So people
know what to expect.

Cheers,
simon


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 17:33               ` Léo Le Bouter
@ 2021-03-17 18:08                 ` Vagrant Cascadian
  2021-03-18  0:03                   ` zimoun
  2021-03-20 11:23                   ` Ludovic Courtès
  0 siblings, 2 replies; 43+ messages in thread
From: Vagrant Cascadian @ 2021-03-17 18:08 UTC (permalink / raw)
  To: Léo Le Bouter, Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 756 bytes --]

On 2021-03-17, Léo Le Bouter wrote:
> Just as a reminder siding with vagrantc here:
>
> We must ensure the Debian 'guix' package can still work and upgrade
> from it's installed version, so ensure that removing gzip doesnt break
> initial 'guix pull' with it.

... and I would expect this version to ship in Debian for another ~3-5
years, unless it gets removed from Debian bullseye before the upcoming
(real soon now) release!

But if lzip substitutes are still supported, I *think* guix 1.2.0 as
packaged in Debian still supports that, at least.

Dropping both gzip and lzip would be unfortunate; I don't think it would
be trivial to backport the zstd patches to guix 1.2.0, as it also
depends on guile-zstd?


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 17:12             ` Ludovic Courtès
  2021-03-17 17:33               ` Léo Le Bouter
  2021-03-17 18:06               ` zimoun
@ 2021-03-17 18:20               ` Jonathan Brielmaier
  2021-03-18 17:25               ` Pierre Neidhardt
  3 siblings, 0 replies; 43+ messages in thread
From: Jonathan Brielmaier @ 2021-03-17 18:20 UTC (permalink / raw)
  To: guix-devel

On 17.03.21 18:12, Ludovic Courtès wrote:
> (See
> <https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00378.html>
> for the initial message.)
>
> Here’s an update, 1.5 month later.  This time I’m looking at nginx logs
> covering Feb 8th to Mar 17th and using a laxer regexp than in the
> message above, here are the gzip/lzip download ratio for several
> packages:
>
> --8<---------------cut here---------------start------------->8---
> ludo@berlin ~$ ./nar-download-stats.sh /tmp/sample3.log                                                                gtk%2B-3: gzip/lzip ratio: 37/3255 1%
> glib-2: gzip/lzip ratio: 97/8629 1%
> coreutils-8: gzip/lzip ratio: 81/2306 3%
> python-3: gzip/lzip ratio: 120/7177 1%
> r-minimal-[34]: gzip/lzip ratio: 8/302 2%
> openmpi-4: gzip/lzip ratio: 19/236 8%
> hwloc-2: gzip/lzip ratio: 10/43 23%
> gfortran-7: gzip/lzip ratio: 6/225 2%
> --8<---------------cut here---------------end--------------->8---

Interesting findings...

> Conclusion?  It still sounds like we can’t reasonably remove gzip
> support just yet.
>
> I’d still like to start providing zstd-compressed substitutes though.
> So I think what we can do is:
>
>    • start providing zstd substitutes on berlin right now so that when
>      1.2.1 comes out, at least some substitutes are available as zstd;
>
>    • when 1.2.1 is announced, announce that gzip substitutes may be
>      removed in the future and invite users to upgrade;

My personal substitution servers runs with lzip + zstd, so no gzip. It
works fine, but I didn't had any "legacy" users. Although zstd is only
0,6% of lzip in regards to total downloads over the last months...


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 18:08                 ` Vagrant Cascadian
@ 2021-03-18  0:03                   ` zimoun
  2021-03-18 16:00                     ` Vagrant Cascadian
  2021-03-20 11:23                   ` Ludovic Courtès
  1 sibling, 1 reply; 43+ messages in thread
From: zimoun @ 2021-03-18  0:03 UTC (permalink / raw)
  To: Vagrant Cascadian, Léo Le Bouter, Ludovic Courtès,
	guix-devel

Hi Vagrant,

On Wed, 17 Mar 2021 at 11:08, Vagrant Cascadian <vagrant@debian.org> wrote:

> ... and I would expect this version to ship in Debian for another ~3-5
> years, unless it gets removed from Debian bullseye before the upcoming
> (real soon now) release!

I could miss a point.  In 3-5 years, some people will be still running
Debian stable (or maybe oldstable or maybe this stable is LTS), so they
will “apt install guix” at 1.2.0, right?  But then there is no guarantee
that Berlin will still serve this 5 years old binary substitutes.  But
“guix install” fallback by compiling what is missing, right?  Then the
question will be: are the upstream sources still available?  Assuming
that SWH is still alive at this future, all the git-fetch packages will
have their source, whatever the upstream status.  For all the other
methods, there is no guarantee.

On the other hand, at this 3-5 years future, after “apt install guix”,
people will not do “guix install” but instead they should do “guix
pull”.  Therefore, the compression of substitutes does not matter that
much, right?

The only strong backward compatibility seems between “guix pull” rather
than all the substitutes themselves.  Isn’t it?  Other said, at least
keep all the necessary to have “guix pull” at 1.2.0 be able to complete.


Thanks for this opportunity to think at such time scale. :-)


Cheers,
simon


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-18  0:03                   ` zimoun
@ 2021-03-18 16:00                     ` Vagrant Cascadian
  2021-03-18 18:53                       ` Leo Famulari
  0 siblings, 1 reply; 43+ messages in thread
From: Vagrant Cascadian @ 2021-03-18 16:00 UTC (permalink / raw)
  To: zimoun, Léo Le Bouter, Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 2668 bytes --]

On 2021-03-18, zimoun wrote:
> On Wed, 17 Mar 2021 at 11:08, Vagrant Cascadian <vagrant@debian.org> wrote:
>
>> ... and I would expect this version to ship in Debian for another ~3-5
>> years, unless it gets removed from Debian bullseye before the upcoming
>> (real soon now) release!
>
> I could miss a point.  In 3-5 years, some people will be still running
> Debian stable (or maybe oldstable or maybe this stable is LTS), so they
> will “apt install guix” at 1.2.0, right?  But then there is no guarantee
> that Berlin will still serve this 5 years old binary substitutes.  But
> “guix install” fallback by compiling what is missing, right?

Sure.


> Then the question will be: are the upstream sources still available?
> Assuming that SWH is still alive at this future, all the git-fetch
> packages will have their source, whatever the upstream status.  For
> all the other methods, there is no guarantee.

There is never a guarantee of source availability from third parties;
one of the downsides of the Guix approach to source management
vs. Debian (e.g. all released sources are mirrored on Debian-controlled
infrastructure ... which brings up an interesting aside; could Debian,
OpenSuSE, Fedora, etc.  archives could be treated as a fallback mirror
for upstream tarballs).


> On the other hand, at this 3-5 years future, after “apt install guix”,
> people will not do “guix install” but instead they should do “guix
> pull”.  Therefore, the compression of substitutes does not matter that
> much, right?

Except for issues like the openssl bug which causes build failure due to
certificate expiry in the test suite basically would break guix pull in
those cases... maybe that is a deal breaker for the Debian packaged
guix...


> The only strong backward compatibility seems between “guix pull” rather
> than all the substitutes themselves.  Isn’t it?  Other said, at least
> keep all the necessary to have “guix pull” at 1.2.0 be able to complete.

The guix-daemon is still run from the packaged version installed as
/usr/bin/guix-daemon, so would need to be patched to get updates for new
features and ... in light of https://issues.guix.gnu.org/47229
... security updates!

It is of course possible to configure to use an updated guix-daemon from
a user's profile (e.g. as recommended with guix-binary installation on a
foreign distro), but out-of-the-box it uses the guix-daemon shipped in
the package, which, at least with my Debian hat on, is how it should be.


> Thanks for this opportunity to think at such time scale. :-)

Heh. :)


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 17:12             ` Ludovic Courtès
                                 ` (2 preceding siblings ...)
  2021-03-17 18:20               ` Jonathan Brielmaier
@ 2021-03-18 17:25               ` Pierre Neidhardt
  3 siblings, 0 replies; 43+ messages in thread
From: Pierre Neidhardt @ 2021-03-18 17:25 UTC (permalink / raw)
  To: Ludovic Courtès, guix-devel

[-- Attachment #1: Type: text/plain, Size: 727 bytes --]

Hi Ludo!

On a side note, the following shell incantations

> --8<---------------cut here---------------start------------->8---
> ludo@berlin ~$ grep -E '/gzip/[[:alnum:]]{32}-(hwloc-2|openmpi-4)\.[[:digit:]]+\.[[:digit:]]+ ' < /tmp/sample3.log | cut -f1 -d- | sort -u | wc -l
> 22
> ludo@berlin ~$ grep -E '/gzip/[[:alnum:]]{32}-(hwloc-2|openmpi-4)\.[[:digit:]]+\.[[:digit:]]+ ' < /tmp/sample3.log | cut -f1 -d- | cut -f 1-3 -d. | sort -u | wc -l
> 8
> --8<---------------cut here---------------end--------------->8---

are perfect examples for why it's high time we moved to a better shell
language :D

https://ambrevar.xyz/lisp-repl-shell/index.html

Cheers!

-- 
Pierre Neidhardt
https://ambrevar.xyz/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-18 16:00                     ` Vagrant Cascadian
@ 2021-03-18 18:53                       ` Leo Famulari
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Famulari @ 2021-03-18 18:53 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 376 bytes --]

On Thu, Mar 18, 2021 at 09:00:20AM -0700, Vagrant Cascadian wrote:
> Except for issues like the openssl bug which causes build failure due to
> certificate expiry in the test suite basically would break guix pull in
> those cases... maybe that is a deal breaker for the Debian packaged
> guix...

To clarify, this bug was in GnuTLS, not OpenSSL:

<https://bugs.gnu.org/44559>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Are gzip-compressed substitutes still used?
  2021-03-17 18:08                 ` Vagrant Cascadian
  2021-03-18  0:03                   ` zimoun
@ 2021-03-20 11:23                   ` Ludovic Courtès
  1 sibling, 0 replies; 43+ messages in thread
From: Ludovic Courtès @ 2021-03-20 11:23 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel

Vagrant Cascadian <vagrant@debian.org> skribis:

> On 2021-03-17, Léo Le Bouter wrote:
>> Just as a reminder siding with vagrantc here:
>>
>> We must ensure the Debian 'guix' package can still work and upgrade
>> from it's installed version, so ensure that removing gzip doesnt break
>> initial 'guix pull' with it.
>
> ... and I would expect this version to ship in Debian for another ~3-5
> years, unless it gets removed from Debian bullseye before the upcoming
> (real soon now) release!
>
> But if lzip substitutes are still supported, I *think* guix 1.2.0 as
> packaged in Debian still supports that, at least.
>
> Dropping both gzip and lzip would be unfortunate; I don't think it would
> be trivial to backport the zstd patches to guix 1.2.0, as it also
> depends on guile-zstd?

Indeed.  But don’t worry: we wouldn’t drop both gzip and lzip at once!

Ludo’.


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-03-20 11:24 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 22:20 When substitute download + decompression is CPU-bound Ludovic Courtès
2020-12-14 22:29 ` Julien Lepiller
2020-12-14 22:59 ` Nicolò Balzarotti
2020-12-15  7:52   ` Pierre Neidhardt
2020-12-15  9:45     ` Nicolò Balzarotti
2020-12-15  9:54       ` Pierre Neidhardt
2020-12-15 10:03         ` Nicolò Balzarotti
2020-12-15 10:13           ` Pierre Neidhardt
2020-12-15 10:14             ` Pierre Neidhardt
2020-12-15 11:42     ` Ludovic Courtès
2020-12-15 12:31       ` Pierre Neidhardt
2020-12-18 14:59         ` Ludovic Courtès
2020-12-18 15:33           ` Pierre Neidhardt
2020-12-15 11:36   ` Ludovic Courtès
2020-12-15 11:45     ` Nicolò Balzarotti
2020-12-15 10:40 ` Jonathan Brielmaier
2020-12-15 19:43   ` Joshua Branson
2021-01-07 10:45     ` Guillaume Le Vaillant
2021-01-07 11:00       ` Pierre Neidhardt
2021-01-07 11:33         ` Guillaume Le Vaillant
2021-01-14 21:51       ` Ludovic Courtès
2021-01-14 22:08         ` Nicolò Balzarotti
2021-01-28 17:53           ` Are gzip-compressed substitutes still used? Ludovic Courtès
2021-03-17 17:12             ` Ludovic Courtès
2021-03-17 17:33               ` Léo Le Bouter
2021-03-17 18:08                 ` Vagrant Cascadian
2021-03-18  0:03                   ` zimoun
2021-03-18 16:00                     ` Vagrant Cascadian
2021-03-18 18:53                       ` Leo Famulari
2021-03-20 11:23                   ` Ludovic Courtès
2021-03-17 18:06               ` zimoun
2021-03-17 18:20               ` Jonathan Brielmaier
2021-03-18 17:25               ` Pierre Neidhardt
2021-01-15  8:10         ` When substitute download + decompression is CPU-bound Pierre Neidhardt
2021-01-28 17:58           ` Ludovic Courtès
2021-01-29  9:45             ` Pierre Neidhardt
2021-01-29 11:23               ` Guillaume Le Vaillant
2021-01-29 11:55                 ` Nicolò Balzarotti
2021-01-29 12:13                   ` Pierre Neidhardt
2021-01-29 13:06                     ` Guillaume Le Vaillant
2021-01-29 14:55                     ` Nicolò Balzarotti
2021-02-01 22:18                 ` Ludovic Courtès
2021-01-29 13:33             ` zimoun

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).