* Parallel (de-)compression
@ 2015-12-02 18:45 Andreas Enge
2015-12-02 18:52 ` Andreas Enge
2015-12-04 14:44 ` Ludovic Courtès
0 siblings, 2 replies; 6+ messages in thread
From: Andreas Enge @ 2015-12-02 18:45 UTC (permalink / raw)
To: guix-devel
Hello,
on my relatively slow ARM build machine with relatively fast storage (SSD),
I notice that often there is an xz process taking 100% of CPU, while there
is never more than 20MB/s written to disk. For instance, texlive-texmf
takes a very long time to build and install into the store.
Would it make sense to switch to a parallel (de-)compression tool to leverage
higher numbers of cores? We have pbzip2 already in Guix, which is compatible
with bzip2.
As a negative point, we would increase the size of our packages and also the
bandwidth requirement. So maybe this is not worth it, since we could also
build more packages in parallel. Or are there parallel implementations of xz?
Andreas
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Parallel (de-)compression
2015-12-02 18:45 Parallel (de-)compression Andreas Enge
@ 2015-12-02 18:52 ` Andreas Enge
2015-12-02 19:18 ` Efraim Flashner
2015-12-04 14:44 ` Ludovic Courtès
1 sibling, 1 reply; 6+ messages in thread
From: Andreas Enge @ 2015-12-02 18:52 UTC (permalink / raw)
To: guix-devel
How about this:
http://anthon.home.xs4all.nl/rants/2013/parallel_xz/
?
Andreas
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Parallel (de-)compression
2015-12-02 18:45 Parallel (de-)compression Andreas Enge
2015-12-02 18:52 ` Andreas Enge
@ 2015-12-04 14:44 ` Ludovic Courtès
2015-12-06 15:31 ` Andreas Enge
1 sibling, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2015-12-04 14:44 UTC (permalink / raw)
To: Andreas Enge; +Cc: guix-devel
Andreas Enge <andreas@enge.fr> skribis:
> on my relatively slow ARM build machine with relatively fast storage (SSD),
> I notice that often there is an xz process taking 100% of CPU, while there
> is never more than 20MB/s written to disk. For instance, texlive-texmf
> takes a very long time to build and install into the store.
Are you saying that xz-compressing TeX Live to resend it to
hydra.gnu.org is too CPU-intensive?
> Would it make sense to switch to a parallel (de-)compression tool to leverage
> higher numbers of cores? We have pbzip2 already in Guix, which is compatible
> with bzip2.
Bzip2 provides a CPU/compression ratio tradeoff that is not as good as
xz, so I would avoid it.
Another option would be to trade compression ratio for reduced CPU usage
by using, say, ‘xz -2’ or ‘gzip’.
We did something similar in 5ef9d7d to reduce CPU consumption on the
front-end. Usually it’s much less important to reduce CPU consumption
on the build machines, but your experience seems to suggest otherwise.
Thoughts?
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Parallel (de-)compression
2015-12-04 14:44 ` Ludovic Courtès
@ 2015-12-06 15:31 ` Andreas Enge
2015-12-06 22:21 ` Ludovic Courtès
0 siblings, 1 reply; 6+ messages in thread
From: Andreas Enge @ 2015-12-06 15:31 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Fri, Dec 04, 2015 at 03:44:38PM +0100, Ludovic Courtès wrote:
> Are you saying that xz-compressing TeX Live to resend it to
> hydra.gnu.org is too CPU-intensive?
That depends on your definition of "too". In any case, on the Novena board
with an SSD attached, CPU is the limiting factor during this phase,
pushing the CPU load on one core to 100% (while the other cores are idle).
> Another option would be to trade compression ratio for reduced CPU usage
> by using, say, ‘xz -2’ or ‘gzip’.
> We did something similar in 5ef9d7d to reduce CPU consumption on the
> front-end. Usually it’s much less important to reduce CPU consumption
> on the build machines, but your experience seems to suggest otherwise.
If possible, it would be more interesting to leverage the several cores
and not make sacrifices on the compression quality. Note that I also did
not measure the different timings: Compression is, I think, done separately
from sending the compressed file; then it is entirely possible that the
data transfer takes longer than the compression.
Andreas
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Parallel (de-)compression
2015-12-06 15:31 ` Andreas Enge
@ 2015-12-06 22:21 ` Ludovic Courtès
0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2015-12-06 22:21 UTC (permalink / raw)
To: Andreas Enge; +Cc: guix-devel
Andreas Enge <andreas@enge.fr> skribis:
> On Fri, Dec 04, 2015 at 03:44:38PM +0100, Ludovic Courtès wrote:
>> Are you saying that xz-compressing TeX Live to resend it to
>> hydra.gnu.org is too CPU-intensive?
>
> That depends on your definition of "too". In any case, on the Novena board
> with an SSD attached, CPU is the limiting factor during this phase,
> pushing the CPU load on one core to 100% (while the other cores are idle).
OK.
>> Another option would be to trade compression ratio for reduced CPU usage
>> by using, say, ‘xz -2’ or ‘gzip’.
>> We did something similar in 5ef9d7d to reduce CPU consumption on the
>> front-end. Usually it’s much less important to reduce CPU consumption
>> on the build machines, but your experience seems to suggest otherwise.
>
> If possible, it would be more interesting to leverage the several cores
> and not make sacrifices on the compression quality.
It’s not necessarily the best option to increase throughput: the build
machine may be busy building other things, and thus unable to dedicate
all its cores to compression.
Dunno.
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-12-06 22:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-02 18:45 Parallel (de-)compression Andreas Enge
2015-12-02 18:52 ` Andreas Enge
2015-12-02 19:18 ` Efraim Flashner
2015-12-04 14:44 ` Ludovic Courtès
2015-12-06 15:31 ` Andreas Enge
2015-12-06 22:21 ` Ludovic Courtès
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.