unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: 43516@debbugs.gnu.org
Cc: Leo Famulari <leo@famulari.name>
Subject: [bug#43516]
Date: Mon, 21 Sep 2020 22:00:02 -0400	[thread overview]
Message-ID: <20200922020003.6954-1-maxim.cournoyer@gmail.com> (raw)
In-Reply-To: <20200919193805.GA31344@jasmine.lan>


Hi Leo!

> On Sat, Sep 19, 2020 at 01:03:57PM -0400, Maxim Cournoyer wrote:
> > The xz compression is slow; using multiple threads/cores yields a linear
> > performance improvement.
> >
> > * guix/packages.scm (patch-and-repack): Ensure xz is invoked with --threads=N
> > by setting the XZ_DEFAULTS environment variable.

> We tried this previous but reverted it because the archives were not
> bit-reproducible:

> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=3e95125e9bd0676d4a9add9105217ad3eaef3ff0

Thanks for bringing this to my attention!  I've studied what others have done
about it, and found a solution that seems to work well on the OpenEmbedded
mailing list [0].  Debian uses something similar in their dpkg.

The important point is that xz will produce reproducible results as long as it
operates in either the single thread mode OR the multi-thread mode (we can't
go from one mode to another reproducibly).  So the following v2 patch ensures
we always use --threads=2 at a minimum, forcing the xz code path into
multi-thread operation.  The --memlimit=50% argument limits the RAM use of xz
to at most half of the available memory, which allows xz to reduce the number
of threads used to meet this requirement.

I've rebuilt the world or core-updates to test this and got impressive
results, such as when building the linux-libre source with 24 cores instead of
1:

$ time guix build --source linux-libre --check

With this change, on a 24 cores/32 GiB system: 24 cores used, 2.9 GiB max memory used, 36.76 s.
On master (same machine): 1 core used, 95 MiB max memory used, 4 m 10 s.

[0]  https://patchwork.openembedded.org/patch/170475/
[1]  https://sources.debian.org/src/dpkg/1.19.7/lib/dpkg/compress.c/#L566-L574

> It's really a shame... it would be nice to reduce the time used for XZ
> compression.

Seems we can have our cake and eat it, too!

Maxim





  reply	other threads:[~2020-09-22  2:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-19 17:03 [bug#43516] [PATCH core-updates] packages: Enable multi-threaded xz compression when repacking source Maxim Cournoyer
2020-09-19 19:38 ` Leo Famulari
2020-09-22  2:00   ` Maxim Cournoyer [this message]
2020-09-22  2:00     ` [bug#43516] [PATCH core-updates v2] " Maxim Cournoyer
2020-09-22 15:19     ` [bug#43516] your mail Leo Famulari
2020-10-09  2:17       ` bug#43516: " Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200922020003.6954-1-maxim.cournoyer@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=43516@debbugs.gnu.org \
    --cc=leo@famulari.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).