unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
@ 2018-12-06  7:56 Christopher Baines
  2018-12-06  8:08 ` Christopher Baines
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Christopher Baines @ 2018-12-06  7:56 UTC (permalink / raw)
  To: 33643

It can take a little while to decompress some packages with large xz
compressed source tar files. xz includes support for parallelism, so enable
this using the parallel job count for the overall derivation.

* guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
to xz to enable it to work in parallel if appropriate.
---
 guix/build/gnu-build-system.scm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
index e5f3197b0..9d11e5b1e 100644
--- a/guix/build/gnu-build-system.scm
+++ b/guix/build/gnu-build-system.scm
@@ -147,7 +147,7 @@ chance to be set."
               locale (strerror (system-error-errno args)))
       #t)))
 
-(define* (unpack #:key source #:allow-other-keys)
+(define* (unpack #:key source parallel-build? #:allow-other-keys)
   "Unpack SOURCE in the working directory, and change directory within the
 source.  When SOURCE is a directory, copy it in a sub-directory of the current
 working directory."
@@ -161,6 +161,10 @@ working directory."
         (copy-recursively source "."
                           #:keep-mtime? #t))
       (begin
+        (when parallel-build?
+          (setenv "XZ_OPT"
+                  (format #f "-T~d" (parallel-job-count))))
+
         (if (string-suffix? ".zip" source)
             (invoke "unzip" source)
             (invoke "tar" "xvf" source))
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-06  7:56 [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel Christopher Baines
@ 2018-12-06  8:08 ` Christopher Baines
  2018-12-06  8:13 ` Leo Famulari
  2020-05-13 18:20 ` Christopher Baines
  2 siblings, 0 replies; 11+ messages in thread
From: Christopher Baines @ 2018-12-06  8:08 UTC (permalink / raw)
  To: 33643

[-- Attachment #1: Type: text/plain, Size: 456 bytes --]


Christopher Baines <mail@cbaines.net> writes:

> It can take a little while to decompress some packages with large xz
> compressed source tar files. xz includes support for parallelism, so enable
> this using the parallel job count for the overall derivation.

I'm guessing this is only suitable for core-updates, as it'll cause a
lot of rebuilds. I'm also not sure if it's worth it, but it does seem to
make building some packages at least start faster.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-06  7:56 [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel Christopher Baines
  2018-12-06  8:08 ` Christopher Baines
@ 2018-12-06  8:13 ` Leo Famulari
  2018-12-06 19:38   ` Christopher Baines
  2020-05-13 18:20 ` Christopher Baines
  2 siblings, 1 reply; 11+ messages in thread
From: Leo Famulari @ 2018-12-06  8:13 UTC (permalink / raw)
  To: Christopher Baines; +Cc: 33643

[-- Attachment #1: Type: text/plain, Size: 375 bytes --]

On Thu, Dec 06, 2018 at 07:56:15AM +0000, Christopher Baines wrote:
> It can take a little while to decompress some packages with large xz
> compressed source tar files. xz includes support for parallelism, so enable
> this using the parallel job count for the overall derivation.

The xz man page says that multi-threaded decompression isn't implemented
yet, unfortunately.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-06  8:13 ` Leo Famulari
@ 2018-12-06 19:38   ` Christopher Baines
  2018-12-06 21:06     ` Leo Famulari
  0 siblings, 1 reply; 11+ messages in thread
From: Christopher Baines @ 2018-12-06 19:38 UTC (permalink / raw)
  To: Leo Famulari; +Cc: 33643

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]


Leo Famulari <leo@famulari.name> writes:

> On Thu, Dec 06, 2018 at 07:56:15AM +0000, Christopher Baines wrote:
>> It can take a little while to decompress some packages with large xz
>> compressed source tar files. xz includes support for parallelism, so enable
>> this using the parallel job count for the overall derivation.
>
> The xz man page says that multi-threaded decompression isn't implemented
> yet, unfortunately.

Ah, interesting. Having a read myself now, it also says it:

  "will work on files that contain multiple blocks with size information
   in block headers.  All files compressed in multi-threaded mode meet
   this condition, but files compressed in single- threaded mode don't
   even if --block-size=size is used."

So, if -T was used to compress the data, then it sounds like it'll work
to decompress it. I guess this adds a little more uncertainty to the
benefit of this change, as the impact is dependent on the way the source
data is compressed.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-06 19:38   ` Christopher Baines
@ 2018-12-06 21:06     ` Leo Famulari
  2018-12-09 14:32       ` Efraim Flashner
  0 siblings, 1 reply; 11+ messages in thread
From: Leo Famulari @ 2018-12-06 21:06 UTC (permalink / raw)
  To: Christopher Baines; +Cc: 33643

[-- Attachment #1: Type: text/plain, Size: 671 bytes --]

On Thu, Dec 06, 2018 at 07:38:21PM +0000, Christopher Baines wrote:
> So, if -T was used to compress the data, then it sounds like it'll work
> to decompress it. I guess this adds a little more uncertainty to the
> benefit of this change, as the impact is dependent on the way the source
> data is compressed.

Right. When parallel decompression is implemented, I think we should
enable it in order to get some benefit from upstream tarballs that may
have been created with multi-threaded compression. 

However, we probably won't be able to use the parallel compression
within Guix because it is apparently not deterministic:

<https://bugs.gnu.org/31015>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-06 21:06     ` Leo Famulari
@ 2018-12-09 14:32       ` Efraim Flashner
  2018-12-10 16:24         ` Leo Famulari
  0 siblings, 1 reply; 11+ messages in thread
From: Efraim Flashner @ 2018-12-09 14:32 UTC (permalink / raw)
  To: Leo Famulari; +Cc: 33643

[-- Attachment #1: Type: text/plain, Size: 1223 bytes --]

On Thu, Dec 06, 2018 at 04:06:53PM -0500, Leo Famulari wrote:
> On Thu, Dec 06, 2018 at 07:38:21PM +0000, Christopher Baines wrote:
> > So, if -T was used to compress the data, then it sounds like it'll work
> > to decompress it. I guess this adds a little more uncertainty to the
> > benefit of this change, as the impact is dependent on the way the source
> > data is compressed.
> 
> Right. When parallel decompression is implemented, I think we should
> enable it in order to get some benefit from upstream tarballs that may
> have been created with multi-threaded compression. 
> 
> However, we probably won't be able to use the parallel compression
> within Guix because it is apparently not deterministic:
> 
> <https://bugs.gnu.org/31015>

If the tarball is compressed in parallel then it can be decompressed in
parallel.

As for compressing in parallel, it *might work* to pass it through our
non-bootstrap tar for 'tar --sort=name' and then pass it through xz
-T(pick-a-num).


-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-09 14:32       ` Efraim Flashner
@ 2018-12-10 16:24         ` Leo Famulari
  2018-12-10 18:48           ` Efraim Flashner
  0 siblings, 1 reply; 11+ messages in thread
From: Leo Famulari @ 2018-12-10 16:24 UTC (permalink / raw)
  To: Efraim Flashner; +Cc: 33643

[-- Attachment #1: Type: text/plain, Size: 440 bytes --]

On Sun, Dec 09, 2018 at 04:32:01PM +0200, Efraim Flashner wrote:
> If the tarball is compressed in parallel then it can be decompressed in
> parallel.

The xz documentation says that parallel decompression is not
implemented? Is that no longer the case?

> As for compressing in parallel, it *might work* to pass it through our
> non-bootstrap tar for 'tar --sort=name' and then pass it through xz
> -T(pick-a-num).

That could be helpful!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-10 16:24         ` Leo Famulari
@ 2018-12-10 18:48           ` Efraim Flashner
  0 siblings, 0 replies; 11+ messages in thread
From: Efraim Flashner @ 2018-12-10 18:48 UTC (permalink / raw)
  To: Leo Famulari; +Cc: 33643

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On Mon, Dec 10, 2018 at 11:24:29AM -0500, Leo Famulari wrote:
> On Sun, Dec 09, 2018 at 04:32:01PM +0200, Efraim Flashner wrote:
> > If the tarball is compressed in parallel then it can be decompressed in
> > parallel.
> 
> The xz documentation says that parallel decompression is not
> implemented? Is that no longer the case?

Looks like I got caught up with the original release notes.
https://git.tukaani.org/?p=xz.git;a=blob;f=NEWS;hb=HEAD#l94
Looks like it's specifically only compression.

> 
> > As for compressing in parallel, it *might work* to pass it through our
> > non-bootstrap tar for 'tar --sort=name' and then pass it through xz
> > -T(pick-a-num).
> 
> That could be helpful!



-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2018-12-06  7:56 [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel Christopher Baines
  2018-12-06  8:08 ` Christopher Baines
  2018-12-06  8:13 ` Leo Famulari
@ 2020-05-13 18:20 ` Christopher Baines
  2020-05-13 19:07   ` Efraim Flashner
  2 siblings, 1 reply; 11+ messages in thread
From: Christopher Baines @ 2020-05-13 18:20 UTC (permalink / raw)
  To: 33643; +Cc: Efraim Flashner, Leo Famulari

[-- Attachment #1: Type: text/plain, Size: 1991 bytes --]


Christopher Baines <mail@cbaines.net> writes:

> It can take a little while to decompress some packages with large xz
> compressed source tar files. xz includes support for parallelism, so enable
> this using the parallel job count for the overall derivation.
>
> * guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
> to xz to enable it to work in parallel if appropriate.
> ---
>  guix/build/gnu-build-system.scm | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
> index e5f3197b0..9d11e5b1e 100644
> --- a/guix/build/gnu-build-system.scm
> +++ b/guix/build/gnu-build-system.scm
> @@ -147,7 +147,7 @@ chance to be set."
>                locale (strerror (system-error-errno args)))
>        #t)))
>
> -(define* (unpack #:key source #:allow-other-keys)
> +(define* (unpack #:key source parallel-build? #:allow-other-keys)
>    "Unpack SOURCE in the working directory, and change directory within the
>  source.  When SOURCE is a directory, copy it in a sub-directory of the current
>  working directory."
> @@ -161,6 +161,10 @@ working directory."
>          (copy-recursively source "."
>                            #:keep-mtime? #t))
>        (begin
> +        (when parallel-build?
> +          (setenv "XZ_OPT"
> +                  (format #f "-T~d" (parallel-job-count))))
> +
>          (if (string-suffix? ".zip" source)
>              (invoke "unzip" source)
>              (invoke "tar" "xvf" source))

It's been a long long while, but now that core-updates has recently been
merged, I'd like to try and take a look at this again.

I think the consensus was that this will only help for xz compressed
files where they have been compressed in parallel. I think it's still
worth doing though, as some of the big xz files that need decompressing
have been compressed in parallel, and this will speed up the builds when
multiple cores are available.

Thanks,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2020-05-13 18:20 ` Christopher Baines
@ 2020-05-13 19:07   ` Efraim Flashner
  2020-05-14  7:37     ` bug#33643: " Christopher Baines
  0 siblings, 1 reply; 11+ messages in thread
From: Efraim Flashner @ 2020-05-13 19:07 UTC (permalink / raw)
  To: Christopher Baines; +Cc: 33643, Leo Famulari

[-- Attachment #1: Type: text/plain, Size: 2600 bytes --]

On Wed, May 13, 2020 at 07:20:08PM +0100, Christopher Baines wrote:
> 
> Christopher Baines <mail@cbaines.net> writes:
> 
> > It can take a little while to decompress some packages with large xz
> > compressed source tar files. xz includes support for parallelism, so enable
> > this using the parallel job count for the overall derivation.
> >
> > * guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
> > to xz to enable it to work in parallel if appropriate.
> > ---
> >  guix/build/gnu-build-system.scm | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
> > index e5f3197b0..9d11e5b1e 100644
> > --- a/guix/build/gnu-build-system.scm
> > +++ b/guix/build/gnu-build-system.scm
> > @@ -147,7 +147,7 @@ chance to be set."
> >                locale (strerror (system-error-errno args)))
> >        #t)))
> >
> > -(define* (unpack #:key source #:allow-other-keys)
> > +(define* (unpack #:key source parallel-build? #:allow-other-keys)
> >    "Unpack SOURCE in the working directory, and change directory within the
> >  source.  When SOURCE is a directory, copy it in a sub-directory of the current
> >  working directory."
> > @@ -161,6 +161,10 @@ working directory."
> >          (copy-recursively source "."
> >                            #:keep-mtime? #t))
> >        (begin
> > +        (when parallel-build?
> > +          (setenv "XZ_OPT"
> > +                  (format #f "-T~d" (parallel-job-count))))
> > +
> >          (if (string-suffix? ".zip" source)
> >              (invoke "unzip" source)
> >              (invoke "tar" "xvf" source))
> 
> It's been a long long while, but now that core-updates has recently been
> merged, I'd like to try and take a look at this again.
> 
> I think the consensus was that this will only help for xz compressed
> files where they have been compressed in parallel. I think it's still
> worth doing though, as some of the big xz files that need decompressing
> have been compressed in parallel, and this will speed up the builds when
> multiple cores are available.
> 
> Thanks,
> 
> Chris

I thought the last time we looked into this we figured out that there
was a mistake in release notes or something and that parallel
decompression isn't actually supported.

-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#33643: [PATCH] gnu-build-system: Enable xz to decompress in parallel.
  2020-05-13 19:07   ` Efraim Flashner
@ 2020-05-14  7:37     ` Christopher Baines
  0 siblings, 0 replies; 11+ messages in thread
From: Christopher Baines @ 2020-05-14  7:37 UTC (permalink / raw)
  To: Efraim Flashner; +Cc: 33643-done

[-- Attachment #1: Type: text/plain, Size: 2839 bytes --]


Efraim Flashner <efraim@flashner.co.il> writes:

> On Wed, May 13, 2020 at 07:20:08PM +0100, Christopher Baines wrote:
>>
>> Christopher Baines <mail@cbaines.net> writes:
>>
>> > It can take a little while to decompress some packages with large xz
>> > compressed source tar files. xz includes support for parallelism, so enable
>> > this using the parallel job count for the overall derivation.
>> >
>> > * guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
>> > to xz to enable it to work in parallel if appropriate.
>> > ---
>> >  guix/build/gnu-build-system.scm | 6 +++++-
>> >  1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
>> > index e5f3197b0..9d11e5b1e 100644
>> > --- a/guix/build/gnu-build-system.scm
>> > +++ b/guix/build/gnu-build-system.scm
>> > @@ -147,7 +147,7 @@ chance to be set."
>> >                locale (strerror (system-error-errno args)))
>> >        #t)))
>> >
>> > -(define* (unpack #:key source #:allow-other-keys)
>> > +(define* (unpack #:key source parallel-build? #:allow-other-keys)
>> >    "Unpack SOURCE in the working directory, and change directory within the
>> >  source.  When SOURCE is a directory, copy it in a sub-directory of the current
>> >  working directory."
>> > @@ -161,6 +161,10 @@ working directory."
>> >          (copy-recursively source "."
>> >                            #:keep-mtime? #t))
>> >        (begin
>> > +        (when parallel-build?
>> > +          (setenv "XZ_OPT"
>> > +                  (format #f "-T~d" (parallel-job-count))))
>> > +
>> >          (if (string-suffix? ".zip" source)
>> >              (invoke "unzip" source)
>> >              (invoke "tar" "xvf" source))
>>
>> It's been a long long while, but now that core-updates has recently been
>> merged, I'd like to try and take a look at this again.
>>
>> I think the consensus was that this will only help for xz compressed
>> files where they have been compressed in parallel. I think it's still
>> worth doing though, as some of the big xz files that need decompressing
>> have been compressed in parallel, and this will speed up the builds when
>> multiple cores are available.
>>
>> Thanks,
>>
>> Chris
>
> I thought the last time we looked into this we figured out that there
> was a mistake in release notes or something and that parallel
> decompression isn't actually supported.

Hmm, I had a look to see if I could find some examples of where this
would apply, but I couldn't find any xz archives that we use in Guix
where it's been compressed in a way that allows multithreaded
decompression...

I'm pretty sure I had some examples before, but maybe somethings changed
in the intervening year.

Anyway, if I discover this again, I'll actually make a note of where
it's applicable.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-05-14  7:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-06  7:56 [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in parallel Christopher Baines
2018-12-06  8:08 ` Christopher Baines
2018-12-06  8:13 ` Leo Famulari
2018-12-06 19:38   ` Christopher Baines
2018-12-06 21:06     ` Leo Famulari
2018-12-09 14:32       ` Efraim Flashner
2018-12-10 16:24         ` Leo Famulari
2018-12-10 18:48           ` Efraim Flashner
2020-05-13 18:20 ` Christopher Baines
2020-05-13 19:07   ` Efraim Flashner
2020-05-14  7:37     ` bug#33643: " Christopher Baines

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).