From: "Ludovic Courtès" <ludo@gnu.org>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Profiling of man-db database generation with zlib vs zstd
Date: Tue, 29 Mar 2022 12:30:14 +0200 [thread overview]
Message-ID: <87czi5126h.fsf@gnu.org> (raw)
In-Reply-To: <87o81qviqg.fsf@gmail.com> (Maxim Cournoyer's message of "Sun, 27 Mar 2022 23:49:59 -0400")
Hi!
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
> You'll need to generate the tar.zst and tar.gz yourself, but the script
> that was used is:
>
> ;; decompress-zstd.scm
> (use-modules (ice-9 binary-ports)
> (ice-9 match)
> (statprof)
> (zstd))
>
> (define MiB (expt 2 20))
> (define input-file "/tmp/chromium-98.0.4758.102.tar.zst")
> (define output-file "/dev/null")
>
> (define (decompression-test)
> (call-with-input-file input-file
> (lambda (port)
> (call-with-zstd-input-port port
> (lambda (input)
> (call-with-output-file output-file
> (lambda (output)
> (let loop ((bv (get-bytevector-n input (* 4 MiB))))
> (match bv
> ((? eof-object?)
> #t)
> (bv
> (put-bytevector output bv)
> (loop (get-bytevector-n input (* 4 MiB)))))))))))))
To isolate the problem, you could allocate the 4 MiB buffer outside of
the loop and use ‘get-bytevector-n!’, and also remove code that writes
to ‘output’.
> This confirms that guile-zstd is not noticeably faster than guile-zlib,
> which is unexpected.
Uh, surprising.
Note that ‘statprof’ incurs overhead, so in general if you want timings,
get them without ‘statprof’.
> Compare to the command line tools:
>
> $ time+ zstd -cdk /tmp/chromium-98.0.4758.102.tar.zst > /dev/null
> cpu: 99%, mem: 10548 KiB, wall: 0:09.37, sys: 0.30, usr: 9.05
>
> $ time+ gunzip -ck /tmp/chromium-98.0.4758.102.tar.gz > /dev/null
> cpu: 99%, mem: 2908 KiB, wall: 0:22.29, sys: 0.31, usr: 21.98
>
> where zstd is about 2.3x faster.
>
> It's unfortunate that the bulk of the time is spent in "anon" (anonymous
> proc?), which doesn't say much.
It’s likely one of the lambdas.
> Perhaps I should open an issue with the guile-zstd project.
Yes, or we can continue here. :-)
From there I think we should first fully isolate the thing we’re
measuring, as discussed above, to gain confidence.
It the code using guile-zstd is slower than the CLI, then it could be
that guile-zstd doesn’t initialize the library properly, or that it gets
buffering wrong or something.
I’ll see if I can give it a try too.
Thanks for investigating!
Ludo’.
next prev parent reply other threads:[~2022-03-29 10:32 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-22 19:09 Profiling of man-db database generation with zlib vs zstd Maxim Cournoyer
2022-03-24 21:37 ` Ludovic Courtès
2022-03-26 3:22 ` Maxim Cournoyer
2022-03-27 3:44 ` Maxim Cournoyer
2022-03-29 10:22 ` Ludovic Courtès
2022-03-28 3:49 ` Maxim Cournoyer
2022-03-29 10:30 ` Ludovic Courtès [this message]
2022-03-30 14:49 ` Maxim Cournoyer
2022-03-30 16:16 ` Jonathan McHugh
2022-03-31 17:13 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87czi5126h.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=guix-devel@gnu.org \
--cc=maxim.cournoyer@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.