From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Profiling of man-db database generation with zlib vs zstd
Date: Sat, 26 Mar 2022 23:44:12 -0400 [thread overview]
Message-ID: <87czi8vz3n.fsf@gmail.com> (raw)
In-Reply-To: <87zgldwg87.fsf@gmail.com> (Maxim Cournoyer's message of "Fri, 25 Mar 2022 23:22:00 -0400")
Hello,
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
> Hi Ludovic!
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hi,
>>
>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>>
>>> scheme@(guix man-db)> ,profile (define a (mandb-entries "/gnu/store/jgc63dxvpd8zq0p8al71x07a02qj8i1b-man-pages-5.13/share/man"))
>>> % cumulative self
>>> time seconds seconds procedure
>>> 20.95 1.98 1.75 gdbm.scm:122:11
>>> 20.95 1.75 1.75 string-tokenize
>>> 19.37 3.61 1.62 set-procedure-property!
>>> 6.72 0.56 0.56 ice-9/eval.scm:604:6
>>> 4.35 0.36 0.36 %read-line
>>> 4.35 0.36 0.36 anon #xa8e0b0
>>> 2.77 0.23 0.23 apply-smob/1
>>> 1.58 11.51 0.13 ice-9/eval.scm:292:11
>>
>> Could it be that man-db.scm is being interpreted, hence eval.scm and
>> ‘set-procedure-property!’ here? (Running ‘make’ doesn’t compile it.)
OK, I've now compiled guix/man-db.scm with -O3, generated a profile with
~5300 manuals (guix shell --no-grafts --pure man-db perl@5.14 man-pages
libx11:doc gnutls-dane:doc my-tcl) on core-updates with my local changes
to compress man pages with zstd and profiled:
--8<---------------cut here---------------start------------->8---
scheme@(guix man-db)> ,profile (define a (mandb-entries "/gnu/store/jp1kjkz5m116r960gvjk1sj4b0fkb0ip-profile/share/man"))
% cumulative self
time seconds seconds procedure
69.37 16.19 16.15 %read-line
14.14 3.31 3.29 string-tokenize
5.76 1.34 1.34 gdbm.scm:122:11
2.44 22.75 0.57 guix/man-db.scm:170:4
1.92 1.71 0.45 make-bytevector
1.57 0.37 0.37 anon #x209d110
0.52 0.14 0.12 open-file
0.35 16.96 0.08 ice-9/rdelim.scm:193:0:read-line
0.35 0.08 0.08 string<?
0.35 0.08 0.08 bytevector->pointer
0.26 0.06 0.06 assv-ref
0.26 0.06 0.06 readdir
0.26 0.06 0.06 equal?
0.26 0.06 0.06 get-bytevector-n!
0.17 0.67 0.04 zstd.scm:234:2:read!
0.17 0.04 0.04 string-append
0.17 0.04 0.04 close-port
0.17 0.04 0.04 bytevector-u64-native-set!
0.17 0.04 0.04 stat
0.17 0.04 0.04 guix/man-db.scm:139:19
0.09 1.83 0.02 zstd.scm:216:0:make-zstd-input-port
0.09 0.33 0.02 ice-9/ftw.scm:445:2:loop
0.09 0.12 0.02 system/foreign.scm:150:0:write-c-struct
0.09 0.04 0.02 system/foreign.scm:167:0:read-c-struct
0.09 0.02 0.02 pointer->bytevector
0.09 0.02 0.02 regexp-exec
0.09 0.02 0.02 string
0.09 0.02 0.02 sizeof
0.09 0.02 0.02 basename
0.09 0.02 0.02 string-contains
0.09 0.02 0.02 lstat
0.09 0.02 0.02 force
0.09 0.02 0.02 anon #x209d0d8
0.00 34249.92 0.00 ice-9/boot-9.scm:220:5:map1
0.00 23.28 0.00 <current input>:3:9
0.00 17.17 0.00 ice-9/ports.scm:429:0:call-with-port
0.00 7.46 0.00 zstd.scm:273:0:call-with-zstd-input-port
0.00 1.34 0.00 anon #x208c5f0
0.00 1.34 0.00 %after-gc-thunk
0.00 0.73 0.00 guix/man-db.scm:129:0:read-synopsis
0.00 0.67 0.00 %read-line
0.00 0.28 0.00 guix/man-db.scm:205:0:mandb-entries
0.00 0.16 0.00 srfi/srfi-1.scm:452:2:fold
0.00 0.16 0.00 anon #x208c57c
0.00 0.16 0.00 guix/build/utils.scm:487:0:find-files
0.00 0.14 0.00 guix/man-db.scm:155:0:man-page->entry
0.00 0.12 0.00 system/foreign.scm:182:0:make-c-struct
0.00 0.08 0.00 sort
0.00 0.04 0.00 filter
0.00 0.04 0.00 close-port
0.00 0.04 0.00 ice-9/boot-9.scm:1971:6
0.00 0.04 0.00 zstd.scm:208:4
0.00 0.04 0.00 guix/build/utils.scm:503:28
0.00 0.04 0.00 string-map
0.00 0.02 0.00 ice-9/boot-9.scm:1689:4:with-exception-handler
0.00 0.02 0.00 system/foreign.scm:187:0:parse-c-struct
0.00 0.02 0.00 guix/build/utils.scm:484:4
---
Sample count: 1146
Total time: 23.282459186 seconds (7.318256931 seconds in GC)
--8<---------------cut here---------------end--------------->8---
It still shows that parsing the files is much more expensive than
decompressing them. This is also true of zlib-compressed manuals;
here's the same experiment on master:
--8<---------------cut here---------------start------------->8---
$ guix shell --no-grafts --pure man-db perl@5.14 man-pages libx11:doc gnutls-dane:doc tcl
[env]$ echo $GUIX_ENVIRONMENT && exit
/gnu/store/qqd7d22wf9d220prkm682yypybpr7df4-profile
$ guix shell -D guix guile-gdbm-ffi guile-zstd
[env]$ guild compile -O3 guix/man-db.scm
`/home/maxim/.cache/guile/ccache/3.0-LE-8-4.6/home/maxim/src/guix-master/guix/man-db.scm.go'
[env]$ ./pre-inst-env guix repl
scheme@(guix-user)> ,m (guix man-db)
scheme@(guix man-db)> ,profile (define a (mandb-entries "/gnu/store/qqd7d22wf9d220prkm682yypybpr7df4-profile/share/man"))
% cumulative self
time seconds seconds procedure
49.15 2.62 2.56 string-tokenize
15.93 0.87 0.83 %read-line
10.85 0.57 0.57 anon #xd67e30
4.75 5.91 0.25 guix/man-db.scm:161:4
3.05 0.16 0.16 open-file
2.03 0.11 0.11 gdbm.scm:122:11
1.36 0.09 0.07 stat
1.36 0.07 0.07 string<?
1.36 0.07 0.07 anon #xd67df8
1.02 1.52 0.05 ice-9/rdelim.scm:193:0:read-line
1.02 0.05 0.05 regexp-exec
1.02 0.05 0.05 lstat
1.02 0.05 0.05 close-port
0.68 0.19 0.04 zlib.scm:158:0:make-gzip-input-port
0.68 0.05 0.04 string->pointer
0.68 0.04 0.04 guix/man-db.scm:134:19
0.68 0.04 0.04 bytevector->pointer
0.68 0.04 0.04 make-custom-binary-input-port
0.68 0.04 0.04 equal?
0.34 0.18 0.02 guix/man-db.scm:150:0:man-page->entry
0.34 0.02 0.02 basename
0.34 0.02 0.02 anon #xd67f10
0.34 0.02 0.02 ice-9/popen.scm:183:0:reap-pipes
0.34 0.02 0.02 zlib.scm:190:33
0.34 0.02 0.02 zlib.scm:66:0
0.00 10838.31 0.00 ice-9/boot-9.scm:220:5:map1
0.00 6.21 0.00 zlib.scm:217:0:call-with-gzip-input-port
0.00 5.22 0.00 <current input>:2:9
0.00 0.60 0.00 %read-line
0.00 0.60 0.00 zlib.scm:99:4
0.00 0.25 0.00 ice-9/ftw.scm:445:2:loop
0.00 0.25 0.00 guix/man-db.scm:196:0:mandb-entries
0.00 0.12 0.00 srfi/srfi-1.scm:452:2:fold
0.00 0.12 0.00 guix/build/utils.scm:476:0:find-files
0.00 0.12 0.00 zlib.scm:87:4
0.00 0.12 0.00 %after-gc-thunk
0.00 0.12 0.00 anon #xd5745c
0.00 0.12 0.00 anon #xd574d0
0.00 0.09 0.00 ice-9/boot-9.scm:1971:6
0.00 0.07 0.00 sort
0.00 0.07 0.00 guix/build/utils.scm:492:28
0.00 0.05 0.00 filter
0.00 0.05 0.00 ice-9/boot-9.scm:1689:4:with-exception-handler
0.00 0.04 0.00 string-map
0.00 0.04 0.00 guix/man-db.scm:200:26
0.00 0.04 0.00 close-port
0.00 0.02 0.00 guix/build/utils.scm:473:4
0.00 0.02 0.00 zlib.scm:138:4
---
Sample count: 295
Total time: 5.217801086 seconds (1.528583927 seconds in GC)
--8<---------------cut here---------------end--------------->8---
Hum, OK, so if I understand guile-zstd causes an almost 5x
slowdown... that doesn't make much sense unless the guile-zstd library
is much slower itself (zstd-decompression should be about 3.5x faster
than that of zlib).
Supposing guile-zstd was a 1:1 equivalent to guile-zlib, then we should
be able to see the above ~5 s time shrink down to around 1.5 s (best
case), which would be nice.
I'm impressed the man-db hook is that fast already though; I seem to
recall it would take like 30 s (or more?) for such a large amount of man
pages on this machine.
Thanks,
Maxim
next prev parent reply other threads:[~2022-03-27 3:44 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-22 19:09 Profiling of man-db database generation with zlib vs zstd Maxim Cournoyer
2022-03-24 21:37 ` Ludovic Courtès
2022-03-26 3:22 ` Maxim Cournoyer
2022-03-27 3:44 ` Maxim Cournoyer [this message]
2022-03-29 10:22 ` Ludovic Courtès
2022-03-28 3:49 ` Maxim Cournoyer
2022-03-29 10:30 ` Ludovic Courtès
2022-03-30 14:49 ` Maxim Cournoyer
2022-03-30 16:16 ` Jonathan McHugh
2022-03-31 17:13 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87czi8vz3n.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=guix-devel@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.