all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Profiling of man-db database generation with zlib vs zstd
Date: Sat, 26 Mar 2022 23:44:12 -0400	[thread overview]
Message-ID: <87czi8vz3n.fsf@gmail.com> (raw)
In-Reply-To: <87zgldwg87.fsf@gmail.com> (Maxim Cournoyer's message of "Fri, 25 Mar 2022 23:22:00 -0400")

Hello,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

> Hi Ludovic!
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hi,
>>
>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>>
>>> scheme@(guix man-db)> ,profile (define a (mandb-entries "/gnu/store/jgc63dxvpd8zq0p8al71x07a02qj8i1b-man-pages-5.13/share/man"))
>>> %     cumulative   self
>>> time   seconds     seconds  procedure
>>>  20.95      1.98      1.75  gdbm.scm:122:11
>>>  20.95      1.75      1.75  string-tokenize
>>>  19.37      3.61      1.62  set-procedure-property!
>>>   6.72      0.56      0.56  ice-9/eval.scm:604:6
>>>   4.35      0.36      0.36  %read-line
>>>   4.35      0.36      0.36  anon #xa8e0b0
>>>   2.77      0.23      0.23  apply-smob/1
>>>   1.58     11.51      0.13  ice-9/eval.scm:292:11
>>
>> Could it be that man-db.scm is being interpreted, hence eval.scm and
>> ‘set-procedure-property!’ here?  (Running ‘make’ doesn’t compile it.)

OK, I've now compiled guix/man-db.scm with -O3, generated a profile with
~5300 manuals (guix shell --no-grafts --pure man-db perl@5.14 man-pages
libx11:doc gnutls-dane:doc my-tcl) on core-updates with my local changes
to compress man pages with zstd and profiled:

--8<---------------cut here---------------start------------->8---
scheme@(guix man-db)> ,profile (define a (mandb-entries "/gnu/store/jp1kjkz5m116r960gvjk1sj4b0fkb0ip-profile/share/man"))
%     cumulative   self
time   seconds     seconds  procedure
 69.37     16.19     16.15  %read-line
 14.14      3.31      3.29  string-tokenize
  5.76      1.34      1.34  gdbm.scm:122:11
  2.44     22.75      0.57  guix/man-db.scm:170:4
  1.92      1.71      0.45  make-bytevector
  1.57      0.37      0.37  anon #x209d110
  0.52      0.14      0.12  open-file
  0.35     16.96      0.08  ice-9/rdelim.scm:193:0:read-line
  0.35      0.08      0.08  string<?
  0.35      0.08      0.08  bytevector->pointer
  0.26      0.06      0.06  assv-ref
  0.26      0.06      0.06  readdir
  0.26      0.06      0.06  equal?
  0.26      0.06      0.06  get-bytevector-n!
  0.17      0.67      0.04  zstd.scm:234:2:read!
  0.17      0.04      0.04  string-append
  0.17      0.04      0.04  close-port
  0.17      0.04      0.04  bytevector-u64-native-set!
  0.17      0.04      0.04  stat
  0.17      0.04      0.04  guix/man-db.scm:139:19
  0.09      1.83      0.02  zstd.scm:216:0:make-zstd-input-port
  0.09      0.33      0.02  ice-9/ftw.scm:445:2:loop
  0.09      0.12      0.02  system/foreign.scm:150:0:write-c-struct
  0.09      0.04      0.02  system/foreign.scm:167:0:read-c-struct
  0.09      0.02      0.02  pointer->bytevector
  0.09      0.02      0.02  regexp-exec
  0.09      0.02      0.02  string
  0.09      0.02      0.02  sizeof
  0.09      0.02      0.02  basename
  0.09      0.02      0.02  string-contains
  0.09      0.02      0.02  lstat
  0.09      0.02      0.02  force
  0.09      0.02      0.02  anon #x209d0d8
  0.00  34249.92      0.00  ice-9/boot-9.scm:220:5:map1
  0.00     23.28      0.00  <current input>:3:9
  0.00     17.17      0.00  ice-9/ports.scm:429:0:call-with-port
  0.00      7.46      0.00  zstd.scm:273:0:call-with-zstd-input-port
  0.00      1.34      0.00  anon #x208c5f0
  0.00      1.34      0.00  %after-gc-thunk
  0.00      0.73      0.00  guix/man-db.scm:129:0:read-synopsis
  0.00      0.67      0.00  %read-line
  0.00      0.28      0.00  guix/man-db.scm:205:0:mandb-entries
  0.00      0.16      0.00  srfi/srfi-1.scm:452:2:fold
  0.00      0.16      0.00  anon #x208c57c
  0.00      0.16      0.00  guix/build/utils.scm:487:0:find-files
  0.00      0.14      0.00  guix/man-db.scm:155:0:man-page->entry
  0.00      0.12      0.00  system/foreign.scm:182:0:make-c-struct
  0.00      0.08      0.00  sort
  0.00      0.04      0.00  filter
  0.00      0.04      0.00  close-port
  0.00      0.04      0.00  ice-9/boot-9.scm:1971:6
  0.00      0.04      0.00  zstd.scm:208:4
  0.00      0.04      0.00  guix/build/utils.scm:503:28
  0.00      0.04      0.00  string-map
  0.00      0.02      0.00  ice-9/boot-9.scm:1689:4:with-exception-handler
  0.00      0.02      0.00  system/foreign.scm:187:0:parse-c-struct
  0.00      0.02      0.00  guix/build/utils.scm:484:4
---
Sample count: 1146
Total time: 23.282459186 seconds (7.318256931 seconds in GC)
--8<---------------cut here---------------end--------------->8---

It still shows that parsing the files is much more expensive than
decompressing them.  This is also true of zlib-compressed manuals;
here's the same experiment on master:

--8<---------------cut here---------------start------------->8---
$ guix shell --no-grafts --pure man-db perl@5.14 man-pages libx11:doc gnutls-dane:doc tcl
[env]$ echo $GUIX_ENVIRONMENT && exit
/gnu/store/qqd7d22wf9d220prkm682yypybpr7df4-profile
$ guix shell -D guix guile-gdbm-ffi guile-zstd
[env]$ guild compile -O3 guix/man-db.scm
`/home/maxim/.cache/guile/ccache/3.0-LE-8-4.6/home/maxim/src/guix-master/guix/man-db.scm.go'
[env]$ ./pre-inst-env guix repl

scheme@(guix-user)> ,m (guix man-db)
scheme@(guix man-db)> ,profile (define a (mandb-entries "/gnu/store/qqd7d22wf9d220prkm682yypybpr7df4-profile/share/man"))
%     cumulative   self
time   seconds     seconds  procedure
 49.15      2.62      2.56  string-tokenize
 15.93      0.87      0.83  %read-line
 10.85      0.57      0.57  anon #xd67e30
  4.75      5.91      0.25  guix/man-db.scm:161:4
  3.05      0.16      0.16  open-file
  2.03      0.11      0.11  gdbm.scm:122:11
  1.36      0.09      0.07  stat
  1.36      0.07      0.07  string<?
  1.36      0.07      0.07  anon #xd67df8
  1.02      1.52      0.05  ice-9/rdelim.scm:193:0:read-line
  1.02      0.05      0.05  regexp-exec
  1.02      0.05      0.05  lstat
  1.02      0.05      0.05  close-port
  0.68      0.19      0.04  zlib.scm:158:0:make-gzip-input-port
  0.68      0.05      0.04  string->pointer
  0.68      0.04      0.04  guix/man-db.scm:134:19
  0.68      0.04      0.04  bytevector->pointer
  0.68      0.04      0.04  make-custom-binary-input-port
  0.68      0.04      0.04  equal?
  0.34      0.18      0.02  guix/man-db.scm:150:0:man-page->entry
  0.34      0.02      0.02  basename
  0.34      0.02      0.02  anon #xd67f10
  0.34      0.02      0.02  ice-9/popen.scm:183:0:reap-pipes
  0.34      0.02      0.02  zlib.scm:190:33
  0.34      0.02      0.02  zlib.scm:66:0
  0.00  10838.31      0.00  ice-9/boot-9.scm:220:5:map1
  0.00      6.21      0.00  zlib.scm:217:0:call-with-gzip-input-port
  0.00      5.22      0.00  <current input>:2:9
  0.00      0.60      0.00  %read-line
  0.00      0.60      0.00  zlib.scm:99:4
  0.00      0.25      0.00  ice-9/ftw.scm:445:2:loop
  0.00      0.25      0.00  guix/man-db.scm:196:0:mandb-entries
  0.00      0.12      0.00  srfi/srfi-1.scm:452:2:fold
  0.00      0.12      0.00  guix/build/utils.scm:476:0:find-files
  0.00      0.12      0.00  zlib.scm:87:4
  0.00      0.12      0.00  %after-gc-thunk
  0.00      0.12      0.00  anon #xd5745c
  0.00      0.12      0.00  anon #xd574d0
  0.00      0.09      0.00  ice-9/boot-9.scm:1971:6
  0.00      0.07      0.00  sort
  0.00      0.07      0.00  guix/build/utils.scm:492:28
  0.00      0.05      0.00  filter
  0.00      0.05      0.00  ice-9/boot-9.scm:1689:4:with-exception-handler
  0.00      0.04      0.00  string-map
  0.00      0.04      0.00  guix/man-db.scm:200:26
  0.00      0.04      0.00  close-port
  0.00      0.02      0.00  guix/build/utils.scm:473:4
  0.00      0.02      0.00  zlib.scm:138:4
---
Sample count: 295
Total time: 5.217801086 seconds (1.528583927 seconds in GC)
--8<---------------cut here---------------end--------------->8---

Hum, OK, so if I understand guile-zstd causes an almost 5x
slowdown... that doesn't make much sense unless the guile-zstd library
is much slower itself (zstd-decompression should be about 3.5x faster
than that of zlib).

Supposing guile-zstd was a 1:1 equivalent to guile-zlib, then we should
be able to see the above ~5 s time shrink down to around 1.5 s (best
case), which would be nice.

I'm impressed the man-db hook is that fast already though; I seem to
recall it would take like 30 s (or more?) for such a large amount of man
pages on this machine.

Thanks,

Maxim


  reply	other threads:[~2022-03-27  3:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 19:09 Profiling of man-db database generation with zlib vs zstd Maxim Cournoyer
2022-03-24 21:37 ` Ludovic Courtès
2022-03-26  3:22   ` Maxim Cournoyer
2022-03-27  3:44     ` Maxim Cournoyer [this message]
2022-03-29 10:22       ` Ludovic Courtès
2022-03-28  3:49   ` Maxim Cournoyer
2022-03-29 10:30     ` Ludovic Courtès
2022-03-30 14:49       ` Maxim Cournoyer
2022-03-30 16:16       ` Jonathan McHugh
2022-03-31 17:13     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czi8vz3n.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.