unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Roel Janssen <roel@gnu.org>
To: "Ludovic Courtès" <ludovic.courtes@inria.fr>
Cc: guix-devel@gnu.org, 27097@debbugs.gnu.org
Subject: Re: Performance on NFS
Date: Mon, 12 Jun 2017 10:45:27 +0200	[thread overview]
Message-ID: <rbu60g1k314.fsf@gnu.org> (raw)
In-Reply-To: <87a85hnvqm.fsf@gnu.org>


Ludovic Courtès writes:

> Hi Roel,
>
> Roel Janssen <roel@gnu.org> skribis:
>
>> You should know that we have 'submit' nodes that use the guixr wrapper
>> script to connect to the guix-daemon that runs on the 'hpcguix' node.
>>
>> Both have a /gnu mounted by a storage subsystem.
>>
>> I couldn't run the second command on a 'submit' node.  But I could run
>> it in the 'hpcguix' node.
>
> OK.
>
> Side note: I think you can replace your ‘guixr’ wrapper by just doing:
>
>   export GUIX_DAEMON_SOCKET=guix://hpcguix:1234
>
> See <https://www.gnu.org/software/guix/manual/html_node/The-Store.html>.
>
>> The first command:
>> ------------------
>>
>> [roel@hpc-submit1 ~]$ time guixr environment --ad-hoc coreutils --pure -- true
>>
>> real    0m38.415s
>> user    0m6.075s
>> sys     0m0.611s
>>
>> [roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure -- true
>>
>> real    0m27.054s
>> user    0m4.254s
>> sys     0m0.383s
>>
>>
>> The second command:
>> -------------------
>>
>> [roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' --pure --no-substitutes --no-grafts -- true                                                                                                      
>> The following derivations will be built:
>>    /gnu/store/9wczighnyz1bz43j4wawf09z180g3ywv-profile.drv
>>    /gnu/store/ffsyhajbdcp1lcq6x65czghya1iydly8-info-dir.drv
>>    /gnu/store/5gyl3l23ps6f8dgay4awybwq7n9j9pzk-fonts-dir.drv
>>    /gnu/store/l2mwj2q4vnq2v5raxz64ra7jyphd2jyd-manual-database.drv
>> Creating manual page database for 1 packages... done in 5.524 s
>>
>> real    1m6.812s
>> user    0m2.969s
>> sys     0m0.325s
>> [roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' --pure --no-substitutes --no-grafts -- true
>>
>> real    0m23.357s
>> user    0m2.802s
>> sys     0m0.340s
>>
>>
>> I suspect that the difference between the two commands is that one only
>> looks for one module, while the other looks in all modules.  Looking at
>> the second run, I suppose the difference is quite small.
>
> Yeah, -e doesn’t seem to be much faster (there are still a lot of
> modules to load anyway.)
>
> At any rate, let’s see what we can do; 23 seconds is not okay.
>
> I did a quick experiment:
>
> --8<---------------cut here---------------start------------->8---
> $ strace -o ,,s -s 123 guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' --pure --no-substitutes --no-grafts -- true
> $ grep ^open ,,s |wc -l
> 1095
> $ grep '^open.*ENOENT' ,,s |wc -l
> 136
> $ grep -E '^(open|stat|lstat).*patches/gcc-arm-bug' ,,s |wc -l
> 27
> $ grep -E '^(open|stat|lstat).*guix/build/utils' ,,s |wc -l
> 2190
> --8<---------------cut here---------------end--------------->8---
>
> After the patch below, I get:
>
> --8<---------------cut here---------------start------------->8---
> $ grep -E '^(open|stat|lstat).*guix/build/utils' ,,s2 |wc -l
> 14
> $ grep -E '^(open|stat|lstat).*patches/gcc-arm-bug' ,,s2 |wc -l
> 4
> --8<---------------cut here---------------end--------------->8---
>
> Here’s the big picture before and after:
>
> --8<---------------cut here---------------start------------->8---
> $ strace -c guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' --pure --no-substitutes --no-grafts -- true
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  32.55    0.009781           1     10463      9158 stat
>  15.55    0.004673           1      8780           write
>  11.26    0.003385        3385         1           wait4
>   7.94    0.002387          20       122        12 futex
>   6.38    0.001917           0      5052         4 lstat
>   5.70    0.001713           2      1095       136 open
>   5.54    0.001664           1      2919           read
>   3.02    0.000909           2       525           mmap
>   2.96    0.000889         148         6           clone
>   2.50    0.000751           2       481           mprotect
>   2.00    0.000600           1       959           close
>   1.56    0.000469           1       898         3 lseek
>   1.10    0.000330           3       100           sendfile
>   0.88    0.000264           0       541           fstat
>   0.42    0.000127           1       175           brk
>   0.15    0.000044           2        22           rt_sigaction
>   0.09    0.000026           5         5           munmap
>   0.06    0.000019           1        18           rt_sigprocmask
>   0.06    0.000019           2        10         3 ioctl
>   0.03    0.000010           3         3         3 access
>   0.03    0.000010           2         6         1 fcntl
>   0.03    0.000009           0        23           clock_gettime
>   0.03    0.000008           2         5           prlimit64
>   0.02    0.000006           6         1           statfs
>   0.02    0.000005           2         3           pipe2
>   0.01    0.000004           2         2           getpid
>   0.01    0.000004           4         1           sysinfo
>   0.01    0.000003           3         1           dup2
>   0.01    0.000003           2         2           arch_prctl
>   0.01    0.000003           3         1           set_tid_address
>   0.01    0.000002           1         2           execve
>   0.01    0.000002           1         2           uname
>   0.01    0.000002           2         1           getuid
>   0.01    0.000002           2         1           getgid
>   0.01    0.000002           2         1           geteuid
>   0.01    0.000002           2         1           getegid
>   0.01    0.000002           2         1           getppid
>   0.01    0.000002           2         1           getpgrp
>   0.01    0.000002           2         1           set_robust_list
>   0.00    0.000001           0        16           readlink
>   0.00    0.000000           0         3           madvise
>   0.00    0.000000           0         1           socket
>   0.00    0.000000           0         1           connect
>   0.00    0.000000           0         1           getcwd
>   0.00    0.000000           0         2           sched_getaffinity
>   0.00    0.000000           0         1           getrandom
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.030051                 32256      9320 total
> $ ./pre-inst-env strace -c guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' --pure --no-substitutes --no-grafts -- true
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  35.19    0.004311           0      9423      8117 stat
>  25.25    0.003094           0      8780           write
>   9.99    0.001224           0      2895           read
>   7.88    0.000965           1      1062       122 open
>   6.39    0.000783           7       117        10 futex
>   3.80    0.000466           1       502           mmap
>   3.34    0.000409           1       474           mprotect
>   2.91    0.000357           0       940           close
>   2.14    0.000262           0       900         3 lseek
>   1.90    0.000233           0       518           fstat
>   0.59    0.000072           1       100           sendfile
>   0.26    0.000032           0       123           lstat
>   0.20    0.000025           0        78           brk
>   0.04    0.000005           0        23           clock_gettime
>   0.02    0.000003           1         5           munmap
>   0.02    0.000003           1         6           clone
>   0.02    0.000002           0         5           rt_sigaction
>   0.02    0.000002           0        15           rt_sigprocmask
>   0.01    0.000001           1         1           wait4
>   0.01    0.000001           1         1           arch_prctl
>   0.01    0.000001           1         2           sched_getaffinity
>   0.01    0.000001           0         3           prlimit64
>   0.00    0.000000           0         7         1 ioctl
>   0.00    0.000000           0         2         2 access
>   0.00    0.000000           0         1           madvise
>   0.00    0.000000           0         1           socket
>   0.00    0.000000           0         1           connect
>   0.00    0.000000           0         1           execve
>   0.00    0.000000           0         1           uname
>   0.00    0.000000           0         3           fcntl
>   0.00    0.000000           0         1           getcwd
>   0.00    0.000000           0         1           statfs
>   0.00    0.000000           0         1           set_tid_address
>   0.00    0.000000           0         1           set_robust_list
>   0.00    0.000000           0         3           pipe2
>   0.00    0.000000           0         1           getrandom
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.012252                 25998      8255 total
> --8<---------------cut here---------------end--------------->8---
>
> So that’s shaving ~20% of the syscalls (notice the number of ‘lstat’
> calls), which may make a difference on NFS.
>
> Could you give it a try and report back?
>
> This patch optimizes the ‘add-to-store’ on the client side.  Until now
> the caching strategy in ‘add-to-store’ was very conservative: at each
> call it would ‘lstat’ the given file and use the stat as a key in
> caching.  Thus, if the file changed, we would notice and reopen it.
>
> The obvious downside is that we’d keep doing ‘lstat’ for a case that’s
> unlikely.  With this patch, we simply use the file name as the key in
> the ‘add-to-store’ cache.
>
> Though as you wrote before, ‘open’ is much more expensive that ‘stat’ on
> NFS, so that’s where we should focus.
>
> Thanks,
> Ludo’.
>
> diff --git a/guix/store.scm b/guix/store.scm
> index c94dfea95..8acde18d0 100644
> --- a/guix/store.scm
> +++ b/guix/store.scm
> @@ -830,10 +830,9 @@ bits are kept.  HASH-ALGO must be a string such as \"sha256\".
>  When RECURSIVE? is true, call (SELECT?  FILE STAT) for each directory entry,
>  where FILE is the entry's absolute file name and STAT is the result of
>  'lstat'; exclude entries for which SELECT? does not return true."
> -      (let* ((st    (false-if-exception (lstat file-name)))
> -             (args  `(,st ,basename ,recursive? ,hash-algo ,select?))
> +      (let* ((args  `(,file-name ,basename ,recursive? ,hash-algo ,select?))
>               (cache (nix-server-add-to-store-cache server)))
> -        (or (and st (hash-ref cache args))
> +        (or (hash-ref cache args)
>              (let ((path (add-to-store server basename recursive?
>                                        hash-algo file-name
>                                        #:select? select?)))

Here are the timings.  The patched version is used with ./pre-inst-env,
and the unpatched version is not invoked using ./pre-inst-env.

With patch:
-----------------------------------------------------------------------------------------------------

[roel@hpc-submit1 guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true
The following derivations will be built:
   /gnu/store/zfyjc6zmhjj19vx4h6z22kjbz75izm2d-profile.drv
   /gnu/store/p9nxlcyhvkiyfxllbb81m8jjnv1hfkhq-ca-certificate-bundle.drv
   /gnu/store/any2b7k23b16xq4pbym21dx0sg6ybmyi-info-dir.drv
   /gnu/store/55mmjkg47n58gql1yys053gl3vqfxlml-fonts-dir.drv
   /gnu/store/s5vinzd4rc9yffigamb5k76r4ipqim6z-manual-database.drv
Creating manual page database for 19 packages... done in 26.220 s
warning: collision encountered: /gnu/store/ri56wnmzkgzrajdyl5ydc55lrwy1164k-ld-wrapper-0/bin/ld /gnu/store/zq65kpvwwxgc3qqbf9apic8gyss2l0zq-binutils-2.27/bin/ld 
warning: arbitrarily choosing /gnu/store/ri56wnmzkgzrajdyl5ydc55lrwy1164k-ld-wrapper-0/bin/ld

real    6m14.571s
user    0m3.702s
sys     0m0.361s
[roel@hpc-submit1 guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m43.446s
user    0m3.393s
sys     0m0.452s
[roel@hpc-submit1 guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m22.405s
user    0m3.532s
sys     0m0.336s
[roel@hpc-submit1 guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m22.361s
user    0m3.356s
sys     0m0.326s
[roel@hpc-submit1 guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m17.993s
user    0m3.422s
sys     0m0.394s

Without patch:
-----------------------------------------------------------------------------------------------------

[roel@hpc-submit1 guix]$ time  guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m41.746s
user    0m3.471s
sys     0m0.552s
[roel@hpc-submit1 guix]$ time  guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m26.406s
user    0m3.451s
sys     0m0.523s
[roel@hpc-submit1 guix]$ time  guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m37.703s
user    0m3.445s
sys     0m0.579s
[roel@hpc-submit1 guix]$ time  guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m27.084s
user    0m3.538s
sys     0m0.428s

On hpcguix with patch:
-----------------------------------------------------------------------------------------------------

[roel@hpcguix guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m42.971s
user    0m3.335s
sys     0m0.386s
[roel@hpcguix guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m19.428s
user    0m3.239s
sys     0m0.367s
[roel@hpcguix guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m27.086s
user    0m3.073s
sys     0m0.348s
[roel@hpcguix guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m33.214s
user    0m3.068s
sys     0m0.296s
[roel@hpcguix guix]$ time ./pre-inst-env guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m22.686s
user    0m3.010s
sys     0m0.305s

On hpcguix without patch:
-----------------------------------------------------------------------------------------------------

[roel@hpcguix guix]$ time guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m28.502s
user    0m3.201s
sys     0m0.447s
[roel@hpcguix guix]$ time guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m41.693s
user    0m3.061s
sys     0m0.301s
[roel@hpcguix guix]$ time guix environment --pure coreutils --no-substitutes --no-grafts -- true

real    0m26.178s
user    0m3.034s
sys     0m0.358s



From these timings, I don't think it has a big impact.  This makes me
wonder, can't we replace the disk-intensive stuff with a database?
If we only have to read the files on disk once, after which we extracted
the information (the hashes?) needed to compute which links have to be
created to make an environment, then actually creating the environment
can be as fast as only creating those links.

Maybe this is too vague, because I don't know what Guix needs to read
exactly here.

Kind regards,
Roel Janssen

  reply	other threads:[~2017-06-12  8:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170527105641.9426-1-mail@cbaines.net>
2017-05-27 11:31 ` Combining Guix, direnv and Emacs for environment customisation Christopher Baines
2017-05-30 15:03   ` Ludovic Courtès
2017-06-01 13:17     ` Roel Janssen
2017-06-03 13:59       ` Ludovic Courtès
2017-06-03 21:08         ` Roel Janssen
2017-06-04 21:15           ` Ludovic Courtès
2017-06-05 22:37             ` Roel Janssen
2017-06-07 10:35               ` Performance on NFS Ludovic Courtès
2017-06-07 13:06                 ` Roel Janssen
2017-06-09 13:46                   ` Ludovic Courtès
2017-06-12  8:45                     ` Roel Janssen [this message]
2017-06-12 15:58                       ` Ludovic Courtès
2017-06-16 15:23                         ` Ludovic Courtès
2017-06-17  7:36                           ` Roel Janssen
2017-06-17 22:40                             ` Ludovic Courtès
2017-06-17 23:15                               ` Roel Janssen
2017-06-18  8:43                               ` Ricardo Wurmus
2017-06-19  8:01                                 ` RPC performance Ludovic Courtès
2017-06-19  8:15                                   ` Ludovic Courtès
2017-06-19 14:25                                     ` Ricardo Wurmus
2017-06-22 14:03                                     ` Andy Wingo
2017-06-22 16:05                                       ` Ludovic Courtès
2017-06-23  9:09                                         ` Andy Wingo
2017-06-23  9:24                                           ` Ludovic Courtès
2017-06-23  9:46                                             ` Andy Wingo
2017-06-26 11:54                                               ` Ludovic Courtès
2017-06-26 14:19                                                 ` Andy Wingo
2017-06-19 21:25                                   ` Ludovic Courtès
2017-06-22  8:04                                     ` Ricardo Wurmus
2017-06-07 11:01               ` Combining Guix, direnv and Emacs for environment customisation Ricardo Wurmus
2017-06-07 12:25                 ` Performance on NFS Ludovic Courtès
2017-06-07 12:59                   ` Ricardo Wurmus
2017-08-29 19:02   ` Combining Guix, direnv and Emacs for environment customisation Thompson, David
2017-08-29 22:30     ` Maxim Cournoyer
2017-08-30  5:52     ` Carlo Zancanaro
2017-08-30 10:09       ` Ludovic Courtès
2017-08-30 10:16     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=rbu60g1k314.fsf@gnu.org \
    --to=roel@gnu.org \
    --cc=27097@debbugs.gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=ludovic.courtes@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).