unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Christopher Baines <mail@cbaines.net>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Josselin Poiret <dev@jpoiret.xyz>,
	Tobias Geerinckx-Rice <me@tobias.gr>,
	Simon Tournier <zimon.toutoune@gmail.com>,
	Mathieu Othacehe <othacehe@gnu.org>,
	68266@debbugs.gnu.org, Ricardo Wurmus <rekado@elephly.net>,
	Christopher Baines <guix@cbaines.net>
Subject: [bug#68266] [PATCH v2] guix: store: Add report-object-cache-duplication.
Date: Fri, 12 Jan 2024 18:26:24 +0000	[thread overview]
Message-ID: <87v87yfkkr.fsf@cbaines.net> (raw)
In-Reply-To: <871qamk4xc.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 5676 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Christopher Baines <mail@cbaines.net> skribis:
>
>> This is intended to help with spotting duplication in the object cache, so
>> where many keys, for example package records map to the same derivation. This
>> represents an opportunity for improved performance if you can reduce this
>> duplication in the cache, and better take advantage of the already present
>> cache entries.
>
> Another way to detect this is by looking at ‘add-data-to-store-cache’
> stats:
>
> $ GUIX_PROFILING="add-data-to-store-cache object-cache" guix build --no-grafts greetd wlgreet du-dust circtools --target=aarch64-linux-gnu -d 
> /gnu/store/mivzv83wryv9gp5bjncg5m1831dx2xwr-circtools-1.0.0.drv
> /gnu/store/4xf7kh9mi0vpvs8m1ak4x8w1rpsdpv6z-du-dust-0.8.6.drv
> /gnu/store/ayk54gvlbc1qam6irzf9kaig56dhzni0-wlgreet-0.4.1.drv
> /gnu/store/r601i40cii9ic5w1k4hy5c2yngfayh64-greetd-0.9.0.drv
> Object Cache:
>   fresh caches:    22
>   lookups:      40435
>   hits:         36821 (91.1%)
>   cache size:    3613 entries
>
> 'add-data-to-store' cache:
>   lookups:       4090
>   hits:           958 (23.4%)
>   .drv files:    3062 (74.9%)
>   Scheme files:   916 (22.4%)
> $ GUIX_PROFILING="add-data-to-store-cache object-cache" ./pre-inst-env guix build --no-grafts greetd wlgreet du-dust circtools --target=aarch64-linux-gnu -d 
> /gnu/store/1wsldmvigjb8w2gk418npbnfznlb0ck1-circtools-1.0.0.drv
> /gnu/store/b5c73fawjdvkgy431qxz9l6l9y9a9lhz-du-dust-0.8.6.drv
> /gnu/store/zwc7qzsbzf62dgbbzy74lki4hsr406bw-wlgreet-0.4.1.drv
> /gnu/store/vjdd23hc82701afb132z1ajcqa7hfd74-greetd-0.9.0.drv
> Object Cache:
>   fresh caches:    22
>   lookups:      37942
>   hits:         34523 (91.0%)
>   cache size:    3418 entries
>
> 'add-data-to-store' cache:
>   lookups:       3895
>   hits:           763 (19.6%)
>   .drv files:    2957 (75.9%)
>   Scheme files:   826 (21.2%)
>
> Ideally, the hit rate there would be 0% and we could remove it.
>
> If there’s a positive hit rate, it means we keep adding the same .drv
> and/or *-builder files to the store, meaning that the object cache was
> ineffective.
>
>> +(define* (report-object-cache-duplication store #:key (threshold 10)
>> +                                          (port (current-error-port)))
>
> Do you have an example output of this?
>
> How helpful does it look to you in practice?

Yep, so here's some output I get from computing all the cross
derivations to i586-pc-gnu:

  value #<derivation
    /gnu/store/lqxlksh00cshc816xqfq47r3jjdfj2p9-subversion-1.14.2.drv =>
    /gnu/store/92avphfdcrcaxx8m5a6ihmw558bj3np8-subversion-1.14.2 7fcfbe5f9280>
  cached 4174 times
  example keys:
    - #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0>
    - #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0>
    - #<file-append #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0> "/bin/svn">
    - #<file-append #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0> "/bin/svn">

So it's not immediately obvious what the issue is, but if you search for
"/bin/svn", then you find the file-append calls in svn-fetch, so there's
a couple of new file-append records added to the cache for each
svn-fetch.


  value /gnu/store/qi1km3qlv5hdsql6h3ibwvz6v4z8lqbz-ld-wrapper.in cached 755 times
  example keys:
    - #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
    - #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
    - #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
    - #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>


Similar to file-append, it's not immediately obvious where all these
local-file's are coming from, but searching for ld-wrapper.in suggests
make-ld-wrapper.

My thinking here is that maybe it's worth not caching everything in the
object cache. Particularly for file-append, I'm not sure what the cache
is actually doing. As for local-file, given there's lower level caching,
maybe it's worth leaning on that and not caching local-file either. I
tried this out, and it seemed to not make performance worse at least,
and it did remove the duplication from the cache.

Other things that show up include:

  value #<derivation /gnu/store/z28jfsm43wg50cvdfq0548pbf5jfhk8r-binutils-cross-i586-pc-gnu-2.38.drv => /gnu/store/lkmwq399jllig2a5r323v6y9nflpn6gn-binutils-cross-i586-pc-gnu-2.38 7fcfbd358000> cached 1228 times
  example keys:
    - #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3bb8f0>
    - #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3b6630>
    - #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3b6420>
    - #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3b62c0>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 987 bytes --]

      reply	other threads:[~2024-01-12 18:56 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-05 16:35 [bug#68266] [PATCH 0/7] Memoize packages associated with cross building Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 1/7] gnu: Memozise make-ld-wrapper results Christopher Baines
2024-01-05 16:40   ` [bug#68266] [PATCH 2/7] gnu: Memozise cross-binutils results Christopher Baines
2024-01-05 16:40   ` [bug#68266] [PATCH 3/7] gnu: Memozise cross-gcc results Christopher Baines
2024-01-05 16:40   ` [bug#68266] [PATCH 4/7] gnu: Memozise cross-kernel-headers results Christopher Baines
2024-01-05 16:40   ` [bug#68266] [PATCH 5/7] gnu: Memozise cross-mig results Christopher Baines
2024-01-05 16:40   ` [bug#68266] [PATCH 6/7] gnu: Memozise cross-libc results Christopher Baines
2024-01-05 16:40   ` [bug#68266] [PATCH 7/7] packages: rust: Memoize make-rust-sysroot results Christopher Baines
2024-01-12 14:13     ` Ludovic Courtès
2024-01-12 17:57       ` Christopher Baines
2024-01-13 16:15         ` Efraim Flashner
2024-01-15 16:54         ` Ludovic Courtès
2024-01-08 17:22   ` [bug#68266] [PATCH 1/7] gnu: Memozise make-ld-wrapper results Ludovic Courtès
2024-01-08 19:01     ` Christopher Baines
2024-01-09 23:10       ` Ludovic Courtès
2024-01-10 12:28         ` Christopher Baines
2024-01-10 12:57 ` [bug#68266] [PATCH v2] guix: store: Add report-object-cache-duplication Christopher Baines
2024-01-12 14:22   ` Ludovic Courtès
2024-01-12 18:26     ` Christopher Baines [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v87yfkkr.fsf@cbaines.net \
    --to=mail@cbaines.net \
    --cc=68266@debbugs.gnu.org \
    --cc=dev@jpoiret.xyz \
    --cc=guix@cbaines.net \
    --cc=ludo@gnu.org \
    --cc=me@tobias.gr \
    --cc=othacehe@gnu.org \
    --cc=rekado@elephly.net \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).