From: Christopher Baines <mail@cbaines.net>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Josselin Poiret <dev@jpoiret.xyz>,
Tobias Geerinckx-Rice <me@tobias.gr>,
Simon Tournier <zimon.toutoune@gmail.com>,
Mathieu Othacehe <othacehe@gnu.org>,
68266@debbugs.gnu.org, Ricardo Wurmus <rekado@elephly.net>,
Christopher Baines <guix@cbaines.net>
Subject: [bug#68266] [PATCH v2] guix: store: Add report-object-cache-duplication.
Date: Fri, 12 Jan 2024 18:26:24 +0000 [thread overview]
Message-ID: <87v87yfkkr.fsf@cbaines.net> (raw)
In-Reply-To: <871qamk4xc.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 5676 bytes --]
Ludovic Courtès <ludo@gnu.org> writes:
> Christopher Baines <mail@cbaines.net> skribis:
>
>> This is intended to help with spotting duplication in the object cache, so
>> where many keys, for example package records map to the same derivation. This
>> represents an opportunity for improved performance if you can reduce this
>> duplication in the cache, and better take advantage of the already present
>> cache entries.
>
> Another way to detect this is by looking at ‘add-data-to-store-cache’
> stats:
>
> $ GUIX_PROFILING="add-data-to-store-cache object-cache" guix build --no-grafts greetd wlgreet du-dust circtools --target=aarch64-linux-gnu -d
> /gnu/store/mivzv83wryv9gp5bjncg5m1831dx2xwr-circtools-1.0.0.drv
> /gnu/store/4xf7kh9mi0vpvs8m1ak4x8w1rpsdpv6z-du-dust-0.8.6.drv
> /gnu/store/ayk54gvlbc1qam6irzf9kaig56dhzni0-wlgreet-0.4.1.drv
> /gnu/store/r601i40cii9ic5w1k4hy5c2yngfayh64-greetd-0.9.0.drv
> Object Cache:
> fresh caches: 22
> lookups: 40435
> hits: 36821 (91.1%)
> cache size: 3613 entries
>
> 'add-data-to-store' cache:
> lookups: 4090
> hits: 958 (23.4%)
> .drv files: 3062 (74.9%)
> Scheme files: 916 (22.4%)
> $ GUIX_PROFILING="add-data-to-store-cache object-cache" ./pre-inst-env guix build --no-grafts greetd wlgreet du-dust circtools --target=aarch64-linux-gnu -d
> /gnu/store/1wsldmvigjb8w2gk418npbnfznlb0ck1-circtools-1.0.0.drv
> /gnu/store/b5c73fawjdvkgy431qxz9l6l9y9a9lhz-du-dust-0.8.6.drv
> /gnu/store/zwc7qzsbzf62dgbbzy74lki4hsr406bw-wlgreet-0.4.1.drv
> /gnu/store/vjdd23hc82701afb132z1ajcqa7hfd74-greetd-0.9.0.drv
> Object Cache:
> fresh caches: 22
> lookups: 37942
> hits: 34523 (91.0%)
> cache size: 3418 entries
>
> 'add-data-to-store' cache:
> lookups: 3895
> hits: 763 (19.6%)
> .drv files: 2957 (75.9%)
> Scheme files: 826 (21.2%)
>
> Ideally, the hit rate there would be 0% and we could remove it.
>
> If there’s a positive hit rate, it means we keep adding the same .drv
> and/or *-builder files to the store, meaning that the object cache was
> ineffective.
>
>> +(define* (report-object-cache-duplication store #:key (threshold 10)
>> + (port (current-error-port)))
>
> Do you have an example output of this?
>
> How helpful does it look to you in practice?
Yep, so here's some output I get from computing all the cross
derivations to i586-pc-gnu:
value #<derivation
/gnu/store/lqxlksh00cshc816xqfq47r3jjdfj2p9-subversion-1.14.2.drv =>
/gnu/store/92avphfdcrcaxx8m5a6ihmw558bj3np8-subversion-1.14.2 7fcfbe5f9280>
cached 4174 times
example keys:
- #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0>
- #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0>
- #<file-append #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0> "/bin/svn">
- #<file-append #<package subversion@1.14.2 gnu/packages/version-control.scm:2316 7fcfc1c924d0> "/bin/svn">
So it's not immediately obvious what the issue is, but if you search for
"/bin/svn", then you find the file-append calls in svn-fetch, so there's
a couple of new file-append records added to the cache for each
svn-fetch.
value /gnu/store/qi1km3qlv5hdsql6h3ibwvz6v4z8lqbz-ld-wrapper.in cached 755 times
example keys:
- #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
- #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
- #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
- #<<local-file> file: "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in" absolute: #<promise "/home/chris/Projects/Guix/guix/gnu/packages/ld-wrapper.in"> name: "ld-wrapper.in" recursive?: #t select?: #<procedure true (file stat)>>
Similar to file-append, it's not immediately obvious where all these
local-file's are coming from, but searching for ld-wrapper.in suggests
make-ld-wrapper.
My thinking here is that maybe it's worth not caching everything in the
object cache. Particularly for file-append, I'm not sure what the cache
is actually doing. As for local-file, given there's lower level caching,
maybe it's worth leaning on that and not caching local-file either. I
tried this out, and it seemed to not make performance worse at least,
and it did remove the duplication from the cache.
Other things that show up include:
value #<derivation /gnu/store/z28jfsm43wg50cvdfq0548pbf5jfhk8r-binutils-cross-i586-pc-gnu-2.38.drv => /gnu/store/lkmwq399jllig2a5r323v6y9nflpn6gn-binutils-cross-i586-pc-gnu-2.38 7fcfbd358000> cached 1228 times
example keys:
- #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3bb8f0>
- #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3b6630>
- #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3b6420>
- #<package binutils-cross-i586-pc-gnu@2.38 gnu/packages/cross-base.scm:79 7fcfbf3b62c0>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 987 bytes --]
prev parent reply other threads:[~2024-01-12 18:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-05 16:35 [bug#68266] [PATCH 0/7] Memoize packages associated with cross building Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 1/7] gnu: Memozise make-ld-wrapper results Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 2/7] gnu: Memozise cross-binutils results Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 3/7] gnu: Memozise cross-gcc results Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 4/7] gnu: Memozise cross-kernel-headers results Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 5/7] gnu: Memozise cross-mig results Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 6/7] gnu: Memozise cross-libc results Christopher Baines
2024-01-05 16:40 ` [bug#68266] [PATCH 7/7] packages: rust: Memoize make-rust-sysroot results Christopher Baines
2024-01-12 14:13 ` Ludovic Courtès
2024-01-12 17:57 ` Christopher Baines
2024-01-13 16:15 ` Efraim Flashner
2024-01-15 16:54 ` Ludovic Courtès
2024-01-08 17:22 ` [bug#68266] [PATCH 1/7] gnu: Memozise make-ld-wrapper results Ludovic Courtès
2024-01-08 19:01 ` Christopher Baines
2024-01-09 23:10 ` Ludovic Courtès
2024-01-10 12:28 ` Christopher Baines
2024-01-10 12:57 ` [bug#68266] [PATCH v2] guix: store: Add report-object-cache-duplication Christopher Baines
2024-01-12 14:22 ` Ludovic Courtès
2024-01-12 18:26 ` Christopher Baines [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v87yfkkr.fsf@cbaines.net \
--to=mail@cbaines.net \
--cc=68266@debbugs.gnu.org \
--cc=dev@jpoiret.xyz \
--cc=guix@cbaines.net \
--cc=ludo@gnu.org \
--cc=me@tobias.gr \
--cc=othacehe@gnu.org \
--cc=rekado@elephly.net \
--cc=zimon.toutoune@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).