* Speed up grafts by storing reference offset in index
@ 2024-12-13 12:50 Ricardo Wurmus
2024-12-15 0:10 ` Ludovic Courtès
0 siblings, 1 reply; 2+ messages in thread
From: Ricardo Wurmus @ 2024-12-13 12:50 UTC (permalink / raw)
To: guix-devel
Hello Guix,
grafts can be a little slow for a number of reasons:
- they are not substituted, because the assumption is that it is
preferrable to rewrite references locally instead of downloading a big
archive with the modified file. Local computations on x86_64 are
often acceptable, but on aarch64 systems they can be very slow indeed.
- when a lot of grafts need to be applied, many files need to be
rewritten
- big files take longer to read and thus to rewrite
Since it is December and I'm in a silly mood here is a silly idea: would
it make sense to shift parts of the grafting work to an offloadable
build? Here's what I imagine:
- on the build farms build an additional derivation for a references
file. The references file is an S-expression containing a list of
tuples of the form (FILE-NAME OFFSET). Each of these tuples
identifies the location of a single reference at the recorded byte
OFFSET in FILE-NAME.
- when computing grafts, don't search the local files sequentially for
references but look them up in the references file. Instead of
computing the reference file substitute it from a build server.
Alternatively, change the format for substitutes and record reference
locations there, so that the local store database can also store
reference locations. It already stores references (for things like
"guix gc -R"), so maybe we could store just a little extra information
to allow us to seek to the offset directly.
As you can tell, I haven't thought this through, and there are a number
of different places where a feature like this could live.
What do you think?
--
Ricardo
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Speed up grafts by storing reference offset in index
2024-12-13 12:50 Speed up grafts by storing reference offset in index Ricardo Wurmus
@ 2024-12-15 0:10 ` Ludovic Courtès
0 siblings, 0 replies; 2+ messages in thread
From: Ludovic Courtès @ 2024-12-15 0:10 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: guix-devel
Hi!
Ricardo Wurmus <rekado@elephly.net> skribis:
> Since it is December and I'm in a silly mood here is a silly idea: would
> it make sense to shift parts of the grafting work to an offloadable
> build? Here's what I imagine:
>
> - on the build farms build an additional derivation for a references
> file. The references file is an S-expression containing a list of
> tuples of the form (FILE-NAME OFFSET). Each of these tuples
> identifies the location of a single reference at the recorded byte
> OFFSET in FILE-NAME.
>
> - when computing grafts, don't search the local files sequentially for
> references but look them up in the references file. Instead of
> computing the reference file substitute it from a build server.
This sounds quite ambitious and it’s unclear that this would be
beneficial (it would be beneficial *if* scanning for references is
substantially more expensive than just copying the part of the file that
would be scanned, and it’s far from obvious that this holds.)
I have another, more down-to-earth proposal: ungraft more often! That’s
the spirit of the auto-ungraft manifest and jobset:
<https://issues.guix.gnu.org/74654>… but it doesn’t quite work as
expected because of ‘rust-ring’ shenanigans:
<https://lists.gnu.org/archive/html/guix-devel/2024-12/msg00113.html>.
(Making grafting faster would still be welcome, but I’d rather look for
a “local” optimization in the code itself.)
Ludo’.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-12-15 0:11 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-13 12:50 Speed up grafts by storing reference offset in index Ricardo Wurmus
2024-12-15 0:10 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).