all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Maxime Devos <maximedevos@telenet.be>
Cc: 50384@debbugs.gnu.org
Subject: [bug#50384] [PATCH] Optimise search-patch (reducing I/O)
Date: Thu, 09 Sep 2021 16:51:07 +0200	[thread overview]
Message-ID: <87ee9xerac.fsf_-_@gnu.org> (raw)
In-Reply-To: <0ec7f0270fcccec730808f9210f074cd5339961f.camel@telenet.be> (Maxime Devos's message of "Sun, 05 Sep 2021 21:48:22 +0200")

Hello,

Maxime Devos <maximedevos@telenet.be> skribis:

>> To address this, ‘local-file’ could store the inode/mtime + computed
>> store file name (rather than the SHA256).  ‘local-file-compiler’ would
>> check whether the actual file has matching inode/mtime before returning
>> the computed store file name.  Problem is that inode/mtime are
>> guaranteed to differ once you’ve run “make install”.  :-/
>
> An additional problem is that 'local-file-compiler' would have to 'stat'
> the file even if it is already in the store, undoing the (fairly limited?)
> performance gains of this patch series.
>
> The dependency tracking avoids this.

OK.

>> Intuitively, I’d have imagined a cache populated at run time; it would
>> map, say, file name/inode/mtime to a store file name.  ‘add-to-store’
>> (or some wrapper above it) would check the cache and return the store
>> file name directly, unless ‘valid-path?’ says it no longer exists.
>> Downside is that this would be a per-user cache and you’d still pay the
>> cost until it’s warm.  Advantage is that you could easily tell whether
>> it’s stale.
>> 
>> Thoughts?
>
> Intuitively, I'd have imagined doing as much as possible at compilation time.

Of course, but it’s important for the caching model to match “reality”,
which is that patch files live independently of the source files that
refer to them.

I’d all be fine if ‘local-file’ were to inline file contents at
macro-expansion time, because then we could be sure the hash and
contents match (but I’m not saying we should do this…).

What we could do is have a boolean saying whether the cached value is
authoritative, similar to what’s in (gnu packages).  That way, when
using ./pre-inst-env or passing a -L flag or setting GUIX_PACKAGE_PATH,
the cached value would not be authoritative; we’d be safe, without
needing ad hoc dependency tracking.

Thoughts?

[...]

> Because fixed-output-path is now called more often, I've added a patch
> optimising (guix base32).

[...]

> From e5dc46800597023dfc1c9d53cc6e0db2f3999022 Mon Sep 17 00:00:00 2001
> From: Maxime Devos <maximedevos@telenet.be>
> Date: Sat, 4 Sep 2021 15:35:51 +0200
> Subject: [PATCH v2 3/9] gexp: Allow computing the hash of the local file in
>  advance.
>
> The new field is currently unused.  The following patches will
> populate and use the field to reduce the time-to-derivation
> when the file is already interned in the store.
>
> * guix/gexp.scm
>   (<local-file>): Add sha256 field.
>   (%local-file): Add sha256 argument for populating the field.
>   (local-file-compiler): Adjust 'match' expression.

[...]

> +;; repeated 'stat' calls.  Allow computing the hash of the file in advance,
> +;; to avoid having to send the file to the daemon when it is already interned
> +;; in the store.
>  (define-record-type <local-file>
> -  (%%local-file file absolute name recursive? select?)
> +  (%%local-file file absolute name sha256 recursive? select?)
>    local-file?
>    (file       local-file-file)                    ;string
>    (absolute   %local-file-absolute-file-name)     ;promise string
>    (name       local-file-name)                    ;string
> +  (sha256     local-file-sha256)                  ;sha256 bytevector | #f

Could we store the result of ‘fixed-output-path’ rather than the SHA256,
while we’re at it?

Again, care must be taken because it’s possible to set NIX_STORE_DIR at
run time, which may invalidate the pre-computed store file name.

Can we make hash/file name computation a feature of ‘local-file’ rather
than one of ‘search-patch’ as in these patches?  I’d rather not provide
a way to override this new field.

There are cases where we cannot know the value of ‘recursive?’ at
expansion time, for instance if the user wrote:

  (local-file "foo.txt" #:recursive? r)

In that case, we cannot compute the hash or file name.

Thanks,
Ludo’.




  parent reply	other threads:[~2021-09-09 14:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-04 21:17 [bug#50384] [PATCH] Optimise search-patch (reducing I/O) Maxime Devos
2021-09-04 21:47 ` Ludovic Courtès
2021-09-04 22:04 ` Ludovic Courtès
2021-09-05 19:48   ` [bug#50384] [PATCH v2] " Maxime Devos
2021-09-05 22:40     ` Maxime Devos
2021-09-06  8:39     ` zimoun
2021-09-06 10:06       ` Maxime Devos
2021-09-09 14:51     ` Ludovic Courtès [this message]
2021-09-21 16:55       ` [bug#50384] [PATCH v4] " Ludovic Courtès
2021-09-23 17:26         ` Maxime Devos
2021-09-27 16:17           ` Ludovic Courtès
2021-10-04 16:46             ` [bug#50384] [PATCH] " zimoun
2021-10-08  7:41               ` Ludovic Courtès
2021-10-11  8:09                 ` [bug#39258] bug#50384: " zimoun
2021-09-09 20:25 ` [bug#50384] [PATCH v3] " Maxime Devos
2021-09-10  9:54   ` bug#50384: " Maxime Devos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ee9xerac.fsf_-_@gnu.org \
    --to=ludo@gnu.org \
    --cc=50384@debbugs.gnu.org \
    --cc=maximedevos@telenet.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.