all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Maxime Devos <maximedevos@telenet.be>
To: 50384@debbugs.gnu.org
Subject: [bug#50384] [PATCH] Optimise search-patch (reducing I/O)
Date: Sat, 04 Sep 2021 23:17:10 +0200	[thread overview]
Message-ID: <8900fa8c8eef7f72fc97adc2408be26c88de7803.camel@telenet.be> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 4791 bytes --]

Hi guix!

The attached patch series optimises the G-exp compiler for
local-file, by avoiding interning the file if it's already
in the store, but only if the hash is known in advance.
To take advantage of this, 'search-patch' has been modified
to compute the hash at expansion time.

The cost of this optimisation is a little additional
complexity, and computing derivations theoretically becomes
a little more expensive when the patch isn't already in
the store (1 call to 'stat' per non-yet-interned patch).

If you want to test this patch series for performance,
do _not_ run ./pre-inst-env guix, instead use
"guix pull --url=$PWD --branch=...", because the guix
from the checkout performs more 'lstat' calls than
an ‘user’ guix from "guix pull".

I'll show the patch series decreases the number of syscalls
below, using the 'strace' command.  Use 'strace -c' to gather
some statistics:

# Run it twice for a hot cache.  Ignore the output of the first run.
$ strace -c ./the-optimised-guix/bin/guix build -d pigx --no-grafts
$ strace -c ./the-optimised-guix/bin/guix build -d pigx --no-grafts
#
$ strace -c guix build -d pigx --no-grafts
$ strace -c guix build -d pigx --no-grafts

I've selected some syscalls from the output that seemed relevant
and formatted the call count in a table

              optimised         unoptimised     result of optimisation:
         stat  3865             3712            + 4.1%
         lstat 119              321             -62.9%
         fstat 59               59              unchanged
         read  17303            17688           - 2.2%
        write  6741             6767            - 1.9%
       openat  885              1076            -17.8%
       readlink 14              16              -12.5%
       -------
       total    28886           32539           -11.2%

Almost all syscalls are now called less (-11.2% in total),
which is good.  The exception is 'stat'.

Because 'search-path' is now being called less often
(only when the patch isn't in the store), the number
of 'stat' calls decreases.  However, 'local-file-compiler'
now calls 'stat' more (one or two times per patch).  I think
it's worth it though, because:

   (1) the second 'stat' is on the same file as the first 'stat',
       so presumably the kernel has cached the result, so no need
       to wait for I/O to complete the second time (there's a context
       switch though).  So ignoring the context switch cost,
       there are only ‘effectively’ +2.1% extra calls to 'stat'.

   (2) the total decrease of -11.2% syscalls

Now, what about the actual "time to derivation"?
First, let's time "guix build -d pigx --no-grafts" to get some raw numbers
on guix before the optimisation:

     time guix build -d pigx --no-grafts
     # repeated four times, first output is discarded
     # to eliminate hot/cold cache differences
     /gnu/store/03vmq94ckxfx6c4rc9zh745yy63n5i5m-pigx-0.0.3.drv
     real       0m13,470s
     user       0m13,526s
     sys        0m0,573s
     /gnu/store/03vmq94ckxfx6c4rc9zh745yy63n5i5m-pigx-0.0.3.drv
     real       0m13,582s
     user       0m13,639s
     sys        0m0,568s
     /gnu/store/03vmq94ckxfx6c4rc9zh745yy63n5i5m-pigx-0.0.3.drv
     real       0m13,834s
     user       0m13,901s
     sys        0m0,556s

Average numbers:
     real      0m13,629s
     user      0m13,689s
     sys       0m0,566s

After the optimisation:
      time ./the-optimised-guix/bin/guix build -d pigx --no-grafts
      /gnu/store/fq6x8d2vcm6sbjkimg7g8kcgb4c5xv1b-pigx-0.0.3.drv
      real      0m14,150s
      user      0m13,979s
      sys       0m0,685s
      /gnu/store/fq6x8d2vcm6sbjkimg7g8kcgb4c5xv1b-pigx-0.0.3.drv
      real      0m13,781s
      user      0m13,697s
      sys       0m0,580s
      /gnu/store/fq6x8d2vcm6sbjkimg7g8kcgb4c5xv1b-pigx-0.0.3.drv
      real      0m14,247s
      user      0m14,160s
      sys       0m0,548s

The numbers are higher somehow after the optimisations?
Even the 'sys' time is higher, even though there are less syscalls?
I re-ran the time commands, and got a decrease in 'real' time this time.

  /gnu/store/fq6x8d2vcm6sbjkimg7g8kcgb4c5xv1b-pigx-0.0.3.drv
  real  0m13,304s
  user  0m13,146s
  sys   0m0,609s
  /gnu/store/fq6x8d2vcm6sbjkimg7g8kcgb4c5xv1b-pigx-0.0.3.drv
  real  0m12,132s
  user  0m11,940s
  sys   0m0,589s
  /gnu/store/fq6x8d2vcm6sbjkimg7g8kcgb4c5xv1b-pigx-0.0.3.drv
  real  0m13,716s
  user  0m13,723s
  sys   0m0,529s

The output of "time ..." seems inconclusive
(can possibly be attributed to things like CPU frequency changing?),
but the decrease in syscall counts seems quite nice to me.

Feel free to run your own tests!

Greetings,
Maxime.

[-- Attachment #1.2: 0001-build-self-Implement-basic-hash-algorithm.patch --]
[-- Type: text/x-patch, Size: 2299 bytes --]

From a8e24a5258aa05689bcafa70af071da5296f63a4 Mon Sep 17 00:00:00 2001
From: Maxime Devos <maximedevos@telenet.be>
Date: Sat, 4 Sep 2021 20:09:03 +0200
Subject: [PATCH 1/6] build-self: Implement basic 'hash-algorithm'.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The module (guix hash) used from 'search-patch' in a future
patch needs it to be properly defined when (guix hash) is being
compiled.  'search-patch' is used when the derivation of Guix is
being computed, so it is important to avoid the ‘wrong type to
apply: #<syntax-transformer hash-algorithm>’ error.

* build-aux/build-self.scm
  (build-program)[fake-gcrypt-hash]: Define hash-algorithm for sha1
  and sha256.
---
 build-aux/build-self.scm | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/build-aux/build-self.scm b/build-aux/build-self.scm
index 3a2d13cc09..2c13d9d530 100644
--- a/build-aux/build-self.scm
+++ b/build-aux/build-self.scm
@@ -259,8 +259,17 @@ interface (FFI) of Guile.")
   (define fake-gcrypt-hash
     ;; Fake (gcrypt hash) module; see below.
     (scheme-file "hash.scm"
-                 #~(define-module (gcrypt hash)
-                     #:export (sha1 sha256))))
+                 #~(begin
+                     (define-module (gcrypt hash)
+                       #:export (sha1 sha256 hash-algorithm))
+                     ;; Avoid ‘Wrong type to apply:
+                     ;; #<syntax-transformer hash-algorithm>’ errors.
+                     (define sha1)
+                     (define sha256)
+                     (define-syntax hash-algorithm
+                       (syntax-rules (sha1 sha256)
+                         ((_ sha1) 2)
+                         ((_ sha256) 8))))))
 
   (define fake-git
     (scheme-file "git.scm" #~(define-module (git))))

base-commit: b4d132f98e03fae559db832e88897f1e166c4d47
prerequisite-patch-id: 91a26ba19372112a11a0eea2b066d2f63641deb1
prerequisite-patch-id: a535c1ae2a1fbf75d7ac9a3118ed23bd4fa03ecc
prerequisite-patch-id: 29eba0cede1c1e7153a7c7b9a58b33b67f693a13
prerequisite-patch-id: 8dd2234fa0f867081c6cf614c7a22b00022702b4
prerequisite-patch-id: 2fe0e5c67a37ef3f0e22813c9808eaeec83bb552
prerequisite-patch-id: 91514568f1ef4870ad7ed7b3f685f04703f9c090
-- 
2.33.0


[-- Attachment #1.3: 0002-guix-hash-Extract-file-hashing-procedures.patch --]
[-- Type: text/x-patch, Size: 5153 bytes --]

From 919a0375781ff0fab9e74dbafc9b1f8989808a3b Mon Sep 17 00:00:00 2001
From: Sarah Morgensen <iskarian@mgsn.dev>
Date: Sun, 15 Aug 2021 16:25:24 -0700
Subject: [PATCH 2/6] guix hash: Extract file hashing procedures.

* guix/scripts/hash.scm (guix-hash)[vcs-file?, file-hash]: Extract logic
to...
* guix/hash.scm: ...here. New file.
---
 guix/hash.scm         | 51 +++++++++++++++++++++++++++++++++++++++++++
 guix/scripts/hash.scm | 29 ++++++------------------
 2 files changed, 58 insertions(+), 22 deletions(-)
 create mode 100644 guix/hash.scm

diff --git a/guix/hash.scm b/guix/hash.scm
new file mode 100644
index 0000000000..8c2ab8187f
--- /dev/null
+++ b/guix/hash.scm
@@ -0,0 +1,51 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2021 Sarah Morgensen <iskarian@mgsn.dev>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (guix hash)
+  #:use-module (gcrypt hash)
+  #:use-module (guix serialization)
+  #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-11)
+  #:export (vcs-file?
+            file-hash*))
+
+(define (vcs-file? file stat)
+  "Returns true if FILE is a version control system file."
+  (case (stat:type stat)
+    ((directory)
+     (member (basename file) '(".bzr" ".git" ".hg" ".svn" "CVS")))
+    ((regular)
+     ;; Git sub-modules have a '.git' file that is a regular text file.
+     (string=? (basename file) ".git"))
+    (else
+     #f)))
+
+(define* (file-hash* file #:key
+                     (algorithm (hash-algorithm sha256))
+                     (recursive? #t)
+                     (select? (negate vcs-file?)))
+  "Compute the hash of FILE with ALGORITHM.  If RECURSIVE? is true, recurse
+into subdirectories of FILE, computing the combined hash of all files for
+which (SELECT?  FILE STAT) returns true."
+  (if recursive?
+      (let-values (((port get-hash)
+                    (open-hash-port algorithm)))
+        (write-file file port #:select? select?)
+        (force-output port)
+        (get-hash))
+      (file-hash algorithm file)))
diff --git a/guix/scripts/hash.scm b/guix/scripts/hash.scm
index b8622373cc..353ca30c2c 100644
--- a/guix/scripts/hash.scm
+++ b/guix/scripts/hash.scm
@@ -3,6 +3,7 @@
 ;;; Copyright © 2013 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;; Copyright © 2018 Tim Gesthuizen <tim.gesthuizen@yahoo.de>
+;;; Copyright © 2021 Sarah Morgensen <iskarian@mgsn.dev>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -23,6 +24,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix serialization)
   #:use-module (guix ui)
+  #:use-module (guix hash)
   #:use-module (guix scripts)
   #:use-module (guix base16)
   #:use-module (guix base32)
@@ -125,16 +127,6 @@ and 'base16' ('hex' and 'hexadecimal' can be used as well).\n"))
     (parse-command-line args %options (list %default-options)
                         #:build-options? #f))
 
-  (define (vcs-file? file stat)
-    (case (stat:type stat)
-      ((directory)
-       (member (basename file) '(".bzr" ".git" ".hg" ".svn" "CVS")))
-      ((regular)
-       ;; Git sub-modules have a '.git' file that is a regular text file.
-       (string=? (basename file) ".git"))
-      (else
-       #f)))
-
   (let* ((opts (parse-options))
          (args (filter-map (match-lambda
                             (('argument . value)
@@ -150,18 +142,11 @@ and 'base16' ('hex' and 'hexadecimal' can be used as well).\n"))
       ;; Compute the hash of FILE.
       ;; Catch and gracefully report possible '&nar-error' conditions.
       (with-error-handling
-        (if (assoc-ref opts 'recursive?)
-            (let-values (((port get-hash)
-                          (open-hash-port (assoc-ref opts 'hash-algorithm))))
-              (write-file file port #:select? select?)
-              (force-output port)
-              (get-hash))
-            (match file
-              ("-" (port-hash (assoc-ref opts 'hash-algorithm)
-                              (current-input-port)))
-              (_   (call-with-input-file file
-                     (cute port-hash (assoc-ref opts 'hash-algorithm)
-                           <>)))))))
+        (match file
+          ("-" (port-hash (assoc-ref opts 'hash-algorithm)
+                          (current-input-port)))
+          (_   (file-hash* #:algorithm (assoc-ref opts 'hash-algorithm)
+                           #:recursive? (assoc-ref opts 'recursive?))))))
 
     (match args
       ((file)
-- 
2.33.0


[-- Attachment #1.4: 0003-gexp-Allow-computing-the-hash-of-the-local-file-in-a.patch --]
[-- Type: text/x-patch, Size: 3314 bytes --]

From cc54e1c5021119bfaba07849e83ea31f7099970e Mon Sep 17 00:00:00 2001
From: Maxime Devos <maximedevos@telenet.be>
Date: Sat, 4 Sep 2021 15:35:51 +0200
Subject: [PATCH 3/6] gexp: Allow computing the hash of the local file in
 advance.

The new field is currently unused.  The following patches will
populate and use the field to reduce the time-to-derivation
when the file is already interned in the store.

* guix/gexp.scm
  (<local-file>): Add sha256 field.
  (%local-file): Add sha256 argument for populating the field.
  (local-file-compiler): Adjust 'match' expression.
---
 guix/gexp.scm | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/guix/gexp.scm b/guix/gexp.scm
index f3d278b3e6..a633984688 100644
--- a/guix/gexp.scm
+++ b/guix/gexp.scm
@@ -419,13 +419,16 @@ Here TARGET is bound to the cross-compilation triplet or #f."
 ;; A local file name.  FILE is the file name the user entered, which can be a
 ;; relative file name, and ABSOLUTE is a promise that computes its canonical
 ;; absolute file name.  We keep it in a promise to compute it lazily and avoid
-;; repeated 'stat' calls.
+;; repeated 'stat' calls.  Allow computing the hash of the file in advance,
+;; to avoid having to send the file to the daemon when it is already interned
+;; in the store.
 (define-record-type <local-file>
-  (%%local-file file absolute name recursive? select?)
+  (%%local-file file absolute name sha256 recursive? select?)
   local-file?
   (file       local-file-file)                    ;string
   (absolute   %local-file-absolute-file-name)     ;promise string
   (name       local-file-name)                    ;string
+  (sha256     local-file-sha256)                  ;sha256 bytevector | #f
   (recursive? local-file-recursive?)              ;Boolean
   (select?    local-file-select?))                ;string stat -> Boolean
 
@@ -434,6 +437,7 @@ Here TARGET is bound to the cross-compilation triplet or #f."
 (define* (%local-file file promise #:optional (name (basename file))
                       #:key
                       (literal? #t) location
+                      sha256
                       recursive? (select? true))
   ;; This intermediate procedure is part of our ABI, but the underlying
   ;; %%LOCAL-FILE is not.
@@ -441,7 +445,7 @@ Here TARGET is bound to the cross-compilation triplet or #f."
     (warning (and=> location source-properties->location)
              (G_ "resolving '~a' relative to current directory~%")
              file))
-  (%%local-file file promise name recursive? select?))
+  (%%local-file file promise name sha256 recursive? select?))
 
 (define (absolute-file-name file directory)
   "Return the canonical absolute file name for FILE, which lives in the
@@ -517,7 +521,7 @@ appears."
 (define-gexp-compiler (local-file-compiler (file <local-file>) system target)
   ;; "Compile" FILE by adding it to the store.
   (match file
-    (($ <local-file> file (= force absolute) name recursive? select?)
+    (($ <local-file> file (= force absolute) name sha256 recursive? select?)
      ;; Canonicalize FILE so that if it's a symlink, it is resolved.  Failing
      ;; to do that, when RECURSIVE? is #t, we could end up creating a dangling
      ;; symlink in the store, and when RECURSIVE? is #f 'add-to-store' would
-- 
2.33.0


[-- Attachment #1.5: 0004-gexp-Allow-overriding-the-absolute-file-name.patch --]
[-- Type: text/x-patch, Size: 2450 bytes --]

From 1937edd906b817dd15648fa682d55d3b3f779e45 Mon Sep 17 00:00:00 2001
From: Maxime Devos <maximedevos@telenet.be>
Date: Sat, 4 Sep 2021 16:25:22 +0200
Subject: [PATCH 4/6] gexp: Allow overriding the absolute file name.

This will be used by the next patch to implement search-patch in
terms of local-file.

* guix/gexp.scm
  (precanonicalized-file-name): New macro.
  (local-file): Use the absolute file name from precanonicalized-file-name
  when available.
---
 guix/gexp.scm | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/guix/gexp.scm b/guix/gexp.scm
index a633984688..c69e4aa299 100644
--- a/guix/gexp.scm
+++ b/guix/gexp.scm
@@ -51,6 +51,7 @@
             gexp-input-output
             gexp-input-native?
 
+            precanonicalized-file-name
             assume-valid-file-name
             local-file
             local-file?
@@ -463,6 +464,12 @@ the given file name is valid, even if it's not a string literal, and thus not
 warn about it."
   file)
 
+(define-syntax-rule (precanonicalized-file-name file absolute)
+  "This is a syntactic keyword to tell 'local-file' that it can assume that
+the given file name FILE has ABSOLUTE as absolute file name and 'local-file'
+does not need to compute the absolute file name by itself."
+  absolute)
+
 (define-syntax local-file
   (lambda (s)
     "Return an object representing local file FILE to add to the store; this
@@ -481,7 +488,7 @@ where FILE is the entry's absolute file name and STAT is the result of
 This is the declarative counterpart of the 'interned-file' monadic procedure.
 It is implemented as a macro to capture the current source directory where it
 appears."
-    (syntax-case s (assume-valid-file-name)
+    (syntax-case s (assume-valid-file-name precanonicalized-file-name)
       ((_ file rest ...)
        (string? (syntax->datum #'file))
        ;; FILE is a literal, so resolve it relative to the source directory.
@@ -495,6 +502,9 @@ appears."
        #'(%local-file file
                       (delay (absolute-file-name file (getcwd)))
                       rest ...))
+      ((_ (precanonicalized-file-name file absolute) rest ...)
+       ;; Use the given file name ABSOLUTE as absolute file name.
+       #'(%local-file file (delay absolute) rest ...))
       ((_ file rest ...)
        ;; Resolve FILE relative to the current directory.
        (with-syntax ((location (datum->syntax s (syntax-source s))))
-- 
2.33.0


[-- Attachment #1.6: 0005-packages-Compute-the-hash-of-patches-in-advance-when.patch --]
[-- Type: text/x-patch, Size: 9373 bytes --]

From e3b14fdf63e78a504a4f6e8a6ed85d5f8b08acb7 Mon Sep 17 00:00:00 2001
From: Maxime Devos <maximedevos@telenet.be>
Date: Sat, 4 Sep 2021 17:25:58 +0200
Subject: [PATCH 5/6] packages: Compute the hash of patches in advance when
 possible.

* gnu/packages.scm
  (search-patch): Rename to ...
  (%search-patch): ... this.
  (try-search-patch): New procedure, extracted from ...
  (%search-patch): ... this procedure.
  (%local-patch-file): New procedure.
  (true): New procedure.
  (search-patch): New macro, behaving like %search-patch, but computing the
  hash at expansion time when possible.
* gnu/packages/chromium.scm
  (%guix-patches): Use search-patches instead of local-file +
  assume-valid-file-name + search-patch.
* gnu/packages/gnuzilla.scm
  (icecat-source)[gnuzilla-fixes-patch]: Use search-patch instead of
  local-file + assule-valid-file-name + search-patch.
  (icecat-source)[makeicecat-patch]: Likewise.
* gnu/packages/embedded.scm
  (gcc-arm-none-eabi-4.9)[source]{patches}: Expect patches to be
  local-file objects instead of strings.
  of strings.
* guix/lint.scm (check-patch-file-names): Allow local-file objects.
---
 gnu/packages.scm          | 42 +++++++++++++++++++++++++++++++++++++--
 gnu/packages/chromium.scm |  4 +---
 gnu/packages/embedded.scm |  3 ++-
 gnu/packages/gnuzilla.scm |  8 ++------
 guix/lint.scm             | 28 ++++++++++++++++----------
 5 files changed, 62 insertions(+), 23 deletions(-)

diff --git a/gnu/packages.scm b/gnu/packages.scm
index ccfc83dd11..f5552e5a9b 100644
--- a/gnu/packages.scm
+++ b/gnu/packages.scm
@@ -4,6 +4,7 @@
 ;;; Copyright © 2014 Eric Bavier <bavier@member.fsf.org>
 ;;; Copyright © 2016, 2017 Alex Kost <alezost@gmail.com>
 ;;; Copyright © 2016 Mathieu Lirzin <mthl@gnu.org>
+;;; Copyright © 2021 Maxime Devos <maximedevos@telenet.be>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -21,11 +22,13 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (gnu packages)
+  #:use-module (guix gexp)
   #:use-module (guix packages)
   #:use-module (guix ui)
   #:use-module (guix utils)
   #:use-module (guix diagnostics)
   #:use-module (guix discovery)
+  #:use-module (guix hash)
   #:use-module (guix memoization)
   #:use-module ((guix build utils)
                 #:select ((package-name->name+version
@@ -90,12 +93,47 @@
   "Search the auxiliary FILE-NAME.  Return #f if not found."
   (search-path (%auxiliary-files-path) file-name))
 
-(define (search-patch file-name)
+(define (try-search-patch file-name)
+  "Search the patch FILE-NAME.  Return #f if not found."
+  (search-path (%patch-path) file-name))
+
+(define (%search-patch file-name)
   "Search the patch FILE-NAME.  Raise an error if not found."
-  (or (search-path (%patch-path) file-name)
+  (or (try-search-patch file-name)
       (raise (formatted-message (G_ "~a: patch not found")
                                 file-name))))
 
+(define (%local-patch-file file-name hash)
+  "Search the patch FILE-NAME, which is known to have HASH."
+  (local-file (precanonicalized-file-name file-name (%search-patch file-name))
+              #:sha256 hash #:recursive? #t))
+
+(define true (const #t))
+
+(define-syntax search-patch
+  (lambda (s)
+    "Search the patch FILE-NAME and compute its hash at expansion time
+if possible.  Return #f if not found."
+    (syntax-case s ()
+      ((_ file-name)
+       (string? (syntax->datum #'file-name))
+       ;; FILE-NAME is a constant string, so the hash can be computed
+       ;; in advance.
+       (let ((patch (try-search-patch (syntax->datum #'file-name))))
+         (if patch
+             #`(%local-patch-file file-name #,(file-hash* patch #:select? true))
+             (begin
+               (warning (source-properties->location
+                         (syntax-source #'file-name))
+                        (G_ "~a: patch not found at expansion time")
+                        (syntax->datum #'ile-name))
+               #'(%search-patch file-name)))))
+      ;; FILE-NAME is variable, so the hash cannot be pre-computed.
+      ((_ file-name) #'(%search-patch file-name))
+      ;; search-patch is being used used in a construct like
+      ;; (map search-patch ...).
+      (id (identifier? #'id) #'%search-patch))))
+
 (define-syntax-rule (search-patches file-name ...)
   "Return the list of absolute file names corresponding to each
 FILE-NAME found in %PATCH-PATH."
diff --git a/gnu/packages/chromium.scm b/gnu/packages/chromium.scm
index 26ae1e2550..cf419cf41b 100644
--- a/gnu/packages/chromium.scm
+++ b/gnu/packages/chromium.scm
@@ -351,9 +351,7 @@
       "0wbcbjzh5ak4nciahqw4yvxc4x8ik4x0iz9h4kfy0m011sxzy174"))))
 
 (define %guix-patches
-  (list (local-file
-         (assume-valid-file-name
-          (search-patch "ungoogled-chromium-extension-search-path.patch")))))
+  (search-patches "ungoogled-chromium-extension-search-path.patch"))
 
 ;; This is a source 'snippet' that does the following:
 ;; *) Applies various patches for unbundling purposes and libstdc++ compatibility.
diff --git a/gnu/packages/embedded.scm b/gnu/packages/embedded.scm
index f388c11c3d..826f5655c3 100644
--- a/gnu/packages/embedded.scm
+++ b/gnu/packages/embedded.scm
@@ -30,6 +30,7 @@
   #:use-module (guix packages)
   #:use-module (guix download)
   #:use-module (guix svn-download)
+  #:use-module (guix gexp)
   #:use-module (guix git-download)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (guix build-system cmake)
@@ -91,7 +92,7 @@
          ;; Remove the one patch that doesn't apply to this 4.9 snapshot (the
          ;; patch is for 4.9.4 and later but this svn snapshot is older).
          (patches (remove (lambda (patch)
-                            (string=? (basename patch)
+                            (string=? (local-file-name patch)
                                       "gcc-arm-bug-71399.patch"))
                           (origin-patches (package-source xgcc))))))
       (native-inputs
diff --git a/gnu/packages/gnuzilla.scm b/gnu/packages/gnuzilla.scm
index 576bc2586f..be674dce8f 100644
--- a/gnu/packages/gnuzilla.scm
+++ b/gnu/packages/gnuzilla.scm
@@ -736,14 +736,10 @@ from forcing GEXP-PROMISE."
              (base32
               "00ws3540x5whpicc5fx4k949ff73cqvajz6jp13ahn49wqdads47"))))
 
-         ;; 'search-patch' returns either a valid file name or #f, so wrap it
-         ;; in 'assume-valid-file-name' to avoid 'local-file' warnings.
          (gnuzilla-fixes-patch
-          (local-file (assume-valid-file-name
-                       (search-patch "icecat-use-older-reveal-hidden-html.patch"))))
+          (search-patch "icecat-use-older-reveal-hidden-html.patch"))
          (makeicecat-patch
-          (local-file (assume-valid-file-name
-                       (search-patch "icecat-makeicecat.patch")))))
+          (search-patch "icecat-makeicecat.patch")))
 
     (origin
       (method computed-origin-method)
diff --git a/guix/lint.scm b/guix/lint.scm
index 3a7f3be327..b0a2fbc327 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -46,6 +46,7 @@
                                 gexp->approximate-sexp))
   #:use-module (guix licenses)
   #:use-module (guix records)
+  #:use-module (guix gexp)
   #:use-module (guix grafts)
   #:use-module (guix upstream)
   #:use-module (guix utils)
@@ -928,6 +929,8 @@ patch could not be found."
                    (starts-with-package-name? (basename patch)))
                   ((? origin? patch)
                    (starts-with-package-name? (origin-actual-file-name patch)))
+                  ((? local-file? patch)
+                   (starts-with-package-name? (local-file-name patch)))
                   (_  #f))     ;must be some other file-like object
                 patches)
          '()
@@ -941,19 +944,22 @@ patch could not be found."
      (let ((prefix (string-length (%distro-directory)))
            (margin (string-length "guix-2.0.0rc3-10000-1234567890/"))
            (max    99))
+       (define (test-patch-name file-name)
+         (if (> (+ margin (if (string-prefix? (%distro-directory) file-name)
+                              (- (string-length file-name) prefix)
+                              (string-length file-name)))
+                max)
+             (make-warning
+              package
+              (G_ "~a: file name is too long")
+              (list (basename file-name))
+              #:field 'patch-file-names)
+             #f))
        (filter-map (match-lambda
                      ((? string? patch)
-                      (if (> (+ margin (if (string-prefix? (%distro-directory)
-                                                           patch)
-                                           (- (string-length patch) prefix)
-                                           (string-length patch)))
-                             max)
-                          (make-warning
-                           package
-                           (G_ "~a: file name is too long")
-                           (list (basename patch))
-                           #:field 'patch-file-names)
-                          #f))
+                      (test-patch-name patch))
+                     ((? local-file? patch)
+                      (test-patch-name (local-file-absolute-file-name patch)))
                      (_ #f))
                    patches)))))
 
-- 
2.33.0


[-- Attachment #1.7: 0006-gexp-Do-not-intern-if-the-file-is-already-in-the-sto.patch --]
[-- Type: text/x-patch, Size: 3059 bytes --]

From 0fc54bdd9ccc9729fff54f5935a552e5e608a1d0 Mon Sep 17 00:00:00 2001
From: Maxime Devos <maximedevos@telenet.be>
Date: Sat, 4 Sep 2021 18:10:32 +0200
Subject: [PATCH 6/6] gexp: Do not intern if the file is already in the store.

* guix/gexp.scm (local-file-compiler): When the file is already in the
  store, re-use the fixed output path instead of interning the file
  again.
---
 guix/gexp.scm | 38 +++++++++++++++++++++++++++++++-------
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/guix/gexp.scm b/guix/gexp.scm
index c69e4aa299..da1e918801 100644
--- a/guix/gexp.scm
+++ b/guix/gexp.scm
@@ -531,13 +531,37 @@ appears."
 (define-gexp-compiler (local-file-compiler (file <local-file>) system target)
   ;; "Compile" FILE by adding it to the store.
   (match file
-    (($ <local-file> file (= force absolute) name sha256 recursive? select?)
-     ;; Canonicalize FILE so that if it's a symlink, it is resolved.  Failing
-     ;; to do that, when RECURSIVE? is #t, we could end up creating a dangling
-     ;; symlink in the store, and when RECURSIVE? is #f 'add-to-store' would
-     ;; just throw an error, both of which are inconvenient.
-     (interned-file absolute name
-                    #:recursive? recursive? #:select? select?))))
+    ;; Delay computing the absolute file name until 'intern', as this
+    ;; might be a relatively expensive computation (e.g. if search-patch
+    ;; is used), especially on a spinning disk.
+    (($ <local-file> file absolute-promise name sha256 recursive? select?)
+     (let ()
+       (define (intern)
+         ;; Canonicalize FILE so that if it's a symlink, it is resolved.
+         ;; Failing to do that, when RECURSIVE? is #t, we could end up creating
+         ;; a dangling symlink in the store, and when RECURSIVE? is #f
+         ;; 'add-to-store' would just throw an error, both of which are
+         ;; inconvenient.
+         (interned-file (force absolute-promise) name
+                        #:recursive? recursive? #:select? select?))
+       (if sha256
+           (let ((path (fixed-output-path name sha256 #:recursive? recursive?)))
+             ;; If the hash is known in advance and the store already has the
+             ;; item, there is no need to intern the file.
+             (if (file-exists? path)
+                 (mbegin %store-monad
+                   ;; Tell the GC that PATH will be used, such that it won't
+                   ;; be deleted.
+                   ((store-lift add-temp-root) path)
+                   ;; The GC could have deleted the item before add-temp-root
+                   ;; completed, so check again if PATH exists.
+                   (if (file-exists? path)
+                       (return path)
+                       ;; If it has been removed, fall-back interning.
+                       (intern)))
+                 ;; If PATH does not yet exist, fall back to interning.
+                 (intern)))
+           (intern))))))
 
 (define-record-type <plain-file>
   (%plain-file name content references)
-- 
2.33.0


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

             reply	other threads:[~2021-09-04 21:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-04 21:17 Maxime Devos [this message]
2021-09-04 21:47 ` [bug#50384] [PATCH] Optimise search-patch (reducing I/O) Ludovic Courtès
2021-09-04 22:04 ` Ludovic Courtès
2021-09-05 19:48   ` [bug#50384] [PATCH v2] " Maxime Devos
2021-09-05 22:40     ` Maxime Devos
2021-09-06  8:39     ` zimoun
2021-09-06 10:06       ` Maxime Devos
2021-09-09 14:51     ` [bug#50384] [PATCH] " Ludovic Courtès
2021-09-21 16:55       ` [bug#50384] [PATCH v4] " Ludovic Courtès
2021-09-23 17:26         ` Maxime Devos
2021-09-27 16:17           ` Ludovic Courtès
2021-10-04 16:46             ` [bug#50384] [PATCH] " zimoun
2021-10-08  7:41               ` Ludovic Courtès
2021-10-11  8:09                 ` [bug#39258] bug#50384: " zimoun
2021-09-09 20:25 ` [bug#50384] [PATCH v3] " Maxime Devos
2021-09-10  9:54   ` bug#50384: " Maxime Devos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8900fa8c8eef7f72fc97adc2408be26c88de7803.camel@telenet.be \
    --to=maximedevos@telenet.be \
    --cc=50384@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.