From: "Ludovic Courtès" <ludo@gnu.org>
To: guix-patches@gnu.org
Cc: "Ludovic Courtès" <ludo@gnu.org>,
65720@debbugs.gnu.org, "Josselin Poiret" <dev@jpoiret.xyz>,
"Simon Tournier" <zimon.toutoune@gmail.com>,
"Christopher Baines" <guix@cbaines.net>,
"Josselin Poiret" <dev@jpoiret.xyz>,
"Ludovic Courtès" <ludo@gnu.org>,
"Mathieu Othacehe" <othacehe@gnu.org>,
"Ricardo Wurmus" <rekado@elephly.net>,
"Simon Tournier" <zimon.toutoune@gmail.com>,
"Tobias Geerinckx-Rice" <me@tobias.gr>
Subject: bug#65720: [PATCH] git: Shell out to ‘git gc’ when necessary.
Date: Fri, 20 Oct 2023 18:15:12 +0200 [thread overview]
Message-ID: <f588bb38b4b9fdaff29dd8af8c62aa3c55902f7c.1697818202.git.ludo@gnu.org> (raw)
In-Reply-To: <87jzswsrlt.fsf@gnu.org>
Fixes <https://issues.guix.gnu.org/65720>.
This fixes a bug whereby libgit2-managed checkouts would keep growing as
we fetch.
* guix/git.scm (packs-in-git-repository, maybe-run-git-gc): New
procedures.
(update-cached-checkout): Use it.
---
guix/git.scm | 39 ++++++++++++++++++++++++++++++++++++---
1 file changed, 36 insertions(+), 3 deletions(-)
Hi!
This is a radical fix/workaround for the unbounded Git checkout growth
problem, shelling out to ‘git gc’ when it’s likely needed (“too many”
pack files around).
I thought we might be able to implement a ‘git gc’ approximation using
the libgit2 “packbuilder” interface, but I haven’t got around to doing
it: <https://libgit2.org/libgit2/#HEAD/search/pack>.
Once again, shelling out is not my favorite option, but it’s a bug we
should fix sooner rather than later, hence this compromise.
Thoughts?
Ludo’.
diff --git a/guix/git.scm b/guix/git.scm
index b7182305cf..d704b62333 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -1,6 +1,6 @@
;;; GNU Guix --- Functional package management for GNU
;;; Copyright © 2017, 2020 Mathieu Othacehe <m.othacehe@gmail.com>
-;;; Copyright © 2018-2022 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2018-2023 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2021 Kyle Meyer <kyle@kyleam.com>
;;; Copyright © 2021 Marius Bakke <marius@gnu.org>
;;; Copyright © 2022 Maxime Devos <maximedevos@telenet.be>
@@ -29,15 +29,16 @@ (define-module (guix git)
#:use-module (guix cache)
#:use-module (gcrypt hash)
#:use-module ((guix build utils)
- #:select (mkdir-p delete-file-recursively))
+ #:select (mkdir-p delete-file-recursively invoke/quiet))
#:use-module (guix store)
#:use-module (guix utils)
#:use-module (guix records)
#:use-module (guix gexp)
#:autoload (guix git-download)
(git-reference-url git-reference-commit git-reference-recursive?)
+ #:autoload (guix config) (%git)
#:use-module (guix sets)
- #:use-module ((guix diagnostics) #:select (leave warning))
+ #:use-module ((guix diagnostics) #:select (leave warning info))
#:use-module (guix progress)
#:autoload (guix swh) (swh-download commit-id?)
#:use-module (rnrs bytevectors)
@@ -428,6 +429,35 @@ (define (delete-checkout directory)
(rename-file directory trashed)
(delete-file-recursively trashed)))
+(define (packs-in-git-repository directory)
+ "Return the number of pack files under DIRECTORY, a Git checkout."
+ (catch 'system-error
+ (lambda ()
+ (let ((directory (opendir (in-vicinity directory ".git/objects/pack"))))
+ (let loop ((count 0))
+ (match (readdir directory)
+ ((? eof-object?)
+ (closedir directory)
+ count)
+ (str
+ (loop (if (string-suffix? ".pack" str)
+ (+ 1 count)
+ count)))))))
+ (const 0)))
+
+(define (maybe-run-git-gc directory)
+ "Run 'git gc' in DIRECTORY if needed."
+ ;; XXX: As of libgit2 1.3.x (used by Guile-Git), there's no support for GC.
+ ;; Each time a checkout is pulled, a new pack is created, which eventually
+ ;; takes up a lot of space (lots of small, poorly-compressed packs). As a
+ ;; workaround, shell out to 'git gc' when the number of packs in a
+ ;; repository has become "too large", potentially wasting a lot of space.
+ ;; See <https://issues.guix.gnu.org/65720>.
+ (when (> (packs-in-git-repository directory) 25)
+ (info (G_ "compressing cached Git repository at '~a'...~%")
+ directory)
+ (invoke/quiet %git "-C" directory "gc")))
+
(define* (update-cached-checkout url
#:key
(ref '())
@@ -515,6 +545,9 @@ (define* (update-cached-checkout url
seconds seconds
nanoseconds nanoseconds))))
+ ;; Run 'git gc' if needed.
+ (maybe-run-git-gc cache-directory)
+
;; When CACHE-DIRECTORY is a sub-directory of the default cache
;; directory, remove expired checkouts that are next to it.
(let ((parent (dirname cache-directory)))
base-commit: 6b0a32196982a0a2f4dbb59d35e55833a5545ac6
--
2.41.0
next prev parent reply other threads:[~2023-10-20 16:16 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-03 20:44 bug#65720: Guile-Git-managed checkouts grow way too much Ludovic Courtès
2023-09-04 21:47 ` Ludovic Courtès
2023-09-05 8:18 ` Josselin Poiret via Bug reports for GNU Guix
2023-09-05 14:18 ` Ludovic Courtès
2023-09-06 8:04 ` Josselin Poiret via Bug reports for GNU Guix
2023-09-08 17:08 ` Ludovic Courtès
2023-09-11 7:00 ` Csepp
2023-09-11 8:42 ` bug#65720: Digression about Git implementations (was Re: bug#65720: Guile-Git-managed checkouts grow way too much) Simon Tournier
2023-09-11 14:42 ` bug#65720: Guile-Git-managed checkouts grow way too much wolf
2023-09-13 18:10 ` Ludovic Courtès
2023-09-13 22:36 ` Simon Tournier
2023-09-07 0:41 ` Simon Tournier
2023-09-08 17:09 ` Ludovic Courtès
2023-09-09 10:31 ` Simon Tournier
2023-09-11 7:06 ` Csepp
2023-09-11 14:37 ` Ludovic Courtès
2023-10-20 16:15 ` Ludovic Courtès [this message]
2023-10-23 10:08 ` bug#65720: [PATCH] git: Shell out to ‘git gc’ when necessary Simon Tournier
2023-10-23 22:27 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2023-10-23 23:28 ` bug#65720: Guile-Git-managed checkouts grow way too much Simon Tournier
2023-10-30 12:02 ` bug#65720: [bug#66650] [PATCH] git: Shell out to ‘git gc’ when necessary Christopher Baines
2023-11-14 9:19 ` Ludovic Courtès
2023-11-14 9:32 ` Simon Tournier
[not found] ` <87h6ll28yh.fsf@gnu.org>
[not found] ` <CAJ3okZ2-W_Me-Gao44+LeKGCm7dhb8VkLfC2doL4NE9VO88HYg@mail.gmail.com>
2023-11-22 11:17 ` bug#65720: [bug#66650] " Ludovic Courtès
2023-11-22 11:57 ` bug#65720: Guile-Git-managed checkouts grow way too much Simon Tournier
2023-09-05 8:22 ` Jelle Licht
2023-09-05 14:20 ` Ludovic Courtès
2023-09-05 18:59 ` Simon Tournier
2023-09-05 14:11 ` Ludovic Courtès
2023-09-18 22:35 ` Ludovic Courtès
2023-09-19 7:19 ` Simon Tournier
2023-11-23 11:35 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f588bb38b4b9fdaff29dd8af8c62aa3c55902f7c.1697818202.git.ludo@gnu.org \
--to=ludo@gnu.org \
--cc=65720@debbugs.gnu.org \
--cc=dev@jpoiret.xyz \
--cc=guix-patches@gnu.org \
--cc=guix@cbaines.net \
--cc=me@tobias.gr \
--cc=othacehe@gnu.org \
--cc=rekado@elephly.net \
--cc=zimon.toutoune@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).