all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: 65720@debbugs.gnu.org
Subject: bug#65720: Guile-Git-managed checkouts grow way too much
Date: Tue, 19 Sep 2023 00:35:28 +0200	[thread overview]
Message-ID: <87jzsnf6tr.fsf@gnu.org> (raw)
In-Reply-To: <87bkejc7go.fsf@inria.fr> ("Ludovic Courtès"'s message of "Sun, 03 Sep 2023 22:44:39 +0200")

Ludovic Courtès <ludo@gnu.org> skribis:

> As reported by Tobias on IRC (in the context of ‘hpcguix-web’),
> checkouts managed by Guile-Git appear to grow beyond reason.  As an
> example, here’s the same ‘.git’ managed with Guile-Git and with Git:
>
> $ du -hs ~/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq
> 6.7G    /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq
> $ du -hs .git
> 517M    .git

More data…  The biggest file in that repo is a pack that was created
when that repo was first cloned (Aug. 2021):

--8<---------------cut here---------------start------------->8---
$ du /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/* |sort -k1 -n| tail -3
44272	/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-3c2f1857501b01c321bc67ba1f30704deb9e18e9.pack
47272	/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-30d5b35ad14a8398464e49e224811b162f673d66.pack
191492	/home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack
$ ls -l /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack
-r--r--r-- 1 ludo users 196079671 Aug  9  2021 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/objects/pack/pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.pack
$ ls -ld /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/config
-rw-r--r-- 1 ludo users 266 Aug  9  2021 /home/ludo/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq/.git/config
--8<---------------cut here---------------end--------------->8---

The pack starts with things from Aug. 2021:

--8<---------------cut here---------------start------------->8---
$ git show-index < pack-d39507858782209d1ad87e389e4dffd4b6ff7ea2.idx|sort -k1 -n|head -3
12 30289f4d4638452520f52c1a36240220d0d940ff (852d8cb3)
927 d7ffc535c52f49177a8e5553569cdb1e321b5bc6 (2007c5d0)
1800 0a379de3249d5e9ff66fb404f7e5aa8ce2cb3d24 (b1e69aa4)
$ git show 30289f4d4638452520f52c1a36240220d0d940ff
commit 30289f4d4638452520f52c1a36240220d0d940ff
Author: Milkey Mouse <milkeymouse@meme.institute>
Date:   Sun Aug 8 22:15:40 2021 -0700

[…]
--8<---------------cut here---------------end--------------->8---

… and at the bottom (large offsets) it contains very old blogs from the
Nix repo that somehow made it here.

I figured we still had a ‘nix’ branch from the early days, that contains
the history of Nix.  I’ve now removed it, which helps a bit:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,use(git)
scheme@(guile-user)> ,t (clone "https://git.savannah.gnu.org/git/guix.git" "/tmp/guix")
$5 = #<git-repository 91a7b0>
;; 600.534529s real time, 435.260926s run time.  0.000000s spent in GC.
scheme@(guile-user)> ,t (clone "https://git.savannah.gnu.org/git/guix.git" "/tmp/guix-after-removing-nix-branch")
$6 = #<git-repository 4465a50>
;; 420.321511s real time, 398.772963s run time.  0.000000s spent in GC.
--8<---------------cut here---------------end--------------->8---

… and more importantly:

--8<---------------cut here---------------start------------->8---
$ du -hs /tmp/guix/.git
373M	/tmp/guix/.git
$ du -hs /tmp/guix-after-removing-nix-branch/.git
362M	/tmp/guix-after-removing-nix-branch/.git
--8<---------------cut here---------------end--------------->8---

Anyway, what seems to happen is that every pull (every call to
‘remote-fetch’) creates a new pack (see ‘git_fetch_download_pack’ in
libgit2), which becomes inefficient in the long run (lots of small
poorly-compressed packs).  That’s at least one possible explanation.

To be continued…

Ludo’.




  parent reply	other threads:[~2023-09-18 22:36 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-03 20:44 bug#65720: Guile-Git-managed checkouts grow way too much Ludovic Courtès
2023-09-04 21:47 ` Ludovic Courtès
2023-09-05  8:18   ` Josselin Poiret via Bug reports for GNU Guix
2023-09-05 14:18     ` Ludovic Courtès
2023-09-06  8:04       ` Josselin Poiret via Bug reports for GNU Guix
2023-09-08 17:08         ` Ludovic Courtès
2023-09-11  7:00           ` Csepp
2023-09-11  8:42           ` bug#65720: Digression about Git implementations (was Re: bug#65720: Guile-Git-managed checkouts grow way too much) Simon Tournier
2023-09-11 14:42           ` bug#65720: Guile-Git-managed checkouts grow way too much wolf
2023-09-13 18:10             ` Ludovic Courtès
2023-09-13 22:36               ` Simon Tournier
2023-09-07  0:41       ` Simon Tournier
2023-09-08 17:09         ` Ludovic Courtès
2023-09-09 10:31           ` Simon Tournier
2023-09-11  7:06             ` Csepp
2023-09-11 14:37       ` Ludovic Courtès
2023-10-20 16:15         ` bug#65720: [PATCH] git: Shell out to ‘git gc’ when necessary Ludovic Courtès
2023-10-23 10:08           ` Simon Tournier
2023-10-23 22:27             ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2023-10-23 23:28               ` bug#65720: Guile-Git-managed checkouts grow way too much Simon Tournier
2023-10-30 12:02           ` bug#65720: [bug#66650] [PATCH] git: Shell out to ‘git gc’ when necessary Christopher Baines
2023-11-14  9:19             ` Ludovic Courtès
2023-11-14  9:32               ` Simon Tournier
2023-11-16 12:12                 ` [bug#66650] " Ludovic Courtès
2023-11-16 13:24                   ` Simon Tournier
2023-11-22 11:17                     ` bug#65720: " Ludovic Courtès
2023-11-22 11:57                       ` [bug#66650] bug#65720: Guile-Git-managed checkouts grow way too much Simon Tournier
2023-11-22 11:57                         ` Simon Tournier
2023-11-22 16:00                         ` bug#66650: " Ludovic Courtès
2023-09-05  8:22   ` Jelle Licht
2023-09-05 14:20     ` Ludovic Courtès
2023-09-05 18:59   ` Simon Tournier
2023-09-05 14:11 ` Ludovic Courtès
2023-09-18 22:35 ` Ludovic Courtès [this message]
2023-09-19  7:19   ` Simon Tournier
2023-11-23 11:35 ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzsnf6tr.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=65720@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.