unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: RFC: libgit2 is slow/inefficient; switch to git command?
Date: Wed, 23 Nov 2022 23:16:18 +0100	[thread overview]
Message-ID: <87mt8he2q5.fsf@gnu.org> (raw)
In-Reply-To: <87cz9fpw4x.fsf@gmail.com> (Maxim Cournoyer's message of "Mon, 21 Nov 2022 21:21:02 -0500")

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> While attempting to bisect against the Linux kernel tree, the
> performance of libgit2 quickly became problematic, to the point where
> simply cloning the repo became a multiple hours affair, using upward to
> 3 GiB of RAM for the clone and indexing of the objects (!)

Did you confirm with a pure Guile-Git snippet that calls ‘clone’ that
this is the behavior observed?

> Given that:
>
> * the git CLI doesn't suffer from such poor performance;
> * This kind of performance problem has been known for years in libgit2
>   [0] with no fix in sight;

This reports talks about 5x wall-clock time, which is obviously not
great, but it doesn’t talk about memory usage, does it?

It talks about SHAttered though; that’s a key consideration to make sure
we’re doing an apples-to-apples comparison.

> * other projects such as Cargo support using the git CLI and that
>   projects are using it for that reason [1];

Should we follow Cargo’s lead for packaging as well?  :-)

> Would it make sense to switch to use the git command directly instead of
> calling into this libgit2 C library that ends up being slower?  It would
> provide a hefty speed-up when using 'guix refresh' or building new
> packages fetched from git without substitutes, or using 'git-checkout',
> etc.
>
> What do you think?

I think that’s not an option.  The level of integration we have in (guix
git), (guix channels), etc. is not achievable by shelling out to ‘git’.

"Philip McGrath" <philip@philipmcgrath.com> skribis:

> Along those lines, there’s an implementation of clone/checkout in pure Racket (for the package manager) that could probably be ported to Guile relatively easily. I’d expect libgit2 to be faster for the things that it supports, but the Racket implementation does support shallow checkout, so it might pay off if that skips a lot of work.
>
> Code: https://github.com/racket/racket/blob/master/racket/collects/net/git-checkout.rkt
> Docs: https://docs.racket-lang.org/net/git-checkout.html

That sounds like a worthy avenue; support for shallow clones would
already be an improvement.

> (More broadly, I haven’t investigated performance issues, but my basic inclination would be toward improving libgit2 over running the git executable.)

Same here.  The way I see it, we could gradually move bits of Guile-Git
to being pure Scheme.  So perhaps the first step would be to provide a
pure Scheme ‘clone’ based on the Racket code above?

Thanks,
Ludo’.


      parent reply	other threads:[~2022-11-23 22:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-22  2:21 RFC: libgit2 is slow/inefficient; switch to git command? Maxim Cournoyer
2022-11-22 15:39 ` zimoun
2022-11-22 16:49   ` Philip McGrath
2022-11-22 17:51     ` Wojtek Kosior via Development of GNU Guix and the GNU System distribution.
2022-11-22 21:15       ` Phil
2022-11-23  9:57         ` zimoun
2022-11-23 22:04         ` Ludovic Courtès
2022-11-23 22:16 ` Ludovic Courtès [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mt8he2q5.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=maxim.cournoyer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).