all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Simon Tournier <zimon.toutoune@gmail.com>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: guix-devel@gnu.org, "Ludovic Courtès" <ludo@gnu.org>
Subject: comparing commit-relation using Scheme+libgit2 vs shellout plumbing Git
Date: Tue, 12 Sep 2023 00:48:30 +0200	[thread overview]
Message-ID: <865y4gz5q9.fsf@gmail.com> (raw)
In-Reply-To: <87msxswoph.fsf@gmail.com>

Hi,

On Mon, 11 Sep 2023 at 14:26, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:

> In the grand scheme of things (pun intended), we'd like every
> programming to be feasible via nice Scheme APIs, which is what Guile-Git
> provides to work with git repositories.  The appeal is to have a single
> language to rule them all, reducing friction among Guix contributors.
> The alternative here is to have an API reduced to invoking system
> commands with string arguments, which is less expressive and lacks
> elegance.

As Maxim noticed in the message that I am proposing to revisit, it seems
that libgit2 comes with some performance penalties.  As wolf is
illustrating in the message:

        bug#65720: Guile-Git-managed checkouts grow way too much
        wolf <wolf@wolfsden.cz>
        Mon, 11 Sep 2023 16:42:59 +0200
        id:ZP8nc1m8rN_34XV-@ws
        https://issues.guix.gnu.org//65720
        https://issues.guix.gnu.org/msgid/ZP8nc1m8rN_34XV-@ws
        https://yhetil.org/guix/ZP8nc1m8rN_34XV-@ws

it might be possible to use an invocation of plain Git command which is
much faster in this case.  Well, that’s need to be investigated, IMHO.

For instance, instead of the current ’commit-relation’ implementation,

        (define (commit-relation old new)
          "Return a symbol denoting the relation between OLD and NEW, two commit
        objects: 'ancestor (meaning that OLD is an ancestor of NEW), 'descendant, or
        'unrelated, or 'self (OLD and NEW are the same commit)."
          (if (eq? old new)
              'self
              (let ((newest (commit-closure new)))
                (if (set-contains? newest old)
                    'ancestor
                    (let* ((seen   (list->setq (commit-parents new)))
                           (oldest (commit-closure old seen)))
                      (if (set-contains? oldest new)
                          'descendant
                          'unrelated))))))

which relies on ’commit-closure’, they propose to use a plumbing Git
commands, as:

        (define (shelling-commit-relation old new)
          (let ((h-old (oid->string (commit-id old)))
                (h-new (oid->string (commit-id new))))
            (cond ((eq? old new)
                   'self)
                  ((zero? (git-C %repo "merge-base" "--is-ancestor" h-old h-new))
                   'ancestor)
                  ((zero? (git-C %repo "merge-base" "--is-ancestor" h-new h-old))
                   'descendant)
                  (else
                   'unrelated))))

Well, this needs to be checked (read the Git documentation which is
probable harder than read some Scheme implementation ;-)) in order to
see if these invocations are doing the same.


>> I’m quite confident this would be slow
>
> My version is ~2000x faster compared to (guix git):
>
>     Guix: 1048.620992ms
>     Git:  0.532143ms

On my machine, I get something less spectacular for a history with 1000
commits in between.

--8<---------------cut here---------------start------------->8---
scheme@(guix-user)> ,time (commit-relation* 1000th newest)
$1 = ancestor
;; 0.128948s real time, 0.082921s run time.  0.046578s spent in GC.
scheme@(guix-user)> ,time (commit-relation 1000th newest)
$2 = ancestor
;; 4.588075s real time, 5.521358s run time.  1.404764s spent in GC.
--8<---------------cut here---------------end--------------->8---

I did something very similar as wolf is proposing and named it
’commit-relation*’.

Well, considering the implementation of ’commit-relation’, I think the
slowness is expected compared to the plain plumbing Git command.
Basically, ’commit-closure’ walks the Git history and for sure the loop
cannot be as efficient as an optimized Git specific implementation.

Hum, I think the most annoying is the time spent in GC.  Basically,
’commit-closure’ is building a set with many visited elements and that
set must be garbage collected.  And this GC time is not nothing compared
to the whole time, IMHO.

I agree with the grand scheme of things and that’s why I started this
thread. :-) However, for what it is worth, today I am less convinced
that manipulating libgit2 is able to provide “not-so-worse” performance
compared to what plain plumbing Git commands could offer.

Cheers,
simon


  reply	other threads:[~2023-09-11 22:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 15:17 hard dependency on Git? (was bug#65866: [PATCH 0/8] Add built-in builder for Git checkouts) Simon Tournier
2023-09-11 17:51 ` wolf
2023-09-11 18:26   ` Maxim Cournoyer
2023-09-11 22:48     ` Simon Tournier [this message]
2023-09-12 11:07       ` comparing commit-relation using Scheme+libgit2 vs shellout plumbing Git Attila Lendvai
2023-09-14 10:30       ` Ludovic Courtès
2023-09-14 11:56         ` Simon Tournier
2023-09-11 17:52 ` hard dependency on Git? (was bug#65866: [PATCH 0/8] Add built-in builder for Git checkouts) Simon Tournier
2023-09-11 18:20 ` Maxim Cournoyer
2023-09-12  9:06   ` Josselin Poiret
2023-09-12 12:56     ` Maxim Cournoyer
2023-09-12 14:08       ` wolf
2023-09-14 10:22     ` Ludovic Courtès
2023-09-14 16:51   ` Ludovic Courtès
2023-09-14 17:28     ` Simon Tournier
2023-09-17  2:16     ` Maxim Cournoyer
2023-09-18 13:56       ` [bug#65866] " Ludovic Courtès
2023-09-18 14:45         ` Simon Tournier
2023-09-19 14:43           ` bug#65866: [PATCH 0/8] Add built-in builder for Git checkouts Ludovic Courtès
2023-09-19 17:09             ` Simon Tournier
2023-09-11 19:35 ` hard dependency on Git? (was bug#65866: [PATCH 0/8] Add built-in builder for Git checkouts) Vagrant Cascadian
2023-09-11 21:23   ` Csepp
2023-09-12  7:44   ` Simon Tournier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=865y4gz5q9.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=ludo@gnu.org \
    --cc=maxim.cournoyer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.