unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: wolf <wolf@wolfsden.cz>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Josselin Poiret <dev@jpoiret.xyz>, 65720@debbugs.gnu.org
Subject: bug#65720: Guile-Git-managed checkouts grow way too much
Date: Mon, 11 Sep 2023 16:42:59 +0200	[thread overview]
Message-ID: <ZP8nc1m8rN_34XV-@ws> (raw)
In-Reply-To: <87pm2s385m.fsf@gnu.org>


[-- Attachment #1.1: Type: text/plain, Size: 2814 bytes --]

On 2023-09-08 19:08:05 +0200, Ludovic Courtès wrote:
> Hello!
> 
> Josselin Poiret <dev@jpoiret.xyz> skribis:
> 
> > Right, although I wouldn't necessarily say that the former doesn't have
> > a proper API, but rather that it has a Unix-oriented API.  That leads to
> > performance issues on e.g. Windows but on Linux I'm not sure there's
> > much of a difference.
> 
> [...]
> 
> > We could consider replacing the guile-git dependency with another
> > library built directly on top of git-minimal, and have this be a
> > dependency of Guix.  Not ideal though, and not really scalable either:
> > we can't just add every VCS as direct dependencies.
> 
> I cannot imagine a viable implementation of things like ‘commit-closure’
> and ‘commit-relation’ from (guix git) done by shelling out to ‘git’.

I am sure I must be missing some part of the contract of the function, but at
least the commit-relation seems fairly straightforward:

    (define (shelling-commit-relation old new)
      (let ((h-old (oid->string (commit-id old)))
            (h-new (oid->string (commit-id new))))
        (cond ((eq? old new)
               'self)
              ((zero? (git-C %repo "merge-base" "--is-ancestor" h-old h-new))
               'ancestor)
              ((zero? (git-C %repo "merge-base" "--is-ancestor" h-new h-old))
               'descendant)
              (else
               'unrelated))))

I would argue it is even somewhat more readable than the current implementation.

> I’m quite confident this would be slow

My version is ~2000x faster compared to (guix git):

    Guix: 1048.620992ms
    Git:  0.532143ms

Again, I am sure I must have miss something, either in the implementation or in
the measurements, because it is pretty hard to believe there is so much room for
improvement.

The full script I used is attached to this email.

> and brittle.

In general git plumbing command are design to have stable CLI interface in order
to be usable in scripting.  So I am not sure where the brittleness would come
from.

> 
> It looks like there’s no option other than carrying the two
> implementations.

Assuming I made no mistake (hard to believe), it is probably worth exploring the
feasibility of just shelling out to the git binary some more.

> 
> ~~~
> 
> Years ago, Andy Wingo sketched a plan for GNU hackers to implement Git
> in pure Scheme.  That was on April 1st though, so people mistakenly
> assumed it was a joke and the project was never carried out.
> 
> I digress, but I wonder: is there not even a viable Haskell or OCaml
> implementation of Git?
> 
> Thanks,
> Ludo’.
>

W.

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

[-- Attachment #1.2: test.scm --]
[-- Type: text/plain, Size: 1986 bytes --]

#!/bin/sh
# -*-scheme-*-
exec guile -s "$0" "$@"
!#

(use-modules (git)
             (guix git))

(define %repo "/tmp/guix-fork")

(define h1 "72745172d155e489936f694d6b9013cb76272370")
(define h2 "6d60d7ccba5a8e06c17d55a1772fa7f4529b5eff")
(define h3 "c3db650680f995f0556d3ddce567cdc1c33e4603")

;;; r has to still be defined when the commit-relation is called.  There is *no*
;;; error, but it always returns 'unrelated.  Quite a footgun.
(define r (repository-open %repo))
(define c1 (commit-lookup r (string->oid h1)))
(define c2 (commit-lookup r (string->oid h2)))
(define c3 (commit-lookup r (string->oid h3)))

(define (git-C dir . args)
  (apply system* "git" "-C" dir args))

(define (shelling-commit-relation old new)
  (let ((h-old (oid->string (commit-id old)))
        (h-new (oid->string (commit-id new))))
    (cond ((eq? old new)
           'self)
          ;; In real code, git-C should probably return #t (for 0), #f (for 1)
          ;; or raise (for anything else).
          ((zero? (git-C %repo "merge-base" "--is-ancestor" h-old h-new))
           'ancestor)
          ((zero? (git-C %repo "merge-base" "--is-ancestor" h-new h-old))
           'descendant)
          (else
           'unrelated))))

;;; Make sure it actually works.
(let ((tests `((,c1 . ,c1)
               (,c1 . ,c2)
               (,c2 . ,c1)
               (,c1 . ,c3))))
  (for-each (λ (c)
              (format #t "Guix: ~a\nGit:  ~a\n\n"
                      (commit-relation (car c) (cdr c))
                      (shelling-commit-relation (car c) (cdr c))))
            tests))

(define (time proc)
  (let* ((start (get-internal-run-time))
         (_ (proc))
         (end   (get-internal-run-time)))
    (exact->inexact (* 1000 (/ (- end start) internal-time-units-per-second)))))

(format #t "Guix: ~ams\nGit:  ~ams\n"
        (time (λ () (commit-relation c1 c2)))
        (time (λ () (shelling-commit-relation c1 c2))))

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2023-09-11 14:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-03 20:44 bug#65720: Guile-Git-managed checkouts grow way too much Ludovic Courtès
2023-09-04 21:47 ` Ludovic Courtès
2023-09-05  8:18   ` Josselin Poiret via Bug reports for GNU Guix
2023-09-05 14:18     ` Ludovic Courtès
2023-09-06  8:04       ` Josselin Poiret via Bug reports for GNU Guix
2023-09-08 17:08         ` Ludovic Courtès
2023-09-11  7:00           ` Csepp
2023-09-11  8:42           ` bug#65720: Digression about Git implementations (was Re: bug#65720: Guile-Git-managed checkouts grow way too much) Simon Tournier
2023-09-11 14:42           ` wolf [this message]
2023-09-13 18:10             ` bug#65720: Guile-Git-managed checkouts grow way too much Ludovic Courtès
2023-09-13 22:36               ` Simon Tournier
2023-09-07  0:41       ` Simon Tournier
2023-09-08 17:09         ` Ludovic Courtès
2023-09-09 10:31           ` Simon Tournier
2023-09-11  7:06             ` Csepp
2023-09-11 14:37       ` Ludovic Courtès
2023-10-20 16:15         ` bug#65720: [PATCH] git: Shell out to ‘git gc’ when necessary Ludovic Courtès
2023-10-23 10:08           ` Simon Tournier
2023-10-23 22:27             ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2023-10-23 23:28               ` bug#65720: Guile-Git-managed checkouts grow way too much Simon Tournier
2023-10-30 12:02           ` bug#65720: [bug#66650] [PATCH] git: Shell out to ‘git gc’ when necessary Christopher Baines
2023-11-14  9:19             ` Ludovic Courtès
2023-11-14  9:32               ` Simon Tournier
     [not found]                 ` <87h6ll28yh.fsf@gnu.org>
     [not found]                   ` <CAJ3okZ2-W_Me-Gao44+LeKGCm7dhb8VkLfC2doL4NE9VO88HYg@mail.gmail.com>
2023-11-22 11:17                     ` bug#65720: [bug#66650] " Ludovic Courtès
2023-11-22 11:57                       ` bug#65720: Guile-Git-managed checkouts grow way too much Simon Tournier
2023-09-05  8:22   ` Jelle Licht
2023-09-05 14:20     ` Ludovic Courtès
2023-09-05 18:59   ` Simon Tournier
2023-09-05 14:11 ` Ludovic Courtès
2023-09-18 22:35 ` Ludovic Courtès
2023-09-19  7:19   ` Simon Tournier
2023-11-23 11:35 ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZP8nc1m8rN_34XV-@ws \
    --to=wolf@wolfsden.cz \
    --cc=65720@debbugs.gnu.org \
    --cc=dev@jpoiret.xyz \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).