unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Sarah Morgensen <iskarian@mgsn.dev>
To: Xinglu Chen <public@yoctocell.xyz>
Cc: 50359@debbugs.gnu.org
Subject: [bug#50359] [PATCH] import: Add 'generic-git' updater.
Date: Sat, 04 Sep 2021 17:19:40 -0700	[thread overview]
Message-ID: <86k0jvkh5v.fsf@mgsn.dev> (raw)
In-Reply-To: <e2bec21e5757204c45b8a06308b4ec183acabe86.1630683779.git.public@yoctocell.xyz>

[-- Attachment #1: Type: text/plain, Size: 9501 bytes --]

Hello,

Thanks for the patch!  Glad to see this idea becoming more polished.

Xinglu Chen <public@yoctocell.xyz> writes:

> * guix/import/git.scm: New file.
> * doc/guix.texi (Invoking guix refresh): Document it.
> * Makefile.am (MODULES): Register it.
> ---
> This patch adds a new ‘generic-git’ updater which can check for new tags
> for package hosted on Git repos.  However, it cannot download Git repos
> and update the package definitions, i.e. ‘guix refresh -u’.  There is a
> pending patch that would add this feature though[1].
>
> ‘guix refresh -L’ now reports
>
>   Available updaters:
>   […]
>   94.5% of the packages are covered by these updaters.
>
> We are getting close to 100% :-)

Wow, that is close!

>
> See it in action!
>
> $ ./pre-inst-env guix refresh harmonist scdoc gmnisrv
> gnu/packages/web.scm:7931:4: warning: no tags were found for package `gmnisrv'
> gnu/packages/web.scm:7931:4: warning: 'generic-git' updater failed to determine available releases for gmnisrv
> gnu/packages/man.scm:339:12: scdoc would be upgraded from 1.10.1 to 1.11.1
> gnu/packages/games.scm:9433:2: warning: failed to fetch Git repository for package `harmonist'
> gnu/packages/games.scm:9433:2: warning: 'generic-git' updater failed to determine available releases for harmonist

FWIW, harmonist and a few other packages fail to work because they use
an old git protocol which is not supported by libgit2.

[...]
> +
> +@itemize
> +@item @code{tag-prefix}: a regular expression for matching a prefix of
> +the tag name.
> +
> +@item @code{tag-suffix}: a regular expression for matching a suffix of
> +the tag name.
> +
> +@item @code{tag-version-delimiter}: a string used as the delimiter in
> +the tag name for separating the numbers of the version.
> +@end itemize
> +
> +@lisp
> +(package
> +  (name "foo")
> +  ;; ...
> +  (properties
> +    '((tag-prefix . "^release0-")
> +      (tag-suffix . "[a-z]?$")
> +      (tag-version-delimiter . ":"))))
> +@end lisp
             ^ extra whitespace

I do like the selection of (prefix, suffix, delimiter), though I think
there are only one or two packages which use a different delimiter.

[...]
> +;;; Errors & warnings
> +
> +(define-condition-type &git-tag-error &error
> +  git-tag-error?
> +  (kind git-tag-error-kind))
> +
> +(define (git-tag-error kind)
> +  (raise (condition (&message (message (format "bad `~a' property")))
> +                    (&git-tag-error
> +                     (kind kind)))))

When I trigger this error, I get:
--8<---------------cut here---------------start------------->8---
In ice-9/exceptions.scm:
   406:15  6 (latest-git-release _)
In ice-9/boot-9.scm:
  1752:10  5 (with-exception-handler _ _ #:unwind? _ # _)
In guix/import/git.scm:
    59:39  4 (get-version _ _ #:prefix _ #:suffix _ #:delim _)
In unknown file:
           3 (simple-format #f "bad `~a' property")
In ice-9/boot-9.scm:
  1685:16  2 (raise-exception _ #:continuable? _)
  1683:16  1 (raise-exception _ #:continuable? _)
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
In procedure simple-format: FORMAT: Missing argument for ~a
--8<---------------cut here---------------end--------------->8---

> +
> +(define (git-tag-warning package c)
> +  (warning (package-location package)
> +           (G_ "~a for package `~a'~%")
> +           (condition-message c)
> +           (package-name package)))
> +
> +(define-condition-type &git-no-tags-error &error
> +  git-no-tags-error?)
> +
> +(define (git-no-tags-error)
> +  (raise (condition (&message (message "no tags were found"))
> +                    (&git-no-tags-error))))
> +
> +(define (git-no-tags-warning package c)
> +  (warning (package-location package)
> +           (G_ "~a for package `~a'~%")
> +           (condition-message c)
> +           (package-name package)))
> +
> +(define (git-fetch-warning package)
> +  (warning (package-location package)
> +           (G_ "failed to fetch Git repository for package `~a'~%")
> +           (package-name package)))
> +
> +\f
> +;;; Helper functions
> +
> +(define (string-split* str delim)
> +  "Like `string-split', but DELIM is a string instead of a
> +char-set."
> +  (filter (lambda (str) (not (equal? str "")))
> +          (string-split str (string->char-set delim))))

  (string-split* "1:2.3" ":.") -> ("1" "2" "3")
  (string-split* "1a2b3" "ab") -> ("1" "2" "3")

Is this what you intended?  The documentation above makes it sound like
the whole string serves as the delimiter.

> +
> +(define* (get-version package tag #:key prefix suffix delim)

PACKAGE is not used by this procedure.

> +  (define delim* (if delim delim "."))
> +  (define prefix-regexp "^[^0-9]*")
> +  (define suffix-regexp (string-append "[^0-9" (regexp-quote delim*) "]*$"))

With a delimiter of '.', this would say the suffix of '1.2.3.prerelease'
is 'prerelease', not '.prerelease'.  Is this correct?  (I would be
tempted to just remove delim* from this.)

> +  (define delim-regexp (string-append "^[0-9]+" (regexp-quote delim*) "[0-9]+"))

This fails to account for versions which use non-numerics, such as (all
taken from the package-version field of packages using git-fetch and
which use this version as the tag):

1.0.0-beta.0
0.0.9.4f
4.4-git.1
5.2.0-alpha
0.2.0-alpha-199-g3e7a475
20200701.154658.b0d6223
12-068oasis4
4.0.0.dev8
0.32-14-gcdfe14e
2.8-fix-2

There are about 50-60 packages like this.

I'm not sure how much effort should be spent including them, and for
some of them I'm not sure what our ideal behavior *is*.  Even if we
could reliably detect them, should "alpha" or "dev" packages be returned
by the updater?

Upon investigation, there is a deeper problem: version-compare thinks
"5.2.0" is a lower version than "5.2.0-alpha", and that "4.0.0" is lower
than "4.0.0.dev8".

scheme@(guile-user)> (version-compare "5.1.9" "5.2.0")
$5 = <
scheme@(guile-user)> (version-compare "5.2.0" "5.2.0-alpha")
$6 = <
scheme@(guile-user)> (version-compare "4.0.0" "4.0.0.dev8")
$7 = <

> +
> +  (define no-prefix
> +    (let ((match (string-match (or prefix prefix-regexp) tag)))
> +      (if match
> +          (regexp-substitute #f match 'post)
> +          (git-tag-error 'tag-prefix))))
> +
> +  (define no-suffix
> +    (let ((match (string-match (or suffix suffix-regexp) no-prefix)))
> +      (if match
> +          (regexp-substitute #f match 'pre)
> +          (git-tag-error 'tag-suffix))))
> +
> +  (define no-delims
> +    (if (string-match delim-regexp no-suffix)
> +        (string-split* no-suffix delim*)
> +        (git-tag-error 'tag-version-delimiter)))

This throws an error if the version doesn't have any delimiter.

Actually, it throws an error in a lot of other cases too, often saying
the 'tag-version-delimiter is wrong when it's something else.  Consider
the tags from the "openjpeg" package, sorted by 'sort-tags':

arelease
opj0-97
start
v2.1.1
v2.1.2
v2.2.0
v2.3.0
v2.3.1
v2.4.0
version.1.0
version.1.1
version.1.2
version.1.3
version.1.4
version.1.5
version.1.5.1
version.1.5.2
version.2.0
version.2.0.1
version.2.1
wg1n6848

At first, 'get-version' throws an error because "wg1n6848" doesn't have
a delimiter. But even disregarding that, it would return "version.2.1"
-> "2.1" as the latest version.

Probably we should process all tags with 'get-version' (simply skipping
any that don't parse) and use that to sort the tags.  If none parse with
'get-version' we could use the "no tags" error or have a separate error
for "there were tags but we couldn't process them".

And this lets us just do something like (untested):

(define* (get-version tag #:key prefix suffix delim)
  (define delim-rx (regexp-quote (or delim ".")))
  (define prefix-rx (or prefix "[^[:digit:]]*"))
  (define suffix-rx (or suffix ".*"))
  (define version-char-rx
   (string-append "[^" delim-rx "[:punct:]]"))

  (define tag-rx
   (string-append "^" prefix "(" version-char-rx "+("
                  delim-rx version-char-rx ")*)" suffix-rx "$"))

  (and=> (string-match tag-rx tag)
         (cut match-substring <> 1)))

Though at this point, 'tag-rx' should probably be constructed and
compiled outside the loop.

> +
> +  (string-join no-delims "."))
> +
> +(define (sort-tags tags)
> +  "Sort TAGS, a list if Git tags, such that the latest tag is the last element."
> +  (sort tags (lambda (a b)
> +               (eq? (version-compare a b) '<))))
> +
> +\f
> +;;; Updater
> +
> +(define (get-remote url git-uri)
> +  "Given a URL and GIT-URI, a <git-reference> record, return the ``origin'' remote."
> +  (let* ((checkout (update-cached-checkout url
> +                                           #:recursive?
> +                                           (git-reference-recursive? git-uri)))
> +         (repository (repository-open checkout)))
> +    (remote-lookup repository "origin")))

We surely don't want 'update-cached-checkout' since that fetches the
whole repo history!  I've attached a patch below (based on top of this
one) which brings the total time-per-package to under 1s.  I moved it to
(guix git) to make use of 'with-libgit2' which ensures we use system
certificates.

Apologies for such a long reply. I hope it was helpful :)

--
Sarah


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: git: Add 'ls-remote-refs'. --]
[-- Type: text/x-patch, Size: 4741 bytes --]

From 0b0973034711e15b52702c0aec0c653dfd41928c Mon Sep 17 00:00:00 2001
Message-Id: <0b0973034711e15b52702c0aec0c653dfd41928c.1630800771.git.iskarian@mgsn.dev>
From: Sarah Morgensen <iskarian@mgsn.dev>
Date: Fri, 3 Sep 2021 22:40:02 -0700
Subject: [PATCH] git: Add 'ls-remote-refs'.

---
 guix/git.scm        | 33 +++++++++++++++++++++++++++++++
 guix/import/git.scm | 47 ++++++++++-----------------------------------
 2 files changed, 43 insertions(+), 37 deletions(-)

diff --git a/guix/git.scm b/guix/git.scm
index 9c6f326c36..b784fd6d20 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -56,6 +56,8 @@
             commit-difference
             commit-relation
 
+            ls-remote-refs
+
             git-checkout
             git-checkout?
             git-checkout-url
@@ -556,6 +558,37 @@ objects: 'ancestor (meaning that OLD is an ancestor of NEW), 'descendant, or
               (if (set-contains? oldest new)
                   'descendant
                   'unrelated))))))
+\f
+;;
+;;; Remote operations.
+;;;
+
+(define* (ls-remote-refs url #:key tags?)
+  "Return the list of references advertised at Git repository URL.  If TAGS?
+is true, limit to only refs/tags."
+  (define (ref? ref)
+    ;; Like `git ls-remote --refs', only show actual references.
+    (and (string-prefix? "refs/" ref)
+         (not (string-suffix? "^{}" ref))))
+
+  (define (tag? ref)
+    (string-prefix? "refs/tags/" ref))
+
+  (define (include? ref)
+    (and ref?
+         (or (not tags?) (tag? ref))))
+
+  (with-libgit2
+   (with-temporary-directory
+    (lambda (cache-directory)
+      (let* ((repository (repository-init cache-directory))
+             ;; Create an in-memory remote so we don't touch disk.
+             (remote (remote-create-anonymous repository url)))
+        (remote-connect remote)
+        (remote-disconnect remote)
+        (repository-close! repository)
+
+        (filter include? (map remote-head-name (remote-ls remote))))))))
 
 \f
 ;;;
diff --git a/guix/import/git.scm b/guix/import/git.scm
index 9a654c1972..097a2f70bc 100644
--- a/guix/import/git.scm
+++ b/guix/import/git.scm
@@ -17,7 +17,6 @@
 ;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
 
 (define-module (guix import git)
-  #:use-module (git)
   #:use-module (guix build utils)
   #:use-module (guix diagnostics)
   #:use-module (guix git)
@@ -126,40 +125,15 @@ char-set."
 \f
 ;;; Updater
 
-(define (get-remote url git-uri)
-  "Given a URL and GIT-URI, a <git-reference> record, return the ``origin'' remote."
-  (let* ((checkout (update-cached-checkout url
-                                           #:recursive?
-                                           (git-reference-recursive? git-uri)))
-         (repository (repository-open checkout)))
-    (remote-lookup repository "origin")))
-
-(define (get-latest-tag remote)
-  "Given a Git REMOTE, return that latest tag available."
-  (remote-connect remote)
-
-  (define tags
-    (sort-tags
-     (map (lambda (tag)
-            (string-drop tag (string-length "refs/tags/")))
-          (filter (lambda (ref)
-                    ;; Every tag has two refs:
-                    ;;
-                    ;; * refs/tags/1.2.3^{}
-                    ;; * refs/tags/1.2.3
-                    ;;
-                    ;; remove the one with the trailing ^{}
-                    (and (not (string-suffix? "^{}" ref))
-                         (string-prefix? "refs/tags/" ref)))
-                  (map (lambda (remote-head)
-                         (remote-head-name remote-head))
-                       (remote-ls remote))))))
-
-  (remote-disconnect remote)
-
-  (if (null? tags)
-      (git-no-tags-error)
-      (last tags)))
+(define (get-latest-tag url)
+  "Return the latest tag available from the Git repository at URL."
+  (let ((tags (map (cut string-drop <> (string-length "refs/tags/"))
+                   (ls-remote-refs url #:tags? #t))))
+
+    (if (null? tags)
+        (git-no-tags-error)
+        (last (sort-tags tags)))))
+
 
 (define (latest-git-tag-version package tag-prefix tag-suffix
                                 tag-version-delimiter)
@@ -177,8 +151,7 @@ properties of PACKAGE, returns the latest version of PACKAGE."
     (let* ((source (package-source package))
            (git-uri (origin-uri source))
            (url (git-reference-url (origin-uri source)))
-           (remote (get-remote url git-uri))
-           (latest-tag (get-latest-tag remote)))
+           (latest-tag (get-latest-tag url)))
       (get-version package
                    latest-tag
                    #:prefix tag-prefix

base-commit: 522a3bf99cbc21a9093f63280b9508cd69b94ff0
prerequisite-patch-id: c60e771d96884a78a014e145723562a619c1a0e0
-- 
2.32.0


  reply	other threads:[~2021-09-05  0:20 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 15:50 [bug#50359] [PATCH] import: Add 'generic-git' updater Xinglu Chen
2021-09-05  0:19 ` Sarah Morgensen [this message]
2021-09-05  1:03   ` Sarah Morgensen
2021-09-05 10:36   ` Xinglu Chen
2021-09-06  5:40     ` Sarah Morgensen
2021-09-06 12:20       ` Xinglu Chen
2021-09-07  1:00         ` Sarah Morgensen
2021-09-07 19:13           ` Xinglu Chen
2021-09-08 18:28             ` Xinglu Chen
2021-09-10  8:36               ` Ludovic Courtès
2021-09-10 13:23                 ` Xinglu Chen
2021-09-05 13:11   ` Xinglu Chen
2021-09-06  3:14     ` Sarah Morgensen
2021-09-10 16:20 ` [bug#50359] [PATCH 0/3] " Xinglu Chen
2021-09-10 16:21   ` [bug#50359] [PATCH 1/3] tests: git: Don't read from the users global Git config file Xinglu Chen
2021-09-10 16:21   ` [bug#50359] [PATCH 2/3] tests: git: Make 'tag' directive non-interactive Xinglu Chen
2021-09-13  8:03     ` Ludovic Courtès
2021-09-10 16:21   ` [bug#50359] [PATCH 3/3] import: Add 'generic-git' updater Xinglu Chen
2021-09-13  8:07     ` Ludovic Courtès
2021-09-16  9:09     ` Sarah Morgensen
2021-09-16 12:48       ` Xinglu Chen
2021-09-16 23:42         ` Sarah Morgensen
2021-09-17  7:48           ` Xinglu Chen
2021-09-17  8:04   ` [bug#50359] [PATCH v3 0/3] " Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 1/3] tests: git: Don't read from the users global Git config file Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 2/3] tests: git: Make 'tag' directive non-interactive Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 3/3] import: Add 'generic-git' updater Xinglu Chen
2021-09-18 17:47     ` bug#50359: [PATCH v3 0/3] " Ludovic Courtès
2021-09-15  8:44 ` [bug#50359] [PATCH 3/3] import: " iskarian
2021-09-15 11:59   ` Xinglu Chen
2021-09-16  9:46     ` Sarah Morgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86k0jvkh5v.fsf@mgsn.dev \
    --to=iskarian@mgsn.dev \
    --cc=50359@debbugs.gnu.org \
    --cc=public@yoctocell.xyz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).