unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Xinglu Chen <public@yoctocell.xyz>
To: Sarah Morgensen <iskarian@mgsn.dev>
Cc: 50359@debbugs.gnu.org
Subject: [bug#50359] [PATCH] import: Add 'generic-git' updater.
Date: Sun, 05 Sep 2021 12:36:24 +0200	[thread overview]
Message-ID: <87h7ez48d3.fsf@yoctocell.xyz> (raw)
In-Reply-To: <86k0jvkh5v.fsf@mgsn.dev>

[-- Attachment #1: Type: text/plain, Size: 10852 bytes --]

On Sat, Sep 04 2021, Sarah Morgensen wrote:

> Hello,
>
> Thanks for the patch!  Glad to see this idea becoming more polished.
>
> Xinglu Chen <public@yoctocell.xyz> writes:
>
>> * guix/import/git.scm: New file.
>> * doc/guix.texi (Invoking guix refresh): Document it.
>> * Makefile.am (MODULES): Register it.
>> ---
>> This patch adds a new ‘generic-git’ updater which can check for new tags
>> for package hosted on Git repos.  However, it cannot download Git repos
>> and update the package definitions, i.e. ‘guix refresh -u’.  There is a
>> pending patch that would add this feature though[1].
>>
>> ‘guix refresh -L’ now reports
>>
>>   Available updaters:
>>   […]
>>   94.5% of the packages are covered by these updaters.
>>
>> We are getting close to 100% :-)
>
> Wow, that is close!
>
>>
>> See it in action!
>>
>> $ ./pre-inst-env guix refresh harmonist scdoc gmnisrv
>> gnu/packages/web.scm:7931:4: warning: no tags were found for package `gmnisrv'
>> gnu/packages/web.scm:7931:4: warning: 'generic-git' updater failed to determine available releases for gmnisrv
>> gnu/packages/man.scm:339:12: scdoc would be upgraded from 1.10.1 to 1.11.1
>> gnu/packages/games.scm:9433:2: warning: failed to fetch Git repository for package `harmonist'
>> gnu/packages/games.scm:9433:2: warning: 'generic-git' updater failed to determine available releases for harmonist
>
> FWIW, harmonist and a few other packages fail to work because they use
> an old git protocol which is not supported by libgit2.
>
> [...]
>> +
>> +@itemize
>> +@item @code{tag-prefix}: a regular expression for matching a prefix of
>> +the tag name.
>> +
>> +@item @code{tag-suffix}: a regular expression for matching a suffix of
>> +the tag name.
>> +
>> +@item @code{tag-version-delimiter}: a string used as the delimiter in
>> +the tag name for separating the numbers of the version.
>> +@end itemize
>> +
>> +@lisp
>> +(package
>> +  (name "foo")
>> +  ;; ...
>> +  (properties
>> +    '((tag-prefix . "^release0-")
>> +      (tag-suffix . "[a-z]?$")
>> +      (tag-version-delimiter . ":"))))
>> +@end lisp
>              ^ extra whitespace
>
> I do like the selection of (prefix, suffix, delimiter), though I think
> there are only one or two packages which use a different delimiter.
>
> [...]
>> +;;; Errors & warnings
>> +
>> +(define-condition-type &git-tag-error &error
>> +  git-tag-error?
>> +  (kind git-tag-error-kind))
>> +
>> +(define (git-tag-error kind)
>> +  (raise (condition (&message (message (format "bad `~a' property")))
>> +                    (&git-tag-error
>> +                     (kind kind)))))
>
> When I trigger this error, I get:
> --8<---------------cut here---------------start------------->8---
> In ice-9/exceptions.scm:
>    406:15  6 (latest-git-release _)
> In ice-9/boot-9.scm:
>   1752:10  5 (with-exception-handler _ _ #:unwind? _ # _)
> In guix/import/git.scm:
>     59:39  4 (get-version _ _ #:prefix _ #:suffix _ #:delim _)
> In unknown file:
>            3 (simple-format #f "bad `~a' property")
> In ice-9/boot-9.scm:
>   1685:16  2 (raise-exception _ #:continuable? _)
>   1683:16  1 (raise-exception _ #:continuable? _)
>   1685:16  0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> In procedure simple-format: FORMAT: Missing argument for ~a
> --8<---------------cut here---------------end--------------->8---

Oops, it should be

  (format "bad `~a' property" kind)
  
>> +
>> +(define (git-tag-warning package c)
>> +  (warning (package-location package)
>> +           (G_ "~a for package `~a'~%")
>> +           (condition-message c)
>> +           (package-name package)))
>> +
>> +(define-condition-type &git-no-tags-error &error
>> +  git-no-tags-error?)
>> +
>> +(define (git-no-tags-error)
>> +  (raise (condition (&message (message "no tags were found"))
>> +                    (&git-no-tags-error))))
>> +
>> +(define (git-no-tags-warning package c)
>> +  (warning (package-location package)
>> +           (G_ "~a for package `~a'~%")
>> +           (condition-message c)
>> +           (package-name package)))
>> +
>> +(define (git-fetch-warning package)
>> +  (warning (package-location package)
>> +           (G_ "failed to fetch Git repository for package `~a'~%")
>> +           (package-name package)))
>> +
>> +\f
>> +;;; Helper functions
>> +
>> +(define (string-split* str delim)
>> +  "Like `string-split', but DELIM is a string instead of a
>> +char-set."
>> +  (filter (lambda (str) (not (equal? str "")))
>> +          (string-split str (string->char-set delim))))
>
>   (string-split* "1:2.3" ":.") -> ("1" "2" "3")
>   (string-split* "1a2b3" "ab") -> ("1" "2" "3")
>
> Is this what you intended?  The documentation above makes it sound like
> the whole string serves as the delimiter.

It’s not what I wanted, indeed.  I will try to fix it.

>> +
>> +(define* (get-version package tag #:key prefix suffix delim)
>
> PACKAGE is not used by this procedure.

Good catch, it was some leftover I forgot to remove.

>> +  (define delim* (if delim delim "."))
>> +  (define prefix-regexp "^[^0-9]*")
>> +  (define suffix-regexp (string-append "[^0-9" (regexp-quote delim*) "]*$"))
>
> With a delimiter of '.', this would say the suffix of '1.2.3.prerelease'
> is 'prerelease', not '.prerelease'.  Is this correct?  (I would be
> tempted to just remove delim* from this.)

Good point, I think removing ‘delim*’ would be a good idea.

>> +  (define delim-regexp (string-append "^[0-9]+" (regexp-quote delim*) "[0-9]+"))
>
> This fails to account for versions which use non-numerics, such as (all
> taken from the package-version field of packages using git-fetch and
> which use this version as the tag):
>
> 1.0.0-beta.0
> 0.0.9.4f
> 4.4-git.1
> 5.2.0-alpha
> 0.2.0-alpha-199-g3e7a475
> 20200701.154658.b0d6223
> 12-068oasis4
> 4.0.0.dev8
> 0.32-14-gcdfe14e
> 2.8-fix-2
>
> There are about 50-60 packages like this.
>
> I'm not sure how much effort should be spent including them, and for
> some of them I'm not sure what our ideal behavior *is*.  Even if we
> could reliably detect them, should "alpha" or "dev" packages be returned
> by the updater?

I don’t think we usually include alpha or rc releases, so updater
probably shouldn’t return them either.  Not sure how we would try to
detect alpha/beta/rc releases, though, besides running something like

  (string-match? "(alpha|beta|rc|dev)" TAG)

On each tag.

> Upon investigation, there is a deeper problem: version-compare thinks
> "5.2.0" is a lower version than "5.2.0-alpha", and that "4.0.0" is lower
> than "4.0.0.dev8".
>
> scheme@(guile-user)> (version-compare "5.1.9" "5.2.0")
> $5 = <
> scheme@(guile-user)> (version-compare "5.2.0" "5.2.0-alpha")
> $6 = <
> scheme@(guile-user)> (version-compare "4.0.0" "4.0.0.dev8")
> $7 = <

Maybe we should filter the tags before comparing them; that should
get rid of these pre-release tags.

>> +
>> +  (define no-prefix
>> +    (let ((match (string-match (or prefix prefix-regexp) tag)))
>> +      (if match
>> +          (regexp-substitute #f match 'post)
>> +          (git-tag-error 'tag-prefix))))
>> +
>> +  (define no-suffix
>> +    (let ((match (string-match (or suffix suffix-regexp) no-prefix)))
>> +      (if match
>> +          (regexp-substitute #f match 'pre)
>> +          (git-tag-error 'tag-suffix))))
>> +
>> +  (define no-delims
>> +    (if (string-match delim-regexp no-suffix)
>> +        (string-split* no-suffix delim*)
>> +        (git-tag-error 'tag-version-delimiter)))
>
> This throws an error if the version doesn't have any delimiter.

Setting the ‘tag-version-delimiter’ prefix to an empty string would
solve this, right?  Or, maybe we should just get rid of the delimiter
thing since only a few packages use a different delimiter.

> Actually, it throws an error in a lot of other cases too, often saying
> the 'tag-version-delimiter is wrong when it's something else.  Consider
> the tags from the "openjpeg" package, sorted by 'sort-tags':
>
> arelease
> opj0-97
> start
> v2.1.1
> v2.1.2
> v2.2.0
> v2.3.0
> v2.3.1
> v2.4.0
> version.1.0
> version.1.1
> version.1.2
> version.1.3
> version.1.4
> version.1.5
> version.1.5.1
> version.1.5.2
> version.2.0
> version.2.0.1
> version.2.1
> wg1n6848
>
> At first, 'get-version' throws an error because "wg1n6848" doesn't have
> a delimiter. But even disregarding that, it would return "version.2.1"
> -> "2.1" as the latest version.
>
> Probably we should process all tags with 'get-version' (simply skipping
> any that don't parse) and use that to sort the tags.  If none parse with
> 'get-version' we could use the "no tags" error or have a separate error
> for "there were tags but we couldn't process them".

Ah, yes, that would be a good idea.

> And this lets us just do something like (untested):
>
> (define* (get-version tag #:key prefix suffix delim)
>   (define delim-rx (regexp-quote (or delim ".")))
>   (define prefix-rx (or prefix "[^[:digit:]]*"))
>   (define suffix-rx (or suffix ".*"))
>   (define version-char-rx
>    (string-append "[^" delim-rx "[:punct:]]"))
>
>   (define tag-rx
>    (string-append "^" prefix "(" version-char-rx "+("
>                   delim-rx version-char-rx ")*)" suffix-rx "$"))
>
>   (and=> (string-match tag-rx tag)
>          (cut match-substring <> 1)))
>
> Though at this point, 'tag-rx' should probably be constructed and
> compiled outside the loop.
>
>> +
>> +  (string-join no-delims "."))
>> +
>> +(define (sort-tags tags)
>> +  "Sort TAGS, a list if Git tags, such that the latest tag is the last element."
>> +  (sort tags (lambda (a b)
>> +               (eq? (version-compare a b) '<))))
>> +
>> +\f
>> +;;; Updater
>> +
>> +(define (get-remote url git-uri)
>> +  "Given a URL and GIT-URI, a <git-reference> record, return the ``origin'' remote."
>> +  (let* ((checkout (update-cached-checkout url
>> +                                           #:recursive?
>> +                                           (git-reference-recursive? git-uri)))
>> +         (repository (repository-open checkout)))
>> +    (remote-lookup repository "origin")))
>
> We surely don't want 'update-cached-checkout' since that fetches the
> whole repo history!  I've attached a patch below (based on top of this
> one) which brings the total time-per-package to under 1s.  I moved it to
> (guix git) to make use of 'with-libgit2' which ensures we use system
> certificates.
>
> Apologies for such a long reply. I hope it was helpful :)

No worries, it definitely helped a lot, thank you!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

  parent reply	other threads:[~2021-09-05 10:37 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 15:50 [bug#50359] [PATCH] import: Add 'generic-git' updater Xinglu Chen
2021-09-05  0:19 ` Sarah Morgensen
2021-09-05  1:03   ` Sarah Morgensen
2021-09-05 10:36   ` Xinglu Chen [this message]
2021-09-06  5:40     ` Sarah Morgensen
2021-09-06 12:20       ` Xinglu Chen
2021-09-07  1:00         ` Sarah Morgensen
2021-09-07 19:13           ` Xinglu Chen
2021-09-08 18:28             ` Xinglu Chen
2021-09-10  8:36               ` Ludovic Courtès
2021-09-10 13:23                 ` Xinglu Chen
2021-09-05 13:11   ` Xinglu Chen
2021-09-06  3:14     ` Sarah Morgensen
2021-09-10 16:20 ` [bug#50359] [PATCH 0/3] " Xinglu Chen
2021-09-10 16:21   ` [bug#50359] [PATCH 1/3] tests: git: Don't read from the users global Git config file Xinglu Chen
2021-09-10 16:21   ` [bug#50359] [PATCH 2/3] tests: git: Make 'tag' directive non-interactive Xinglu Chen
2021-09-13  8:03     ` Ludovic Courtès
2021-09-10 16:21   ` [bug#50359] [PATCH 3/3] import: Add 'generic-git' updater Xinglu Chen
2021-09-13  8:07     ` Ludovic Courtès
2021-09-16  9:09     ` Sarah Morgensen
2021-09-16 12:48       ` Xinglu Chen
2021-09-16 23:42         ` Sarah Morgensen
2021-09-17  7:48           ` Xinglu Chen
2021-09-17  8:04   ` [bug#50359] [PATCH v3 0/3] " Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 1/3] tests: git: Don't read from the users global Git config file Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 2/3] tests: git: Make 'tag' directive non-interactive Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 3/3] import: Add 'generic-git' updater Xinglu Chen
2021-09-18 17:47     ` bug#50359: [PATCH v3 0/3] " Ludovic Courtès
2021-09-15  8:44 ` [bug#50359] [PATCH 3/3] import: " iskarian
2021-09-15 11:59   ` Xinglu Chen
2021-09-16  9:46     ` Sarah Morgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7ez48d3.fsf@yoctocell.xyz \
    --to=public@yoctocell.xyz \
    --cc=50359@debbugs.gnu.org \
    --cc=iskarian@mgsn.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).