all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Xinglu Chen <public@yoctocell.xyz>
To: Sarah Morgensen <iskarian@mgsn.dev>
Cc: 50359@debbugs.gnu.org
Subject: [bug#50359] [PATCH] import: Add 'generic-git' updater.
Date: Tue, 07 Sep 2021 21:13:22 +0200	[thread overview]
Message-ID: <87wnnsyzal.fsf@yoctocell.xyz> (raw)
In-Reply-To: <86pmtli4hn.fsf@mgsn.dev>

[-- Attachment #1: Type: text/plain, Size: 16391 bytes --]

On Mon, Sep 06 2021, Sarah Morgensen wrote:

> Hi,
>
> Xinglu Chen <public@yoctocell.xyz> writes:
>
>> Any trick you used to find all of there weird version numbers?  :-)
>
> This monstrosity:
>
>   rg -U -B4 --pcre2 '(?!\(let.*(\n.*){0,1})\(version "([^\n"]*[^0-9\.][^"\n]*)".*(\n.*){0,10}commit.*version' gnu/packages
>
> and to show just the versions:
>
>   rg -Uor '$2' --pcre2 --no-filename --no-line-number 

Wow!  I will try that and see for myself!  :-)

>>> IMO, just get rid of the delimiter.  If we wanted to be *that* flexible,
>>> we could make it so they provide a tag->version proc instead of (prefix,
>>> suffix, delimiter).
>>
>> a ‘tag->version’ procedure would probably make things a bit too
>> complicated for the people writing package definitions.  For example,
>> having a delimiter would make it easy to match a tag like
>> “2021-01-01-release”
>>
>> Delimiter is “.” (sorry if this hurts your eyes ;-))
>>
>> scheme@(guile-user)> (match:substring (string-match "^[^0-9]*([^\\.[:punct:]]+(\\.[^\\.[:punct:]]+)*).*$" "2021-01-01-release") 1)
>> $28 = "2021"
>>
>> Delimiter is “-”
>>
>> scheme@(guile-user)> (match:substring (string-match "^[^0-9]*([^-[:punct:]]+(-[^-[:punct:]]+)*).*$" "2021-01-01-release") 1)
>> $29 = "2021-01-01-release"
>>
>> And then, setting the suffix to “-release” would match just the version
>> part.
>
> Right.  I missed that.
>
> In that vein, should we keep the dashes in "2021-01-01" or convert them
> to periods?

Having periods would be more consistent, then could have a
‘date->version’ procedure that replaces the hyphens with dots and have

  (git-reference
    (url "https://git.example.org")
    (commit (date->version version)))

> What about when a tag has underscores?

Hmm, not sure about that, below is a list of packages I could find which had
underscores as delimiters

gnu/packages/graphics.scm
239:          (commit "DIRECTFB_1_7_7")))
gnu/packages/gstreamer.scm
326:          (commit "ESOUND_0_2_41")))
gnu/packages/java.scm
13925:                      (commit "jboss-transaction-api_1.2_spec-1.1.1.Final")))

They all seem to use periods in the ‘version’ field, though, so I would
say that the underscroes, should also be converted to periods.

> What if a repo has tags in both formats?  Then "3.0.1" would be
> considered older than "2011-01-01".

That’s tricky, there isn’t really a way to know how old “3.0.1” is,
without looking at the metadata of the tag.  Maybe this is one of those
corner cases which can’t really automatically determine the latest
release.  Should we have a ‘no-refresh?’ property to tell the refresh to
not try to update the package?

> Maybe we should just add an extra bit to detect a date format and only
> consider it when there's no "proper versions"?

That could be a good idea!

> Aaaand I fell down a rabbit hole after that :) I've attached a patch
> with what I've done.  It still has lots of issues--it requires the tag
> to contain at least one version delimiter, it requires the first
> character of the version to be a number... it might not even be better
> than before I touched it, and even so the added complexity might not be
> worth it.  But if you'd like to take it for a spin, I've attached it (it
> applies straight on master).

Great! I will try it out and see how it compares to my current WIP
version.

Not having characters in the first version number probably isn’t such a
big deal, most version that contain characters end with a character.
E.g., “1.2.3a” is not to uncommon, but “a1.2.3” is rarely seen.

> --
> Sarah
>
> From 08bd59a7fa1aa9735a1794672ce8d1f683d3d6db Mon Sep 17 00:00:00 2001
> Message-Id: <08bd59a7fa1aa9735a1794672ce8d1f683d3d6db.1630975873.git.iskarian@mgsn.dev>
> From: Xinglu Chen <public@yoctocell.xyz>
> Date: Fri, 3 Sep 2021 17:50:56 +0200
> Subject: [PATCH] import: Add 'generic-git' updater.
>
> * guix/import/git.scm: New file.
> * doc/guix.texi (Invoking guix refresh): Document it.
> * Makefile.am (MODULES): Register it.
> ---
>  Makefile.am         |   1 +
>  doc/guix.texi       |  27 +++++++
>  guix/git.scm        |  33 ++++++++
>  guix/import/git.scm | 191 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 252 insertions(+)
>  create mode 100644 guix/import/git.scm
>
> diff --git a/Makefile.am b/Makefile.am
> index 3c79760734..c4d3a456b1 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -254,6 +254,7 @@ MODULES =					\
>    guix/import/egg.scm   			\
>    guix/import/elpa.scm   			\
>    guix/import/gem.scm				\
> +  guix/import/git.scm                           \
>    guix/import/github.scm   			\
>    guix/import/gnome.scm				\
>    guix/import/gnu.scm				\
> diff --git a/doc/guix.texi b/doc/guix.texi
> index 36a0c7f5ec..26afb1607a 100644
> --- a/doc/guix.texi
> +++ b/doc/guix.texi
> @@ -11920,6 +11920,33 @@ the updater for @uref{https://launchpad.net, Launchpad} packages.
>  @item generic-html
>  a generic updater that crawls the HTML page where the source tarball of
>  the package is hosted, when applicable.
> +@item generic-git
> +a generic updater for packages hosted on Git repositories.  It tries to
> +be smart about parsing Git tag names, but if it is not able to parse the
> +tag name and compare tags correctly, users can define the following
> +properties for a package.
> +
> +@itemize
> +@item @code{tag-prefix}: a regular expression for matching a prefix of
> +the tag name.
> +
> +@item @code{tag-suffix}: a regular expression for matching a suffix of
> +the tag name.
> +
> +@item @code{tag-version-delimiter}: a string used as the delimiter in
> +the tag name for separating the numbers of the version.
> +@end itemize
> +
> +@lisp
> +(package
> +  (name "foo")
> +  ;; ...
> +  (properties
> +    '((tag-prefix . "^release0-")
> +      (tag-suffix . "[a-z]?$")
> +      (tag-version-delimiter . ":"))))
> +@end lisp      
> +
>  @end table
>  
>  For instance, the following command only checks for updates of Emacs
> diff --git a/guix/git.scm b/guix/git.scm
> index 9c6f326c36..c5d0d2da8e 100644
> --- a/guix/git.scm
> +++ b/guix/git.scm
> @@ -56,6 +56,8 @@
>              commit-difference
>              commit-relation
>  
> +            ls-remote-refs
> +
>              git-checkout
>              git-checkout?
>              git-checkout-url
> @@ -556,6 +558,37 @@ objects: 'ancestor (meaning that OLD is an ancestor of NEW), 'descendant, or
>                (if (set-contains? oldest new)
>                    'descendant
>                    'unrelated))))))
> +\f
> +;;
> +;;; Remote operations.
> +;;;
> +
> +(define* (ls-remote-refs url #:key tags?)
> +  "Return the list of references advertised at Git repository URL.  If TAGS?
> +is true, limit to only refs/tags."
> +  (define (ref? ref)
> +    ;; Like `git ls-remote --refs', only show actual references.
> +    (and (string-prefix? "refs/" ref)
> +         (not (string-suffix? "^{}" ref))))
> +
> +  (define (tag? ref)
> +    (string-prefix? "refs/tags/" ref))
> +
> +  (define (include? ref)
> +    (and (ref? ref)
> +         (or (not tags?) (tag? ref))))
> +
> +  (with-libgit2
> +   (call-with-temporary-directory
> +    (lambda (cache-directory)
> +      (let* ((repository (repository-init cache-directory))
> +             ;; Create an in-memory remote so we don't touch disk.
> +             (remote (remote-create-anonymous repository url)))
> +        (remote-connect remote)
> +        (remote-disconnect remote)
> +        (repository-close! repository)
> +
> +        (filter include? (map remote-head-name (remote-ls remote))))))))
>  
>  \f
>  ;;;
> diff --git a/guix/import/git.scm b/guix/import/git.scm
> new file mode 100644
> index 0000000000..8568981af2
> --- /dev/null
> +++ b/guix/import/git.scm
> @@ -0,0 +1,191 @@
> +;;; GNU Guix --- Functional package management for GNU
> +;;; Copyright © 2021 Xinglu Chen <public@yoctocell.xyz>
> +;;; Copyright © 2021 Sarah Morgensen <iskarian@mgsn.dev>
> +;;;
> +;;; This file is part of GNU Guix.
> +;;;
> +;;; GNU Guix is free software; you can redistribute it and/or modify it
> +;;; under the terms of the GNU General Public License as published by
> +;;; the Free Software Foundation; either version 3 of the License, or (at
> +;;; your option) any later version.
> +;;;
> +;;; GNU Guix is distributed in the hope that it will be useful, but
> +;;; WITHOUT ANY WARRANTY; without even the implied warranty of
> +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;;; GNU General Public License for more details.
> +;;;
> +;;; You should have received a copy of the GNU General Public License
> +;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
> +
> +(define-module (guix import git)
> +  #:use-module (guix build utils)
> +  #:use-module (guix diagnostics)
> +  #:use-module (guix git)
> +  #:use-module (guix git-download)
> +  #:use-module (guix i18n)
> +  #:use-module (guix packages)
> +  #:use-module (guix upstream)
> +  #:use-module (guix utils)
> +  #:use-module (ice-9 format)
> +  #:use-module (ice-9 match)
> +  #:use-module (ice-9 rdelim)
> +  #:use-module (ice-9 regex)
> +  #:use-module (srfi srfi-1)
> +  #:use-module (srfi srfi-26)
> +  #:use-module (srfi srfi-34)
> +  #:use-module (srfi srfi-35)
> +  #:use-module (srfi srfi-71)
> +  #:export (%generic-git-updater))
> +
> +;;; Commentary:
> +;;;
> +;;; This module provides a generic package updater for packages hosted on Git
> +;;; repositories.
> +;;;
> +;;; It tries to be smart about tag names, but if it is not automatically able
> +;;; to parse the tag names correctly, users can set the `tag-prefix',
> +;;; `tag-suffix' and `tag-version-delimiter' properties of the package to make
> +;;; the updater parse the Git tag name correctly.
> +;;;
> +;;; Code:
> +
> +;;; Errors & warnings
> +
> +(define-condition-type &git-no-valid-tags-error &error
> +  git-no-valid-tags-error?)
> +
> +(define (git-no-valid-tags-error)
> +  (raise (condition (&message (message "no valid tags found"))
> +                    (&git-no-valid-tags-error))))
> +
> +(define-condition-type &git-no-tags-error &error
> +  git-no-tags-error?)
> +
> +(define (git-no-tags-error)
> +  (raise (condition (&message (message "no tags were found"))
> +                    (&git-no-tags-error))))
> +
> +\f
> +;;; Updater
> +
> +(define* (get-version-mapping tags #:key prefix suffix delim)
> +  (define (guess-delim)
> +    (let ((total (length tags))
> +          (dots (reduce + 0 (map (cut string-count <> #\.) tags)))
> +          (dashes (reduce + 0 (map (cut string-count <> #\-) tags)))
> +          (underscores (reduce + 0 (map (cut string-count <> #\_) tags))))
> +      ;; (format #t "total: ~d, dots: ~d, dashes ~d, underscores ~d~%"
> +      ;;         total dots dashes underscores)
> +      (cond
> +       ((> dots (* total 0.35)) ".")
> +       ((> dashes (* total 0.8)) "-")
> +       ((> underscores (* total 0.8)) "_")
> +       (else "."))))

These numbers seem rather arbitrary, how did you arrive at them?

Also, AFAICS, versions without delimiters won’t be matched.

> +  (define delim-rx (regexp-quote (or delim (guess-delim))))
> +  (define suffix-rx (or suffix ""))
> +
> +  (define prefix-rx
> +    (make-regexp (string-append "^" (or prefix ".*") "$")))

Why is there a $ here?

> +  (define tag-rx
> +    (make-regexp
> +     (string-append "([[:digit:]][^" delim-rx "[:punct:]]*"
> +                    "(" delim-rx "[^" delim-rx "]+)+"
> +                    ")" suffix-rx "$")))
> +
> +  (define (get-version tag)
> +    (let ((tag-match (regexp-exec tag-rx tag)))
> +      (and tag-match
> +           (regexp-exec prefix-rx (match:prefix tag-match))
> +           (regexp-substitute/global
> +            #f delim-rx (match:substring tag-match 1)
> +            'pre "." 'post))))
> +
> +  (define (entry<? a b)
> +    (eq? (version-compare (car a) (car b)) '<))
> +
> +  (let ((mapping (fold alist-cons '() (map get-version tags) tags)))
> +    (stable-sort! (filter car mapping) entry<?)))
> +
> +(define* (get-latest-tag url #:key prefix suffix delim)
> +  "Return the latest tag available from the Git repository at URL."
> +  (define (pre-release? tag)
> +    (any (cut string-contains tag <>)
> +         '("alpha" "beta" "rc" "dev" "test")))

This would only match lower-case characters, but sometimes upper-case
characters are used.

  (define (pre-release? tag)
    (any (lambda (rx) (regexp-exec (make-regexp rx regexp/icase) tag))
              '("alpha" "beta" "rc" "dev" "test")))
              
> +  (let* ((tags (map (cut string-drop <> (string-length "refs/tags/"))
> +                    (ls-remote-refs url #:tags? #t)))
> +         (versions->tags
> +          (get-version-mapping (filter (negate pre-release?) tags)
> +                               #:prefix prefix
> +                               #:suffix suffix
> +                               #:delim delim)))
> +    (cond
> +     ((null? tags)
> +      (git-no-tags-error))
> +     ((null? versions->tags)
> +      (git-no-valid-tags-error))
> +     (else
> +      (match (last versions->tags)
> +        ((version . tag)
> +         (values version tag)))))))
> +
> +(define (latest-git-tag-version package tag-prefix tag-suffix
> +                                tag-version-delimiter)
> +  "Given a PACKAGE, the TAG-PREFIX, TAG-SUFFIX, and TAG-VERSION-DELIMITER
> +properties of PACKAGE, returns the latest version of PACKAGE."
> +  (guard (c ((or (git-no-tags-error? c) (git-no-valid-tags-error? c))
> +             (warning (or (package-field-location package 'source)
> +                          (package-location package))
> +                      (G_ "~a for ~a~%")
> +                      (condition-message c)
> +                      (package-name package))
> +             #f)
> +            ((eq? (exception-kind c) 'git-error)
> +             (warning (or (package-field-location package 'source)
> +                          (package-location package))
> +                      (G_ "failed to fetch Git repository for ~a~%")
> +                      (package-name package))
> +             #f))
> +    (let* ((source (package-source package))
> +           (url (git-reference-url (origin-uri source))))
> +      ;;(format #t "~a~%" (package-name package))
> +      (get-latest-tag url #:prefix tag-prefix #:suffix tag-suffix
> +                      #:delim tag-version-delimiter))))
> +
> +(define (git-package? package)
> +  "Whether the origin of PACKAGE is a Git repostiory."
> +  (match (package-source package)
> +    ((? origin? origin)
> +     (and (eq? (origin-method origin) git-fetch)
> +          (git-reference? (origin-uri origin))))
> +    (_ #f)))
> +
> +(define (latest-git-release package)
> +  "Return the latest release of PACKAGE."
> +  (let* ((name (package-name package))
> +         (properties (package-properties package))
> +         (tag-prefix (assq-ref properties 'tag-prefix))
> +         (tag-suffix (assq-ref properties 'tag-suffix))
> +         (tag-version-delimiter (assq-ref properties 'tag-version-delimiter))
> +         (old-version (package-version package))
> +         (url (git-reference-url (origin-uri (package-source package))))
> +         (new-version (latest-git-tag-version package
> +                                              tag-prefix
> +                                              tag-suffix
> +                                              tag-version-delimiter)))
> +
> +    (if new-version
> +        (upstream-source
> +         (package name)
> +         (version new-version)
> +         (urls (list url)))
> +        ;; No new release or no tags available.
> +        #f)))
> +
> +(define %generic-git-updater
> +  (upstream-updater
> +   (name 'generic-git)
> +   (description "Updater for packages hosted on Git repositories")
> +   (pred git-package?)
> +   (latest latest-git-release)))
>
> base-commit: 522a3bf99cbc21a9093f63280b9508cd69b94ff0
> -- 
> 2.32.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

  reply	other threads:[~2021-09-07 19:14 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 15:50 [bug#50359] [PATCH] import: Add 'generic-git' updater Xinglu Chen
2021-09-05  0:19 ` Sarah Morgensen
2021-09-05  1:03   ` Sarah Morgensen
2021-09-05 10:36   ` Xinglu Chen
2021-09-06  5:40     ` Sarah Morgensen
2021-09-06 12:20       ` Xinglu Chen
2021-09-07  1:00         ` Sarah Morgensen
2021-09-07 19:13           ` Xinglu Chen [this message]
2021-09-08 18:28             ` Xinglu Chen
2021-09-10  8:36               ` Ludovic Courtès
2021-09-10 13:23                 ` Xinglu Chen
2021-09-05 13:11   ` Xinglu Chen
2021-09-06  3:14     ` Sarah Morgensen
2021-09-10 16:20 ` [bug#50359] [PATCH 0/3] " Xinglu Chen
2021-09-10 16:21   ` [bug#50359] [PATCH 1/3] tests: git: Don't read from the users global Git config file Xinglu Chen
2021-09-10 16:21   ` [bug#50359] [PATCH 2/3] tests: git: Make 'tag' directive non-interactive Xinglu Chen
2021-09-13  8:03     ` Ludovic Courtès
2021-09-10 16:21   ` [bug#50359] [PATCH 3/3] import: Add 'generic-git' updater Xinglu Chen
2021-09-13  8:07     ` Ludovic Courtès
2021-09-16  9:09     ` Sarah Morgensen
2021-09-16 12:48       ` Xinglu Chen
2021-09-16 23:42         ` Sarah Morgensen
2021-09-17  7:48           ` Xinglu Chen
2021-09-17  8:04   ` [bug#50359] [PATCH v3 0/3] " Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 1/3] tests: git: Don't read from the users global Git config file Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 2/3] tests: git: Make 'tag' directive non-interactive Xinglu Chen
2021-09-17  8:04     ` [bug#50359] [PATCH v3 3/3] import: Add 'generic-git' updater Xinglu Chen
2021-09-18 17:47     ` bug#50359: [PATCH v3 0/3] " Ludovic Courtès
2021-09-15  8:44 ` [bug#50359] [PATCH 3/3] import: " iskarian
2021-09-15 11:59   ` Xinglu Chen
2021-09-16  9:46     ` Sarah Morgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wnnsyzal.fsf@yoctocell.xyz \
    --to=public@yoctocell.xyz \
    --cc=50359@debbugs.gnu.org \
    --cc=iskarian@mgsn.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.