unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: 65230@debbugs.gnu.org
Cc: "Maxim Cournoyer" <maxim.cournoyer@gmail.com>,
	"Christopher Baines" <guix@cbaines.net>,
	"Josselin Poiret" <dev@jpoiret.xyz>,
	"Ludovic Courtès" <ludo@gnu.org>,
	"Mathieu Othacehe" <othacehe@gnu.org>,
	"Ricardo Wurmus" <rekado@elephly.net>,
	"Simon Tournier" <zimon.toutoune@gmail.com>,
	"Tobias Geerinckx-Rice" <me@tobias.gr>
Subject: [bug#65230] [PATCH v4 08/10] gnu-maintenance: Add support to rewrite version in URL path.
Date: Tue, 22 Aug 2023 12:52:25 -0400	[thread overview]
Message-ID: <a509442a23966c2f48a4bd8b1e94f84991699872.1692723147.git.maxim.cournoyer@gmail.com> (raw)
In-Reply-To: <06b6c57b1af15b6ddca780182fc4a5e5264a67db.1692723147.git.maxim.cournoyer@gmail.com>

Fixes <https://issues.guix.gnu.org/64015>.
Fixes <https://issues.guix.gnu.org/65304>.

Previously, the generic HTML updater would only look for the list of files
found at the parent of its current source URL, ignoring that the URL may embed
the version elsewhere in its path.  This could cause 'guix refresh' to report
no updates available, while in fact there were, such as for 'libuv'.

* guix/gnu-maintenance.scm (strip-trailing-slash): New procedure.
(%version-rx): New variable.
(rewrite-url): New procedure.
(import-html-release): New rewrite-url? argument.  When true, use the above
procedure.
(import-html-updatable-release): Call import-html-release with #:rewrite-url
set to #t.
* tests/gnu-maintenance.scm ("rewrite-url, to-version specified")
("rewrite-url, without to-version"): New tests.
---

- Rebase and mention it also fixes #65304 in commit message

 guix/gnu-maintenance.scm  | 102 ++++++++++++++++++++++++++++++++++++--
 tests/gnu-maintenance.scm |  43 ++++++++++++++++
 2 files changed, 142 insertions(+), 3 deletions(-)

diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index 9eff98217e..228a84bd4b 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -3,6 +3,7 @@
 ;;; Copyright © 2012, 2013 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
 ;;; Copyright © 2022 Maxime Devos <maximedevos@telenet.be>
+;;; Copyright © 2023 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -26,6 +27,7 @@ (define-module (guix gnu-maintenance)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 match)
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-2)
   #:use-module (srfi srfi-11)
   #:use-module (srfi srfi-26)
   #:use-module (rnrs io ports)
@@ -61,6 +63,7 @@ (define-module (guix gnu-maintenance)
             gnu-package?
 
             uri-mirror-rewrite
+            rewrite-url
 
             release-file?
             releases
@@ -518,9 +521,93 @@ (define (canonicalize-url url base-url)
          ;; within a directory.
          (string-append (dirname base-url) "/" url))))
 
+(define (strip-trailing-slash s)
+  "Strip any trailing slash from S, a string."
+  (if (string-suffix? "/" s)
+      (string-drop-right s 1)
+      s))
+
+;;; TODO: Extend to support the RPM and GNOME version schemes?
+(define %version-rx "[0-9.]+")
+
+(define* (rewrite-url url version #:key to-version)
+  "Rewrite URL so that the URL path components matching the current VERSION or
+VERSION-MAJOR.VERSION-MINOR are updated with that of the latest version found
+by crawling the corresponding URL directories.  Alternatively, when TO-VERSION
+is specified, rewrite version matches directly to it without crawling URL.
+
+For example, the URL
+\"https://dist.libuv.org/dist/v1.45.0/libuv-v1.45.0.tar.gz\" could be
+rewritten to something like
+\"https://dist.libuv.org/dist/v1.46.0/libuv-v1.46.0.tar.gz\"."
+  ;; XXX: major-minor may be #f if version is not a triplet but a single
+  ;; number such as "2".
+  (let* ((major-minor (false-if-exception (version-major+minor version)))
+         (to-major-minor (false-if-exception
+                          (and=> to-version version-major+minor)))
+         (uri (string->uri url))
+         (url-prefix (string-drop-right url (string-length (uri-path uri))))
+         (url-prefix-components (string-split url-prefix #\/))
+         (path (uri-path uri))
+         ;; Strip a forward slash on the path to avoid a double slash when
+         ;; string-joining later.
+         (path (if (string-prefix? "/" path)
+                   (string-drop path 1)
+                   path))
+         (path-components (string-split path #\/)))
+    (string-join
+     (reverse
+      (fold
+       (lambda (s parents)
+         (if to-version
+             ;; Direct rewrite case; the archive is assumed to exist.
+             (let ((u (string-replace-substring s version to-version)))
+               (cons (if (and major-minor to-major-minor)
+                         (string-replace-substring u major-minor to-major-minor)
+                         u)
+                     parents))
+             ;; More involved HTML crawl case.
+             (let* ((pattern (if major-minor
+                                 (format #f "(~a|~a)" version major-minor)
+                                 (format #f "(~a)" version)))
+                    (m (string-match pattern s)))
+               (if m
+                   ;; Crawl parent and rewrite current component.
+                   (let* ((parent-url (string-join (reverse parents) "/"))
+                          (links (url->links parent-url))
+                          ;; The pattern matching the version.
+                          (pattern (string-append "^" (match:prefix m)
+                                                  "(" %version-rx ")"
+                                                  (match:suffix m) "$"))
+                          (candidates (filter-map
+                                       (lambda (l)
+                                         ;; Links may be followed by a
+                                         ;; trailing '/' in the case of
+                                         ;; directories.
+                                         (and-let*
+                                             ((l (strip-trailing-slash l))
+                                              (m (string-match pattern l))
+                                              (v (match:substring m 1)))
+                                           (cons v l)))
+                                       links)))
+                     ;; Retrieve the item having the largest version.
+                     (if (null? candidates)
+                         (error "no candidates found in rewrite-url")
+                         (cons (cdr (first (sort candidates
+                                                 (lambda (x y)
+                                                   (version>? (car x)
+                                                              (car y))))))
+                               parents)))
+                   ;; No version found in path component; continue.
+                   (cons s parents)))))
+       (reverse url-prefix-components)
+       path-components))
+     "/")))
+
 (define* (import-html-release base-url package
                               #:key
-                              (version #f)
+                              rewrite-url?
+                              version
                               (directory (string-append
                                           "/" (package-upstream-name package)))
                               file->signature)
@@ -534,11 +621,19 @@ (define* (import-html-release base-url package
 When FILE->SIGNATURE is omitted or #f, guess the detached signature file name,
 if any.  Otherwise, FILE->SIGNATURE must be a procedure; it is passed a source
 file URL and must return the corresponding signature URL, or #f it signatures
-are unavailable."
-  (let* ((name (package-upstream-name package))
+are unavailable.
+
+When REWRITE-URL? is #t, versioned components in BASE-URL and/or DIRECTORY are
+also updated to the latest version, as explained in the doc of the
+\"rewrite-url\" procedure used."
+  (let* ((current-version (package-version package))
+         (name (package-upstream-name package))
          (url (if (string-null? directory)
                   base-url
                   (string-append base-url directory "/")))
+         (url (if rewrite-url?
+                  (rewrite-url url current-version #:to-version version)
+                  url))
          (links (map (cut canonicalize-url <> url) (url->links url))))
 
     (define (file->signature/guess url)
@@ -877,6 +972,7 @@ (define* (import-html-updatable-release package #:key (version #f))
                         (dirname (uri-path uri)))))
     (false-if-networking-error
      (import-html-release base package
+                          #:rewrite-url? #t
                           #:version version
                           #:directory directory))))
 
diff --git a/tests/gnu-maintenance.scm b/tests/gnu-maintenance.scm
index 516e02ec6a..196a6f9092 100644
--- a/tests/gnu-maintenance.scm
+++ b/tests/gnu-maintenance.scm
@@ -147,4 +147,47 @@ (define-module (test-gnu-maintenance)
            (equal? (list expected-signature-url)
                    (upstream-source-signature-urls update))))))
 
+(test-equal "rewrite-url, to-version specified"
+  "https://download.qt.io/official_releases/qt/6.5/6.5.2/\
+submodules/qtbase-everywhere-src-6.5.2.tar.xz"
+  (rewrite-url "https://download.qt.io/official_releases/qt/6.3/6.3.2/\
+submodules/qtbase-everywhere-src-6.3.2.tar.xz" "6.3.2" #:to-version "6.5.2"))
+
+(test-equal "rewrite-url, without to-version"
+  "https://dist.libuv.org/dist/v1.46.0/libuv-v1.46.0.tar.gz"
+  (with-http-server
+      ;; First reply, crawling https://dist.libuv.org/dist/.
+      `((200 "\
+<!DOCTYPE html>
+<html>
+<head><title>Index of dist</title></head>
+<body>
+<a href=\"../\">../</a>
+<a href=\"v1.44.0/\" title=\"v1.44.0/\">v1.44.0/</a>
+<a href=\"v1.44.1/\" title=\"v1.44.1/\">v1.44.1/</a>
+<a href=\"v1.44.2/\" title=\"v1.44.2/\">v1.44.2/</a>
+<a href=\"v1.45.0/\" title=\"v1.45.0/\">v1.45.0/</a>
+<a href=\"v1.46.0/\" title=\"v1.46.0/\">v1.46.0/</a>
+</body>
+</html>")
+        ;; Second reply, crawling https://dist.libuv.org/dist/v1.46.0/.
+        (200 "\
+<!DOCTYPE html>
+<html>
+<head><title>Index of dist/v1.46.0</title></head>
+<body>
+<a href=\"../\">../</a>
+<a href=\"libuv-v1.46.0-dist.tar.gz\" title=\"libuv-v1.46.0-dist.tar.gz\">
+   libuv-v1.46.0-dist.tar.gz</a>
+<a href=\"libuv-v1.46.0-dist.tar.gz.sign\"
+   title=\"libuv-v1.46.0-dist.tar.gz.sign\">libuv-v1.46.0-dist.tar.gz.sign</a>
+<a href=\"libuv-v1.46.0.tar.gz\" title=\"libuv-v1.46.0.tar.gz\">
+   libuv-v1.46.0.tar.gz</a>
+<a href=\"libuv-v1.46.0.tar.gz.sign\" title=\"libuv-v1.46.0.tar.gz.sign\">
+   libuv-v1.46.0.tar.gz.sign</a>
+</body>
+</html>"))
+    (rewrite-url "https://dist.libuv.org/dist/v1.45.0/libuv-v1.45.0.tar.gz"
+                 "1.45.0")))
+
 (test-end)
-- 
2.41.0





  parent reply	other threads:[~2023-08-22 16:57 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11 18:42 [bug#65230] [PATCH 00/13] Fix 'guix refresh' for Qt and other packages Maxim Cournoyer
2023-08-11 18:44 ` [bug#65230] [PATCH 01/13] gnu-maintenance: Make base-url argument of import-html-release required Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 02/13] download: Add mirrors for Qt Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 03/13] gnu: qt: Streamline qt-urls Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 04/13] gnu: qt-creator: Use mirror://qt for source URI Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 05/13] gnu-maintenance: Fix docstring Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 06/13] gnu-maintenance: Extract url->links procedure Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 07/13] gnu-maintenance: Fix indentation Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 08/13] gnu-maintenance: Accept package object in 'import-html-release' procedure Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 09/13] gnu-maintenance: Document nested procedures in 'import-html-release' Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 10/13] gnu-maintenance: Extract 'canonicalize-url' from 'import-html-release' Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 11/13] gnu-maintenance: Add support to rewrite version in URL path Maxim Cournoyer
2023-08-11 18:44   ` [bug#65230] [PATCH 12/13] gnu-maintenance: Allow mirror URLs to fallback to the generic HTML updater Maxim Cournoyer
2023-08-11 18:45   ` [bug#65230] [PATCH 13/13] gnu-maintenance: Consider Qt source tarballs as "release files" Maxim Cournoyer
2023-08-15 20:29 ` [bug#65230] [PATCH v2 01/13] gnu-maintenance: Make base-url argument of import-html-release required Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 02/13] download: Add mirrors for Qt Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 03/13] gnu: qt: Streamline qt-urls Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 04/13] gnu: qt-creator: Use mirror://qt for source URI Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 05/13] gnu-maintenance: Fix docstring Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 06/13] gnu-maintenance: Extract url->links procedure Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 07/13] gnu-maintenance: Fix indentation Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 08/13] gnu-maintenance: Accept package object in 'import-html-release' procedure Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 09/13] gnu-maintenance: Document nested procedures in 'import-html-release' Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 10/13] gnu-maintenance: Extract 'canonicalize-url' from 'import-html-release' Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 11/13] gnu-maintenance: Add support to rewrite version in URL path Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 12/13] gnu-maintenance: Allow mirror URLs to fallback to the generic HTML updater Maxim Cournoyer
2023-08-15 20:29   ` [bug#65230] [PATCH v2 13/13] gnu-maintenance: Consider Qt source tarballs as "release files" Maxim Cournoyer
2023-08-21 18:06 ` [bug#65230] [PATCH v3 01/10] gnu-maintenance: Make base-url argument of import-html-release required Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 02/10] gnu-maintenance: Fix docstring Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 03/10] gnu-maintenance: Extract url->links procedure Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 04/10] gnu-maintenance: Fix indentation Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 05/10] gnu-maintenance: Accept package object in 'import-html-release' procedure Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 06/10] gnu-maintenance: Document nested procedures in 'import-html-release' Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 07/10] gnu-maintenance: Extract 'canonicalize-url' from 'import-html-release' Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 08/10] gnu-maintenance: Add support to rewrite version in URL path Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 09/10] gnu-maintenance: Allow mirror URLs to fallback to the generic HTML updater Maxim Cournoyer
2023-08-21 18:06   ` [bug#65230] [PATCH v3 10/10] gnu-maintenance: Consider Qt source tarballs as "release files" Maxim Cournoyer
2023-08-22 16:52 ` [bug#65230] [PATCH v4 01/10] gnu-maintenance: Make base-url argument of import-html-release required Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 02/10] gnu-maintenance: Fix docstring Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 03/10] gnu-maintenance: Extract url->links procedure Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 04/10] gnu-maintenance: Fix indentation Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 05/10] gnu-maintenance: Accept package object in 'import-html-release' procedure Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 06/10] gnu-maintenance: Document nested procedures in 'import-html-release' Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 07/10] gnu-maintenance: Extract 'canonicalize-url' from 'import-html-release' Maxim Cournoyer
2023-08-22 16:52   ` Maxim Cournoyer [this message]
2023-08-26 20:21     ` bug#65230: [PATCH v4 08/10] gnu-maintenance: Add support to rewrite version in URL path Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 09/10] gnu-maintenance: Allow mirror URLs to fallback to the generic HTML updater Maxim Cournoyer
2023-08-22 16:52   ` [bug#65230] [PATCH v4 10/10] gnu-maintenance: Consider Qt source tarballs as "release files" Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a509442a23966c2f48a4bd8b1e94f84991699872.1692723147.git.maxim.cournoyer@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=65230@debbugs.gnu.org \
    --cc=dev@jpoiret.xyz \
    --cc=guix@cbaines.net \
    --cc=ludo@gnu.org \
    --cc=me@tobias.gr \
    --cc=othacehe@gnu.org \
    --cc=rekado@elephly.net \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).