all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludovic.courtes@inria.fr>
To: zimoun <zimon.toutoune@gmail.com>
Cc: 43442@debbugs.gnu.org
Subject: [bug#43442] [PATCH] Fixes init of #42162: gforge.inria.fr down Dec. 2020
Date: Mon, 20 Mar 2023 15:09:11 +0100	[thread overview]
Message-ID: <87jzzbms54.fsf_-_@gnu.org> (raw)
In-Reply-To: <87y2knhei3.fsf@gnu.org> ("Ludovic Courtès"'s message of "Sat, 03 Oct 2020 10:59:00 +0200")

[-- Attachment #1: Type: text/plain, Size: 3744 bytes --]

Hi!

Ludovic Courtès <ludo@gnu.org> skribis:

> Ah yes, under “extra_headers” there’s the SVN revision number.  (guix
> swh) doesn’t expose “extra_headers” yet, but once it does, we can walk
> snapshots until we find the SVN revision we’re looking for.
>
> scheme@(guile-user)> (lookup-origin "https://scm.gforge.inria.fr/anonscm/svn/mpfi/")
> $3 = #<<origin> visits-url: "https://archive.softwareheritage.org/api/1/origin/https://scm.gforge.inria.fr/anonscm/svn/mpfi/visits/" type: #f url: "https://scm.gforge.inria.fr/anonscm/svn/mpfi">
> scheme@(guile-user)> (origin-visits $3)
> $4 = (#<<visit> date: #<date nanosecond: 902765 second: 25 minute: 53 hour: 21 day: 21 month: 9 year: 2020 zone-offset: 0> origin: "https://scm.gforge.inria.fr/anonscm/svn/mpfi" url: "https://archive.softwareheritage.org/api/1/origin/https://scm.gforge.inria.fr/anonscm/svn/mpfi/visit/1/" snapshot-url: "https://archive.softwareheritage.org/api/1/snapshot/e7fdd4dc6230f710dbc55c1b308804fa1b5f51f0/" status: full number: 1>)
> scheme@(guile-user)> (visit-snapshot (car $4))
> $5 = #<<snapshot> branches: (#<<branch> name: "HEAD" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/f7b445a6bdc38bf075f29265120ca49824f698ea/">)>
>
> So the next step is to augment (guix swh) with a
> ‘lookup-subversion-revision’ procedure.

The attached patch does that:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (lookup-subversion-revision "https://scm.gforge.inria.fr/anonscm/svn/mpfi" 680)
$12 = #<<revision> id: "72102de7605a2459ebcb016338ebbf1a997e8c8d" date: #<date nanosecond: 938388 second: 35 minute: 32 hour: 11 day: 6 month: 9 year: 2018 zone-offset: 0> directory: "5c89c025a4cd9d16befdfec12dfc23f7318d0d5b" directory-url: "https://archive.softwareheritage.org/api/1/directory/5c89c025a4cd9d16befdfec12dfc23f7318d0d5b/" parents-ids: ("16da41f1848d77a93aec565320b72b460c429b61") extra-headers: (("svn_repo_uuid" . "e2f78e0c-bb60-4709-9413-9660a31d4696") ("svn_revision" . "680"))>
scheme@(guile-user)> (lookup-subversion-revision "https://scm.gforge.inria.fr/anonscm/svn/mpfi" 666)
$13 = #<<revision> id: "148eb1e7206b111af4075c73c656e54c9efed6cb" date: #<date nanosecond: 654167 second: 2 minute: 52 hour: 12 day: 2 month: 8 year: 2018 zone-offset: 0> directory: "ed7b0bd7019fb85cd86d948a97c23b9d43aa8728" directory-url: "https://archive.softwareheritage.org/api/1/directory/ed7b0bd7019fb85cd86d948a97c23b9d43aa8728/" parents-ids: ("0ba2aa7e0d3fc0a1eb3ba72b32094515415ae47a") extra-headers: (("svn_repo_uuid" . "e2f78e0c-bb60-4709-9413-9660a31d4696") ("svn_revision" . "666"))>
--8<---------------cut here---------------end--------------->8---

The implementation is pretty bad though, because it walks the revision
history until it finds the right revision number—so you’re likely to
reach the bandwidth rate limit before you’ve found the revision you’re
looking for.

More importantly, most svn origins cannot be found, or at least not
by passing the URL as-is:

  https://sympa.inria.fr/sympa/arc/swh-devel/2023-03/msg00009.html

This whole hack looks like a dead end.

It would be ideal if SWH would compute nar hashes, as you proposed:

  https://gitlab.softwareheritage.org/swh/meta/-/issues/4538

As a stopgap, I wonder if we could use “double hashing” on our side, but
only for svn: we’d store both the nar sha256 as we currently do, plus
the swhid.  It still seems to me that it’d be hard to scale and to
maintain that over time, even if it’s limited to svn.  Plus, there’d
still be the problem of ‘svn-multi-fetch’, which is what most TeX Live
packages use.

Thoughts?

Ludo’.


[-- Attachment #2: Type: text/x-patch, Size: 4366 bytes --]

diff --git a/guix/swh.scm b/guix/swh.scm
index c7c1c873a2..a65635b1db 100644
--- a/guix/swh.scm
+++ b/guix/swh.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2018, 2019, 2020, 2021, 2023 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2020 Jakub Kądziołka <kuba@kadziolka.net>
 ;;; Copyright © 2021 Xinglu Chen <public@yoctocell.xyz>
 ;;; Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
@@ -75,8 +75,10 @@ (define-module (guix swh)
             revision-id
             revision-date
             revision-directory
+            revision-parents
             lookup-revision
             lookup-origin-revision
+            lookup-subversion-revision
 
             content?
             content-checksums
@@ -207,6 +209,14 @@ (define string*
     ((? null?) #f)                                ;Guile-JSON 3.x
     ('null #f)))                                  ;Guile-JSON 4.x
 
+(define pair-vector->alist
+  (match-lambda
+    ('null '())
+    ((= vector->list lst)
+     (map (match-lambda
+            (#(key value) (cons key value)))
+          lst))))
+
 (define %allow-request?
   ;; Takes a URL and method (e.g., the 'http-get' procedure) and returns true
   ;; to keep going.  This can be used to disallow requests when
@@ -346,7 +356,14 @@ (define-json-mapping <revision> make-revision revision?
   (id            revision-id)
   (date          revision-date "date" (maybe-null string->date*))
   (directory     revision-directory)
-  (directory-url revision-directory-url "directory_url"))
+  (directory-url revision-directory-url "directory_url")
+  (parents-ids   revision-parent-ids "parents"
+                 (lambda (vector)
+                   (map (lambda (alist)
+                          (assoc-ref alist "id"))
+                        (vector->list vector))))
+  (extra-headers revision-extra-headers      ;alist--e.g., with "svn_revision"
+                 "extra_headers" pair-vector->alist))
 
 ;; <https://archive.softwareheritage.org/api/1/content/>
 (define-json-mapping <content> make-content content?
@@ -524,6 +541,50 @@ (define (lookup-origin-revision url tag)
         (()
          #f)))))
 
+(define (revision-parents revision)
+  "Return the parent revision(s) of REVISION."
+  (filter-map lookup-revision (revision-parent-ids revision)))
+
+(define (lookup-subversion-revision-in-history revision revision-number)
+  "Look for Subversion REVISION-NUMBER starting from REVISION and going back
+in history."
+  (let loop ((revision revision))
+    (let ((number (and=> (assoc-ref (revision-extra-headers revision)
+                                    "svn_revision")
+                         string->number)))
+      (and number
+           (cond ((= number revision-number)
+                  ;; Found it!
+                  revision)
+                 ((< number revision-number)
+                  ;; REVISION is ancestor of REVISION-NUMBER, so stop here.
+                  #f)
+                 (else
+                  ;; Check the parent(s) of REVISION.
+                  (any loop (revision-parents revision))))))))
+
+(define (lookup-subversion-revision url revision-number)
+  "Return either #f or the revision of the Subversion repository once
+available at URL with the given REVISION-NUMBER."
+  (match (lookup-origin url)
+    (#f #f)
+    (origin
+      (match (filter (lambda (visit)
+                       ;; Return #f if (visit-snapshot VISIT) would return #f.
+                       (and (visit-snapshot-url visit)
+                            (eq? 'full (visit-status visit))))
+                     (origin-visits origin))
+        (()
+         #f)
+        ((visit . _)
+         (any (lambda (branch)
+                (match (branch-target branch)
+                  ((? revision? revision)
+                   (lookup-subversion-revision-in-history revision
+                                                          revision-number))
+                  (_ #f)))
+              (snapshot-branches (visit-snapshot visit))))))))
+
 (define (release-target release)
   "Return the revision that is the target of RELEASE."
   (match (release-target-type release)

  reply	other threads:[~2023-03-20 14:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-16  8:14 [bug#43442] [PATCH] Fixes init of #42162: gforge.inria.fr down Dec. 2020 zimoun
2020-09-16  8:16 ` [bug#43442] [PATCH 1/2] gnu: mpfi: Replace 'url-fetch' by 'svn-fetch' zimoun
2020-09-16  8:16   ` [bug#43442] [PATCH 2/2] gnu: gmp-ecm: " zimoun
2020-09-21 21:19   ` [bug#43442] [PATCH 1/2] gnu: mpfi: " Ludovic Courtès
2020-09-21 21:51     ` zimoun
2020-09-23 16:21       ` Ludovic Courtès
2020-09-23 17:07         ` zimoun
2020-09-25  8:56           ` Ludovic Courtès
2020-10-01 20:26             ` zimoun
2020-10-01 21:01               ` zimoun
2020-10-03  8:59               ` Ludovic Courtès
2023-03-20 14:09                 ` Ludovic Courtès [this message]
2023-03-22 22:42                   ` [bug#43442] [PATCH] Fixes init of #42162: gforge.inria.fr down Dec. 2020 Timothy Sample
2023-03-24 17:22                     ` [bug#43442] Subversion keyword substitution Ludovic Courtès
2023-03-24 23:31                       ` Timothy Sample
2023-03-27  9:04                         ` Ludovic Courtès
2023-04-03 12:05                           ` Simon Tournier
2023-04-04 17:16                           ` Timothy Sample
2023-04-07 16:45                         ` Ludovic Courtès
2023-04-03 13:34                   ` [bug#43442] [PATCH] Fixes init of #42162: gforge.inria.fr down Dec. 2020 Simon Tournier
2024-03-09 22:34                   ` bug#43442: Code stored with Subversion (SVN) cannot be retrieved from SWH Ludovic Courtès
2020-09-17  8:14 ` [bug#43442] [PATCH] Fixes init of #42162: gforge.inria.fr down Dec. 2020 Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzzbms54.fsf_-_@gnu.org \
    --to=ludovic.courtes@inria.fr \
    --cc=43442@debbugs.gnu.org \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.