Hi! Ludovic Courtès skribis: > Ah yes, under “extra_headers” there’s the SVN revision number. (guix > swh) doesn’t expose “extra_headers” yet, but once it does, we can walk > snapshots until we find the SVN revision we’re looking for. > > scheme@(guile-user)> (lookup-origin "https://scm.gforge.inria.fr/anonscm/svn/mpfi/") > $3 = #< visits-url: "https://archive.softwareheritage.org/api/1/origin/https://scm.gforge.inria.fr/anonscm/svn/mpfi/visits/" type: #f url: "https://scm.gforge.inria.fr/anonscm/svn/mpfi"> > scheme@(guile-user)> (origin-visits $3) > $4 = (#< date: # origin: "https://scm.gforge.inria.fr/anonscm/svn/mpfi" url: "https://archive.softwareheritage.org/api/1/origin/https://scm.gforge.inria.fr/anonscm/svn/mpfi/visit/1/" snapshot-url: "https://archive.softwareheritage.org/api/1/snapshot/e7fdd4dc6230f710dbc55c1b308804fa1b5f51f0/" status: full number: 1>) > scheme@(guile-user)> (visit-snapshot (car $4)) > $5 = #< branches: (#< name: "HEAD" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/f7b445a6bdc38bf075f29265120ca49824f698ea/">)> > > So the next step is to augment (guix swh) with a > ‘lookup-subversion-revision’ procedure. The attached patch does that: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (lookup-subversion-revision "https://scm.gforge.inria.fr/anonscm/svn/mpfi" 680) $12 = #< id: "72102de7605a2459ebcb016338ebbf1a997e8c8d" date: # directory: "5c89c025a4cd9d16befdfec12dfc23f7318d0d5b" directory-url: "https://archive.softwareheritage.org/api/1/directory/5c89c025a4cd9d16befdfec12dfc23f7318d0d5b/" parents-ids: ("16da41f1848d77a93aec565320b72b460c429b61") extra-headers: (("svn_repo_uuid" . "e2f78e0c-bb60-4709-9413-9660a31d4696") ("svn_revision" . "680"))> scheme@(guile-user)> (lookup-subversion-revision "https://scm.gforge.inria.fr/anonscm/svn/mpfi" 666) $13 = #< id: "148eb1e7206b111af4075c73c656e54c9efed6cb" date: # directory: "ed7b0bd7019fb85cd86d948a97c23b9d43aa8728" directory-url: "https://archive.softwareheritage.org/api/1/directory/ed7b0bd7019fb85cd86d948a97c23b9d43aa8728/" parents-ids: ("0ba2aa7e0d3fc0a1eb3ba72b32094515415ae47a") extra-headers: (("svn_repo_uuid" . "e2f78e0c-bb60-4709-9413-9660a31d4696") ("svn_revision" . "666"))> --8<---------------cut here---------------end--------------->8--- The implementation is pretty bad though, because it walks the revision history until it finds the right revision number—so you’re likely to reach the bandwidth rate limit before you’ve found the revision you’re looking for. More importantly, most svn origins cannot be found, or at least not by passing the URL as-is: https://sympa.inria.fr/sympa/arc/swh-devel/2023-03/msg00009.html This whole hack looks like a dead end. It would be ideal if SWH would compute nar hashes, as you proposed: https://gitlab.softwareheritage.org/swh/meta/-/issues/4538 As a stopgap, I wonder if we could use “double hashing” on our side, but only for svn: we’d store both the nar sha256 as we currently do, plus the swhid. It still seems to me that it’d be hard to scale and to maintain that over time, even if it’s limited to svn. Plus, there’d still be the problem of ‘svn-multi-fetch’, which is what most TeX Live packages use. Thoughts? Ludo’.