* bug#44187: whishlist: time-machine --channel falls back to SWH @ 2020-10-23 22:17 zimoun 2021-03-05 14:51 ` Ludovic Courtès 2021-09-17 8:02 ` bug#44187: Channel clones lack SWH fallback zimoun 0 siblings, 2 replies; 15+ messages in thread From: zimoun @ 2020-10-23 22:17 UTC (permalink / raw) To: 44187 Dear, Let’s describe the use case. Consider that: guix time-machine -C channels -- install foo is provided in some documentation, say scientific paper. Where the channels.scm file is completly described: --8<---------------cut here---------------start------------->8--- (list (channel (name 'kikoo) (url "https://example.org/that-great.git") (commit "353bdae32f72b720c7ddd706576ccc40e2b43f95"))) --8<---------------cut here---------------end--------------->8--- In the future, if https://example.org/that-great.git disappears, then build/install the package ’foo’ is becoming difficult, nor impossible. However, let’s consider that the repo ’that-great’ had been saved in SWH (say manually); since it is a regular Git repo. Guix should be able to fallback to it transparently. Obviously, another whislist is to have something to ease the save request of the channel on SWH. Maybe this latter could be part of the several-times discussed “guix channel” subcommand. :-) All the best, simon ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: whishlist: time-machine --channel falls back to SWH 2020-10-23 22:17 bug#44187: whishlist: time-machine --channel falls back to SWH zimoun @ 2021-03-05 14:51 ` Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones Ludovic Courtès 2021-09-17 8:02 ` bug#44187: Channel clones lack SWH fallback zimoun 1 sibling, 1 reply; 15+ messages in thread From: Ludovic Courtès @ 2021-03-05 14:51 UTC (permalink / raw) To: zimoun; +Cc: 44187 [-- Attachment #1: Type: text/plain, Size: 1635 bytes --] Hi, zimoun <zimon.toutoune@gmail.com> skribis: > Let’s describe the use case. Consider that: > > guix time-machine -C channels -- install foo > > is provided in some documentation, say scientific paper. Where the > channels.scm file is completly described: > > (list (channel > (name 'kikoo) > (url "https://example.org/that-great.git") > (commit > "353bdae32f72b720c7ddd706576ccc40e2b43f95"))) > > In the future, if https://example.org/that-great.git disappears, then > build/install the package ’foo’ is becoming difficult, nor impossible. > > However, let’s consider that the repo ’that-great’ had been saved in SWH > (say manually); since it is a regular Git repo. Guix should be able to > fallback to it transparently. I went head-down to add SWH fallback to ‘latest-repository-commit’… but that’s of no use because (guix channels) wants a complete clone so that it can determine commit relations (to detect downgrades). The SWH vault gives access to checkouts primarily, but it’s also possible to get a full repo in ‘git fast-import’ format, which is what we need: https://archive.softwareheritage.org/api/1/vault/revision/gitfast/doc/ However, this API will be eventually replaced by some other solution say SWH developers, possibly a bare Git repo export, so it may not be a good idea to build upon it. If we were able, using the SWH API, to map “revisions” to “origins”, we could find potential mirrors hosting a given commit, but apparently that’s not possible. To be continued… Ludo’. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 2914 bytes --] diff --git a/guix/git.scm b/guix/git.scm index a5103547d3..449011c51a 100644 --- a/guix/git.scm +++ b/guix/git.scm @@ -32,6 +32,7 @@ #:use-module (guix records) #:use-module (guix gexp) #:use-module (guix sets) + #:autoload (guix swh) (swh-download) #:use-module ((guix diagnostics) #:select (leave)) #:use-module (guix progress) #:use-module (rnrs bytevectors) @@ -459,22 +460,43 @@ Log progress and checkout info to LOG-PORT." (eq? 'regular (stat:type stat)))))) (format log-port "updating checkout of '~a'...~%" url) - (let*-values - (((checkout commit _) - (update-cached-checkout url - #:recursive? recursive? - #:ref ref - #:cache-directory - (url-cache-directory url cache-directory - #:recursive? - recursive?) - #:log-port log-port)) - ((name) - (url+commit->name url commit))) - (format log-port "retrieved commit ~a~%" commit) - (values (add-to-store store name #t "sha256" checkout - #:select? (negate dot-git?)) - commit))) + + (catch 'git-error + (lambda () + (let*-values + (((checkout commit _) + (update-cached-checkout (pk 'l-r-c url) + #:recursive? recursive? + #:ref ref + #:cache-directory + (url-cache-directory url cache-directory + #:recursive? + recursive?) + #:log-port log-port)) + ((name) + (url+commit->name url commit))) + (format log-port "retrieved commit ~a~%" commit) + (values (add-to-store store name #t "sha256" checkout + #:select? (negate dot-git?)) + commit))) + (lambda (key err . rest) + ;; XXX: 'swh-download' currently doesn't support submodules. + (when recursive? + (apply throw key err rest)) + + (pk 'err key err rest) + (match ref + (('commit . commit) + ;; Attempt to fetch COMMIT from SWH. + (call-with-temporary-directory + (lambda (directory) + (unless (swh-download url commit directory) + (apply throw key err rest)) + (values (add-to-store store (url+commit->name url commit) + #t "sha256" directory) + commit)))) + (_ + (apply throw key err rest)))))) (define (print-git-error port key args default-printer) (match args ^ permalink raw reply related [flat|nested] 15+ messages in thread
* bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones 2021-03-05 14:51 ` Ludovic Courtès @ 2021-09-10 14:34 ` Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 1/3] swh: Support downloads of bare Git repositories Ludovic Courtès ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Ludovic Courtès @ 2021-09-10 14:34 UTC (permalink / raw) To: 44187 Hi! A bit of context: we already had automatic SWH fallback for Git checkouts, which is to say that any origin that uses ‘git-fetch’ would have its checkout transparently fetched from SWH if upstream vanished (this dates back to commit 608d3dca89d73fe7260e97a284a8aeea756a3e11, Nov. 2018). What this patch series provides is SWH fallback for full Git clones (as opposed to flat checkouts). It works for anything that uses (guix git). That includes <git-checkout>, used by transformation options: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build footswitch --with-git-url=footswitch=http://example.org/sdf --with-commit=footswitch=1eabc563ca5692b3e08d84f1f0e6fd2283284469 -n updating checkout of 'http://example.org/sdf'... SWH: found revision 1eabc563ca5692b3e08d84f1f0e6fd2283284469 with directory at 'https://archive.softwareheritage.org/api/1/directory/ad8976564375ee55f645387bbcdf4b66e6582fbf/' swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/HEAD swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/branches/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/config swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/description swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/applypatch-msg.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/commit-msg.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/fsmonitor-watchman.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/post-update.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-applypatch.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-commit.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-push.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-rebase.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-receive.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/prepare-commit-msg.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/update.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/exclude swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/refs swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/info/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/info/packs swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/pack-ed28f44a2599fe2d0a5f1b1a84c247c43afd14a1.idx swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/pack-ed28f44a2599fe2d0a5f1b1a84c247c43afd14a1.pack swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/heads/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/heads/master swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/tags/ retrieved commit 1eabc563ca5692b3e08d84f1f0e6fd2283284469 substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% substitute: updating substitutes from 'https://bayfront.guix.gnu.org'... 100.0% The following derivation would be built: /gnu/store/39kzsy5kgj5150q6zgckc2hbxp999adw-footswitch-git.1eabc56.drv --8<---------------cut here---------------end--------------->8--- In the example above, we pass a bogus Git URL, but since the target commit is known, (guix git) automatically fetches a bare Git repository from the SWH vault. It also works for channels, which is what zimoun reported here: --8<---------------cut here---------------start------------->8--- $ cat /tmp/chan.scm (list (channel (name 'guix) (url "https://git.savannah.gnu.org/git/guix.git") (commit "f91ae9425bb385b60396a544afe27933896b8fa3") (introduction (make-channel-introduction "9edb3f66fd807b096b48283debdcddccfea34bad" (openpgp-fingerprint "BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))) (channel (name 'guix-past) (url "https://does-not-exist.inria.fr/guix-hpc/guix-past") (commit "77e183dc7ade307ad3409fad4b71f12e266de910") #;(introduction (make-channel-introduction "0c119db2ea86a389769f4d2b9c6f5c41c027e336" (openpgp-fingerprint "3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5"))))) $ ./pre-inst-env guix time-machine -C /tmp/chan.scm -- describe Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'... Updating channel 'guix-past' from Git repository at 'https://does-not-exist.inria.fr/guix-hpc/guix-past'... SWH: found revision 77e183dc7ade307ad3409fad4b71f12e266de910 with directory at 'https://archive.softwareheritage.org/api/1/directory/7c6aa10e1e0fa54199566145c6a453731872b87d/' swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/HEAD swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/branches/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/config swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/description swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/hooks/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/exclude swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/refs swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/info/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/info/packs swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/pack-e6c0a4813509178eed735708dd60503353a50b9c.idx swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/pack-e6c0a4813509178eed735708dd60503353a50b9c.pack swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/heads/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/heads/master swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/tags/ Computing Guix derivation for 'x86_64-linux'... \ C-c C-c --8<---------------cut here---------------end--------------->8--- Here, the ‘guix-past’ channel is transparently cloned from SWH. This is pretty cool, because having the whole repo around is what permits things like downgrade prevention¹ and news support². Finally we can enjoy content-addressability and brittle URLs are becoming a thing of the past!* Limitations ~~~~~~~~~~~~ Yes, there’s a couple of them. First, fallback is implemented only for fresh clones, not for updates. Thus, if I rerun the first example, having now the clone in ~/.cache/guix/checkouts, with a different commit, I get: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build footswitch --with-git-url=footswitch=http://example.org/sdf --with-commit=footswitch=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa -n updating checkout of 'http://example.org/sdf'... guix build: error: Git failure while fetching http://example.org/sdf: unexpected http status code: 404 --8<---------------cut here---------------end--------------->8--- Second, clones from SWH only contain the one branch that the revision is on. For channels, that means that the ‘keyring’ branch is not fetched, which is why I commented out ‘introduction’ in /tmp/chan.scm above. If I uncomment it, I get: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix time-machine -C /tmp/chan.scm -- describe Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'... Updating channel 'guix-past' from Git repository at 'https://does-not-exist.inria.fr/guix-hpc/guix-past'... guix time-machine: error: Git error: cannot locate remote-tracking branch 'origin/keyring' --8<---------------cut here---------------end--------------->8--- The SWH folks tell me it’ll eventually be possible to map a revision to its containing snapshot(s) via the HTTP API, and to obtain entire snapshots (i.e., the repo and all its branches) from the vault. That’s what we need to fix this issue. *Third, and this answers the asterisk above, we must keep in mind that this is content-addressibility *with SHA1*. Generating a chosen-prefix collision is becoming affordable³, so users absolutely need an additional mechanism to authenticate code they fetched. For origins, we have the content SHA256, so we’re fine. For channels, we have Guix’s authentication mechanism¹, except it’s not available yet via SWH, as I wrote above. For the footswitch example above using ‘--with-commit’, we don’t have any authentication method, but in fact, that’s the situation of Git repositories in general: they can rarely be authenticated. Overall, I think it’s a step in the right direction. Thoughts? Thanks to vlorentz and olasd on #swh-devel for their support! Thanks, Ludo’. ¹ https://guix.gnu.org/en/blog/2020/securing-updates/ ² https://guix.gnu.org/en/blog/2019/spreading-the-news/ ³ https://sha-mbles.github.io/ Ludovic Courtès (3): swh: Support downloads of bare Git repositories. git: 'update-cached-checkout' can fall back to SWH when cloning. git: 'reference-available?' recognizes 'tag-or-commit'. guix/git.scm | 45 +++++++++++++++++++++++++++++++++++++++++++-- guix/swh.scm | 52 ++++++++++++++++++++++++++++++++++++++++------------ 2 files changed, 83 insertions(+), 14 deletions(-) -- 2.33.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: [PATCH 1/3] swh: Support downloads of bare Git repositories. 2021-09-10 14:34 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones Ludovic Courtès @ 2021-09-10 14:34 ` Ludovic Courtès 2021-09-17 17:31 ` bug#44187: Channel clones lack SWH fallback zimoun 2021-09-10 14:34 ` bug#44187: [PATCH 2/3] git: 'update-cached-checkout' can fall back to SWH when cloning Ludovic Courtès ` (2 subsequent siblings) 3 siblings, 1 reply; 15+ messages in thread From: Ludovic Courtès @ 2021-09-10 14:34 UTC (permalink / raw) To: 44187; +Cc: Ludovic Courtès From: Ludovic Courtès <ludovic.courtes@inria.fr> * guix/swh.scm (swh-download-archive): New procedure. (swh-download-directory): Rewrite in terms of 'swh-download-archive'. (swh-download): Add #:archive-type and honor it. Use 'swh-download-archive' instead of 'swh-download-directory'. --- guix/swh.scm | 52 ++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 40 insertions(+), 12 deletions(-) diff --git a/guix/swh.scm b/guix/swh.scm index 3d5d2a410a..707551a799 100644 --- a/guix/swh.scm +++ b/guix/swh.scm @@ -645,20 +645,29 @@ delete it when leaving the dynamic extent of this call." (lambda () (false-if-exception (delete-file-recursively tmp-dir)))))) -(define* (swh-download-directory id output - #:key (log-port (current-error-port))) - "Download from Software Heritage the directory with the given ID, and -unpack it to OUTPUT. Return #t on success and #f on failure" +(define* (swh-download-archive swhid output + #:key + (archive-type 'flat) + (log-port (current-error-port))) + "Download from Software Heritage the directory or revision with the given +SWID, in the ARCHIVE-TYPE format (one of 'flat or 'git-bare), and unpack it to +OUTPUT. Return #t on success and #f on failure." (call-with-temporary-directory (lambda (directory) - (match (vault-fetch id 'directory #:log-port log-port) + (match (vault-fetch swhid + #:archive-type archive-type + #:log-port log-port) (#f (format log-port - "SWH: directory ~a could not be fetched from the vault~%" - id) + "SWH: object ~a could not be fetched from the vault~%" + swhid) #f) ((? port? input) - (let ((tar (open-pipe* OPEN_WRITE "tar" "-C" directory "-xzvf" "-"))) + (let ((tar (open-pipe* OPEN_WRITE "tar" "-C" directory + (match archive-type + ('flat "-xzvf") ;gzipped + ('git-bare "-xvf")) ;uncompressed + "-"))) (dump-port input tar) (close-port input) (let ((status (close-pipe tar))) @@ -672,6 +681,14 @@ unpack it to OUTPUT. Return #t on success and #f on failure" #:log (%make-void-port "w")) #t)))))))) +(define* (swh-download-directory id output + #:key (log-port (current-error-port))) + "Download from Software Heritage the directory with the given ID, and +unpack it to OUTPUT. Return #t on success and #f on failure." + (swh-download-archive (string-append "swh:1:dir:" id) output + #:archive-type 'flat + #:log-port log-port)) + (define (commit-id? reference) "Return true if REFERENCE is likely a commit ID, false otherwise---e.g., if it is a tag name. This is based on a simple heuristic so use with care!" @@ -679,8 +696,11 @@ it is a tag name. This is based on a simple heuristic so use with care!" (string-every char-set:hex-digit reference))) (define* (swh-download url reference output - #:key (log-port (current-error-port))) - "Download from Software Heritage a checkout of the Git tag or commit + #:key + (archive-type 'flat) + (log-port (current-error-port))) + "Download from Software Heritage a checkout (if ARCHIVE-TYPE is 'flat) or a +full Git repository (if ARCHIVE-TYPE is 'git-bare) of the Git tag or commit REFERENCE originating from URL, and unpack it in OUTPUT. Return #t on success and #f on failure. @@ -694,7 +714,15 @@ wait until it becomes available, which could take several minutes." (format log-port "SWH: found revision ~a with directory at '~a'~%" (revision-id revision) (swh-url (revision-directory-url revision))) - (swh-download-directory (revision-directory revision) output - #:log-port log-port)) + (swh-download-archive (match archive-type + ('flat + (string-append + "swh:1:dir:" (revision-directory revision))) + ('git-bare + (string-append + "swh:1:rev:" (revision-id revision)))) + output + #:archive-type archive-type + #:log-port log-port)) (#f #f))) -- 2.33.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2021-09-10 14:34 ` bug#44187: [PATCH 1/3] swh: Support downloads of bare Git repositories Ludovic Courtès @ 2021-09-17 17:31 ` zimoun 2021-09-18 10:05 ` Ludovic Courtès 0 siblings, 1 reply; 15+ messages in thread From: zimoun @ 2021-09-17 17:31 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 44187, Ludovic Courtès Hi Ludo, The patch LGTM although there is a redundancy, from my understanding. On Fri, 10 Sep 2021 at 16:34, Ludovic Courtès <ludo@gnu.org> wrote: > @@ -694,7 +714,15 @@ wait until it becomes available, which could take several minutes." > (format log-port "SWH: found revision ~a with directory at '~a'~%" > (revision-id revision) > (swh-url (revision-directory-url revision))) > - (swh-download-directory (revision-directory revision) output > - #:log-port log-port)) > + (swh-download-archive (match archive-type > + ('flat > + (string-append > + "swh:1:dir:" (revision-directory revision))) > + ('git-bare > + (string-append > + "swh:1:rev:" (revision-id revision)))) Here the ’swid’ depends on the ’archive-type’… > + output > + #:archive-type archive-type …which is also passed. Then this is propagated. For instance, ’swh-download-directory’: > +(define* (swh-download-directory id output > + #:key (log-port (current-error-port))) > + "Download from Software Heritage the directory with the given ID, and > +unpack it to OUTPUT. Return #t on success and #f on failure." > + (swh-download-archive (string-append "swh:1:dir:" id) output > + #:archive-type 'flat > + #:log-port log-port)) > + Does it make sense to pass this ’swhid’ equal to ’swh:1:rev’ with the ’flat’ archive-type? Another instance is, > + (match (vault-fetch swhid > + #:archive-type archive-type > + #:log-port log-port) and from my understanding, again ’swhid’ depends on ’archive-type’. Therefore, it prone error. The best seems to pass ’(archive-type . swhid)’ and pattern-match on that. Yeah, it potentially breaks the public API… but there is no claim about stability (and I am not convinced this (guix swh) module is used outside Guix :-)). Cheers, simon ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2021-09-17 17:31 ` bug#44187: Channel clones lack SWH fallback zimoun @ 2021-09-18 10:05 ` Ludovic Courtès 2021-09-18 10:27 ` zimoun 0 siblings, 1 reply; 15+ messages in thread From: Ludovic Courtès @ 2021-09-18 10:05 UTC (permalink / raw) To: zimoun; +Cc: 44187 Hi! zimoun <zimon.toutoune@gmail.com> skribis: > The patch LGTM although there is a redundancy, from my understanding. > > On Fri, 10 Sep 2021 at 16:34, Ludovic Courtès <ludo@gnu.org> wrote: > >> @@ -694,7 +714,15 @@ wait until it becomes available, which could take several minutes." >> (format log-port "SWH: found revision ~a with directory at '~a'~%" >> (revision-id revision) >> (swh-url (revision-directory-url revision))) >> - (swh-download-directory (revision-directory revision) output >> - #:log-port log-port)) >> + (swh-download-archive (match archive-type >> + ('flat >> + (string-append >> + "swh:1:dir:" (revision-directory revision))) >> + ('git-bare >> + (string-append >> + "swh:1:rev:" (revision-id revision)))) > > Here the ’swid’ depends on the ’archive-type’… > >> + output >> + #:archive-type archive-type > > …which is also passed. Then this is propagated. For instance, > ’swh-download-directory’: > >> +(define* (swh-download-directory id output >> + #:key (log-port (current-error-port))) >> + "Download from Software Heritage the directory with the given ID, and >> +unpack it to OUTPUT. Return #t on success and #f on failure." >> + (swh-download-archive (string-append "swh:1:dir:" id) output >> + #:archive-type 'flat >> + #:log-port log-port)) >> + > > Does it make sense to pass this ’swhid’ equal to ’swh:1:rev’ with the > ’flat’ archive-type? Another instance is, > >> + (match (vault-fetch swhid >> + #:archive-type archive-type >> + #:log-port log-port) > > and from my understanding, again ’swhid’ depends on ’archive-type’. > Therefore, it prone error. ‘git-bare’ only makes sense for a revision, not a directory, but I wonder if ‘flat’ can be used for a revision (in which case it’d be equivalent to getting the corresponding directory)? I agree there’s some redundancy between directory/revision and flat/git-bare, but it’s the SWH API that looks like this, so I’d be tempted to just keep it as is. Maybe we could ask for guidance on #swh-devel. Thanks! Ludo’. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2021-09-18 10:05 ` Ludovic Courtès @ 2021-09-18 10:27 ` zimoun 0 siblings, 0 replies; 15+ messages in thread From: zimoun @ 2021-09-18 10:27 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 44187 Hi, On Sat, 18 Sept 2021 at 12:05, Ludovic Courtès <ludo@gnu.org> wrote: > zimoun <zimon.toutoune@gmail.com> skribis: > > Does it make sense to pass this ’swhid’ equal to ’swh:1:rev’ with the > > ’flat’ archive-type? Another instance is, [...] > > and from my understanding, again ’swhid’ depends on ’archive-type’. > > Therefore, it prone error. > > ‘git-bare’ only makes sense for a revision, not a directory, but I So it does not seem possible to form a 'swhid' as "swh:1:dir" and pass 'archive-type' as 'git-bare'. And conversely with 'swh:1:rev' and 'flat'. Right? I have not tried though. :-) If yes, it means the both arguments 'swhid' and 'archive-type' are linked so the function should accept only one unifyied argument and not 2 independent ones. IMHO. > wonder if ‘flat’ can be used for a revision (in which case it’d be > equivalent to getting the corresponding directory)? > > I agree there’s some redundancy between directory/revision and > flat/git-bare, but it’s the SWH API that looks like this, so I’d be > tempted to just keep it as is. Maybe we could ask for guidance on > #swh-devel. Well, let postpone the refactoring. :-) However, if it works as I understand, then the refactoring seems the correct way so I would not accept a backward compatibility argument. ;-) Have a nice week-end, simon ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: [PATCH 2/3] git: 'update-cached-checkout' can fall back to SWH when cloning. 2021-09-10 14:34 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 1/3] swh: Support downloads of bare Git repositories Ludovic Courtès @ 2021-09-10 14:34 ` Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 3/3] git: 'reference-available?' recognizes 'tag-or-commit' Ludovic Courtès 2021-09-13 16:07 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones zimoun 3 siblings, 0 replies; 15+ messages in thread From: Ludovic Courtès @ 2021-09-10 14:34 UTC (permalink / raw) To: 44187; +Cc: Ludovic Courtès From: Ludovic Courtès <ludovic.courtes@inria.fr> Fixes <https://issues.guix.gnu.org/44187>. Reported by zimoun <zimon.toutoune@gmail.com>. * guix/git.scm (GITERR_HTTP): New variable. (clone-from-swh, clone/swh-fallback): New procedures. (update-cached-checkout): Use 'clone/swh-fallback' instead of 'clone*'. --- guix/git.scm | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/guix/git.scm b/guix/git.scm index acc48fd12f..377e09888a 100644 --- a/guix/git.scm +++ b/guix/git.scm @@ -36,6 +36,7 @@ #:use-module (guix sets) #:use-module ((guix diagnostics) #:select (leave)) #:use-module (guix progress) + #:autoload (guix swh) (swh-download) #:use-module (rnrs bytevectors) #:use-module (ice-9 format) #:use-module (ice-9 match) @@ -180,6 +181,13 @@ the 'SSL_CERT_FILE' and 'SSL_CERT_DIR' environment variables." (lambda args (make-fetch-options auth-method))))) +(define GITERR_HTTP + ;; Guile-Git <= 0.5.2 lacks this constant. + (let ((errors (resolve-interface '(git errors)))) + (if (module-defined? errors 'GITERR_HTTP) + (module-ref errors 'GITERR_HTTP) + 34))) + (define (clone* url directory) "Clone git repository at URL into DIRECTORY. Upon failure, make sure no empty directory is left behind." @@ -342,6 +350,38 @@ definitely available in REPOSITORY, false otherwise." (_ #f))) +(define (clone-from-swh url tag-or-commit output) + "Attempt to clone TAG-OR-COMMIT (a string), which originates from URL, using +a copy archived at Software Heritage." + (call-with-temporary-directory + (lambda (bare) + (and (swh-download url tag-or-commit bare + #:archive-type 'git-bare) + (let ((repository (clone* bare output))) + (remote-set-url! repository "origin" url) + repository))))) + +(define (clone/swh-fallback url ref cache-directory) + "Like 'clone', but fallback to Software Heritage if the repository cannot be +found at URL." + (define (inaccessible-url-error? err) + (let ((class (git-error-class err)) + (code (git-error-code err))) + (or (= class GITERR_HTTP) ;404 or similar + (= class GITERR_NET)))) ;unknown host, etc. + + (catch 'git-error + (lambda () + (clone* url cache-directory)) + (lambda (key err) + (match ref + (((or 'commit 'tag-or-commit) . commit) + (if (inaccessible-url-error? err) + (or (clone-from-swh url commit cache-directory) + (throw key err)) + (throw key err))) + (_ (throw key err)))))) + (define cached-checkout-expiration ;; Return the expiration time procedure for a cached checkout. ;; TODO: Honor $GUIX_GIT_CACHE_EXPIRATION. @@ -408,7 +448,7 @@ it unchanged." (let* ((cache-exists? (openable-repository? cache-directory)) (repository (if cache-exists? (repository-open cache-directory) - (clone* url cache-directory)))) + (clone/swh-fallback url ref cache-directory)))) ;; Only fetch remote if it has not been cloned just before. (when (and cache-exists? (not (reference-available? repository ref))) -- 2.33.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* bug#44187: [PATCH 3/3] git: 'reference-available?' recognizes 'tag-or-commit'. 2021-09-10 14:34 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 1/3] swh: Support downloads of bare Git repositories Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 2/3] git: 'update-cached-checkout' can fall back to SWH when cloning Ludovic Courtès @ 2021-09-10 14:34 ` Ludovic Courtès 2021-09-13 16:07 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones zimoun 3 siblings, 0 replies; 15+ messages in thread From: Ludovic Courtès @ 2021-09-10 14:34 UTC (permalink / raw) To: 44187 * guix/git.scm (reference-available?): Handle 'tag-or-commit' with a 40-digit hex string. --- guix/git.scm | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/guix/git.scm b/guix/git.scm index 377e09888a..33a111b84a 100644 --- a/guix/git.scm +++ b/guix/git.scm @@ -36,7 +36,7 @@ #:use-module (guix sets) #:use-module ((guix diagnostics) #:select (leave)) #:use-module (guix progress) - #:autoload (guix swh) (swh-download) + #:autoload (guix swh) (swh-download commit-id?) #:use-module (rnrs bytevectors) #:use-module (ice-9 format) #:use-module (ice-9 match) @@ -340,7 +340,8 @@ dynamic extent of EXP." "Return true if REF, a reference such as '(commit . \"cabba9e\"), is definitely available in REPOSITORY, false otherwise." (match ref - (('commit . commit) + ((or ('commit . commit) + ('tag-or-commit . (? commit-id? commit))) (let ((len (string-length commit)) (oid (string->oid commit))) (false-if-git-not-found -- 2.33.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones 2021-09-10 14:34 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones Ludovic Courtès ` (2 preceding siblings ...) 2021-09-10 14:34 ` bug#44187: [PATCH 3/3] git: 'reference-available?' recognizes 'tag-or-commit' Ludovic Courtès @ 2021-09-13 16:07 ` zimoun 2021-09-14 13:37 ` Ludovic Courtès 3 siblings, 1 reply; 15+ messages in thread From: zimoun @ 2021-09-13 16:07 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 44187 Hi Ludo, Cool! However, the patch does not apply on the top of 53f54d4aa2. That's why the option '--base' of "git format-patch" is really helpful. ;-) Onto which commit does the patch set apply? In order to try and review. :-) Cheers, simon ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones 2021-09-13 16:07 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones zimoun @ 2021-09-14 13:37 ` Ludovic Courtès 0 siblings, 0 replies; 15+ messages in thread From: Ludovic Courtès @ 2021-09-14 13:37 UTC (permalink / raw) To: zimoun; +Cc: 44187 Hello, zimoun <zimon.toutoune@gmail.com> skribis: > Cool! However, the patch does not apply on the top of 53f54d4aa2. > That's why the option '--base' of "git format-patch" is really helpful. ;-) Ah! It should apply on top of ff613c2b68aac539262822490448e637d8f315ba. If not, I can rebase it and send an updated patch (I’ve been fiddling with code in this area lately…). Thanks, Ludo’. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2020-10-23 22:17 bug#44187: whishlist: time-machine --channel falls back to SWH zimoun 2021-03-05 14:51 ` Ludovic Courtès @ 2021-09-17 8:02 ` zimoun 2021-09-18 21:10 ` Ludovic Courtès 1 sibling, 1 reply; 15+ messages in thread From: zimoun @ 2021-09-17 8:02 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 44187 Hi, On ven., 10 sept. 2021 at 16:34, Ludovic Courtès <ludo@gnu.org> wrote: > Finally we can enjoy content-addressability and brittle URLs > are becoming a thing of the past!* Yeah, it is awesome! The original URL of the channel was: <https://github.com/zimoun/channel-example.git>. And this channel defines a package where the upstream has also disappeared <https://github.com/zimoun/hello-example.git>. Note the URL in the package definition is not bogus… but using one was already working. :-) All is saved on SWH, so now all is transparent! From my point of view, this is a killer feature for scientific folks. :-) --8<---------------cut here---------------start------------->8--- $ cat /tmp/channels.scm (list (channel (name 'guix) (url "/home/sitour/src/guix/guix") (branch "fix-44187") (commit "cdea76a2fdaf7705583a02081a6468d436b8df05")) (channel (name 'example) (url "https://example.org/foo.git") (commit "67c9f2143aa6f545419ae913b4ae02af4cd3effc"))) $ ./pre-inst-env guix time-machine -C /tmp/channels.scm --disable-authentication -- build hi Updating channel 'guix' from Git repository at '/home/sitour/src/guix/guix'... guix time-machine: warning: channel authentication disabled Updating channel 'example' from Git repository at 'https://example.org/foo.git'... SWH: found revision 67c9f2143aa6f545419ae913b4ae02af4cd3effc with directory at 'https://archive.softwareheritage.org/api/1/directory/fe423e88ce277d3fc230c88d408e42b14a3a458c/' SWH vault: requested bundle cooking, waiting for completion... swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/HEAD swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/branches/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/config swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/description swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/hooks/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/info/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/info/exclude swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/info/refs swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/info/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/info/packs swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/pack/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/pack/pack-4e9279a1b64e4dda7bd9d84bb6b50bb1f80def08.idx swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/objects/pack/pack-4e9279a1b64e4dda7bd9d84bb6b50bb1f80def08.pack swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/heads/ swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/heads/master swh:1:rev:67c9f2143aa6f545419ae913b4ae02af4cd3effc.git/refs/tags/ guix time-machine: warning: channel authentication disabled [...] Computing Guix derivation for 'x86_64-linux'... - [...] construction de /gnu/store/6g9qlysbbk7p4609xrv82j0wzbib1y4r-git-checkout.drv... guile: warning: failed to install locale environment variable `PATH' set to `/gnu/store/378zjf2kgajcfd7mfr98jn5xyc5wa3qv-gzip-1.10/bin:/gnu/store/sf3rbvb6iqcphgm1afbplcs72hsywg25-tar-1.32/bin' hint: Using 'master' as the name for the initial branch. This default branch name hint: is subject to change. To configure the initial branch name to use in all hint: of your new repositories, which will suppress this warning, call: hint: hint: git config --global init.defaultBranch <name> hint: hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and hint: 'development'. The just-created branch can be renamed via this command: hint: hint: git branch -m <name> Initialized empty Git repository in /gnu/store/884nsva9r8wkp40kbqyvpj1ad57jc5dd-git-checkout/.git/ fatal: could not read Username for 'https://github.com': No such device or address Failed to do a shallow fetch; retrying a full fetch... fatal: could not read Username for 'https://github.com': No such device or address git-fetch: '/gnu/store/5vai7bfrfkzv22dx13bxpszjrqyi78x6-git-minimal-2.33.0/bin/git fetch origin' failed with exit code 128 Trying content-addressed mirror at berlin.guix.gnu.org... Trying content-addressed mirror at berlin.guix.gnu.org... Trying to download from Software Heritage... SWH: found revision e1eefd033b8a2c4c81babc6fde08ebb116c6abb8 with directory at 'https://archive.softwareheritage.org/api/1/directory/c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/' swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/ swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/ABOUT-NLS swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/AUTHORS swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/COPYING [...] swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/tests/hello-1 swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/tests/last-1 swh:1:dir:c3e538ed2de412d54c567ed7c8cfc46cbbc35d07/tests/traditional-1 construction de /gnu/store/6g9qlysbbk7p4609xrv82j0wzbib1y4r-git-checkout.drv réussie construction de /gnu/store/jx1r7w8xaw768176pjl0j0q1l1529w75-hi-2.10.drv... starting phase `set-SOURCE-DATE-EPOCH' phase `set-SOURCE-DATE-EPOCH' succeeded after 0.0 seconds [...] construction de /gnu/store/jx1r7w8xaw768176pjl0j0q1l1529w75-hi-2.10.drv réussie /gnu/store/jn8d031zx4znxy7s5zhj4dbr6xjsfq9v-hi-2.10 --8<---------------cut here---------------end--------------->8--- Well, it still misses the tarball and non-Git fetch method fallback and the story will be more than awesome! :-) > Limitations > ~~~~~~~~~~~~ > > Yes, there’s a couple of them. Well, yes some limitations but not so much. ;-) > First, fallback is implemented only for fresh clones, not for updates. > Thus, if I rerun the first example, having now the clone in > ~/.cache/guix/checkouts, with a different commit, I get: SWH is not a forge but an archive. :-) Therefore, this update case does not make sense to me. I mean, --8<---------------cut here---------------start------------->8--- $ git -C ~/.cache/guix/checkouts/6k7wvrcpbdsw3pje5b4squybw3jfn3viyrj7gcl7fipa5yjflaza fetch fatal: dépôt 'http://example.org/sdf/' non trouvé --8<---------------cut here---------------end--------------->8--- Well, maybe this cache could be removed if the commit is not found inside this cache and retry to fetch it from SWH. Obviously, the downdate case works. Note that on fresh clone, the error message could be improved: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build guix --with-git-url=guix=https://example.org --with-commit=guix=ff613c2b68aac539262822490448e637d8f315ba -n updating checkout of 'https://example.org'... guix build: error: Git failure while fetching https://example.org: unexpected http status code: 404 --8<---------------cut here---------------end--------------->8--- where https://example.org is bogus and ff613c2b68aac539262822490448e637d8f315ba is not yet archived on SWH. It could be nice to warn in addition to the 404 that it is not found in SWH. WDYT? > Second, clones from SWH only contain the one branch that the revision > is on. For channels, that means that the ‘keyring’ branch is not fetched, > which is why I commented out ‘introduction’ in /tmp/chan.scm above. To me, it is not an issue. Because you reach a commit from the past knowing the hash. Aside my opinion, I wanted to know which kind of metadata we get back from the Git repo, so I tried: --8<---------------cut here---------------start------------->8--- $ guix build guix --with-git-url=guix=https://example.org --with-commit=guix=c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada -n updating checkout of 'https://example.org'... SWH: found revision c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada with directory at 'https://archive.softwareheritage.org/api/1/directory/ca2e8a7222b4850c7bea935dff86b9c2a905efd6/' SWH vault: requested bundle cooking, waiting for completion... SWH vault: Processing... [...] --8<---------------cut here---------------end--------------->8--- then after several hours, I get this: --8<---------------cut here---------------start------------->8--- SWH vault: failure: Internal Server Error. This incident will be reported. SWH vault: retrying... SWH vault: requested bundle cooking, waiting for completion... SWH vault: Processing... --8<---------------cut here---------------end--------------->8--- and after more than 12h, the status is still: «SWH vault: Processing...» and nothing is complete. About this ’keyring’ branch, somehow it could be as a separated repo, so why not effectively do it. :-) I mean, get the branch as it is and mirror this branch in another Git repo saved on SWH; fallback to it if ’keyring’ branch is not there. I do not know… Or simply wait that SWH improves their things. :-) > *Third, and this answers the asterisk above, we must keep in mind that > this is content-addressibility *with SHA1*. Generating a chosen-prefix > collision is becoming affordable³, so users absolutely need an additional > mechanism to authenticate code they fetched. > > For origins, we have the content SHA256, so we’re fine. For channels, > we have Guix’s authentication mechanism¹, except it’s not available yet > via SWH, as I wrote above. For the footswitch example above using > ‘--with-commit’, we don’t have any authentication method, but in fact, > that’s the situation of Git repositories in general: they can rarely be > authenticated. How a chosen-prefix attack could work here? I understand why the second preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack could be exploited here to compromise the user, because this hash is provided by this very same user. > Ludovic Courtès (3): > swh: Support downloads of bare Git repositories. > git: 'update-cached-checkout' can fall back to SWH when cloning. > git: 'reference-available?' recognizes 'tag-or-commit'. LGTM! Cheers, simon ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2021-09-17 8:02 ` bug#44187: Channel clones lack SWH fallback zimoun @ 2021-09-18 21:10 ` Ludovic Courtès 2021-09-20 9:27 ` zimoun 0 siblings, 1 reply; 15+ messages in thread From: Ludovic Courtès @ 2021-09-18 21:10 UTC (permalink / raw) To: zimoun; +Cc: 44187-done Hello! zimoun <zimon.toutoune@gmail.com> skribis: > The original URL of the channel was: > <https://github.com/zimoun/channel-example.git>. And this channel > defines a package where the upstream has also disappeared > <https://github.com/zimoun/hello-example.git>. Note the URL in the > package definition is not bogus… but using one was already working. :-) > > All is saved on SWH, so now all is transparent! From my point of view, > this is a killer feature for scientific folks. :-) Yay! Great that you came up with a nice example to test it on! >> First, fallback is implemented only for fresh clones, not for updates. >> Thus, if I rerun the first example, having now the clone in >> ~/.cache/guix/checkouts, with a different commit, I get: > > SWH is not a forge but an archive. :-) Therefore, this update case does > not make sense to me. I mean, > > $ git -C ~/.cache/guix/checkouts/6k7wvrcpbdsw3pje5b4squybw3jfn3viyrj7gcl7fipa5yjflaza fetch > fatal: dépôt 'http://example.org/sdf/' non trouvé Right, that’s a reasonable limitation. > Well, maybe this cache could be removed if the commit is not found > inside this cache and retry to fetch it from SWH. Obviously, the > downdate case works. It’s still useful to keep it cached around in case the user is going to use it several times in a row. > Note that on fresh clone, the error message could be improved: > > $ ./pre-inst-env guix build guix --with-git-url=guix=https://example.org --with-commit=guix=ff613c2b68aac539262822490448e637d8f315ba -n > updating checkout of 'https://example.org'... > guix build: error: Git failure while fetching https://example.org: unexpected http status code: 404 > > > where https://example.org is bogus and > ff613c2b68aac539262822490448e637d8f315ba is not yet archived on SWH. It > could be nice to warn in addition to the 404 that it is not found in > SWH. WDYT? Agreed; I’ve made this change (actually ‘swh-download’ prints something upon failure since commit 60b42bec8413aa9844e625fb1903257f1bc1e55c, but it looks more like a debugging message.) > $ guix build guix --with-git-url=guix=https://example.org --with-commit=guix=c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada -n > updating checkout of 'https://example.org'... > SWH: found revision c75b30d58f0becb0a5cd6a8bfe69d1063b0d1ada with directory at 'https://archive.softwareheritage.org/api/1/directory/ca2e8a7222b4850c7bea935dff86b9c2a905efd6/' > SWH vault: requested bundle cooking, waiting for completion... > SWH vault: Processing... > [...] > > > then after several hours, I get this: > > SWH vault: failure: Internal Server Error. This incident will be reported. > SWH vault: retrying... > SWH vault: requested bundle cooking, waiting for completion... > SWH vault: Processing... > > and after more than 12h, the status is still: «SWH vault: Processing...» > and nothing is complete. Did it eventually succeed? We obviously have no guarantee as to how long it might take to cook a bundle. > About this ’keyring’ branch, somehow it could be as a separated repo, so > why not effectively do it. :-) I mean, get the branch as it is and > mirror this branch in another Git repo saved on SWH; fallback to it if > ’keyring’ branch is not there. I do not know… Or simply wait that SWH > improves their things. :-) Yeah, they’re planning to support it eventually. >> *Third, and this answers the asterisk above, we must keep in mind that >> this is content-addressibility *with SHA1*. Generating a chosen-prefix >> collision is becoming affordable³, so users absolutely need an additional >> mechanism to authenticate code they fetched. [...] > How a chosen-prefix attack could work here? I understand why the second > preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack > could be exploited here to compromise the user, because this hash is provided > by this very same user. I think you’re right, it’s rather second-preimage attacks that would be a serious problem. My point is: as time passes, assuming that a SHA1 resolves to a single revision on SWH is becoming more and more questionable. >> swh: Support downloads of bare Git repositories. >> git: 'update-cached-checkout' can fall back to SWH when cloning. >> git: 'reference-available?' recognizes 'tag-or-commit'. I’ve pushed this after adding the warning as you suggested: dce2cf311b * git: 'reference-available?' recognizes 'tag-or-commit'. 05f44c2d85 * git: 'update-cached-checkout' can fall back to SWH when cloning. 6ec81c31c0 * swh: Support downloads of bare Git repositories. Thanks a lot for reviewing and testing on real-world examples! Ludo’. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2021-09-18 21:10 ` Ludovic Courtès @ 2021-09-20 9:27 ` zimoun 2021-09-22 10:03 ` Ludovic Courtès 0 siblings, 1 reply; 15+ messages in thread From: zimoun @ 2021-09-20 9:27 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 44187-done Hi, On Sat, 18 Sept 2021 at 23:10, Ludovic Courtès <ludo@gnu.org> wrote: > zimoun <zimon.toutoune@gmail.com> skribis: > > and after more than 12h, the status is still: «SWH vault: Processing...» > > and nothing is complete. > > Did it eventually succeed? We obviously have no guarantee as to how > long it might take to cook a bundle. No, I stopped. And I reported to #swh-devel. It might be something wrong on their side. Yeah, cook a bundle could be long... especially with large repo as Guix (lot of commits and couple of files). I think it is ok to let the code as it is now. > >> *Third, and this answers the asterisk above, we must keep in mind that > >> this is content-addressibility *with SHA1*. Generating a chosen-prefix > >> collision is becoming affordable³, so users absolutely need an additional > >> mechanism to authenticate code they fetched. > > [...] > > > How a chosen-prefix attack could work here? I understand why the second > > preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack > > could be exploited here to compromise the user, because this hash is provided > > by this very same user. > > I think you’re right, it’s rather second-preimage attacks that would be > a serious problem. My point is: as time passes, assuming that a SHA1 > resolves to a single revision on SWH is becoming more and more > questionable. Well, SHA-1 is 2^160 (~10^48.2) and compared to 10^50 which is the estimated number of atoms in Earth. Speaking about content-addressability, SHA-1 seems fine. However, for security, yeah time flies. :-) > >> swh: Support downloads of bare Git repositories. > >> git: 'update-cached-checkout' can fall back to SWH when cloning. > >> git: 'reference-available?' recognizes 'tag-or-commit'. > > I’ve pushed this after adding the warning as you suggested: > > dce2cf311b * git: 'reference-available?' recognizes 'tag-or-commit'. > 05f44c2d85 * git: 'update-cached-checkout' can fall back to SWH when cloning. > 6ec81c31c0 * swh: Support downloads of bare Git repositories. Cool! I would deserve a --news entry. ;-) Cheers, simon ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#44187: Channel clones lack SWH fallback 2021-09-20 9:27 ` zimoun @ 2021-09-22 10:03 ` Ludovic Courtès 0 siblings, 0 replies; 15+ messages in thread From: Ludovic Courtès @ 2021-09-22 10:03 UTC (permalink / raw) To: zimoun; +Cc: 44187-done Hi, zimoun <zimon.toutoune@gmail.com> skribis: > On Sat, 18 Sept 2021 at 23:10, Ludovic Courtès <ludo@gnu.org> wrote: [...] >> > How a chosen-prefix attack could work here? I understand why the second >> > preimage attack is an issue. But I miss how the SHA-1 chosen-prefix attack >> > could be exploited here to compromise the user, because this hash is provided >> > by this very same user. >> >> I think you’re right, it’s rather second-preimage attacks that would be >> a serious problem. My point is: as time passes, assuming that a SHA1 >> resolves to a single revision on SWH is becoming more and more >> questionable. > > Well, SHA-1 is 2^160 (~10^48.2) and compared to 10^50 which is the > estimated number of atoms in Earth. Speaking about > content-addressability, SHA-1 seems fine. However, for security, yeah > time flies. :-) True! >> >> swh: Support downloads of bare Git repositories. >> >> git: 'update-cached-checkout' can fall back to SWH when cloning. >> >> git: 'reference-available?' recognizes 'tag-or-commit'. >> >> I’ve pushed this after adding the warning as you suggested: >> >> dce2cf311b * git: 'reference-available?' recognizes 'tag-or-commit'. >> 05f44c2d85 * git: 'update-cached-checkout' can fall back to SWH when cloning. >> 6ec81c31c0 * swh: Support downloads of bare Git repositories. > > Cool! I would deserve a --news entry. ;-) That’s a good idea, I’ve added one. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-09-22 10:04 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-10-23 22:17 bug#44187: whishlist: time-machine --channel falls back to SWH zimoun 2021-03-05 14:51 ` Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 1/3] swh: Support downloads of bare Git repositories Ludovic Courtès 2021-09-17 17:31 ` bug#44187: Channel clones lack SWH fallback zimoun 2021-09-18 10:05 ` Ludovic Courtès 2021-09-18 10:27 ` zimoun 2021-09-10 14:34 ` bug#44187: [PATCH 2/3] git: 'update-cached-checkout' can fall back to SWH when cloning Ludovic Courtès 2021-09-10 14:34 ` bug#44187: [PATCH 3/3] git: 'reference-available?' recognizes 'tag-or-commit' Ludovic Courtès 2021-09-13 16:07 ` bug#44187: [PATCH 0/3] Fall back to Software Heritage (SWH) for Git clones zimoun 2021-09-14 13:37 ` Ludovic Courtès 2021-09-17 8:02 ` bug#44187: Channel clones lack SWH fallback zimoun 2021-09-18 21:10 ` Ludovic Courtès 2021-09-20 9:27 ` zimoun 2021-09-22 10:03 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).