* Re: Disarchive update
2021-10-09 10:05 Disarchive update Ludovic Courtès
@ 2021-10-09 10:37 ` Mathieu Othacehe
2021-10-10 13:22 ` Ludovic Courtès
2021-10-12 9:19 ` zimoun
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Mathieu Othacehe @ 2021-10-09 10:37 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Hey Ludo,
> https://ci.guix.gnu.org/eval/29213?status=succeeded
Nice! It looks like an expensive operation, maybe we should increase its
period to 24 hours or so?
> 2. On berlin, add an mcron job that periodically copies the output of
> the latest “disarchive-collection” build to a directory, say
> /srv/disarchive. Thus, the database would accumulate tarball
> metadata over time.
We could add the result as a "build-product" so that it is available at:
https://ci.guix.gnu.org/search/latest/disarchive-collection. The mcron
job could use this URL to fetch the latest archive.
> How does that sound? Thoughts?
Sounds great, happy to see more use-cases for Cuirass :)
Thanks,
Mathieu
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-09 10:37 ` Mathieu Othacehe
@ 2021-10-10 13:22 ` Ludovic Courtès
2021-10-12 8:41 ` Mathieu Othacehe
0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-10 13:22 UTC (permalink / raw)
To: Mathieu Othacehe; +Cc: guix-devel
Hi!
Mathieu Othacehe <othacehe@gnu.org> skribis:
>> https://ci.guix.gnu.org/eval/29213?status=succeeded
>
> Nice! It looks like an expensive operation, maybe we should increase its
> period to 24 hours or so?
Yes, I’ve made it 12 hours now. :-)
It shouldn’t be too expensive: there’s one derivation per tarball
disarchive and very few of them get rebuilt between subsequent
evaluations; disarchive-collection.drv depends on all of them.
However, I think the current model of Cuirass means that those
intermediate derivations aren’t retrieved on berlin so we’re potentially
building things multiple times?
>> 2. On berlin, add an mcron job that periodically copies the output of
>> the latest “disarchive-collection” build to a directory, say
>> /srv/disarchive. Thus, the database would accumulate tarball
>> metadata over time.
>
> We could add the result as a "build-product" so that it is available at:
> https://ci.guix.gnu.org/search/latest/disarchive-collection. The mcron
> job could use this URL to fetch the latest archive.
That’d be nice! How do we do that again?
I was planning on retrieving the derivation file name in the mcron job
using the (guix ci) API, but having a build product may simplify things
a bit.
Thanks for your feedback!
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-10 13:22 ` Ludovic Courtès
@ 2021-10-12 8:41 ` Mathieu Othacehe
2021-10-14 14:06 ` Ludovic Courtès
0 siblings, 1 reply; 15+ messages in thread
From: Mathieu Othacehe @ 2021-10-12 8:41 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Hey,
> That’d be nice! How do we do that again?
The build-outputs field of the <specification> record must be used as
explained here:
https://guix.gnu.org/cuirass/manual/html_node/Specifications.html#Specifications.
This field cannot be manipulated via the web interface yet.
I think the easier way to proceed would be to create the "disarchive"
specification in the (maintenance sysadmin services) module, this way:
--8<---------------cut here---------------start------------->8---
(specification
(name "disarchive")
(build '(manifests "etc/disarchive-manifest.scm"))
(build-outputs
(list
(build-output
(job "disarchive-collection*")
(type "archive")
(path ""))))
(notifications #$(cuirass-notifications))
(period 43200)
(priority 7)
(systems '("x86_64-linux")))
--8<---------------cut here---------------end--------------->8---
I can take care of that if it's ok for you.
Thanks,
Mathieu
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-12 8:41 ` Mathieu Othacehe
@ 2021-10-14 14:06 ` Ludovic Courtès
0 siblings, 0 replies; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-14 14:06 UTC (permalink / raw)
To: Mathieu Othacehe; +Cc: guix-devel
Hi,
Mathieu Othacehe <othacehe@gnu.org> skribis:
> I think the easier way to proceed would be to create the "disarchive"
> specification in the (maintenance sysadmin services) module, this way:
>
> (specification
> (name "disarchive")
> (build '(manifests "etc/disarchive-manifest.scm"))
> (build-outputs
> (list
> (build-output
> (job "disarchive-collection*")
> (type "archive")
> (path ""))))
> (notifications #$(cuirass-notifications))
> (period 43200)
> (priority 7)
> (systems '("x86_64-linux")))
Thanks for the tip. I went ahead, committed it to maintenance.git and
deployed it. It works! :-)
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-09 10:05 Disarchive update Ludovic Courtès
2021-10-09 10:37 ` Mathieu Othacehe
@ 2021-10-12 9:19 ` zimoun
2021-10-14 14:02 ` Ludovic Courtès
2021-10-13 14:54 ` Timothy Sample
2021-10-14 14:31 ` Ludovic Courtès
3 siblings, 1 reply; 15+ messages in thread
From: zimoun @ 2021-10-12 9:19 UTC (permalink / raw)
To: Ludovic Courtès, guix-devel
Hi Ludo,
On Sat, 09 Oct 2021 at 12:05, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:
> If you run:
>
> guix build /gnu/store/nnl67m8c2x9rwqbnych1agc6p7g5473g-disarchive-collection.drv
Oh, cool!
> and if you’re patient :-), you eventually get a 579 MB directory
> containing Disarchive metadata for 8,413 tarballs out of 9,113 (the
> missing tarballs are those that “disarchive disassemble” fails to
> handle, for instance because it couldn’t guess what compression method
> is being used.)
Timothy made this table months ago:
tar+gz 9090 52.0%
git 5294 30.3%
tar+xz 1184 06.8%
tar+bz2 775 04.4%
tar 393 02.2%
zip 273 01.6%
svn-multi 175 01.0%
svn 125 00.7%
file 51 00.3%
computed 38 00.2%
hg 36 00.2%
unknown-uri 20 00.1%
tar+gz? 15 00.1%
tar+lz 13 00.1%
tar+Z 4 00.0%
cvs 3 00.0%
bzr 3 00.0%
tar+lzma 2 00.0%
total 17494 100.0%
What is really missing is XZ and Bzip2 support in Disarchive, I guess.
> Where to go from here? Timothy Sample had already set up a Disarchive
> database at <https://disarchive.ngyro.com>, which (guix download) uses
> as a fallback; I’m not sure exactly how it’s populated. The goal here
> would be for the Guix project to set up infrastructure populating a
> database automatically and creating backups, possibly via SWH (we’ll
> have to discuss it with them).
Timothy was working on feeding the database using each release. Well,
you can give a look at:
<https://git.ngyro.com/preservation-of-guix>
Then something along these lines:
$ sqlite3 /tmp/pog.db < schema.sql
$ guix repl -L . <(echo '
(use-modules (pog))
(ingest "6298c3ffd9654d3231a6f25390b056483e8f407c"
"/tmp/pog.db")
')
for where the commit hash corresponds to v1.0.0. I do not know if it
would be equivalent to run:
guix time-machine --commit=6298c3ffd9654d3231a6f25390b056483e8f407c \
-- build -m etc/disarchive-manifest.scm
> A plan we can already deploy would be:
>
> 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
>
> 2. On berlin, add an mcron job that periodically copies the output of
> the latest “disarchive-collection” build to a directory, say
> /srv/disarchive. Thus, the database would accumulate tarball
> metadata over time.
>
> 3. Add an nginx route so that /srv/disarchive is served at
> https://disarchive.guix.gnu.org.
>
> 4. Add disarchive.guix.gnu.org to (guix download).
To replace (or add to) the current ’%disarchive-mirrors’ right?
Going this road (use Cuirass), why not generating the sources.json
similarly? Instead of the hack using the website builder.
On my side, I will try to resume what I started months ago: knowing the
SWH coverage. For instance, on this ~92% of tarballs, how many are
currently stored into SWH? Well, do not take your breath and I would be
happy if someone beats me. ;-)
Cheers,
simon
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-12 9:19 ` zimoun
@ 2021-10-14 14:02 ` Ludovic Courtès
2021-10-14 19:17 ` zimoun
0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-14 14:02 UTC (permalink / raw)
To: zimoun; +Cc: guix-devel
Hey!
zimoun <zimon.toutoune@gmail.com> skribis:
> Timothy made this table months ago:
>
> tar+gz 9090 52.0%
> git 5294 30.3%
> tar+xz 1184 06.8%
> tar+bz2 775 04.4%
> tar 393 02.2%
> zip 273 01.6%
> svn-multi 175 01.0%
> svn 125 00.7%
> file 51 00.3%
> computed 38 00.2%
> hg 36 00.2%
> unknown-uri 20 00.1%
> tar+gz? 15 00.1%
> tar+lz 13 00.1%
> tar+Z 4 00.0%
> cvs 3 00.0%
> bzr 3 00.0%
> tar+lzma 2 00.0%
> total 17494 100.0%
>
> What is really missing is XZ and Bzip2 support in Disarchive, I guess.
Definitely, we know what to work on next!
> Timothy was working on feeding the database using each release. Well,
> you can give a look at:
>
> <https://git.ngyro.com/preservation-of-guix>
Ah nice! I had completely overlooked this.
[...]
>> A plan we can already deploy would be:
>>
>> 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
>>
>> 2. On berlin, add an mcron job that periodically copies the output of
>> the latest “disarchive-collection” build to a directory, say
>> /srv/disarchive. Thus, the database would accumulate tarball
>> metadata over time.
>>
>> 3. Add an nginx route so that /srv/disarchive is served at
>> https://disarchive.guix.gnu.org.
>>
>> 4. Add disarchive.guix.gnu.org to (guix download).
>
> To replace (or add to) the current ’%disarchive-mirrors’ right?
Exactly.
> Going this road (use Cuirass), why not generating the sources.json
> similarly? Instead of the hack using the website builder.
I guess that would also work, indeed. Then we could make /source.json
redirect to ci.guix.gnu.org/whatever/latest.
> On my side, I will try to resume what I started months ago: knowing the
> SWH coverage. For instance, on this ~92% of tarballs, how many are
> currently stored into SWH? Well, do not take your breath and I would be
> happy if someone beats me. ;-)
Yup, we definitely need that kind of info now!
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-14 14:02 ` Ludovic Courtès
@ 2021-10-14 19:17 ` zimoun
2021-10-21 19:41 ` Ludovic Courtès
0 siblings, 1 reply; 15+ messages in thread
From: zimoun @ 2021-10-14 19:17 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]
Hi,
On Thu, 14 Oct 2021 at 16:02, Ludovic Courtès <ludo@gnu.org> wrote:
>> Going this road (use Cuirass), why not generating the sources.json
>> similarly? Instead of the hack using the website builder.
>
> I guess that would also work, indeed. Then we could make /source.json
> redirect to ci.guix.gnu.org/whatever/latest.
I gave a look but it is not clear yet how to do it. Pointers or tips
welcome. :-)
>> On my side, I will try to resume what I started months ago: knowing the
>> SWH coverage. For instance, on this ~92% of tarballs, how many are
>> currently stored into SWH? Well, do not take your breath and I would be
>> happy if someone beats me. ;-)
>
> Yup, we definitely need that kind of info now!
Using, the Authentication mode from SWH [1] and this trivial patch, the
rate limit is at 1200 which allows to check and archive some packages.
For instance, now,
--8<---------------cut here---------------start------------->8---
for p in $(guix package -A | cut -f1 | grep "julia-");
do
./pre-inst-env guix lint -c archival $p
;done
--8<---------------cut here---------------end--------------->8---
passes. The remaining work is to check with SWH folks for an higher
value than this 1200 limit and have a token associated to an account to
the Software Heritage Authentication service. And set a cron task
“somewhere” running:
./pre-inst-env guix lint -c archival
WDYT?
Cheers,
simon
1: <https://archive.softwareheritage.org/api/>
2: <https://archive.softwareheritage.org/oidc/login/>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: swh-auth.patch --]
[-- Type: text/x-diff, Size: 1488 bytes --]
diff --git a/guix/swh.scm b/guix/swh.scm
index 5c41685a24..1aaf733b5d 100644
--- a/guix/swh.scm
+++ b/guix/swh.scm
@@ -153,12 +153,20 @@ (define url
url
(string-append url "/")))
+(define token
+ 'xxXxxXxxXxXXXxXxXxXxXxXxxXXxXxXxXxxXXxxxxxxxXxXxXXXxXXXxXXXxXxxxXxXxXXXxXXXxXXXxXxxxXxXxXxxxXXXxXXXxxX.xxXxXXXxXxXxXxXxXxxxXxXxXxxxxXXxXxXxXxxxXXXxXXxxXxxxXXXxXxxxXXxxXXXxXXXxXXxxXxXxXXXxXxxxxxXxXxxxxXXxXxxxXXXxxXxxxxXxxxXxXXxxxxxxXXxxXxxxXxxxxXXxXxXxXXxxxxxXxxXxxxXxXXxxxxxxXXxxXxxxXXXxXxxxxXXxxXXxXxxxxXXxXxXxXxXxXXXxxXXxxXXxXxXxxxXxXxXxxXxxxxXxxXxxXxXxXxXxXXXxXXXxxXXxXxXxXXXxxXXxXxXxXXXxXXxxXxxxXXXxXXXxXXxxXXXxXxxxXXxxXXXxXxXxXxXxXXXxxXXxXxXXXxXxxXxxXxxxXXxxXxxxxxxxXXxxXxXxXxXxxxXxxxxxxxXxxXXxXxXxXXXxXxxxXxxxXxxxXXXxXxXxXxXxXxxxXxxxXXXxXxXxXxXxXXXxXxxxXXXxXxxxXXxxXXXxXxXxxXxxXxXxXxXxxxXxxxxxxXxxXXXxXXxxXxx.xxXxxxxXxxxxXxxxxXXxXXxxxXXXX-xxx_xxxXXxxxx
+ )
+
;; XXX: Work around a bug in Guile 3.0.2 where #:verify-certificate? would
;; be ignored (<https://bugs.gnu.org/40486>).
(define* (http-get* uri #:rest rest)
- (apply http-request uri #:method 'GET rest))
+ (apply http-request uri #:method 'GET
+ #:headers `((authorization . (Bearer ,token)))
+ rest))
(define* (http-post* uri #:rest rest)
- (apply http-request uri #:method 'POST rest))
+ (apply http-request uri #:method 'POST
+ #:headers `((authorization . (Bearer ,token)))
+ rest))
(define %date-regexp
;; Match strings like "2014-11-17T22:09:38+01:00" or
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-14 19:17 ` zimoun
@ 2021-10-21 19:41 ` Ludovic Courtès
2021-10-21 19:57 ` zimoun
0 siblings, 1 reply; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-21 19:41 UTC (permalink / raw)
To: zimoun; +Cc: guix-devel
Hi,
zimoun <zimon.toutoune@gmail.com> skribis:
> Using, the Authentication mode from SWH [1] and this trivial patch, the
> rate limit is at 1200 which allows to check and archive some packages.
> For instance, now,
>
> for p in $(guix package -A | cut -f1 | grep "julia-");
> do
> ./pre-inst-env guix lint -c archival $p
> ;done
>
> passes. The remaining work is to check with SWH folks for an higher
> value than this 1200 limit and have a token associated to an account to
> the Software Heritage Authentication service. And set a cron task
> “somewhere” running:
>
> ./pre-inst-env guix lint -c archival
>
> WDYT?
I think you made progress on this in the meantime: this is great!
Really cool of the SWH folks to give you a higher rate limit.
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-09 10:05 Disarchive update Ludovic Courtès
2021-10-09 10:37 ` Mathieu Othacehe
2021-10-12 9:19 ` zimoun
@ 2021-10-13 14:54 ` Timothy Sample
2021-10-14 14:04 ` Ludovic Courtès
2021-10-14 14:31 ` Ludovic Courtès
3 siblings, 1 reply; 15+ messages in thread
From: Timothy Sample @ 2021-10-13 14:54 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Hi Ludovic,
Ludovic Courtès <ludovic.courtes@inria.fr> writes:
> This job is disassembling all the .tar.gz files packages refer to, using
> the recently-added ‘etc/disarchive-manifest.scm’ file:
>
> https://ci.guix.gnu.org/jobset/disarchive
>
> It has just succeeded for the first time. :-)
Fantastic! I feel bad that I left you holding the bag on this one,
though. Sorry. I’ve been a little adrift this summer. Thanks for
picking it up!
> Where to go from here? Timothy Sample had already set up a Disarchive
> database at <https://disarchive.ngyro.com>, which (guix download) uses
> as a fallback; I’m not sure exactly how it’s populated.
Basically the same as what you are doing now. I have many Cuirass jobs,
and I use the build outputs mechanism (mentioned by Mathieu in elsewhere
in this thread). I don’t have a “disarchive-collection” job, so I have
to use the Cuirass API to dig through the recent build outputs to find
new results. This happens from a cron job, which uploads each new
result to my server.
One simple but satisfying thing that I do is serve the files compressed.
That is, they are compressed on disk and nginx just passes them along
(using the “gzip_static” module). Because of Disarchive’s verbose and
repetitive output format, this makes for a huge reduction in storage
requirements.
> The goal here would be for the Guix project to set up infrastructure
> populating a database automatically and creating backups, possibly via
> SWH (we’ll have to discuss it with them).
>
> A plan we can already deploy would be:
>
> 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
>
> 2. On berlin, add an mcron job that periodically copies the output of
> the latest “disarchive-collection” build to a directory, say
> /srv/disarchive. Thus, the database would accumulate tarball
> metadata over time.
>
> 3. Add an nginx route so that /srv/disarchive is served at
> https://disarchive.guix.gnu.org.
>
> 4. Add disarchive.guix.gnu.org to (guix download).
>
> How does that sound? Thoughts?
This is great! I can offer some past metadata, too. Specifically, I
have ~14000 files that I generated while digging into SWH coverage.
(That’s a project I’d like to return to, but I’m still trying to get my
head back in the game and pick up where I left off.)
-- Tim
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-13 14:54 ` Timothy Sample
@ 2021-10-14 14:04 ` Ludovic Courtès
0 siblings, 0 replies; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-14 14:04 UTC (permalink / raw)
To: Timothy Sample; +Cc: guix-devel
Hey Timothy!
Timothy Sample <samplet@ngyro.com> skribis:
> Fantastic! I feel bad that I left you holding the bag on this one,
> though. Sorry. I’ve been a little adrift this summer. Thanks for
> picking it up!
No problem, I’m glad to see you chime in now! :-)
> One simple but satisfying thing that I do is serve the files compressed.
> That is, they are compressed on disk and nginx just passes them along
> (using the “gzip_static” module). Because of Disarchive’s verbose and
> repetitive output format, this makes for a huge reduction in storage
> requirements.
Oh nice, thanks for sharing this tip!
> This is great! I can offer some past metadata, too. Specifically, I
> have ~14000 files that I generated while digging into SWH coverage.
> (That’s a project I’d like to return to, but I’m still trying to get my
> head back in the game and pick up where I left off.)
Alright.
Thanks!
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-09 10:05 Disarchive update Ludovic Courtès
` (2 preceding siblings ...)
2021-10-13 14:54 ` Timothy Sample
@ 2021-10-14 14:31 ` Ludovic Courtès
2021-10-14 21:44 ` zimoun
2021-10-21 19:44 ` Ludovic Courtès
3 siblings, 2 replies; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-14 14:31 UTC (permalink / raw)
To: guix-devel
Hello Guix!
Ludovic Courtès <ludovic.courtes@inria.fr> skribis:
> This job is disassembling all the .tar.gz files packages refer to, using
> the recently-added ‘etc/disarchive-manifest.scm’ file:
>
> https://ci.guix.gnu.org/jobset/disarchive
[...]
> A plan we can already deploy would be:
>
> 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
Done:
https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=df9e9b7f51abceb5999aabc9a7b71396600cffa4
https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=12195160432871b80d0e1eac996a9aa7d8500697
Sample URLs:
https://disarchive.guix.gnu.org/sha256/53cf3e14c71f3a149f29d13a0da64120b3c1d3334fba39c4af3e520be053982a
https://disarchive.guix.gnu.org/sha256/39052f59ff474a4a69cefc25cf3caf8429400889deba010ee6403ca188f8b311
https://disarchive.guix.gnu.org/sha256/03a71d53055bd9ec528d55e07afaf15c09dec9856cba734904bfd05acbc6cf12
Aren’t those Disarchive sexps really cute? :-)
> 2. On berlin, add an mcron job that periodically copies the output of
> the latest “disarchive-collection” build to a directory, say
> /srv/disarchive. Thus, the database would accumulate tarball
> metadata over time.
First, there’s a script to populate the database; it copies files from
the latest successful “disarchive-collection” build to a specified
directory, gzipping them on their way. It’s atomic, so the directory in
question can be directly served by nginx or similar:
https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=fb83b3d8de189c6d6c33c4cdc2ebabf6eae1463e
If you want to try it at home, just run:
./sync-disarchive-db.scm /tmp/db
It’s pretty fast! The output is only 70 MiB, now that individual files
are gzipped.
Then there’s the mcron job that runs it once a day on berlin:
https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=27dc74fbe33a9d929b37994e825dc202385f87c0
We could run it as well on bayfront so we have a backup.
> 3. Add an nginx route so that /srv/disarchive is served at
> https://disarchive.guix.gnu.org.
Done here:
https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=9ffb2db81a2fbee67b99c76217be874ec0fd6bde
> 4. Add disarchive.guix.gnu.org to (guix download).
Done:
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=f9a506aa6a5aaeb2c06c97d5b663d01d2103db69
As I was once again modifying files by hand to test the download
fallback mechanisms, I figured we could just as well add a variable to
enable testing, which is what I did here:
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=c4a7aa82e25503133a1bd33148d17968c899a5f5
So you can do, say:
GUIX_DOWNLOAD_FALLBACK_TEST=disarchive-mirrors guix build -S r-ebimage --check
or:
GUIX_DOWNLOAD_FALLBACK_TEST=content-addressed-mirrors guix build -S r-ebimage --check
to check whether these fallback mechanisms work as expected. (They do,
but I’ll update the ‘guix’ package because the current one has a bug
that breaks the Disarchive/SWH fallback.)
I think we’re making progress! :-)
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-14 14:31 ` Ludovic Courtès
@ 2021-10-14 21:44 ` zimoun
2021-10-21 19:44 ` Ludovic Courtès
1 sibling, 0 replies; 15+ messages in thread
From: zimoun @ 2021-10-14 21:44 UTC (permalink / raw)
To: Ludovic Courtès, guix-devel
Hi,
On Thu, 14 Oct 2021 at 16:31, Ludovic Courtès <ludo@gnu.org> wrote:
>> 4. Add disarchive.guix.gnu.org to (guix download).
>
> Done:
[...]
> I think we’re making progress! :-)
I added the SWH authentication token in patch#51216 [1]. Using a valid
TOKEN from an account of the Software Heritage Authentication service,
it reads something along these lines,
GUIX_SWH_TOKEN=${TOKEN} guix lint -c archival
The token allows by default 1200 requests instead of 120. The
interesting thing concerning the recent Disarchive additions are these
bits:
--8<---------------cut here---------------start------------->8---
Disarchive entry refers to non-existent SWH directory 'aeae11cb3c33ab33374e222dc3bdf17039808a5b'
Disarchive entry refers to non-existent SWH directory 'b25414c9864a270899ca1ff494e7ba4c437b166d'
Disarchive entry refers to non-existent SWH directory '128bbe76a82dd0b38b725565ed703a7148257ae0'
Disarchive entry refers to non-existent SWH directory '92625e2c6dbe3ad7c4f44a061ada24ce00637087'
Disarchive entry refers to non-existent SWH directory '6000a273dfff9de62725b53e41562fff711069c1'
Disarchive entry refers to non-existent SWH directory 'c68ff8714c6fd360a38158f3d8f22e555c061452'
Disarchive entry refers to non-existent SWH directory 'cb52aaa9500df2b674bf7922811deeea1b766139'
Disarchive entry refers to non-existent SWH directory '3e574043a04d77dd7231d23210547c4fe065a40c'
Disarchive entry refers to non-existent SWH directory 'aa763150704fe06f34097b38e839409cee52366d'
Disarchive entry refers to non-existent SWH directory '127c0a03c7ccba74870aef7dac36019af35798cc'
Disarchive entry refers to non-existent SWH directory 'd9745f29da983c6ad674871e68ac96362c4f11cc'
Disarchive entry refers to non-existent SWH directory '7d7ed9f88ee649a90493f54d3988a062c3ddeafb'
Disarchive entry refers to non-existent SWH directory 'f5bd0fe7450175196c57d6f6d5aca8905393e814'
Disarchive entry refers to non-existent SWH directory '92bd3b93caa9a4b0840c70ddb96ac75b0684d7ec'
--8<---------------cut here---------------end--------------->8---
which needs investigations why the Disarchive database contains some
entries that SWH does not know. It probably means that there is an
inconsistency from sources.json.
1: <http://issues.guix.gnu.org/issue/51216>
2: <https://archive.softwareheritage.org/oidc/login/>
Cheers,
simon
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Disarchive update
2021-10-14 14:31 ` Ludovic Courtès
2021-10-14 21:44 ` zimoun
@ 2021-10-21 19:44 ` Ludovic Courtès
1 sibling, 0 replies; 15+ messages in thread
From: Ludovic Courtès @ 2021-10-21 19:44 UTC (permalink / raw)
To: guix-devel
Hi!
Ludovic Courtès <ludo@gnu.org> skribis:
> Then there’s the mcron job that runs it once a day on berlin:
>
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=27dc74fbe33a9d929b37994e825dc202385f87c0
>
> We could run it as well on bayfront so we have a backup.
I did that without thinking much but it won’t work: as written,
sync-disarchive-db.scm assumes ci.guix.gnu.org substitutes are
authorized, which is not the case on bayfront.
So I suppose we need to do things differently there, such as
fetching/unpacking substitutes straight from sync-disarchive-db.scm
instead of going through the daemon.
I’ll take a look sometime, but it’d be great if someone else did. :-)
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 15+ messages in thread