unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#47336] Disarchive as a fallback for downloads
       [not found] <87eeg6o50b.fsf@ngyro.com>
@ 2021-03-23  9:35 ` zimoun
  2021-03-23 14:31   ` Timothy Sample
       [not found] ` <20210323045213.9419-1-samplet@ngyro.com>
  2021-05-14 21:36 ` Ludovic Courtès
  2 siblings, 1 reply; 12+ messages in thread
From: zimoun @ 2021-03-23  9:35 UTC (permalink / raw)
  To: Timothy Sample, 47336, Mathieu Othacehe

Hi Timothy,

(CC Mathieu to advice if it could be a feature of Cuirass.)


On Tue, 23 Mar 2021 at 00:42, Timothy Sample <samplet@ngyro.com> wrote:

> This patch series adds Disarchive assembly (backed by SWH lookup) as a
> fallback for downloads.

Awesome!


> You also need to make sure that regular downloads are unavailable.  I do
> this by adjusting the “try” loop at the end of “url-fetch” in
> “guix/build/download.scm”.  I replace the usual list of URLs with ‘()’:
>
>     (let try ((uri (append uri content-addressed-uris)))
>       (match '() ; uri
>         ...))
>
> Now you can ask Guix for a recent .tar.gz source package:
>
>     $ ./pre-inst-env guix build --no-substitutes -S python-httpretty

Neat!  Now, there is a way to easily check the coverage, right?  Since
SWH is ingesting the tarball using <http://guix.gnu.org/sources.json>,
there is now a mean to report what Guix is able to rebuild.

>     Checking httpretty-1.0.5 digest... ok

What happens if it is not ok?

>     Assembling the tarball httpretty-1.0.5.tar
>     Checking httpretty-1.0.5.tar digest... ok
>     Assembling the Gzip file httpretty-1.0.5.tar.gz
>     Checking httpretty-1.0.5.tar.gz digest... ok
>     Copying result to /gnu/store/kbcnm57y2q1jvhvd8zw1g5vdiwlv19y9-httpretty-1.0.5.tar.gz

Where is the assembly done?  In /tmp/, right?

>     successfully built /gnu/store/k0b3c7kgzyn1nlyhx192pcbcgbfnhnwa-httpretty-1.0.5.tar.gz.drv

Just to be sure, when does Guix check the integrity checksum?  I mean,
does Guix check the checksum after ’disassemble’ re-assembled the source?


> First, it looks up the metadata on my server.  This is fine for a demo,
> but not what we want forever.  The patch series supports adding
> several

As we talked before, how does the database scale?  Do you have some
numbers for the current demo?  In order to try to extrapolate what does
it mean for a server to «store the metadata».

> mirrors for looking up the metadata.  In the past, we talked about
> putting everything on one or a few of the big Git hosting platforms like
> GitHub or Gitlab.  That way, it would be easily picked up by SWH and
> archived “forever”.  Right now, I have Cuirass set up to build the
> metadata, and a little script that moves it from the build server to my
> Web server.  It would be simple enough to adjust that script to push it
> to a remote Git repo.  (Of course, the next step is to move this setup
> to Guix infrastructure.)  Thoughts?

Maybe this database could be a package, say “guix-tarball-db”, updated
in agreement with the package “guix”.  The source of this
“guix-tarball-db” would be a remote big Git hosting platforms like
GitHub or whatever and not stored on Guix infrastructure, or maybe
stored on Guix infra.

Regularly, i.e., when the package “guix” is updated, in the same time,
the package “guix-tarball-db” is updated too.  The “guix lint -c
archival” sends the saving request to SWH.  Even if this saving request
should be automated soon. :-)

Then if Cuirass would have a feature to disassemble and update the Git
repo.

Last, a service should run as your demo.  But for long-term, this
service could disappear––assuming SWH not :-).  Therefore, we could
imagine installing “guix-tarball-db” then tweak some parameters of the
guix-daemon and “guix build <foo>”.  Both installing and building would
fetch from SWH if both upstream disappear.

Or this “guix-tarball-db” should not be a plain package but only an
input as origin for the package “guix”.


> Hopefully everything else is more-or-less fine.  :)

Thanks!  That’s awesome!


Cheers,
simon




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
  2021-03-23  9:35 ` [bug#47336] Disarchive as a fallback for downloads zimoun
@ 2021-03-23 14:31   ` Timothy Sample
  2021-03-27 10:39     ` Ludovic Courtès
  0 siblings, 1 reply; 12+ messages in thread
From: Timothy Sample @ 2021-03-23 14:31 UTC (permalink / raw)
  To: zimoun; +Cc: Mathieu Othacehe, 47336

Hi zimoun,

You make a lot of good points here.  Let me at least provide some quick
answers even if I’m not ready to comment on some of the bigger picture
stuff.

zimoun <zimon.toutoune@gmail.com> writes:

> (CC Mathieu to advice if it could be a feature of Cuirass.)

So far I have been using Cuirass with only a tiny patch.  I’m not sure
we need anything more than what Cuirass already provides.  (The tiny
patch is for allowing sorting the “latestbuilds” results by “stoptime”
and “id”.  This in turn allows paging through all the builds from the
API.)

> On Tue, 23 Mar 2021 at 00:42, Timothy Sample <samplet@ngyro.com> wrote:
>
>> Now you can ask Guix for a recent .tar.gz source package:
>>
>>     $ ./pre-inst-env guix build --no-substitutes -S python-httpretty
>
> Neat!  Now, there is a way to easily check the coverage, right?  Since
> SWH is ingesting the tarball using <http://guix.gnu.org/sources.json>,
> there is now a mean to report what Guix is able to rebuild.

I’m not sure I fully understand.  Disarchive covers about 4,300 Gzip’ed
tarballs (no XZ yet).  There are about 100 for which compression
parameters cannot be found, and a handful (about 5) that have a
particularly funny idea about what a tarball is.  The metadata builds
for my database started one week ago and have been continuously updating
since then.

Are you asking if we could check what SWH has?  Yes!  Each metadata
file contains the SWHID of the input directory.  You could use
Disarchive to get this value or a simple “grep swhid” would do it.  :)

    $ curl https://disarchive.ngyro.com/sha256/67989614004773db349791c37675efb914d084bdb221356a05e4369c35e7eb62 | grep swhid

It would be neat to have a big database of archive coverage from Guix
1.0 through to the present.  It’s quite a big project though.

Of course, you know all about the SWH rate limit....

>>     Checking httpretty-1.0.5 digest... ok
>
> What happens if it is not ok?

For that particular digest, it means the source directory is wrong.
Since we get the source from SWH, it means that the SWH archive is
wrong.  You will have to look elsewhere, I guess (this seems pretty
unlikely).  (There is a vanishing possibility that Disarchive
miscomputed the SWHID and managed to come up with a different, but still
valid SWHID....)

The other digest checks are more likely to fail.  They would indicate
that Disarchive no longer knows how to interpret the metadata.  Maybe
there will be a subtle bug in Disarchive 0.3.0 that causes this.  Either
use an old version of Disarchive or try to fix the current version.  :)
I worry about this, because it would be annoying, but the metadata does
have all the information needed to recover the original archive, so
nothing is really lost (except the user’s time).

>>     Assembling the tarball httpretty-1.0.5.tar
>>     Checking httpretty-1.0.5.tar digest... ok
>>     Assembling the Gzip file httpretty-1.0.5.tar.gz
>>     Checking httpretty-1.0.5.tar.gz digest... ok
>>     Copying result to
>> /gnu/store/kbcnm57y2q1jvhvd8zw1g5vdiwlv19y9-httpretty-1.0.5.tar.gz
>
> Where is the assembly done?  In /tmp/, right?

Yes.

>>     successfully built
>> /gnu/store/k0b3c7kgzyn1nlyhx192pcbcgbfnhnwa-httpretty-1.0.5.tar.gz.drv
>
> Just to be sure, when does Guix check the integrity checksum?  I mean,
> does Guix check the checksum after ’disassemble’ re-assembled the source?

Disarchive checks the result against the metadata to make sure it didn’t
make a mistake.  Guix also checks the final result to make sure the
fixed-output derivation is correct.  A fixed-output derivation is
basically just a checksum with a hint about how the data can be
obtained.  Guix really only cares about the checksum, the hint can do
whatever as long as it produces the result Guix wants.  With this patch
series, Disarchive is part of the hint.

>> First, it looks up the metadata on my server.  This is fine for a demo,
>> but not what we want forever.  The patch series supports adding
>> several
>
> As we talked before, how does the database scale?  Do you have some
> numbers for the current demo?  In order to try to extrapolate what does
> it mean for a server to «store the metadata».

With “gzip -9”, the average metadata file is 6.8KiB.  It’s pretty
manageable.  There’s room for improvement on the Disarchive side, too.
It still stores some redundant information.  Uncompressed, it’s more
like 112KiB per file.  This is still pretty okay, really.  It means we
might hit tens of GiB over a couple years.  (It would take just over
100GiB to store a million uncompressed metadata files.)  The compression
ratio is what drove me to skip Git for now.

>> mirrors for looking up the metadata.  In the past, we talked about
>> putting everything on one or a few of the big Git hosting platforms like
>> GitHub or Gitlab.  That way, it would be easily picked up by SWH and
>> archived “forever”.  Right now, I have Cuirass set up to build the
>> metadata, and a little script that moves it from the build server to my
>> Web server.  It would be simple enough to adjust that script to push it
>> to a remote Git repo.  (Of course, the next step is to move this setup
>> to Guix infrastructure.)  Thoughts?
>
> Maybe this database could be a package, say “guix-tarball-db”, updated
> in agreement with the package “guix”.  The source of this
> “guix-tarball-db” would be a remote big Git hosting platforms like
> GitHub or whatever and not stored on Guix infrastructure, or maybe
> stored on Guix infra.
>
> Regularly, i.e., when the package “guix” is updated, in the same time,
> the package “guix-tarball-db” is updated too.  The “guix lint -c
> archival” sends the saving request to SWH.  Even if this saving request
> should be automated soon. :-)
>
> Then if Cuirass would have a feature to disassemble and update the Git
> repo.
>
> Last, a service should run as your demo.  But for long-term, this
> service could disappear––assuming SWH not :-).  Therefore, we could
> imagine installing “guix-tarball-db” then tweak some parameters of the
> guix-daemon and “guix build <foo>”.  Both installing and building would
> fetch from SWH if both upstream disappear.
>
> Or this “guix-tarball-db” should not be a plain package but only an
> input as origin for the package “guix”.

This is an interesting idea, but one that I would have to think about
more.  :)


-- Tim




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
  2021-03-23 14:31   ` Timothy Sample
@ 2021-03-27 10:39     ` Ludovic Courtès
  0 siblings, 0 replies; 12+ messages in thread
From: Ludovic Courtès @ 2021-03-27 10:39 UTC (permalink / raw)
  To: Timothy Sample; +Cc: Mathieu Othacehe, 47336, zimoun

Hi!

Timothy Sample <samplet@ngyro.com> skribis:

> With “gzip -9”, the average metadata file is 6.8KiB.  It’s pretty
> manageable.  There’s room for improvement on the Disarchive side, too.
> It still stores some redundant information.  Uncompressed, it’s more
> like 112KiB per file.  This is still pretty okay, really.  It means we
> might hit tens of GiB over a couple years.  (It would take just over
> 100GiB to store a million uncompressed metadata files.)  The compression
> ratio is what drove me to skip Git for now.

If needed, the sexp serialization could still be made more compact:
using ‘write’ instead of ‘pretty-print’, shortening field names (but
that’d be incompatible).

We could also use CBOR or canonical sexp serialization, though maybe
gzipped sexps are more compact than what we could achieve?

Anyway, these are surface syntax optimizations that can always be made
at a later point in time when we feel a need for them.

Ludo’.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
       [not found] ` <20210323045213.9419-1-samplet@ngyro.com>
@ 2021-03-27 10:40   ` Ludovic Courtès
  2021-04-10 20:52     ` Ludovic Courtès
  2021-04-26  9:49     ` Ludovic Courtès
       [not found]   ` <20210323045213.9419-2-samplet@ngyro.com>
  1 sibling, 2 replies; 12+ messages in thread
From: Ludovic Courtès @ 2021-03-27 10:40 UTC (permalink / raw)
  To: Timothy Sample; +Cc: 47336

Timothy Sample <samplet@ngyro.com> skribis:

> * guix/swh.scm (swh-directory-download): New procedure (with
> implementation extracted from 'swh-download').
> (swh-download): Use it to download the revision directory.

LGTM!




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
       [not found]   ` <20210323045213.9419-2-samplet@ngyro.com>
@ 2021-03-27 10:57     ` Ludovic Courtès
  0 siblings, 0 replies; 12+ messages in thread
From: Ludovic Courtès @ 2021-03-27 10:57 UTC (permalink / raw)
  To: Timothy Sample; +Cc: 47336

Hi!

Timothy Sample <samplet@ngyro.com> skribis:

> * guix/download.scm (%disarchive-mirrors): New variable.
> (%disarchive-mirror-file): New variable.
> (built-in-download): Add 'disarchive-mirrors' keyword argument and
> pass its value along to the 'builtin:download' derivation.
> (url-fetch): Pass '%disarchive-mirror-file' to 'built-in-download'.
> * guix/scripts/perform-download.scm (perform-download): Read
> Disarchive mirrors from the environment and pass them to
> 'url-fetch'.
> * guix/build/download.scm (disarchive-fetch/any): New procedure.
> (url-fetch): Add 'disarchive-mirrors' keyword argument, use it to
> make a list of URIs, and use the new procedure to fetch the file if
> all other methods fail.

[...]

> +  #:use-module (guix base16)
>    #:use-module (guix base64)
>    #:use-module (guix ftp-client)
>    #:use-module (guix build utils)
>    #:use-module (guix progress)
> +  #:use-module (guix swh)

Maybe #:autoload them.

> +(define* (disarchive-fetch/any uris file
> +                               #:key (timeout 10))
> +  "Fetch a Disarchive specification from any of URIS, assemble it,
> +and write the output to FILE."
> +  (define (fetch-specification uris)
> +    (any (lambda (uri)
> +           (false-if-exception*
> +            (let-values (((port size) (http-fetch uri
> +                                                  #:verify-certificate? #t
> +                                                  #:timeout timeout)))

Perhaps add #:key (verify-certificate? #t) and have the caller pass it?
Currently (guix scripts perform-download) sets it to #f, which is a good
idea IMO.

> +  (match (and=> (resolve-module '(disarchive) #:ensure #f)
> +                (lambda (disarchive)
> +                  (cons (module-ref disarchive '%disarchive-log-port)
> +                        (module-ref disarchive 'disarchive-assemble))))
> +    (#f #f)
> +    ((%disarchive-log-port . disarchive-assemble)
> +     (format #t "Trying to use Disarchive to assemble ~a~%" file)
> +     (match (fetch-specification uris)
> +       (#f #f)
> +       (spec (parameterize ((%disarchive-log-port (current-output-port)))
> +               (disarchive-assemble spec file #:resolver resolve)))))))

So we would normally arrange so that the ‘guix’ package depends on
Disarchive, such that the above ‘resolve-module’ call works when done
via ‘guix perform-download’, right?

In the #f case, perhaps we should print something like “Disarchive not
found, bailing out”?

That’s all I have to say; it looks great to me!

That’s quite a milestone, it’d be great to have that in the upcoming
release.  Next we can discuss how to populate the Disarchive database
and where to do that (or your hosting fees could easily skyrocket :-)).
I suppose we could run that in Berlin and/or we could make an argument
about using SWH or Inria resources for that.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
  2021-03-27 10:40   ` Ludovic Courtès
@ 2021-04-10 20:52     ` Ludovic Courtès
  2021-04-26  9:49     ` Ludovic Courtès
  1 sibling, 0 replies; 12+ messages in thread
From: Ludovic Courtès @ 2021-04-10 20:52 UTC (permalink / raw)
  To: Timothy Sample; +Cc: 47336

Ping!  :-)

Ludovic Courtès <ludo@gnu.org> skribis:

> Timothy Sample <samplet@ngyro.com> skribis:
>
>> * guix/swh.scm (swh-directory-download): New procedure (with
>> implementation extracted from 'swh-download').
>> (swh-download): Use it to download the revision directory.
>
> LGTM!




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
  2021-03-27 10:40   ` Ludovic Courtès
  2021-04-10 20:52     ` Ludovic Courtès
@ 2021-04-26  9:49     ` Ludovic Courtès
  2021-04-28  2:30       ` bug#47336: " Timothy Sample
  1 sibling, 1 reply; 12+ messages in thread
From: Ludovic Courtès @ 2021-04-26  9:49 UTC (permalink / raw)
  To: Timothy Sample; +Cc: 47336

Hi Timothy,

Ping²!

Let me know if you’re like me to apply the patches on your behalf.

Ludo’.

Ludovic Courtès <ludo@gnu.org> skribis:

> Timothy Sample <samplet@ngyro.com> skribis:
>
>> * guix/swh.scm (swh-directory-download): New procedure (with
>> implementation extracted from 'swh-download').
>> (swh-download): Use it to download the revision directory.
>
> LGTM!




^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#47336: Disarchive as a fallback for downloads
  2021-04-26  9:49     ` Ludovic Courtès
@ 2021-04-28  2:30       ` Timothy Sample
  2021-04-28  7:01         ` [bug#47336] " Timothy Sample
  0 siblings, 1 reply; 12+ messages in thread
From: Timothy Sample @ 2021-04-28  2:30 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 47336-done

Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Ping²!
>
> Let me know if you’re like me to apply the patches on your behalf.

No, no.  I’m just a little distracted over here.  I just pushed this
series with the updates you suggested (using #:autoload, passing
#:verify-certificates?, and being a bit more chatty).  Sorry for the
delay and thanks for the reminder.

Next, I’ll convert my Cuirass 0.x setup to a Cuirass 1.x setup, and then
I can start a discussion about moving the metadata builds to
ci.guix.gnu.org.

Also, to answer your other question:

> So we would normally arrange so that the ‘guix’ package depends on
> Disarchive, such that the above ‘resolve-module’ call works when done
> via ‘guix perform-download’, right?

That’s the idea.  I’m not confident about updating the ‘guix’ package
myself, though....


-- Tim




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
  2021-04-28  2:30       ` bug#47336: " Timothy Sample
@ 2021-04-28  7:01         ` Timothy Sample
  2021-04-29  7:48           ` Ludovic Courtès
  0 siblings, 1 reply; 12+ messages in thread
From: Timothy Sample @ 2021-04-28  7:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: control, 47336

reopen 47336
thanks

Hi again,

Timothy Sample <samplet@ngyro.com> writes:

> I just pushed this series [...]

And broke “guix pull”!!  (I somehow fooled myself into thinking that I
had already tested with “guix pull --url=...” locally.)  I reverted the
offending commit.

It turns out that adding a reference from “(guix build download)” to
“(guix swh)” breaks “compute-guix-derivation” in
“build-aux/build-self.scm”.  This is because “(guix swh)” references
“(json)”, which is not available in the “compute-guix-derivation”
environment.  I tried mimicking the “fake-git” trick, but it didn’t work
(I guess it needs the “define-json-mapping” macro at compile time).

Everything works if I remove the #:autoload for “(guix swh)” and put

  ;; If we import (guix swh) directly, we introduce a compile-time
  ;; dependency on Guile-JSON.  This breaks the "build-self" code, which
  ;; needs to build this module without Guile-JSON.  Hence, we track
  ;; down the following procedure at runtime.
  (define swh-download-directory
    (module-ref (resolve-module '(guix swh)) 'swh-download-directory))

inside of “disarchive-fetch/any” (just before it’s needed).  Does this
approach look okay?


-- Tim




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
  2021-04-28  7:01         ` [bug#47336] " Timothy Sample
@ 2021-04-29  7:48           ` Ludovic Courtès
  2021-04-29 17:24             ` bug#47336: " Timothy Sample
  0 siblings, 1 reply; 12+ messages in thread
From: Ludovic Courtès @ 2021-04-29  7:48 UTC (permalink / raw)
  To: Timothy Sample; +Cc: 47336

[-- Attachment #1: Type: text/plain, Size: 1689 bytes --]

Hi!

Timothy Sample <samplet@ngyro.com> skribis:

> And broke “guix pull”!!  (I somehow fooled myself into thinking that I
> had already tested with “guix pull --url=...” locally.)  I reverted the
> offending commit.

You can test with ‘guix pull’ (you need to make sure to specify the
right file:// URL *and* branch), or you can run “make as-derivation”.

> It turns out that adding a reference from “(guix build download)” to
> “(guix swh)” breaks “compute-guix-derivation” in
> “build-aux/build-self.scm”.  This is because “(guix swh)” references
> “(json)”, which is not available in the “compute-guix-derivation”
> environment.  I tried mimicking the “fake-git” trick, but it didn’t work
> (I guess it needs the “define-json-mapping” macro at compile time).
>
> Everything works if I remove the #:autoload for “(guix swh)” and put
>
>   ;; If we import (guix swh) directly, we introduce a compile-time
>   ;; dependency on Guile-JSON.  This breaks the "build-self" code, which
>   ;; needs to build this module without Guile-JSON.  Hence, we track
>   ;; down the following procedure at runtime.
>   (define swh-download-directory
>     (module-ref (resolve-module '(guix swh)) 'swh-download-directory))
>
> inside of “disarchive-fetch/any” (just before it’s needed).  Does this
> approach look okay?

That’s one possibility.

The patch below takes another approach.  I think it aesthetically
slightly more pleasant because we don’t have to play ‘resolve-module’
tricks for obscure reasons.  WDYT?

(It also fixes a format string argument mismatch.)

Thanks!

Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1552 bytes --]

diff --git a/build-aux/build-self.scm b/build-aux/build-self.scm
index 853a2f328f..f100ff4aae 100644
--- a/build-aux/build-self.scm
+++ b/build-aux/build-self.scm
@@ -250,6 +250,7 @@ interface (FFI) of Guile.")
     (match-lambda
       (('guix 'config) #f)
       (('guix 'channels) #f)
+      (('guix 'build 'download) #f)             ;autoloaded by (guix download)
       (('guix _ ...)   #t)
       (('gnu _ ...)    #t)
       (_               #f)))
diff --git a/guix/build/download.scm b/guix/build/download.scm
index 5431d7c682..ce31038b05 100644
--- a/guix/build/download.scm
+++ b/guix/build/download.scm
@@ -650,7 +650,7 @@ and write the output to FILE."
            (('swhid swhid)
             (match (string-split swhid #\:)
               (("swh" "1" "dir" id)
-               (format #t "Downloading from Software Heritage...~%" file)
+               (format #t "Downloading ~a from Software Heritage...~%" file)
                (false-if-exception*
                 (swh-download-directory id output)))
               (_ #f)))
diff --git a/guix/self.scm b/guix/self.scm
index 3154d180ac..7181205610 100644
--- a/guix/self.scm
+++ b/guix/self.scm
@@ -878,7 +878,8 @@ itself."
                    ("guix/store/schema.sql"
                     ,(local-file "../guix/store/schema.sql")))
 
-                 #:extensions (list guile-gcrypt)
+                 #:extensions (list guile-gcrypt
+                                    guile-json)   ;for (guix swh)
                  #:guile-for-build guile-for-build))
 
   (define *extra-modules*

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#47336: Disarchive as a fallback for downloads
  2021-04-29  7:48           ` Ludovic Courtès
@ 2021-04-29 17:24             ` Timothy Sample
  0 siblings, 0 replies; 12+ messages in thread
From: Timothy Sample @ 2021-04-29 17:24 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 47336-done

Hello,

Ludovic Courtès <ludo@gnu.org> writes:

> Timothy Sample <samplet@ngyro.com> skribis:
>
>> And broke “guix pull”!!  (I somehow fooled myself into thinking that I
>> had already tested with “guix pull --url=...” locally.)  I reverted the
>> offending commit.
>
> You can test with ‘guix pull’ (you need to make sure to specify the
> right file:// URL *and* branch), or you can run “make as-derivation”.

I will definitely be more careful with this in the future.

>> [...]  Does this approach look okay?
>
> That’s one possibility.
>
> The patch below takes another approach.  I think it aesthetically
> slightly more pleasant because we don’t have to play ‘resolve-module’
> tricks for obscure reasons.  WDYT?

This is exactly what I was hoping for, but I couldn’t quite connect all
the dots in “build-self.scm”.  Thanks!

> (It also fixes a format string argument mismatch.)

Good catch!

I’ve pushed the updated patch and am closing the issue.  :)


-- Tim




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bug#47336] Disarchive as a fallback for downloads
       [not found] <87eeg6o50b.fsf@ngyro.com>
  2021-03-23  9:35 ` [bug#47336] Disarchive as a fallback for downloads zimoun
       [not found] ` <20210323045213.9419-1-samplet@ngyro.com>
@ 2021-05-14 21:36 ` Ludovic Courtès
  2 siblings, 0 replies; 12+ messages in thread
From: Ludovic Courtès @ 2021-05-14 21:36 UTC (permalink / raw)
  To: Timothy Sample; +Cc: 47336

Hi!

Timothy Sample <samplet@ngyro.com> skribis:

> This patch series adds Disarchive assembly (backed by SWH lookup) as a
> fallback for downloads.
>
> To try it, make sure you are running the daemon in an environment with
> Disarchive available:
>
>     $ ./pre-inst-env guix environment --ad-hoc guile disarchive
>     # ./pre-inst-env guix-daemon --build-users-group=guixbuild
>
> Don’t forget to stop your existing Guix Daemon.  :)
>
> You also need to make sure that regular downloads are unavailable.  I do
> this by adjusting the “try” loop at the end of “url-fetch” in
> “guix/build/download.scm”.  I replace the usual list of URLs with ‘()’:
>
>     (let try ((uri (append uri content-addressed-uris)))
>       (match '() ; uri
>         ...))
>
> Now you can ask Guix for a recent .tar.gz source package:
>
>     $ ./pre-inst-env guix build --no-substitutes -S python-httpretty
>
> You should see:
>
>     Trying to use Disarchive to assemble /gnu/store/kbcnm57y2q1jvhvd8zw1g5vdiwlv19y9-httpretty-1.0.5.tar.gz
>     Assembling the directory httpretty-1.0.5
>     Downloading from Software Heritage...
>     7903d608efc89c14afb4d692a3721156e31a43e2/
>     7903d608efc89c14afb4d692a3721156e31a43e2/httpretty-1.0.5/
>     7903d608efc89c14afb4d692a3721156e31a43e2/httpretty-1.0.5/COPYING
>     [...]
>     Checking httpretty-1.0.5 digest... ok
>     Assembling the tarball httpretty-1.0.5.tar
>     Checking httpretty-1.0.5.tar digest... ok
>     Assembling the Gzip file httpretty-1.0.5.tar.gz
>     Checking httpretty-1.0.5.tar.gz digest... ok
>     Copying result to /gnu/store/kbcnm57y2q1jvhvd8zw1g5vdiwlv19y9-httpretty-1.0.5.tar.gz
>     successfully built /gnu/store/k0b3c7kgzyn1nlyhx192pcbcgbfnhnwa-httpretty-1.0.5.tar.gz.drv

Commits 67bf61255414115ffae0141df9dd3623bc742bff and
0b1f70d1a792af40aa0d13b3d227fde88f02d061 add the dependency on
Disarchive, so this fallback path is now enabled!

> There’s lots to talk about though....
>
> First, it looks up the metadata on my server.  This is fine for a demo,
> but not what we want forever.  The patch series supports adding several
> mirrors for looking up the metadata.  In the past, we talked about
> putting everything on one or a few of the big Git hosting platforms like
> GitHub or Gitlab.  That way, it would be easily picked up by SWH and
> archived “forever”.  Right now, I have Cuirass set up to build the
> metadata, and a little script that moves it from the build server to my
> Web server.  It would be simple enough to adjust that script to push it
> to a remote Git repo.  (Of course, the next step is to move this setup
> to Guix infrastructure.)  Thoughts?

We should talk to SWH, giving them the figures you gave earlier in this
thread.  But yeah, a Git repo looks best to me (it would be useful to
keep track of changes, for example if we eventually update metadata to a
new format) and it simplifies archival to SWH.

Second thing we need to figure out if where to create this database.  If
you have a Cuirass job already, we should run it on ci.guix.  WDYT?

> On the code level, there were two things I couldn’t figure out for
> myself.
>
> I made the mirror list just simple strings.  AIUI, the client and the
> daemon have to agree about the format of the mirror list.  Given that
> running old daemons is common, changing the format is difficult.  Is it
> worth it to copy the more flexible interface used by the content
> addressed mirrors?  If yes, do I have to do the same ‘module-autoload!’
> dance to use ‘bytevector->base16-string’?  :)  (I probably would have
> just copied it, but that part confused me a bit.)

I had overlooked this suggestion of yours.  Yes, I think it’s best to
copy the SWH scheme.  Don’t worry about ‘module-autoload!’: nowadays we
can safely assume (guix base16) is available.

When we change from list-of-strings to list-of-procedures, we’ll have to
adjust the (guix build download) code so that it can deal with both.

> I imported some modules from “guix/build/download.scm” (well, just
> “base16” and “swh”).  It feels weird to use a bunch of host-side modules
> from what’s nominally a “guix/build” module.  This is okay because
> “guix/build/download.scm” is not /really/ build-side code.  It’s more
> like daemon (-ish) code that just happens to live in “guix/build”, which
> is why importing host-side modules is OK... right?

Yup.  :-)  In the end, the whole point is to reuse code on both sides,
and that’s what’s being done here.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-05-14 21:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <87eeg6o50b.fsf@ngyro.com>
2021-03-23  9:35 ` [bug#47336] Disarchive as a fallback for downloads zimoun
2021-03-23 14:31   ` Timothy Sample
2021-03-27 10:39     ` Ludovic Courtès
     [not found] ` <20210323045213.9419-1-samplet@ngyro.com>
2021-03-27 10:40   ` Ludovic Courtès
2021-04-10 20:52     ` Ludovic Courtès
2021-04-26  9:49     ` Ludovic Courtès
2021-04-28  2:30       ` bug#47336: " Timothy Sample
2021-04-28  7:01         ` [bug#47336] " Timothy Sample
2021-04-29  7:48           ` Ludovic Courtès
2021-04-29 17:24             ` bug#47336: " Timothy Sample
     [not found]   ` <20210323045213.9419-2-samplet@ngyro.com>
2021-03-27 10:57     ` [bug#47336] " Ludovic Courtès
2021-05-14 21:36 ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).