unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: zimoun <zimon.toutoune@gmail.com>
To: Christopher Baines <mail@cbaines.net>
Cc: Guix Devel <guix-devel@gnu.org>
Subject: Automation of SWH save (was: Cuirass: "lint -c archival"?)
Date: Fri, 25 Sep 2020 10:56:53 +0200	[thread overview]
Message-ID: <CAJ3okZ0MAKW4BVAMOo03yDQOMnqXQVzq1Agq6jNu-KYRVUohrg@mail.gmail.com> (raw)
In-Reply-To: <874knnouzr.fsf@cbaines.net>

Hi,

On Thu, 24 Sep 2020 at 21:06, Christopher Baines <mail@cbaines.net> wrote:
> zimoun <zimon.toutoune@gmail.com> writes:

> So, my understanding is that Software Heritage is a potential store for
> source material for Guix packages. I think the majority of builds
> Cuirass does are because inputs change, rather than the source of a
> package.

To be precise, Software Heritage stores all the upstream source codes,
only.  Their API entry-point for "save" is the URL of a Git or
Mercurial or Subversion repository and then they ingest the content
that this very URL serves.

And it is not necessary to build the package to send a "save" request;
 "guix lint -c archival foo" sends the request for the git-reference
source of Guix packages.

Note that Guix does not send the result of "guix build -S" but the
real upstream URL.

> I'm not sure hooking this up to Cuirass would make the most sense,
> because of the above point.
>
> Also, unfortunately, the Guix Data Service doesn't have the ideal data
> for this, as it doesn't really store the package source information in
> the way that would be useful for this.

Somehow, the GDS has this information because it reports Lint Warnings
(for example [1]: bottom "no lint warnings").  However, if I read
correctly, you added the option "--no-network" to only use the linters
which do not require network access.

Does the GDS run the linters by itself or does it use the log from Cuirass?

[1] <https://data.guix.gnu.org/revision/c385bd69ad407f608e3da3156fed0ac915574313/package/git/2.28.0>


BTW, please consider the patch #43261 [2] fixing issue in the current
implement of "--no-network". :-)

[2] <http://issues.guix.gnu.org/issue/43261>


> Personally though (and I'm rather biased), I think the Guix Data Service
> might still be an approach. If you take the view on this that the
> Software Heritage is a means to a store item (which I think is right?),
> the Guix Data Service knows about those store items (like [1]).
>
> 1: https://data.guix.gnu.org/gnu/store/5h4dz6ild4fkida5yfv5fhh59vfd8hvk-python-boolean.py-3.6-checkout

Currently, Guix does not provide machinery to send its source
substitutes.  I am not convinced it makes sense to do so.  The model I
am imagining is:

 - short term:
    + a script runs as a cron job to lint all the packages, say once
per day (packages will be missed but it is better than what we
currently have)
    + try to implement the save request for hg and svn (I am working
on it if no one beats me :-))
 - middle term: add a hook (Cuirass or GDS) to trigger action if the
package passes.
 - long term: SWH ingest everything via sources.json

Somehow, send all the source substitutes should be done once, at the
moment from short to middle term.  Currently, SWH ingests all the
tarballs (via sources.json) and few git-reference packages: the ones
when the packager/reviewer did "guix lint -c archival".  I am
proposing to automatize instead of relying on a packager/reviewer
willing. :-)

Well, with wider point of view, the hook could send a save request to
SWH or we could also imagine that the hook could do whatever with the
results (store item): push to somewhere or dissambles the tarball (if
any) and saves it to the database (be able then to fetch from SWH).

Note that the long term does not depend on the Guix side but on the
SWH side.  So the term could be shorter. :-)

Does this make sense?


> To make the information actionable though, it would be necessary to
> store more information about the sources for packages in the Guix Data
> Service database.
>
> This is much more work than just using the existing linter, but it does
> have the advantage that you'd be able to look at coverage statistics and
> things like that, which the checker doesn't really afford.

Yes.

In summary, SWH limits the number of requests per hour (10 save
requests and 120 query requests) and so it is impossible to automatize
the saving mechanism.  I am proposing to ask them to change this rate
limit for one specific trusted machine (for example, if I understand
correctly, the Nix and Debian projects are doing so).  Therefore, the
question is:

 - which machine?
 - what is the automation process? (see above)


WDYT?

All the best,
simon


      reply	other threads:[~2020-09-25  8:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-23 16:58 Cuirass: "lint -c archival"? zimoun
2020-09-24  7:28 ` Mathieu Othacehe
2020-09-24 10:29   ` zimoun
2020-09-24 19:06 ` Christopher Baines
2020-09-25  8:56   ` zimoun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ3okZ0MAKW4BVAMOo03yDQOMnqXQVzq1Agq6jNu-KYRVUohrg@mail.gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=mail@cbaines.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).