unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Cuirass: "lint -c archival"?
@ 2020-09-23 16:58 zimoun
  2020-09-24  7:28 ` Mathieu Othacehe
  2020-09-24 19:06 ` Christopher Baines
  0 siblings, 2 replies; 5+ messages in thread
From: zimoun @ 2020-09-23 16:58 UTC (permalink / raw)
  To: Guix Devel, Mathieu Othacehe, Ricardo Wurmus,
	Ludovic Courtès, Christopher Baines

Dear,

Does it make sense to add "lint -c archival" when a package is built
by Cuirass?  Or on the Guix Data Services?

The idea behind is then to ask SWH folks to increase the rate limit
for a specific IP (or couple of IPs).  Today, the SWH rate is 10 save
requests per hour, i.e., 240 per day (more or less).  And the new
chart [1] shows that there are ~2000 builds per day.  Ouch! :-)

[1] <https://ci.guix.gnu.org/metrics>

If it is not possible, then instead does it make sense to add a script
to etc/?  If SWH accepts to increase the rate for a specific machine,
the script (fold-packages+save-origin) could run with some delay and
save all the missing Git references.

Well, I do not know what the GitLab CI in Bordeaux is doing?  About
Guix packages because there are already some things saving requests
automatically, I guess.

WDYT?

All the best,
simon


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cuirass: "lint -c archival"?
  2020-09-23 16:58 Cuirass: "lint -c archival"? zimoun
@ 2020-09-24  7:28 ` Mathieu Othacehe
  2020-09-24 10:29   ` zimoun
  2020-09-24 19:06 ` Christopher Baines
  1 sibling, 1 reply; 5+ messages in thread
From: Mathieu Othacehe @ 2020-09-24  7:28 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel


Hello zimoun,

> The idea behind is then to ask SWH folks to increase the rate limit
> for a specific IP (or couple of IPs).  Today, the SWH rate is 10 save
> requests per hour, i.e., 240 per day (more or less).  And the new
> chart [1] shows that there are ~2000 builds per day.  Ouch! :-)

Yesterday almost 18.000 derivations were added, and even if only 10.000
were built, it is indeed quite substantial. 

> If it is not possible, then instead does it make sense to add a script
> to etc/?  If SWH accepts to increase the rate for a specific machine,
> the script (fold-packages+save-origin) could run with some delay and
> save all the missing Git references.

Adding some sort of "post build" hook to Cuirass that would trigger an
SHW archival would be possible, even though it would require to
implement this mechanism.

Having a cron job archiving missing references would also be possible I
guess, but I may have a preference for the first option.

Thanks,

Mathieu

-- 
https://othacehe.org


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cuirass: "lint -c archival"?
  2020-09-24  7:28 ` Mathieu Othacehe
@ 2020-09-24 10:29   ` zimoun
  0 siblings, 0 replies; 5+ messages in thread
From: zimoun @ 2020-09-24 10:29 UTC (permalink / raw)
  To: Mathieu Othacehe; +Cc: Guix Devel

Hi,

On Thu, 24 Sep 2020 at 09:28, Mathieu Othacehe <othacehe@gnu.org> wrote:

> > The idea behind is then to ask SWH folks to increase the rate limit
> > for a specific IP (or couple of IPs).  Today, the SWH rate is 10 save
> > requests per hour, i.e., 240 per day (more or less).  And the new
> > chart [1] shows that there are ~2000 builds per day.  Ouch! :-)
>
> Yesterday almost 18.000 derivations were added, and even if only 10.000
> were built, it is indeed quite substantial.

That's good news. :-)

On average, it is ~2000, right?

Well, we could set a limit for the extra days, sending the X first
buildings where X is in agreement with SWH.
It would be far from perfect and some packages would not be saved, but
it seems better than the current situation (depends on the
submitter/reviewer only).

This would be something in the meantime; while waiting the SWH
sources.json loads accepts more than 'url-fetch' sources.


> > If it is not possible, then instead does it make sense to add a script
> > to etc/?  If SWH accepts to increase the rate for a specific machine,
> > the script (fold-packages+save-origin) could run with some delay and
> > save all the missing Git references.
>
> Adding some sort of "post build" hook to Cuirass that would trigger an
> SHW archival would be possible, even though it would require to
> implement this mechanism.

Cool!  Yakafonkon. ;-)

> Having a cron job archiving missing references would also be possible I
> guess, but I may have a preference for the first option.

Because I am lazy, the "post build" hook appears to me more
complicated to implement than a cron job with a Scheme script (that I
almost already have :-)).  Hey, "Now is better than never. Although
never is often better than *right* now." :-)


Thanks,
simon


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cuirass: "lint -c archival"?
  2020-09-23 16:58 Cuirass: "lint -c archival"? zimoun
  2020-09-24  7:28 ` Mathieu Othacehe
@ 2020-09-24 19:06 ` Christopher Baines
  2020-09-25  8:56   ` Automation of SWH save (was: Cuirass: "lint -c archival"?) zimoun
  1 sibling, 1 reply; 5+ messages in thread
From: Christopher Baines @ 2020-09-24 19:06 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 2318 bytes --]


zimoun <zimon.toutoune@gmail.com> writes:

> Does it make sense to add "lint -c archival" when a package is built
> by Cuirass?  Or on the Guix Data Services?
>
> The idea behind is then to ask SWH folks to increase the rate limit
> for a specific IP (or couple of IPs).  Today, the SWH rate is 10 save
> requests per hour, i.e., 240 per day (more or less).  And the new
> chart [1] shows that there are ~2000 builds per day.  Ouch! :-)
>
> [1] <https://ci.guix.gnu.org/metrics>
>
> If it is not possible, then instead does it make sense to add a script
> to etc/?  If SWH accepts to increase the rate for a specific machine,
> the script (fold-packages+save-origin) could run with some delay and
> save all the missing Git references.
>
> Well, I do not know what the GitLab CI in Bordeaux is doing?  About
> Guix packages because there are already some things saving requests
> automatically, I guess.
>
> WDYT?

So, my understanding is that Software Heritage is a potential store for
source material for Guix packages. I think the majority of builds
Cuirass does are because inputs change, rather than the source of a
package.

I'm not sure hooking this up to Cuirass would make the most sense,
because of the above point.

Also, unfortunately, the Guix Data Service doesn't have the ideal data
for this, as it doesn't really store the package source information in
the way that would be useful for this.

Personally though (and I'm rather biased), I think the Guix Data Service
might still be an approach. If you take the view on this that the
Software Heritage is a means to a store item (which I think is right?),
the Guix Data Service knows about those store items (like [1]).

1: https://data.guix.gnu.org/gnu/store/5h4dz6ild4fkida5yfv5fhh59vfd8hvk-python-boolean.py-3.6-checkout

It's already storing if substitute servers have a nar for that store
item, so I don't think storing if it's available elsewhere is
particularly out of place.

To make the information actionable though, it would be necessary to
store more information about the sources for packages in the Guix Data
Service database.

This is much more work than just using the existing linter, but it does
have the advantage that you'd be able to look at coverage statistics and
things like that, which the checker doesn't really afford.

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Automation of SWH save (was: Cuirass: "lint -c archival"?)
  2020-09-24 19:06 ` Christopher Baines
@ 2020-09-25  8:56   ` zimoun
  0 siblings, 0 replies; 5+ messages in thread
From: zimoun @ 2020-09-25  8:56 UTC (permalink / raw)
  To: Christopher Baines; +Cc: Guix Devel

Hi,

On Thu, 24 Sep 2020 at 21:06, Christopher Baines <mail@cbaines.net> wrote:
> zimoun <zimon.toutoune@gmail.com> writes:

> So, my understanding is that Software Heritage is a potential store for
> source material for Guix packages. I think the majority of builds
> Cuirass does are because inputs change, rather than the source of a
> package.

To be precise, Software Heritage stores all the upstream source codes,
only.  Their API entry-point for "save" is the URL of a Git or
Mercurial or Subversion repository and then they ingest the content
that this very URL serves.

And it is not necessary to build the package to send a "save" request;
 "guix lint -c archival foo" sends the request for the git-reference
source of Guix packages.

Note that Guix does not send the result of "guix build -S" but the
real upstream URL.

> I'm not sure hooking this up to Cuirass would make the most sense,
> because of the above point.
>
> Also, unfortunately, the Guix Data Service doesn't have the ideal data
> for this, as it doesn't really store the package source information in
> the way that would be useful for this.

Somehow, the GDS has this information because it reports Lint Warnings
(for example [1]: bottom "no lint warnings").  However, if I read
correctly, you added the option "--no-network" to only use the linters
which do not require network access.

Does the GDS run the linters by itself or does it use the log from Cuirass?

[1] <https://data.guix.gnu.org/revision/c385bd69ad407f608e3da3156fed0ac915574313/package/git/2.28.0>


BTW, please consider the patch #43261 [2] fixing issue in the current
implement of "--no-network". :-)

[2] <http://issues.guix.gnu.org/issue/43261>


> Personally though (and I'm rather biased), I think the Guix Data Service
> might still be an approach. If you take the view on this that the
> Software Heritage is a means to a store item (which I think is right?),
> the Guix Data Service knows about those store items (like [1]).
>
> 1: https://data.guix.gnu.org/gnu/store/5h4dz6ild4fkida5yfv5fhh59vfd8hvk-python-boolean.py-3.6-checkout

Currently, Guix does not provide machinery to send its source
substitutes.  I am not convinced it makes sense to do so.  The model I
am imagining is:

 - short term:
    + a script runs as a cron job to lint all the packages, say once
per day (packages will be missed but it is better than what we
currently have)
    + try to implement the save request for hg and svn (I am working
on it if no one beats me :-))
 - middle term: add a hook (Cuirass or GDS) to trigger action if the
package passes.
 - long term: SWH ingest everything via sources.json

Somehow, send all the source substitutes should be done once, at the
moment from short to middle term.  Currently, SWH ingests all the
tarballs (via sources.json) and few git-reference packages: the ones
when the packager/reviewer did "guix lint -c archival".  I am
proposing to automatize instead of relying on a packager/reviewer
willing. :-)

Well, with wider point of view, the hook could send a save request to
SWH or we could also imagine that the hook could do whatever with the
results (store item): push to somewhere or dissambles the tarball (if
any) and saves it to the database (be able then to fetch from SWH).

Note that the long term does not depend on the Guix side but on the
SWH side.  So the term could be shorter. :-)

Does this make sense?


> To make the information actionable though, it would be necessary to
> store more information about the sources for packages in the Guix Data
> Service database.
>
> This is much more work than just using the existing linter, but it does
> have the advantage that you'd be able to look at coverage statistics and
> things like that, which the checker doesn't really afford.

Yes.

In summary, SWH limits the number of requests per hour (10 save
requests and 120 query requests) and so it is impossible to automatize
the saving mechanism.  I am proposing to ask them to change this rate
limit for one specific trusted machine (for example, if I understand
correctly, the Nix and Debian projects are doing so).  Therefore, the
question is:

 - which machine?
 - what is the automation process? (see above)


WDYT?

All the best,
simon


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-25  8:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 16:58 Cuirass: "lint -c archival"? zimoun
2020-09-24  7:28 ` Mathieu Othacehe
2020-09-24 10:29   ` zimoun
2020-09-24 19:06 ` Christopher Baines
2020-09-25  8:56   ` Automation of SWH save (was: Cuirass: "lint -c archival"?) zimoun

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).