unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Indication of build failure from substitute servers?
@ 2024-07-23  2:38 Jonathan Frederickson
  2024-08-02 10:19 ` Marek Paśnikowski
  2024-08-06  9:39 ` Ricardo Wurmus
  0 siblings, 2 replies; 5+ messages in thread
From: Jonathan Frederickson @ 2024-07-23  2:38 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1005 bytes --]

Hi folks - I had a thought I wanted to bring up to you all.

I frequently end up with Guix attempting to build packages on my lower-powered machines when there are no substitutes available. However, a common reason that substitutes aren't available for a package is that the package failed to build in CI! And I usually discover this when the package fails to build locally, usually for the same reason, and usually after a relatively long build process.

Would it make sense to have some mechanism for substitute servers to be able to provide a sort of "non-existence proof" for a given package? Something that the CI system could publish to indicate that its build attempt for that package failed, and that clients could use to optionally abort without attempting a local build?

My reasoning for this is that, especially on some of my smaller ARM systems, a build attempt for some of these larger packages can take several hours, and if it's likely to fail I'd really prefer to know that ahead of time.

[-- Attachment #2: Type: text/html, Size: 1278 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Indication of build failure from substitute servers?
  2024-07-23  2:38 Indication of build failure from substitute servers? Jonathan Frederickson
@ 2024-08-02 10:19 ` Marek Paśnikowski
  2024-08-06  9:00   ` Ricardo Wurmus
  2024-08-06 16:53   ` pelzflorian (Florian Pelz)
  2024-08-06  9:39 ` Ricardo Wurmus
  1 sibling, 2 replies; 5+ messages in thread
From: Marek Paśnikowski @ 2024-08-02 10:19 UTC (permalink / raw)
  To: guix-devel

"Jonathan Frederickson" <jonathan@terracrypt.net> writes:

> I frequently end up with Guix attempting to build packages on my
> lower-powered machines when there are no substitutes
> available. However, a common reason that substitutes aren't available
> for a package is that the package failed to build in CI! And I usually
> discover this when the package fails to build locally, usually for the
> same reason, and usually after a relatively long build process.

I am also annoyed by this artifact of nix-based systems. Some systems
are physically incapable of building their binaries; for example, kernel
of a microcomputer — absolutely necessary, yet the device does not have
enough memory. This is why I believe that a clean solution is to
guarantee proper substitute availability for systems that require it.

> Would it make sense to have some mechanism for substitute servers to
> be able to provide a sort of "non-existence proof" for a given
> package? Something that the CI system could publish to indicate that
> its build attempt for that package failed, and that clients could use
> to optionally abort without attempting a local build?

I have carried the following idea for a long time with the intent of
actually implementing it before sharing it ("if you want something done,
do it yourself" mentality). But seeing other's frustration with this
problem I could at least share it. Here it is:

The proof of availability is in workflow itself. The project committers
NEVER commit anything to the master branch. Only the CI system
does. Instead, the committers push to a "pre-main" branch, and the CI
system picks the commits up one by one and attempts to build them as
usual. IMPORTANT POINT: *if* the commit builds correctly, it gets pushed
by CI to master branch, and the substitute is already available. *If*
the commit does not build, it gets rejected, and it never goes to
master.

I currently do not know enough about Git to confidently propose a
solution to the problem of how to handle the reordering of the queued
work on a build failure, but I have a feeling it is not that hard to
solve.

One could argue that this process delays availability of software
updates, but I believe this is the correct price to pay. The CI latency
would still be neglible when compared to the latency of developers who
perform the real work of software maintenance.

There is also the issue of software bugs which cause problems at
runtime. However, this is an independent problems, which should be
managed by other QA processes. The art of good engineering is to find
the simplest mechanisms possible that achieve the tasks well. And this
requires to break down problems into atomic parts.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Indication of build failure from substitute servers?
  2024-08-02 10:19 ` Marek Paśnikowski
@ 2024-08-06  9:00   ` Ricardo Wurmus
  2024-08-06 16:53   ` pelzflorian (Florian Pelz)
  1 sibling, 0 replies; 5+ messages in thread
From: Ricardo Wurmus @ 2024-08-06  9:00 UTC (permalink / raw)
  To: Marek Paśnikowski; +Cc: guix-devel

Marek Paśnikowski <marek@marekpasnikowski.pl> writes:

> The proof of availability is in workflow itself. The project committers
> NEVER commit anything to the master branch. Only the CI system
> does. Instead, the committers push to a "pre-main" branch, and the CI
> system picks the commits up one by one and attempts to build them as
> usual. IMPORTANT POINT: *if* the commit builds correctly, it gets pushed
> by CI to master branch, and the substitute is already available. *If*
> the commit does not build, it gets rejected, and it never goes to
> master.
>
> I currently do not know enough about Git to confidently propose a
> solution to the problem of how to handle the reordering of the queued
> work on a build failure, but I have a feeling it is not that hard to
> solve.

Prior art: https://docs.gitlab.com/ee/ci/pipelines/merge_trains.html

An open problem is preserving commit signatures or perhaps changing the
meaning of signatures by signing not the commit but the changes only.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Indication of build failure from substitute servers?
  2024-07-23  2:38 Indication of build failure from substitute servers? Jonathan Frederickson
  2024-08-02 10:19 ` Marek Paśnikowski
@ 2024-08-06  9:39 ` Ricardo Wurmus
  1 sibling, 0 replies; 5+ messages in thread
From: Ricardo Wurmus @ 2024-08-06  9:39 UTC (permalink / raw)
  To: Jonathan Frederickson; +Cc: guix-devel

"Jonathan Frederickson" <jonathan@terracrypt.net> writes:

> Would it make sense to have some mechanism for substitute servers to be able to provide a sort of "non-existence proof" for a given
> package? Something that the CI system could publish to indicate that its build attempt for that package failed, and that clients could use
> to optionally abort without attempting a local build?

It's something I've been wanting for the past decade.  The CI system
knows when a build has failed but when checking for substitutes there is
no endpoint to ask whether the CI build has failed.

In the past we had discussed enhancements to the substitution mechanism
(with the background of making use of information from "guix challenge")
that would allow people to have a bit more control over it.  In the
meantime we have added a way to let "guix pull" determine the latest
commit with substitutes for the derivations needed for "guix pull" ---
in the same vein we could enhance the substitution mechanism to check an
endpoint on CI and abort if the remote build has been marked as failed.

If you'd like you could think this through and come up with a minimal
set of proposed changes needed to make it work.  Then we could discuss
this here and decide which of the possibile approaches would be most
appropriate.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Indication of build failure from substitute servers?
  2024-08-02 10:19 ` Marek Paśnikowski
  2024-08-06  9:00   ` Ricardo Wurmus
@ 2024-08-06 16:53   ` pelzflorian (Florian Pelz)
  1 sibling, 0 replies; 5+ messages in thread
From: pelzflorian (Florian Pelz) @ 2024-08-06 16:53 UTC (permalink / raw)
  To: Marek Paśnikowski; +Cc: guix-devel

Marek Paśnikowski <marek@marekpasnikowski.pl> writes:
> The proof of availability is in workflow itself. The project committers
> NEVER commit anything to the master branch. Only the CI system
> does. Instead, the committers push to a "pre-main" branch, and the CI
> system picks the commits up one by one and attempts to build them as
> usual. IMPORTANT POINT: *if* the commit builds correctly, it gets pushed
> by CI to master branch, and the substitute is already available. *If*
> the commit does not build, it gets rejected, and it never goes to
> master.

In theory we have QA badges on patches at the top of
e.g. <https://issues.guix.gnu.org/72101>.  Committers could wait for it
and for its substitutes.

In practice, QA is too slow or does not prioritize enough yet. [1]

Regards,
Florian

[1]
https://lists.gnu.org/archive/html/bug-guix/2024-05/msg00116.html
https://yhetil.org/guix-bugs/87le4czh0z.fsf@gmail.com/


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-08-06 16:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-23  2:38 Indication of build failure from substitute servers? Jonathan Frederickson
2024-08-02 10:19 ` Marek Paśnikowski
2024-08-06  9:00   ` Ricardo Wurmus
2024-08-06 16:53   ` pelzflorian (Florian Pelz)
2024-08-06  9:39 ` Ricardo Wurmus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).