unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Prototype tool for building derivations
@ 2020-04-17 20:22 Christopher Baines
  2020-04-22 20:31 ` Ludovic Courtès
  2020-04-24 13:35 ` zimoun
  0 siblings, 2 replies; 5+ messages in thread
From: Christopher Baines @ 2020-04-17 20:22 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2554 bytes --]

Hey,

Over the last couple of weeks, I've made some time to implement
something I was thinking about for a while.

In terms of getting to a point where Guix packages build reliably and
reproducibly, I think more testing is what's going to help. By taking
packages and building them more, on a wide variety of hardware and
software configurations, we'll get data on what works, what doesn't, and
where improvements and fixes can be made.

It's very much a prototype, but I've pushed some code up here [1] now,
the README.org file [2] contains usage instructions as well as a
description of the architecture.

1: https://git.cbaines.net/guix/build-coordinator/
2: https://git.cbaines.net/guix/build-coordinator/about/

So far, I've mostly done the boring stuff, but I'm excited about what
this could support.

Because the allocation/scheduling of builds is controlled, this offers
the possibility of doing some builds before others. If you were using
this for providing substitutes for example, it could be valuable to try
and prioritise building things that are requested more often, or those
that are more expensive (in time or space) to build.

Often there are concurrency issues with builds, I want to add a way of
specifying where builds should run. This would make it easy to test
building the same derivation in different setups, then capture where it
succeeds, fails, and how the output differs (if at all) across the
different environments.

I think it would be good to get point where there are many different
individuals and groups providing independent sources of Guix packages,
such that users can have a high level of confidence that the substitutes
they're getting correspond to the source code. Getting there will be
easier if substitute servers are easy to operate, and part of that I
think comes down to how easy it is to see what's going on. With the
current daemon implementation, I'm not sure how to get much data out
(this could be possible, I haven't looked very closely). This approach
however where the scheduling is done outside the daemon makes the
information more accessible.

I think some of the design decisions here are quite short sighted. I
think it would be better if some of this functionality could be handled
by the guix-daemon, especially things like providing information on
builds that are going to happen, but haven't happened yet. Once there's
a Guile implementation of the guix-daemon, hopefully some of this
technical debt can be repaid.

Just let me know if you have any questions or comments!

Thanks,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Prototype tool for building derivations
  2020-04-17 20:22 Prototype tool for building derivations Christopher Baines
@ 2020-04-22 20:31 ` Ludovic Courtès
  2020-04-26 19:59   ` Christopher Baines
  2020-04-24 13:35 ` zimoun
  1 sibling, 1 reply; 5+ messages in thread
From: Ludovic Courtès @ 2020-04-22 20:31 UTC (permalink / raw)
  To: Christopher Baines; +Cc: guix-devel

Hello!

Christopher Baines <mail@cbaines.net> skribis:

> In terms of getting to a point where Guix packages build reliably and
> reproducibly, I think more testing is what's going to help. By taking
> packages and building them more, on a wide variety of hardware and
> software configurations, we'll get data on what works, what doesn't, and
> where improvements and fixes can be made.
>
> It's very much a prototype, but I've pushed some code up here [1] now,
> the README.org file [2] contains usage instructions as well as a
> description of the architecture.
>
> 1: https://git.cbaines.net/guix/build-coordinator/
> 2: https://git.cbaines.net/guix/build-coordinator/about/
>
> So far, I've mostly done the boring stuff, but I'm excited about what
> this could support.

Neat!  I like the architecture and the fact that it’s easy to track down
what was built and where.  (guix scripts offload) currently doesn’t save
that info.

A note about the usage as explained in the README: be sure to register
GC roots for derivations before passing them around processes.  :-)

Having an HTTP interface is really nice (I recently had someone ask
whether one could coordinate offloading over HTTP rather than SSH, this
is helpful in some contexts.)

That also makes me wonder whether we could implement some of the store
RPCs over HTTP (I think Nix does something along these lines for
substitutes nowadays).  Because the coordinator interface is in fact
close to a subset of the daemon’s RPC interface.

> Because the allocation/scheduling of builds is controlled, this offers
> the possibility of doing some builds before others. If you were using
> this for providing substitutes for example, it could be valuable to try
> and prioritise building things that are requested more often, or those
> that are more expensive (in time or space) to build.

Yup, this highlights another shortcoming of ‘guix offload’ as well as
Cuirass actually, whereby there’s no way to observe scheduling decisions
nor to influence them dynamically.

> Often there are concurrency issues with builds, I want to add a way of
> specifying where builds should run. This would make it easy to test
> building the same derivation in different setups, then capture where it
> succeeds, fails, and how the output differs (if at all) across the
> different environments.

Yup.

> I think it would be good to get point where there are many different
> individuals and groups providing independent sources of Guix packages,
> such that users can have a high level of confidence that the substitutes
> they're getting correspond to the source code. Getting there will be
> easier if substitute servers are easy to operate, and part of that I
> think comes down to how easy it is to see what's going on. With the
> current daemon implementation, I'm not sure how to get much data out
> (this could be possible, I haven't looked very closely). This approach
> however where the scheduling is done outside the daemon makes the
> information more accessible.

That’s a worthy goal!  I’m not sure the coordinator is necessarily
helping directly there, because it’s another component (or two!) to set
up, in addition to ‘guix publish’ and something like Cuirass that
monitors a channel and builds it.

However I think it could be a way to restructure (guix scripts offload)
and/or Cuirass: they could talk to the coordinator instead of doing
their own thing.

In fact, I think it would be quite easy to reimplement (guix scripts
offload) using the coordinator (see ‘process-request’ for the protocol
with the daemon), and it would be interesting to see how that works.

> I think some of the design decisions here are quite short sighted. I
> think it would be better if some of this functionality could be handled
> by the guix-daemon, especially things like providing information on
> builds that are going to happen, but haven't happened yet. Once there's
> a Guile implementation of the guix-daemon, hopefully some of this
> technical debt can be repaid.

Yup!

Lots of food for thought!  :-)

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Prototype tool for building derivations
  2020-04-17 20:22 Prototype tool for building derivations Christopher Baines
  2020-04-22 20:31 ` Ludovic Courtès
@ 2020-04-24 13:35 ` zimoun
  2020-04-26 20:08   ` Christopher Baines
  1 sibling, 1 reply; 5+ messages in thread
From: zimoun @ 2020-04-24 13:35 UTC (permalink / raw)
  To: Christopher Baines; +Cc: Guix Devel

Hi Chris,

On Fri, 17 Apr 2020 at 22:22, Christopher Baines <mail@cbaines.net> wrote:

> Just let me know if you have any questions or comments!

From what I understand of both your prototype and build systems, you
should interesting to read this paper [1]: "Build systems à la carte:
Theory and practice". It needs some imagination if you are not
familiar with Haskell notations. And you should interested by section
4 about Schedulers and section 5 about Rebuilders; especially Table 2
p.28, and also subsections 8.2 about Parallelism and 8.4 about Could
implementation.

[1] https://doi.org/10.1017/S0956796820000088


Thank you for the initiative and sharing your perspectives.

Cheers,
simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Prototype tool for building derivations
  2020-04-22 20:31 ` Ludovic Courtès
@ 2020-04-26 19:59   ` Christopher Baines
  0 siblings, 0 replies; 5+ messages in thread
From: Christopher Baines @ 2020-04-26 19:59 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 5510 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Hello!
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> In terms of getting to a point where Guix packages build reliably and
>> reproducibly, I think more testing is what's going to help. By taking
>> packages and building them more, on a wide variety of hardware and
>> software configurations, we'll get data on what works, what doesn't, and
>> where improvements and fixes can be made.
>>
>> It's very much a prototype, but I've pushed some code up here [1] now,
>> the README.org file [2] contains usage instructions as well as a
>> description of the architecture.
>>
>> 1: https://git.cbaines.net/guix/build-coordinator/
>> 2: https://git.cbaines.net/guix/build-coordinator/about/
>>
>> So far, I've mostly done the boring stuff, but I'm excited about what
>> this could support.
>
> Neat!  I like the architecture and the fact that it’s easy to track down
> what was built and where.  (guix scripts offload) currently doesn’t save
> that info.

Thanks :)

> A note about the usage as explained in the README: be sure to register
> GC roots for derivations before passing them around processes.  :-)

Yeah, there's a lot to be desired in terms of this kind of polish.

> Having an HTTP interface is really nice (I recently had someone ask
> whether one could coordinate offloading over HTTP rather than SSH, this
> is helpful in some contexts.)
>
> That also makes me wonder whether we could implement some of the store
> RPCs over HTTP (I think Nix does something along these lines for
> substitutes nowadays).  Because the coordinator interface is in fact
> close to a subset of the daemon’s RPC interface.

I have been thinking about accessing the daemon using HTTP, however I'm
still not sure if this would be better as a core feature, or a bridge
that talks to the daemon on one end, handles authentication and talks
HTTP on the other.

...

>> I think it would be good to get point where there are many different
>> individuals and groups providing independent sources of Guix packages,
>> such that users can have a high level of confidence that the substitutes
>> they're getting correspond to the source code. Getting there will be
>> easier if substitute servers are easy to operate, and part of that I
>> think comes down to how easy it is to see what's going on. With the
>> current daemon implementation, I'm not sure how to get much data out
>> (this could be possible, I haven't looked very closely). This approach
>> however where the scheduling is done outside the daemon makes the
>> information more accessible.
>
> That’s a worthy goal!  I’m not sure the coordinator is necessarily
> helping directly there, because it’s another component (or two!) to set
> up, in addition to ‘guix publish’ and something like Cuirass that
> monitors a channel and builds it.

Indeed, there's a real risk in solving problems by building new things.

One thing I've been thinking about with the build-coordinator is that it
would be neat if it was easy to build around and extend. So I've added
some "hooks" which mean you can just add Guile code that runs when
builds succeed/fail or when an input can't be found.

I've been experimenting in the last few days with adding the necessary
options to the build-coordinator so that it can be used to build things
for providing substitutes. This is in the form of some options in the
build-coordinator, a script [3] to query data.guix.gnu.org for
derivations (this has downsides, but it was a simple place to start) and
a build-success hook [4] that performs a similar function as guix
publish, but without a daemon, just moving the nar file in to the right
place, and generating a corresponding narinfo file.

3: https://git.cbaines.net/guix/build-coordinator/tree/scripts/guix-build-coordinator-queue-builds-from-guix-data-service.in
4: https://git.cbaines.net/guix/build-coordinator/tree/guix-build-coordinator/hooks.scm?id=43312f9d977aac0f35bc5ce9b63e81cd5116d980#n46

Now this approach is far less mature than Cuirass + guix publish, but I
think it has some advantages such as not requiring everything you want
to provide substitutes for to be in a single store. Additionally, while
you don't get the file level deduplication when you have compressed nar
files, I think that this approach will reduce the disk space
requirements compared to Cuirass + guix publish, as I think you'd
probably have the compressed nar plus the item in the store in that
case.

> However I think it could be a way to restructure (guix scripts offload)
> and/or Cuirass: they could talk to the coordinator instead of doing
> their own thing.

That's an interesting idea.

> In fact, I think it would be quite easy to reimplement (guix scripts
> offload) using the coordinator (see ‘process-request’ for the protocol
> with the daemon), and it would be interesting to see how that works.

Yeah, I'll take a look.

>> I think some of the design decisions here are quite short sighted. I
>> think it would be better if some of this functionality could be handled
>> by the guix-daemon, especially things like providing information on
>> builds that are going to happen, but haven't happened yet. Once there's
>> a Guile implementation of the guix-daemon, hopefully some of this
>> technical debt can be repaid.
>
> Yup!
>
> Lots of food for thought!  :-)

Indeed :)

Thanks,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Prototype tool for building derivations
  2020-04-24 13:35 ` zimoun
@ 2020-04-26 20:08   ` Christopher Baines
  0 siblings, 0 replies; 5+ messages in thread
From: Christopher Baines @ 2020-04-26 20:08 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 884 bytes --]


zimoun <zimon.toutoune@gmail.com> writes:

> Hi Chris,
>
> On Fri, 17 Apr 2020 at 22:22, Christopher Baines <mail@cbaines.net> wrote:
>
>> Just let me know if you have any questions or comments!
>
> From what I understand of both your prototype and build systems, you
> should interesting to read this paper [1]: "Build systems à la carte:
> Theory and practice". It needs some imagination if you are not
> familiar with Haskell notations. And you should interested by section
> 4 about Schedulers and section 5 about Rebuilders; especially Table 2
> p.28, and also subsections 8.2 about Parallelism and 8.4 about Could
> implementation.
>
> [1] https://doi.org/10.1017/S0956796820000088
>
>
> Thank you for the initiative and sharing your perspectives.

Thanks, yeah, I'll have a read of the scheduling sections as that could
be useful!

Thanks,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-26 20:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-17 20:22 Prototype tool for building derivations Christopher Baines
2020-04-22 20:31 ` Ludovic Courtès
2020-04-26 19:59   ` Christopher Baines
2020-04-24 13:35 ` zimoun
2020-04-26 20:08   ` Christopher Baines

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).