all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Deterministic Library Calls when Building
@ 2016-03-20 10:04 Karl Semich
  2016-03-20 12:51 ` Thompson, David
  0 siblings, 1 reply; 5+ messages in thread
From: Karl Semich @ 2016-03-20 10:04 UTC (permalink / raw)
  To: guix-devel

Hi,

I recently learned about guix and I haven't found any information on
approaching deterministic builds by changing library and kernel
functions to have deterministic behavior.  Has anybody done this?

For example, I would imagine if I needed timestamps to no longer be a
factor, I might change how the current time is reported to the build
environment, such that it is always precisely equal to the time of
last modification of the source package.  Similarly /dev/*random
should return deterministic numbers seeded by perhaps the hash of the
source package and all dependencies.

Has there been a discussion of this somewhere?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Deterministic Library Calls when Building
  2016-03-20 10:04 Deterministic Library Calls when Building Karl Semich
@ 2016-03-20 12:51 ` Thompson, David
  2016-03-20 16:53   ` Karl Semich
  0 siblings, 1 reply; 5+ messages in thread
From: Thompson, David @ 2016-03-20 12:51 UTC (permalink / raw)
  To: Karl Semich; +Cc: guix-devel

On Sun, Mar 20, 2016 at 6:04 AM, Karl Semich <fuzzytew@gmail.com> wrote:
> Hi,
>
> I recently learned about guix and I haven't found any information on
> approaching deterministic builds by changing library and kernel
> functions to have deterministic behavior.  Has anybody done this?
>
> For example, I would imagine if I needed timestamps to no longer be a
> factor, I might change how the current time is reported to the build
> environment, such that it is always precisely equal to the time of
> last modification of the source package.  Similarly /dev/*random
> should return deterministic numbers seeded by perhaps the hash of the
> source package and all dependencies.
>
> Has there been a discussion of this somewhere?

I'm not sure if there has been an on-the-record discussion of this
anywhere, but I have thought about similar things re: random numbers.
Maybe this thread is the time to discuss? :)

- Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Deterministic Library Calls when Building
  2016-03-20 12:51 ` Thompson, David
@ 2016-03-20 16:53   ` Karl Semich
  2016-03-20 17:35     ` Jookia
  2016-03-20 21:05     ` Ludovic Courtès
  0 siblings, 2 replies; 5+ messages in thread
From: Karl Semich @ 2016-03-20 16:53 UTC (permalink / raw)
  To: Thompson, David; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2329 bytes --]

It seems to me it would be the most reliable, future-proof, way, but might
have the downside of making it a step harder for people without the special
environment to reproduce the build.

I'm pretty new at looking under the hood of linux, but I can imagine these
approaches at least:
- preload system library wrappers around key nondeterministic functions
- replace /dev/*random with fakes (could be named pipes, dummy devices fed
by modules, or just flat files!)
- replace system libraries with fullblown libraries with nondeterministic
calls rewritten (could merge changes upstream, provide a flag)
- create a kernel module which alters the behavior of the running kernel to
be more deterministic
- change the kernel itself to have a "deterministic mode" (could merge
upstream)

The goal of making packages deterministic would change from modifying the
packages themselves, to modifying the build environment, with the hope of
making a build environment that always creates deterministic builds for
normal software packages.  This should be very possible.

The approach of small library wrappers and/or replacing device files could
be pretty fast to implement, but not as "far thinking" as the other end of
the spectrum, where changes to glibc and linux could be merged upstream.

On Sun, Mar 20, 2016 at 8:51 AM, Thompson, David <dthompson2@worcester.edu>
wrote:

> On Sun, Mar 20, 2016 at 6:04 AM, Karl Semich <fuzzytew@gmail.com> wrote:
> > Hi,
> >
> > I recently learned about guix and I haven't found any information on
> > approaching deterministic builds by changing library and kernel
> > functions to have deterministic behavior.  Has anybody done this?
> >
> > For example, I would imagine if I needed timestamps to no longer be a
> > factor, I might change how the current time is reported to the build
> > environment, such that it is always precisely equal to the time of
> > last modification of the source package.  Similarly /dev/*random
> > should return deterministic numbers seeded by perhaps the hash of the
> > source package and all dependencies.
> >
> > Has there been a discussion of this somewhere?
>
> I'm not sure if there has been an on-the-record discussion of this
> anywhere, but I have thought about similar things re: random numbers.
> Maybe this thread is the time to discuss? :)
>
> - Dave
>

[-- Attachment #2: Type: text/html, Size: 2998 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Deterministic Library Calls when Building
  2016-03-20 16:53   ` Karl Semich
@ 2016-03-20 17:35     ` Jookia
  2016-03-20 21:05     ` Ludovic Courtès
  1 sibling, 0 replies; 5+ messages in thread
From: Jookia @ 2016-03-20 17:35 UTC (permalink / raw)
  To: Karl Semich; +Cc: guix-devel

On Sun, Mar 20, 2016 at 12:53:42PM -0400, Karl Semich wrote:
> It seems to me it would be the most reliable, future-proof, way, but might
> have the downside of making it a step harder for people without the special
> environment to reproduce the build.
> 
> I'm pretty new at looking under the hood of linux, but I can imagine these
> approaches at least:
> - preload system library wrappers around key nondeterministic functions
> - replace /dev/*random with fakes (could be named pipes, dummy devices fed
> by modules, or just flat files!)
> - replace system libraries with fullblown libraries with nondeterministic
> calls rewritten (could merge changes upstream, provide a flag)
> - create a kernel module which alters the behavior of the running kernel to
> be more deterministic
> - change the kernel itself to have a "deterministic mode" (could merge
> upstream)
> 
> The goal of making packages deterministic would change from modifying the
> packages themselves, to modifying the build environment, with the hope of
> making a build environment that always creates deterministic builds for
> normal software packages.  This should be very possible.
> 
> The approach of small library wrappers and/or replacing device files could
> be pretty fast to implement, but not as "far thinking" as the other end of
> the spectrum, where changes to glibc and linux could be merged upstream.

I think this would only really be useful if it could be detected that these
sources or nondeterministic functions are being used and flagged for patching
upstream.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Deterministic Library Calls when Building
  2016-03-20 16:53   ` Karl Semich
  2016-03-20 17:35     ` Jookia
@ 2016-03-20 21:05     ` Ludovic Courtès
  1 sibling, 0 replies; 5+ messages in thread
From: Ludovic Courtès @ 2016-03-20 21:05 UTC (permalink / raw)
  To: Karl Semich; +Cc: guix-devel

Hello!

Karl Semich <fuzzytew@gmail.com> skribis:

> I'm pretty new at looking under the hood of linux, but I can imagine these
> approaches at least:
> - preload system library wrappers around key nondeterministic functions

Often, preloaded libraries are a bit fragile, in part because they need
to intercept a wide range of libc system call wrappers and to maintain
state.

An example is libfaketime, which people have tried to use in the context
of Debian’s reproducible builds effort.  In this particular case,
there’s the additional problem that tools that heavily rely on
timestamps, such as Make, break in unexpected ways in such environments.

> - replace /dev/*random with fakes (could be named pipes, dummy devices fed
> by modules, or just flat files!)

This one sounds like it could easily be implemented.

I think /dev/*random cannot be a flat file, because applications such as
test suites expect some sort of randomness, but it could be seeded with
a constant value.

In practice I don’t think we’ve ever identified a case where
/dev/*random was having a visible effect on build results.

> - replace system libraries with fullblown libraries with nondeterministic
> calls rewritten (could merge changes upstream, provide a flag)
> - create a kernel module which alters the behavior of the running kernel to
> be more deterministic
> - change the kernel itself to have a "deterministic mode" (could merge
> upstream)

In practice, the non-determinism issues we stumble upon are often either
“trivial” (e.g., a timestamp is stored somewhere), or super tricky
(e.g., result depends on thread/process scheduling.)

Because of that, I think that the changes that are easy to implement
would in fact be of little help, and that the changes that would be the
most helpful (e.g., deterministic scheduling) would take a lot of
effort and/or make the build environment much more complex.

But I may well be too pessimistic, and I think it’s good to investigate
how we could improve things!

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-03-20 21:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-20 10:04 Deterministic Library Calls when Building Karl Semich
2016-03-20 12:51 ` Thompson, David
2016-03-20 16:53   ` Karl Semich
2016-03-20 17:35     ` Jookia
2016-03-20 21:05     ` Ludovic Courtès

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.