all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Reproducible and deterministic builds for GNU Guix (and Nix)
       [not found] ` <CAJ67tHrONL3_EMcfq+rPVx1k8RHMQc2ddBo4EuO6WYDqCwQSdw@mail.gmail.com>
@ 2016-04-08  6:59   ` Pjotr Prins
  2016-04-08  7:13     ` Erlang: " Pjotr Prins
  2016-04-08 15:06     ` Joe Armstrong
  0 siblings, 2 replies; 5+ messages in thread
From: Pjotr Prins @ 2016-04-08  6:59 UTC (permalink / raw)
  To: Magnus Henoch; +Cc: guix-devel, Erlang, Joe Armstrong

On Tue, Apr 05, 2016 at 02:07:46PM +0100, Magnus Henoch wrote:
>    Debian has included a patch that lets you use the environment variable
>    SOURCE_DATE_EPOCH to fix the compile time, and thus obtain identical
>    output (given the same compiler version and other things):
>    [1]https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795834
> 
>    This was briefly discussed on this mailing list:
>    [2]http://erlang.org/pipermail/erlang-questions/2015-January/082699.html

You may have heard of GNU Guix, the modern (functional) package
manager of the GNU project. We are trying to add Erlang and Elixir to
Guix, but we are running into the problem that building the Erlang
compiler is not deterministic and therefore not reproducible, i.e. the
beam files contain time stamps. 

For normal software built by Erlang this can be overriden with
SOURCE_DATE_EPOCH (as per mentioned Debian patch), but for the
compiler itself we have not found how to do this.

Do you have a suggestion how to bootstrap the compiler with
SOURCE_DATE_EPOCH set or disable the time stamps? I am sure as a FP
compiler designer you can appreciate determinism. Because GNU Guix is
deterministic there is no need to keep track of time stamps. For hot
reloading we can assume the start of EPOCH will do the trick, right?

Pj.

On Mon, Apr 04, 2016 at 01:49:44PM -0400, Leo Famulari wrote:
> On Mon, Apr 04, 2016 at 12:50:12PM -0400, Leo Famulari wrote:
> > On Mon, Apr 04, 2016 at 10:28:02AM +0200, Pjotr Prins wrote:
> > > On Sun, Apr 03, 2016 at 11:39:24PM -0400, Leo Famulari wrote:
> > > > Debian's package exhibits this problem. The timestamps are generated in
> > > > the following places in the source code. I don't know how to approach
> > > > this problem.
> > > > 
> > > > lib/kernel/test/global_SUITE_data/global_trace.erl:    io:format("The trace was generated at ~p~n", [EndTime]),
> > > > lib/reltool/bin/reltool.escript:    lists:flatten(io_lib:format("%% ~s generated at ~w ~w\n~p.\n\n",
> > > > lib/reltool/src/reltool_server.erl:    IoList = io_lib:format("%% config generated at ~w ~w\n~p.\n\n",
> > > > lib/reltool/src/reltool_target.erl:    RelIoList = io_lib:format("%% rel generated at ~w ~w\n~p.\n\n",
> > > > lib/reltool/src/reltool_target.erl:    ScriptIoList = io_lib:format("%% script generated at ~w ~w\n~p.\n\n",
> > > > lib/reltool/src/reltool_target.erl:            AppIoList = io_lib:format("%% app generated at ~w ~w\n~p.\n\n",
> > > > lib/reltool/src/reltool_target.erl:            AppIoList = io_lib:format("%% app generated at ~w ~w\n~p.\n\n",
> > > > lib/runtime_tools/src/erts_alloc_config.erl:	"generated at ~w-~2..0w-~2..0w ~2..0w:~2..0w.~2..0w by "
> > > > lib/sasl/src/systools_make.erl:	    io:format(Fd, "%% script generated at ~w ~w\n~p.\n",
> > > > lib/wx/src/gen/gl.erl:%% The program object's information log is updated and the program is generated at the time
> > > 
> > > If there is no easy work around I suggest simply patching them. Fortunately
> > > the Erlang compiler does not change much at this level.
> > 
> > The ideal solution would be to use the value of the environment variable
> > SOURCE_DATE_EPOCH if it is set, and else to behave as it does now.
> > 
> > > We can also contact Joe Armstrong, the author of Erlang, to discuss
> > > this point. He appears to be approachable. I am sure he is open to
> > > the idea of deterministic builds in a deterministic build system ;)
> > 
> > I could go to the Erlang IRC channel or forums (whatever they use) and
> > ask for advice. Since you are actually using Erlang, I think you would
> > be the better person to contact Joe Armstrong himself, if we decide to
> > do that.
> 
> I presented the situation on IRC and it was recommended that I start the
> discussion on a mailing list.
> 
> I think that the erlang-questions list [0] could be a good place to
> start.
> 
> Pjotr, would you like to start the conversation? I can do it if you are
> too busy or something.
> 
> [0]
> http://www.erlang.org/community
> 

-- 

> 
>    Regards,
>    Magnus
> 
>    On Mon, Apr 4, 2016 at 8:59 PM, Joe Armstrong <[3]erlang@gmail.com> wrote:
> 
>      Hello,
> 
>      I think I've asked this before but cannot find the answer:
> 
>      I want the beam file produced by
> 
>        $ erl file.erl
> 
>      to always have the same sha1 checksum - there was, if I remember
>      correctly, a hidden flag that removed the time of compilation etc from
>      the beam code. Any ideas how to do this?
> 
>      /Joe
>      _______________________________________________
>      erlang-questions mailing list
>      [4]erlang-questions@erlang.org
>      [5]http://erlang.org/mailman/listinfo/erlang-questions
> 
> References
> 
>    Visible links
>    1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795834
>    2. http://erlang.org/pipermail/erlang-questions/2015-January/082699.html
>    3. mailto:erlang@gmail.com
>    4. mailto:erlang-questions@erlang.org
>    5. http://erlang.org/mailman/listinfo/erlang-questions

> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions


-- 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Erlang: Reproducible and deterministic builds for GNU Guix (and Nix)
  2016-04-08  6:59   ` Reproducible and deterministic builds for GNU Guix (and Nix) Pjotr Prins
@ 2016-04-08  7:13     ` Pjotr Prins
  2016-04-08 15:06     ` Joe Armstrong
  1 sibling, 0 replies; 5+ messages in thread
From: Pjotr Prins @ 2016-04-08  7:13 UTC (permalink / raw)
  To: guix-devel

Sorry, the subject should have said Erlang.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Reproducible and deterministic builds for GNU Guix (and Nix)
  2016-04-08  6:59   ` Reproducible and deterministic builds for GNU Guix (and Nix) Pjotr Prins
  2016-04-08  7:13     ` Erlang: " Pjotr Prins
@ 2016-04-08 15:06     ` Joe Armstrong
  2016-04-08 19:46       ` Pjotr Prins
  1 sibling, 1 reply; 5+ messages in thread
From: Joe Armstrong @ 2016-04-08 15:06 UTC (permalink / raw)
  To: Pjotr Prins; +Cc: guix-devel, Leo Famulari, Erlang, Magnus Henoch

On Fri, Apr 8, 2016 at 8:59 AM, Pjotr Prins <pjotr.public12@thebird.nl> wrote:
> On Tue, Apr 05, 2016 at 02:07:46PM +0100, Magnus Henoch wrote:
>>    Debian has included a patch that lets you use the environment variable
>>    SOURCE_DATE_EPOCH to fix the compile time, and thus obtain identical
>>    output (given the same compiler version and other things):

This is very hacky - it might work by accident but you'd want
stronger guarantees that if you compiled the same file
many times by the same compiler that you'd always get the same
object code file.

As I see things this is crucial to making reproducible builds. Having the
compilation time in the beam file is really bad (I don't remember if I did this,
but if so I apologise) - it should not matter what time you compiled the file.
If you want this information, stick it in a log file, or somewhere else.

If the beam file is uniquely determined by the version of the compiler,
the source and the macro definitons used when it was compiled then
we can use the SHA1 checksum of the beam file as a key, and inject the
code into a distributed hash table - this is the first step to making
a global
revision control system with strong guarantees on version consistency.

I see absolutely no reason for the dozens of different version and
revision control systems that pollute the planet when all that is
needed is a
DHT containing blogs identified by some checksum (like SHA1 or something).

>>    [1]https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795834
>>
>>    This was briefly discussed on this mailing list:
>>    [2]http://erlang.org/pipermail/erlang-questions/2015-January/082699.html
>
> You may have heard of GNU Guix, the modern (functional) package
> manager of the GNU project. We are trying to add Erlang and Elixir to
> Guix, but we are running into the problem that building the Erlang
> compiler is not deterministic and therefore not reproducible, i.e. the
> beam files contain time stamps.

Great - I've not looked at Guix but I've been following NiX - I've wanted
GitTorrent (= Git + Bit Torrent) so both these seem like a step in the
right direction.

Re the time stamps - you can post-process the beam code to remove
the time stamps, but I'd like stronger guarantees than that.

What would happen if two different implementations of a module
produced the same beam file (I think just rearranging the comments
would achieve this, though I haven't tested this) should this be
allowed?

Personally, I think that the SHA of the source should be included in the
beam file, which will identify the code used to create the beam file.

For your purposes I can write a script to call erlc and strip out the
parts that make compilation reproducible. In the long term we should
discuss this, figure out what the correct thing to do is, and then do it.

Cheers

/Joe

>
> For normal software built by Erlang this can be overriden with
> SOURCE_DATE_EPOCH (as per mentioned Debian patch), but for the
> compiler itself we have not found how to do this.
>
> Do you have a suggestion how to bootstrap the compiler with
> SOURCE_DATE_EPOCH set or disable the time stamps? I am sure as a FP
> compiler designer you can appreciate determinism. Because GNU Guix is
> deterministic there is no need to keep track of time stamps. For hot
> reloading we can assume the start of EPOCH will do the trick, right?
>
> Pj.
>
> On Mon, Apr 04, 2016 at 01:49:44PM -0400, Leo Famulari wrote:
>> On Mon, Apr 04, 2016 at 12:50:12PM -0400, Leo Famulari wrote:
>> > On Mon, Apr 04, 2016 at 10:28:02AM +0200, Pjotr Prins wrote:
>> > > On Sun, Apr 03, 2016 at 11:39:24PM -0400, Leo Famulari wrote:
>> > > > Debian's package exhibits this problem. The timestamps are generated in
>> > > > the following places in the source code. I don't know how to approach
>> > > > this problem.
>> > > >
>> > > > lib/kernel/test/global_SUITE_data/global_trace.erl:    io:format("The trace was generated at ~p~n", [EndTime]),
>> > > > lib/reltool/bin/reltool.escript:    lists:flatten(io_lib:format("%% ~s generated at ~w ~w\n~p.\n\n",
>> > > > lib/reltool/src/reltool_server.erl:    IoList = io_lib:format("%% config generated at ~w ~w\n~p.\n\n",
>> > > > lib/reltool/src/reltool_target.erl:    RelIoList = io_lib:format("%% rel generated at ~w ~w\n~p.\n\n",
>> > > > lib/reltool/src/reltool_target.erl:    ScriptIoList = io_lib:format("%% script generated at ~w ~w\n~p.\n\n",
>> > > > lib/reltool/src/reltool_target.erl:            AppIoList = io_lib:format("%% app generated at ~w ~w\n~p.\n\n",
>> > > > lib/reltool/src/reltool_target.erl:            AppIoList = io_lib:format("%% app generated at ~w ~w\n~p.\n\n",
>> > > > lib/runtime_tools/src/erts_alloc_config.erl:    "generated at ~w-~2..0w-~2..0w ~2..0w:~2..0w.~2..0w by "
>> > > > lib/sasl/src/systools_make.erl:     io:format(Fd, "%% script generated at ~w ~w\n~p.\n",
>> > > > lib/wx/src/gen/gl.erl:%% The program object's information log is updated and the program is generated at the time
>> > >
>> > > If there is no easy work around I suggest simply patching them. Fortunately
>> > > the Erlang compiler does not change much at this level.
>> >
>> > The ideal solution would be to use the value of the environment variable
>> > SOURCE_DATE_EPOCH if it is set, and else to behave as it does now.
>> >
>> > > We can also contact Joe Armstrong, the author of Erlang, to discuss
>> > > this point. He appears to be approachable. I am sure he is open to
>> > > the idea of deterministic builds in a deterministic build system ;)
>> >
>> > I could go to the Erlang IRC channel or forums (whatever they use) and
>> > ask for advice. Since you are actually using Erlang, I think you would
>> > be the better person to contact Joe Armstrong himself, if we decide to
>> > do that.
>>
>> I presented the situation on IRC and it was recommended that I start the
>> discussion on a mailing list.
>>
>> I think that the erlang-questions list [0] could be a good place to
>> start.
>>
>> Pjotr, would you like to start the conversation? I can do it if you are
>> too busy or something.
>>
>> [0]
>> http://www.erlang.org/community
>>
>
> --
>
>>
>>    Regards,
>>    Magnus
>>
>>    On Mon, Apr 4, 2016 at 8:59 PM, Joe Armstrong <[3]erlang@gmail.com> wrote:
>>
>>      Hello,
>>
>>      I think I've asked this before but cannot find the answer:
>>
>>      I want the beam file produced by
>>
>>        $ erl file.erl
>>
>>      to always have the same sha1 checksum - there was, if I remember
>>      correctly, a hidden flag that removed the time of compilation etc from
>>      the beam code. Any ideas how to do this?
>>
>>      /Joe
>>      _______________________________________________
>>      erlang-questions mailing list
>>      [4]erlang-questions@erlang.org
>>      [5]http://erlang.org/mailman/listinfo/erlang-questions
>>
>> References
>>
>>    Visible links
>>    1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795834
>>    2. http://erlang.org/pipermail/erlang-questions/2015-January/082699.html
>>    3. mailto:erlang@gmail.com
>>    4. mailto:erlang-questions@erlang.org
>>    5. http://erlang.org/mailman/listinfo/erlang-questions
>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@erlang.org
>> http://erlang.org/mailman/listinfo/erlang-questions
>
>
> --

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Reproducible and deterministic builds for GNU Guix (and Nix)
  2016-04-08 15:06     ` Joe Armstrong
@ 2016-04-08 19:46       ` Pjotr Prins
  2016-04-08 19:52         ` Pjotr Prins
  0 siblings, 1 reply; 5+ messages in thread
From: Pjotr Prins @ 2016-04-08 19:46 UTC (permalink / raw)
  To: Joe Armstrong; +Cc: guix-devel, Leo Famulari, Erlang, Magnus Henoch

On Fri, Apr 08, 2016 at 05:06:49PM +0200, Joe Armstrong wrote:
> For your purposes I can write a script to call erlc and strip out the
> parts that make compilation reproducible. In the long term we should
> discuss this, figure out what the correct thing to do is, and then do it.

On the Erlang mailing list there is a discussion about removing time
stamps and creating a compile time switch for that. That would do, or
a post-processing script/hack would work for the time being - we can
easily incorporate that. The sooner we can do reproducible builds, the
sooner we can have Erlang and Elixir in Guix :)

BTW, having a SHA value inside beam/object files may sound useful, but
in the context of GNU Guix, and Nix, compiled files are immutable -
i.e. we guarantee they can not be modified (well, there may be a way
by overriding ro permissions, but that would mean a built in SHA can
also be changed). I guess you need it for hot reloading, but can't you
calculate the SHA on the fly for that? You'd need to open beam files
anyway to get the value(s) and they will be loaded by the OS unless
they are very large (> 8Mb).  I'd go by file stamps and if different
do a checksum. It only has to happen once for the original file.  

Pj.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Reproducible and deterministic builds for GNU Guix (and Nix)
  2016-04-08 19:46       ` Pjotr Prins
@ 2016-04-08 19:52         ` Pjotr Prins
  0 siblings, 0 replies; 5+ messages in thread
From: Pjotr Prins @ 2016-04-08 19:52 UTC (permalink / raw)
  To: Pjotr Prins; +Cc: guix-devel, Leo Famulari, Erlang, Magnus Henoch

> also be changed). I guess you need it for hot reloading, but can't you
> calculate the SHA on the fly for that? You'd need to open beam files
> anyway to get the value(s) and they will be loaded by the OS unless

Aw, sorry Joe, my bad. The stored SHA value is calculated on the
original source file, so that is not opened. 

Guix/Nix calculate their SHA values over the source and build
environment and that path is stored in an immutable store. So, you
already have a strong guarantee that the sands don't shift under you. 

But then people will use Erlang outside GNU Guix - and there a SHA
value makes sense if you watn to have a guarantee that it has the same
source. Much better than a time stamp.

Pj.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-08 19:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAANBt-rnhhLkzJqzPZ6yoKQb-jXHaQ33ONXXmFL-b0aCLuDNQA@mail.gmail.com>
     [not found] ` <CAJ67tHrONL3_EMcfq+rPVx1k8RHMQc2ddBo4EuO6WYDqCwQSdw@mail.gmail.com>
2016-04-08  6:59   ` Reproducible and deterministic builds for GNU Guix (and Nix) Pjotr Prins
2016-04-08  7:13     ` Erlang: " Pjotr Prins
2016-04-08 15:06     ` Joe Armstrong
2016-04-08 19:46       ` Pjotr Prins
2016-04-08 19:52         ` Pjotr Prins

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.