* Re: Removing compilers that cannot be bootstrapped
2016-03-21 17:54 Removing compilers that cannot be bootstrapped Thompson, David
@ 2016-03-21 19:15 ` Taylan Ulrich Bayırlı/Kammer
2016-03-21 19:22 ` Taylan Ulrich Bayırlı/Kammer
2016-03-21 19:32 ` Andreas Enge
2016-03-21 22:43 ` rain1
` (2 subsequent siblings)
3 siblings, 2 replies; 26+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2016-03-21 19:15 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> writes:
> Haskell, OCaml, Chicken, and other compilers that we package have a
> serious issue that many of us are aware of: they cannot be built from
> source! They rely upon pre-built binaries of the same compiler. I
> understand that it's very inconvenient to not have these compilers
> available to us, and all of the software that is written in their
> respective languages, but I feel like all of our work is undermined by
> making exceptions for them. I would like to remove compilers that
> don't have a bunch of dependent packages yet such as Chicken until
> upstream fixes the issue. But we have tons of Haskell packages and a
> handful of OCaml packages and it would be heartbreaking to some to
> remove all of that hard work.
>
> What can we possibly do to avoid being yet another distro that relies
> on a bunch of blobs (leaving the *true* bootstrap binaries out of it
> for now)?
A while back Mark raised the idea of hosting one pre-compiled bootstrap
version of each such compiler, and use that to compile further versions.
This way the number of blobs is one per such compiler, instead of one
for every new version of each such compiler.
It seemed like a good medium-term solution to me. I'm not sure how it
would be implemented.
I think the original proposal had it that we keep an internal bootstrap
version of the package, which works by downloading a blob, and this is
used to compile the true, public package. However, we would need to
update the bootstrap version whenever it becomes too old to compile the
newest version. So new untrusted blobs enter the picture every once in
a while. Maybe frequently, if some of these compilers don't care much
about supporting the ability to be compiled with somewhat older versions
of themselves.
A solution to that in turn might be to keep a growing list of
intermediate versions of the compiler in addition to the bootstrap and
newest versions. The first version in this list can be compiled via the
bootstrap version, each nth version in the list can be compiled with the
n-1th version, and the last version in the list is used to compile the
current, newest version.
(define %foo-compiler-bootstrap
(package ... (source some-blob) ...))
(define %foo-compiler-intermediate-versions-list
'("1.1" "1.7" "2.2" ...))
(define %foo-compiler-intermediate-versions
(magic %foo-compiler-bootstrap
%foo-compiler-intermediate-versions-list))
(define foo-compiler
(package
...
(native-inputs
`(("foo-compiler" ,(last %foo-compiler-intermediate-versions))
...))
...))
So when someone instructs guix to rebuild the world from scratch, it
downloads the bootstrap blob, then builds 1.1 with it, then builds 1.7
with that, then 2.2 with that, and so on, and ultimately the current
version.
This way it's truly one blob for each such compiler in guix. And one
day, that blob can be replaced with a verified-safe one.
I just came up with this and haven't thought much about it. Just
throwing it out there.
> - Dave
Taylan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 19:15 ` Taylan Ulrich Bayırlı/Kammer
@ 2016-03-21 19:22 ` Taylan Ulrich Bayırlı/Kammer
2016-03-21 19:32 ` Andreas Enge
1 sibling, 0 replies; 26+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2016-03-21 19:22 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:
> "Thompson, David" <dthompson2@worcester.edu> writes:
>
>> Haskell, OCaml, Chicken, and other compilers that we package have a
>> serious issue that many of us are aware of: they cannot be built from
>> source! They rely upon pre-built binaries of the same compiler. I
>> understand that it's very inconvenient to not have these compilers
>> available to us, and all of the software that is written in their
>> respective languages, but I feel like all of our work is undermined by
>> making exceptions for them. I would like to remove compilers that
>> don't have a bunch of dependent packages yet such as Chicken until
>> upstream fixes the issue. But we have tons of Haskell packages and a
>> handful of OCaml packages and it would be heartbreaking to some to
>> remove all of that hard work.
>>
>> What can we possibly do to avoid being yet another distro that relies
>> on a bunch of blobs (leaving the *true* bootstrap binaries out of it
>> for now)?
>
> A while back Mark raised the idea of hosting one pre-compiled bootstrap
> version of each such compiler, and use that to compile further versions.
I now found Mark's mail in the archive:
https://lists.gnu.org/archive/html/guix-devel/2015-02/msg00814.html
Taylan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 19:15 ` Taylan Ulrich Bayırlı/Kammer
2016-03-21 19:22 ` Taylan Ulrich Bayırlı/Kammer
@ 2016-03-21 19:32 ` Andreas Enge
1 sibling, 0 replies; 26+ messages in thread
From: Andreas Enge @ 2016-03-21 19:32 UTC (permalink / raw)
To: Taylan Ulrich Bayırlı/Kammer; +Cc: guix-devel
On Mon, Mar 21, 2016 at 08:15:34PM +0100, Taylan Ulrich Bayırlı/Kammer wrote:
> So when someone instructs guix to rebuild the world from scratch, it
> downloads the bootstrap blob, then builds 1.1 with it, then builds 1.7
> with that, then 2.2 with that, and so on, and ultimately the current
> version.
This is an interesting idea, but maybe not enough, assuming that the build
process requires additional inputs to build besides the bootstrap blob.
If we need additional libraries A to Z to build the compiler, then it may
be that 1.7 and 2.2 require different versions of these libraries. And of
course, each library may depend recursively on another set of libraries.
Ultimately, this might force us to keep a whole tree (well, DAG) of frozen
inputs for versions 1.7 and 2.2 to compile the current one. This might
quickly become completely unmanageable. (An example is the latest trial
of updating libreoffice, which was actually impossible given that libreoffice
itself and some of its inputs all depended on a certain library, but in
different versions).
Andreas
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 17:54 Removing compilers that cannot be bootstrapped Thompson, David
2016-03-21 19:15 ` Taylan Ulrich Bayırlı/Kammer
@ 2016-03-21 22:43 ` rain1
2016-03-22 16:23 ` Ludovic Courtès
2016-03-21 22:48 ` Ludovic Courtès
2016-03-26 6:51 ` John Darrington
3 siblings, 1 reply; 26+ messages in thread
From: rain1 @ 2016-03-21 22:43 UTC (permalink / raw)
To: guix-devel
On 2016-03-21 17:54, Thompson, David wrote:
> Haskell, OCaml, Chicken, and other compilers that we package have a
> serious issue that many of us are aware of: they cannot be built from
> source! They rely upon pre-built binaries of the same compiler. I
> understand that it's very inconvenient to not have these compilers
> available to us, and all of the software that is written in their
> respective languages, but I feel like all of our work is undermined by
> making exceptions for them. I would like to remove compilers that
> don't have a bunch of dependent packages yet such as Chicken until
> upstream fixes the issue. But we have tons of Haskell packages and a
> handful of OCaml packages and it would be heartbreaking to some to
> remove all of that hard work.
>
> What can we possibly do to avoid being yet another distro that relies
> on a bunch of blobs (leaving the *true* bootstrap binaries out of it
> for now)?
>
> - Dave
I would like to propose one idea to manage this: What about adding a
field to the system configuration for a list of 'trusted-binaries'?
guix requires gcc binaries (and some other things) to bootstrap off so
these things are all implicitly trusted binaries. My suggestion is to
make that list explicit and allow people to add things like the ocaml
compiler to it.
Trying to install the ocaml compiler could give an error about an
untrusted binary unless the user has added that to their system
configuration.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 17:54 Removing compilers that cannot be bootstrapped Thompson, David
2016-03-21 19:15 ` Taylan Ulrich Bayırlı/Kammer
2016-03-21 22:43 ` rain1
@ 2016-03-21 22:48 ` Ludovic Courtès
2016-03-22 9:56 ` Jookia
` (2 more replies)
2016-03-26 6:51 ` John Darrington
3 siblings, 3 replies; 26+ messages in thread
From: Ludovic Courtès @ 2016-03-21 22:48 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
"Thompson, David" <dthompson2@worcester.edu> skribis:
> Haskell, OCaml, Chicken, and other compilers that we package have a
> serious issue that many of us are aware of: they cannot be built from
> source!
(And GCC, but let’s put it aside for now.)
> They rely upon pre-built binaries of the same compiler. I understand
> that it's very inconvenient to not have these compilers available to
> us, and all of the software that is written in their respective
> languages, but I feel like all of our work is undermined by making
> exceptions for them. I would like to remove compilers that don't have
> a bunch of dependent packages yet such as Chicken until upstream fixes
> the issue. But we have tons of Haskell packages and a handful of
> OCaml packages and it would be heartbreaking to some to remove all of
> that hard work.
I definitely sympathize with your concerns, and also in the case of
whole-distro bootstrapping.
However, removing things seems really harsh, and also sidestepping the
problem (not to mention that once we’d done that, we couldn’t ignore
GCC’s bootstrapping.)
taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:
> A while back Mark raised the idea of hosting one pre-compiled bootstrap
> version of each such compiler, and use that to compile further versions.
>
> This way the number of blobs is one per such compiler, instead of one
> for every new version of each such compiler.
>
> It seemed like a good medium-term solution to me. I'm not sure how it
> would be implemented.
I like the idea.
Often, in their implementation history, compilers are boostrapped from
something else initially, and only later to they become self-hosted and
unbootstrappable.
So in theory, it’d be possible to find, say, an old-enough GHC that only
requires a C compiler (?), and use that to build the next version and so
on, until we reach the latest version. I suspect the same applies to
many compilers.
This is technically possible. The main difficulty is to find what exact
chain of compiler versions will work, and then to make sure that the
super-old compilers can build. The risk, as Andreas suggests, is that
maintaining those old versions will require dragging a whole graph of
old dependencies, recursively.
But really, we won’t know until we’ve actually tried ;-), and it’ll
be different for each compiler.
I would suggest that people pick a compiler they’re more or less
familiar with, and give it a try. MIT/GNU Scheme might be a good start,
since we should be able to talk with the folks behind it and reach
mutual understanding. ;-)
For GCC, an idea discussed at
<https://reproducible-builds.org/events/athens2015/bootstrapping/> would
be to build GCC 4.7 (the last version written in plain C) with something
more auditable like TinyCC, and then use this g++ 4.7 to build whatever
GCC version we want. Again, sounds like it should work, but we need to
actually try.
Thoughts?
BTW, the “good news” is that more and more compilers build upon LLVM,
and for those there’s no bootstrapping problem if we take the C++
compiler for granted.
Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 22:48 ` Ludovic Courtès
@ 2016-03-22 9:56 ` Jookia
2016-03-22 16:25 ` Ludovic Courtès
2016-03-22 14:57 ` Eric Bavier
2016-03-22 22:29 ` Christopher Allan Webber
2 siblings, 1 reply; 26+ messages in thread
From: Jookia @ 2016-03-22 9:56 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Mon, Mar 21, 2016 at 11:48:40PM +0100, Ludovic Courtès wrote:
> Often, in their implementation history, compilers are boostrapped from
> something else initially, and only later to they become self-hosted and
> unbootstrappable.
>
> So in theory, it’d be possible to find, say, an old-enough GHC that only
> requires a C compiler (?), and use that to build the next version and so
> on, until we reach the latest version. I suspect the same applies to
> many compilers.
I'm not sure about this. Bootstrapping older compilers means there's often less
support for the platform you're on, which means we'll end up in a situation
where we're bootstrapping from machines and cross-compiling, and I forsee the
problem being that we'll have to rely on nonfree code or machines as our huge
backhaul in a decade where we're on some cool free hardware RISC architecture.
For instance, to run GHC on ARM you can only use a recent GHC, all the old
versions didn't support it. Sure you could go from C to get an old GHC on ARM,
but it wouldn't have support for outputting ARM assembly.
> This is technically possible. The main difficulty is to find what exact
> chain of compiler versions will work, and then to make sure that the
> super-old compilers can build. The risk, as Andreas suggests, is that
> maintaining those old versions will require dragging a whole graph of
> old dependencies, recursively.
Guix could be well suited to this with its ability to create environments with
bitrotted tools.
> But really, we won’t know until we’ve actually tried ;-), and it’ll
> be different for each compiler.
>
> I would suggest that people pick a compiler they’re more or less
> familiar with, and give it a try. MIT/GNU Scheme might be a good start,
> since we should be able to talk with the folks behind it and reach
> mutual understanding. ;-)
>
> For GCC, an idea discussed at
> <https://reproducible-builds.org/events/athens2015/bootstrapping/> would
> be to build GCC 4.7 (the last version written in plain C) with something
> more auditable like TinyCC, and then use this g++ 4.7 to build whatever
> GCC version we want. Again, sounds like it should work, but we need to
> actually try.
Sounds interesting, and even better if we could compile the rest of the
bootstrap with just TinyCC.
> Thoughts?
Many! :)
> BTW, the “good news” is that more and more compilers build upon LLVM,
> and for those there’s no bootstrapping problem if we take the C++
> compiler for granted.
Is this true? I know a lot of compilers *use* LLVM as a backend, but not sure
about their frontends.
That said, LLVM's a bit of a problem in itself, to the point GHC has decided to
fix a lot of their issues with the ARM platform by bundling a LLVM version to be
consistent across distros. I think there's also been talk of taking advantage of
this to remove useless phases or add Haskell-specific ones to LLVM, so we'd end
up LLVM just being another part of the toolchain. This is a bit off topic though
since it doesn't relate to bootstrapping but to bundling. ;)
> Ludo’.
Jookia.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-22 9:56 ` Jookia
@ 2016-03-22 16:25 ` Ludovic Courtès
0 siblings, 0 replies; 26+ messages in thread
From: Ludovic Courtès @ 2016-03-22 16:25 UTC (permalink / raw)
To: Jookia; +Cc: guix-devel
Jookia <166291@gmail.com> skribis:
> On Mon, Mar 21, 2016 at 11:48:40PM +0100, Ludovic Courtès wrote:
>> Often, in their implementation history, compilers are boostrapped from
>> something else initially, and only later to they become self-hosted and
>> unbootstrappable.
>>
>> So in theory, it’d be possible to find, say, an old-enough GHC that only
>> requires a C compiler (?), and use that to build the next version and so
>> on, until we reach the latest version. I suspect the same applies to
>> many compilers.
>
> I'm not sure about this. Bootstrapping older compilers means there's often less
> support for the platform you're on, which means we'll end up in a situation
> where we're bootstrapping from machines and cross-compiling, and I forsee the
> problem being that we'll have to rely on nonfree code or machines as our huge
> backhaul in a decade where we're on some cool free hardware RISC architecture.
>
> For instance, to run GHC on ARM you can only use a recent GHC, all the old
> versions didn't support it. Sure you could go from C to get an old GHC on ARM,
> but it wouldn't have support for outputting ARM assembly.
Good point, indeed.
>> For GCC, an idea discussed at
>> <https://reproducible-builds.org/events/athens2015/bootstrapping/> would
>> be to build GCC 4.7 (the last version written in plain C) with something
>> more auditable like TinyCC, and then use this g++ 4.7 to build whatever
>> GCC version we want. Again, sounds like it should work, but we need to
>> actually try.
>
> Sounds interesting, and even better if we could compile the rest of the
> bootstrap with just TinyCC.
Yes, though we need a C++ compiler anyway to build current GCCs.
>> BTW, the “good news” is that more and more compilers build upon LLVM,
>> and for those there’s no bootstrapping problem if we take the C++
>> compiler for granted.
>
> Is this true? I know a lot of compilers *use* LLVM as a backend, but not sure
> about their frontends.
Right, it may be that front-ends are still mostly written in the target
language.
Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 22:48 ` Ludovic Courtès
2016-03-22 9:56 ` Jookia
@ 2016-03-22 14:57 ` Eric Bavier
2016-03-22 16:22 ` Ludovic Courtès
2016-03-22 22:29 ` Christopher Allan Webber
2 siblings, 1 reply; 26+ messages in thread
From: Eric Bavier @ 2016-03-22 14:57 UTC (permalink / raw)
To: ludo; +Cc: guix-devel, guix-devel-bounces+ericbavier=openmailbox.org
On 2016-03-21 17:48, ludo@gnu.org wrote:
> taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:
>
>> A while back Mark raised the idea of hosting one pre-compiled
>> bootstrap
>> version of each such compiler, and use that to compile further
>> versions.
>>
>> This way the number of blobs is one per such compiler, instead of one
>> for every new version of each such compiler.
>>
>> It seemed like a good medium-term solution to me. I'm not sure how it
>> would be implemented.
>
> I like the idea.
>
> Often, in their implementation history, compilers are boostrapped from
> something else initially, and only later to they become self-hosted and
> unbootstrappable.
>
> So in theory, it’d be possible to find, say, an old-enough GHC that
> only
> requires a C compiler (?), and use that to build the next version and
> so
> on, until we reach the latest version. I suspect the same applies to
> many compilers.
>
> This is technically possible. The main difficulty is to find what
> exact
> chain of compiler versions will work, and then to make sure that the
> super-old compilers can build. The risk, as Andreas suggests, is that
> maintaining those old versions will require dragging a whole graph of
> old dependencies, recursively.
>
> But really, we won’t know until we’ve actually tried ;-), and it’ll
> be different for each compiler.
My initial attempt at packaging GHC before our current package went this
route, ultimately to no avail. The earliest publicly-available GHC
source tarball is indeed "just C", but it is machine-generated C code
much like Chicken Scheme's.
Some software archeology might be able to produce a still earlier
version of GHC from a developers private collection.
--
`~Eric
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-22 14:57 ` Eric Bavier
@ 2016-03-22 16:22 ` Ludovic Courtès
0 siblings, 0 replies; 26+ messages in thread
From: Ludovic Courtès @ 2016-03-22 16:22 UTC (permalink / raw)
To: Eric Bavier; +Cc: guix-devel, guix-devel-bounces+ericbavier=openmailbox.org
Eric Bavier <ericbavier@openmailbox.org> skribis:
> On 2016-03-21 17:48, ludo@gnu.org wrote:
>> taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:
>>
>>> A while back Mark raised the idea of hosting one pre-compiled
>>> bootstrap
>>> version of each such compiler, and use that to compile further
>>> versions.
>>>
>>> This way the number of blobs is one per such compiler, instead of one
>>> for every new version of each such compiler.
>>>
>>> It seemed like a good medium-term solution to me. I'm not sure how it
>>> would be implemented.
>>
>> I like the idea.
>>
>> Often, in their implementation history, compilers are boostrapped from
>> something else initially, and only later to they become self-hosted and
>> unbootstrappable.
>>
>> So in theory, it’d be possible to find, say, an old-enough GHC that
>> only
>> requires a C compiler (?), and use that to build the next version
>> and so
>> on, until we reach the latest version. I suspect the same applies to
>> many compilers.
>>
>> This is technically possible. The main difficulty is to find what
>> exact
>> chain of compiler versions will work, and then to make sure that the
>> super-old compilers can build. The risk, as Andreas suggests, is that
>> maintaining those old versions will require dragging a whole graph of
>> old dependencies, recursively.
>>
>> But really, we won’t know until we’ve actually tried ;-), and it’ll
>> be different for each compiler.
>
> My initial attempt at packaging GHC before our current package went
> this route, ultimately to no avail. The earliest publicly-available
> GHC source tarball is indeed "just C", but it is machine-generated C
> code much like Chicken Scheme's.
Oh, good to hear that you tried and failed.
It sounds like GHC bootstrapping will be very hard to address.
Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 22:48 ` Ludovic Courtès
2016-03-22 9:56 ` Jookia
2016-03-22 14:57 ` Eric Bavier
@ 2016-03-22 22:29 ` Christopher Allan Webber
2016-03-23 22:12 ` Ludovic Courtès
2 siblings, 1 reply; 26+ messages in thread
From: Christopher Allan Webber @ 2016-03-22 22:29 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Ludovic Courtès writes:
> "Thompson, David" <dthompson2@worcester.edu> skribis:
>
>> Haskell, OCaml, Chicken, and other compilers that we package have a
>> serious issue that many of us are aware of: they cannot be built from
>> source!
>
> (And GCC, but let’s put it aside for now.)
>
>> They rely upon pre-built binaries of the same compiler. I understand
>> that it's very inconvenient to not have these compilers available to
>> us, and all of the software that is written in their respective
>> languages, but I feel like all of our work is undermined by making
>> exceptions for them. I would like to remove compilers that don't have
>> a bunch of dependent packages yet such as Chicken until upstream fixes
>> the issue. But we have tons of Haskell packages and a handful of
>> OCaml packages and it would be heartbreaking to some to remove all of
>> that hard work.
>
> I definitely sympathize with your concerns, and also in the case of
> whole-distro bootstrapping.
>
> However, removing things seems really harsh, and also sidestepping the
> problem (not to mention that once we’d done that, we couldn’t ignore
> GCC’s bootstrapping.)
I agree that removing things seems really harsh... I'd even say too
harsh, in the case of Haskell at least. I'd really like Guix to be a
good long-term solution for Haskell people.
> taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:
>
>> A while back Mark raised the idea of hosting one pre-compiled bootstrap
>> version of each such compiler, and use that to compile further versions.
>>
>> This way the number of blobs is one per such compiler, instead of one
>> for every new version of each such compiler.
>>
>> It seemed like a good medium-term solution to me. I'm not sure how it
>> would be implemented.
>
> I like the idea.
It sounds good to me too.
Let me give an even shorter-term solution: maybe there is a way to mark
things as risky from a trust perspective when it comes to bootstrapping?
Maybe we could do something like:
(define-public ghc
(package
(name "ghc")
(version "7.10.2")
;; [... bla bla ...]
(properties '(("bootstrap-untrusted" #t)))))
... or bootstrap-risky, or etc.
This can allow us to move forward with these languages for now while
leaving clear documentation and a way to check trust via the dependency
heierarchy.
Obviously we want there to be no bootstrap-untrusted, and we can work
towards that...
- Chris
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-22 22:29 ` Christopher Allan Webber
@ 2016-03-23 22:12 ` Ludovic Courtès
2016-03-23 22:49 ` Christopher Allan Webber
0 siblings, 1 reply; 26+ messages in thread
From: Ludovic Courtès @ 2016-03-23 22:12 UTC (permalink / raw)
To: Christopher Allan Webber; +Cc: guix-devel
Christopher Allan Webber <cwebber@dustycloud.org> skribis:
> Let me give an even shorter-term solution: maybe there is a way to mark
> things as risky from a trust perspective when it comes to bootstrapping?
> Maybe we could do something like:
>
> (define-public ghc
> (package
> (name "ghc")
> (version "7.10.2")
> ;; [... bla bla ...]
> (properties '(("bootstrap-untrusted" #t)))))
Why not, but what would be the correspond warning, and the expected
effect?
On one hand, a warning might annoy people since there’s nothing they can
do; on the other hand, it can help raise awareness.
Thoughts?
Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-23 22:12 ` Ludovic Courtès
@ 2016-03-23 22:49 ` Christopher Allan Webber
2016-03-24 3:11 ` Leo Famulari
0 siblings, 1 reply; 26+ messages in thread
From: Christopher Allan Webber @ 2016-03-23 22:49 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Ludovic Courtès writes:
> Christopher Allan Webber <cwebber@dustycloud.org> skribis:
>
>> Let me give an even shorter-term solution: maybe there is a way to mark
>> things as risky from a trust perspective when it comes to bootstrapping?
>> Maybe we could do something like:
>>
>> (define-public ghc
>> (package
>> (name "ghc")
>> (version "7.10.2")
>> ;; [... bla bla ...]
>> (properties '(("bootstrap-untrusted" #t)))))
>
> Why not, but what would be the correspond warning, and the expected
> effect?
A warning, or maybe even also a:
guix package -i foo --only-reproducible
which could error?
> On one hand, a warning might annoy people since there’s nothing they can
> do; on the other hand, it can help raise awareness.
>
> Thoughts?
>
> Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-23 22:49 ` Christopher Allan Webber
@ 2016-03-24 3:11 ` Leo Famulari
2016-03-25 23:08 ` Ludovic Courtès
0 siblings, 1 reply; 26+ messages in thread
From: Leo Famulari @ 2016-03-24 3:11 UTC (permalink / raw)
To: Christopher Allan Webber; +Cc: guix-devel
On Wed, Mar 23, 2016 at 03:49:33PM -0700, Christopher Allan Webber wrote:
> Ludovic Courtès writes:
>
> > Christopher Allan Webber <cwebber@dustycloud.org> skribis:
> >
> >> Let me give an even shorter-term solution: maybe there is a way to mark
> >> things as risky from a trust perspective when it comes to bootstrapping?
> >> Maybe we could do something like:
> >>
> >> (define-public ghc
> >> (package
> >> (name "ghc")
> >> (version "7.10.2")
> >> ;; [... bla bla ...]
> >> (properties '(("bootstrap-untrusted" #t)))))
> >
> > Why not, but what would be the correspond warning, and the expected
> > effect?
>
> A warning, or maybe even also a:
>
> guix package -i foo --only-reproducible
>
> which could error?
If we decide to do something like that, we should decide if we want the
word 'reproducible' to mean bit-for-bit reproducibility.
Personally, I think use of that word should include that meaning.
>
> > On one hand, a warning might annoy people since there’s nothing they can
> > do; on the other hand, it can help raise awareness.
> >
> > Thoughts?
> >
> > Ludo’.
>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-24 3:11 ` Leo Famulari
@ 2016-03-25 23:08 ` Ludovic Courtès
2016-03-26 0:22 ` Leo Famulari
0 siblings, 1 reply; 26+ messages in thread
From: Ludovic Courtès @ 2016-03-25 23:08 UTC (permalink / raw)
To: Leo Famulari; +Cc: guix-devel
Leo Famulari <leo@famulari.name> skribis:
> On Wed, Mar 23, 2016 at 03:49:33PM -0700, Christopher Allan Webber wrote:
>> Ludovic Courtès writes:
>>
>> > Christopher Allan Webber <cwebber@dustycloud.org> skribis:
>> >
>> >> Let me give an even shorter-term solution: maybe there is a way to mark
>> >> things as risky from a trust perspective when it comes to bootstrapping?
>> >> Maybe we could do something like:
>> >>
>> >> (define-public ghc
>> >> (package
>> >> (name "ghc")
>> >> (version "7.10.2")
>> >> ;; [... bla bla ...]
>> >> (properties '(("bootstrap-untrusted" #t)))))
>> >
>> > Why not, but what would be the correspond warning, and the expected
>> > effect?
>>
>> A warning, or maybe even also a:
>>
>> guix package -i foo --only-reproducible
>>
>> which could error?
Hmm or --only-traceable?
> If we decide to do something like that, we should decide if we want the
> word 'reproducible' to mean bit-for-bit reproducibility.
The problem is that big binary blobs like GHC’s are necessarily
bit-for-bit reproducible. :-)
Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-25 23:08 ` Ludovic Courtès
@ 2016-03-26 0:22 ` Leo Famulari
2016-03-26 6:40 ` Chris Marusich
0 siblings, 1 reply; 26+ messages in thread
From: Leo Famulari @ 2016-03-26 0:22 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
On Sat, Mar 26, 2016 at 12:08:09AM +0100, Ludovic Courtès wrote:
> Leo Famulari <leo@famulari.name> skribis:
>
> > On Wed, Mar 23, 2016 at 03:49:33PM -0700, Christopher Allan Webber wrote:
> >> Ludovic Courtès writes:
> >>
> >> > Christopher Allan Webber <cwebber@dustycloud.org> skribis:
> >> >
> >> >> Let me give an even shorter-term solution: maybe there is a way to mark
> >> >> things as risky from a trust perspective when it comes to bootstrapping?
> >> >> Maybe we could do something like:
> >> >>
> >> >> (define-public ghc
> >> >> (package
> >> >> (name "ghc")
> >> >> (version "7.10.2")
> >> >> ;; [... bla bla ...]
> >> >> (properties '(("bootstrap-untrusted" #t)))))
> >> >
> >> > Why not, but what would be the correspond warning, and the expected
> >> > effect?
> >>
> >> A warning, or maybe even also a:
> >>
> >> guix package -i foo --only-reproducible
> >>
> >> which could error?
>
> Hmm or --only-traceable?
>
> > If we decide to do something like that, we should decide if we want the
> > word 'reproducible' to mean bit-for-bit reproducibility.
>
> The problem is that big binary blobs like GHC’s are necessarily
> bit-for-bit reproducible. :-)
`wget https://blob` doesn't count as reproducible :)
Another useful word could be 'deterministic'.
>
> Ludo’.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 0:22 ` Leo Famulari
@ 2016-03-26 6:40 ` Chris Marusich
2016-03-26 6:55 ` Chris Marusich
2016-03-26 8:12 ` Ricardo Wurmus
0 siblings, 2 replies; 26+ messages in thread
From: Chris Marusich @ 2016-03-26 6:40 UTC (permalink / raw)
To: Leo Famulari; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1768 bytes --]
Leo Famulari <leo@famulari.name> writes:
> `wget https://blob` doesn't count as reproducible :)
Very true.
Self-hosting compilers are a cute trick, but they're a far cry from
being reproducible. They're just inscrutable binary blobs. If we want
true reproducibility from the bottom up, then it seems like the only way
to do it is via a strategy like the following:
1) Write the simplest possible program (or collection of programs) in
the simplest possible machine code. This program serves only one
purpose: to enable you to write more code at a higher level of
abstraction. It is effectively a compiler for a very primitive
language, but the language it compiles will be one layer of abstraction
above machine code, which is a step in the right direction. This first
program must be a "binary blob", since we cannot rely on any existing
tools to build it. It must be simple enough that someone can read and
understand it using e.g. a hex editor, provided that they have access to
the right reference materials. Since this program exists only as
machine code, it must be documented thoroughly to make it easier to
understand.
2) Write source code which, when compiled using the compiler/toolchain
From the previous step, produces a new compiler/toolchain that will
allow you to write more expressive source code at a higher layer of
abstraction.
3) Repeat step (2) as many times as necessary to produce a compiler that
is capable of compiling GCC from source.
4) Use the compiler from (3) to compile GCC from source.
5) Use the GCC from (4) to compile the rest of the world from source.
If we want to free ourselves from reliance on inscrutable binary blobs,
isn't something like that the only way?
--
Chris
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 6:40 ` Chris Marusich
@ 2016-03-26 6:55 ` Chris Marusich
2016-03-26 9:02 ` Jookia
2016-03-26 14:05 ` Alex Vong
2016-03-26 8:12 ` Ricardo Wurmus
1 sibling, 2 replies; 26+ messages in thread
From: Chris Marusich @ 2016-03-26 6:55 UTC (permalink / raw)
To: Leo Famulari; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]
Chris Marusich <cmmarusich@gmail.com> writes:
> Leo Famulari <leo@famulari.name> writes:
>
>> `wget https://blob` doesn't count as reproducible :)
>
> Very true.
>
> Self-hosting compilers are a cute trick, but they're a far cry from
> being reproducible. They're just inscrutable binary blobs. If we want
> true reproducibility from the bottom up, then it seems like the only way
> to do it is via a strategy like the following:
>
> 1) Write the simplest possible program (or collection of programs) in
> the simplest possible machine code. This program serves only one
> purpose: to enable you to write more code at a higher level of
> abstraction. It is effectively a compiler for a very primitive
> language, but the language it compiles will be one layer of abstraction
> above machine code, which is a step in the right direction. This first
> program must be a "binary blob", since we cannot rely on any existing
> tools to build it. It must be simple enough that someone can read and
> understand it using e.g. a hex editor, provided that they have access to
> the right reference materials. Since this program exists only as
> machine code, it must be documented thoroughly to make it easier to
> understand.
>
> 2) Write source code which, when compiled using the compiler/toolchain
> From the previous step, produces a new compiler/toolchain that will
> allow you to write more expressive source code at a higher layer of
> abstraction.
>
> 3) Repeat step (2) as many times as necessary to produce a compiler that
> is capable of compiling GCC from source.
>
> 4) Use the compiler from (3) to compile GCC from source.
>
> 5) Use the GCC from (4) to compile the rest of the world from source.
>
> If we want to free ourselves from reliance on inscrutable binary blobs,
> isn't something like that the only way?
Sorry for replying to my own post, but I couldn't help myself. If
anyone thinks the above sounds too paranoid, remember the Ken Thompson
hack:
http://www.c2.com/cgi/wiki?TheKenThompsonHack
Chilling!
--
Chris
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 6:55 ` Chris Marusich
@ 2016-03-26 9:02 ` Jookia
2016-03-26 14:05 ` Alex Vong
1 sibling, 0 replies; 26+ messages in thread
From: Jookia @ 2016-03-26 9:02 UTC (permalink / raw)
To: Chris Marusich; +Cc: guix-devel
Hey,
I have a few quick notes that aren't that relevant.
On Fri, Mar 25, 2016 at 11:55:44PM -0700, Chris Marusich wrote:
> > 1) Write the simplest possible program (or collection of programs) in
> > the simplest possible machine code. This program serves only one
> > purpose: to enable you to write more code at a higher level of
> > abstraction. It is effectively a compiler for a very primitive
> > language, but the language it compiles will be one layer of abstraction
> > above machine code, which is a step in the right direction. This first
> > program must be a "binary blob", since we cannot rely on any existing
> > tools to build it. It must be simple enough that someone can read and
> > understand it using e.g. a hex editor, provided that they have access to
> > the right reference materials. Since this program exists only as
> > machine code, it must be documented thoroughly to make it easier to
> > understand.
> >
> > 2) Write source code which, when compiled using the compiler/toolchain
> > From the previous step, produces a new compiler/toolchain that will
> > allow you to write more expressive source code at a higher layer of
> > abstraction.
> >
> > 3) Repeat step (2) as many times as necessary to produce a compiler that
> > is capable of compiling GCC from source.
> >
> > 4) Use the compiler from (3) to compile GCC from source.
> >
> > 5) Use the GCC from (4) to compile the rest of the world from source.
> >
> > If we want to free ourselves from reliance on inscrutable binary blobs,
> > isn't something like that the only way?
I think so. We also have to write bootstrap compilers for other languages, which
is the other end of the stick being burned here. We also need to figure out how
to get the correct source to people with signing.
> Sorry for replying to my own post, but I couldn't help myself. If
> anyone thinks the above sounds too paranoid, remember the Ken Thompson
> hack:
>
> http://www.c2.com/cgi/wiki?TheKenThompsonHack
>
> Chilling!
I try not to think about security too much when it comes to efforts, after all,
it's just a cat and mouse game of probability. I think there was a talk on this
by perhaps Ludovic or Dave about how freedom relates to reproducibility. If we
can't reproduce something, how can we truly have the source code to modify it?
Philosophy aside, I think it's important to know that the clear security benefit
here isn't that we can prevent these attacks, but we will have a smaller set of
code to audit when we have the tools to take a threat model and prevent threats.
For now the closest we could have is to trust whatever platform we're using to
boot code that the CPU with faithfully execute. For now it could be BIOS.
> --
> Chris
Jookia.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 6:55 ` Chris Marusich
2016-03-26 9:02 ` Jookia
@ 2016-03-26 14:05 ` Alex Vong
1 sibling, 0 replies; 26+ messages in thread
From: Alex Vong @ 2016-03-26 14:05 UTC (permalink / raw)
To: Chris Marusich; +Cc: guix-devel
Chris Marusich <cmmarusich@gmail.com> writes:
> Chris Marusich <cmmarusich@gmail.com> writes:
>
>
> Sorry for replying to my own post, but I couldn't help myself. If
> anyone thinks the above sounds too paranoid, remember the Ken Thompson
> hack:
>
> http://www.c2.com/cgi/wiki?TheKenThompsonHack
>
There is a way to defeat this kind of attack. It is called Diverse
Double-Compiling.
Graphical explaination: <https://imgur.com/a/BWbnU#0>
David A. Wheeler’s Page: <http://www.dwheeler.com/trusting-trust/>
> Chilling!
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 6:40 ` Chris Marusich
2016-03-26 6:55 ` Chris Marusich
@ 2016-03-26 8:12 ` Ricardo Wurmus
2016-03-26 9:23 ` Jookia
2016-03-26 14:31 ` Ludovic Courtès
1 sibling, 2 replies; 26+ messages in thread
From: Ricardo Wurmus @ 2016-03-26 8:12 UTC (permalink / raw)
To: Chris Marusich; +Cc: guix-devel
Chris Marusich <cmmarusich@gmail.com> writes:
> Leo Famulari <leo@famulari.name> writes:
>
>> `wget https://blob` doesn't count as reproducible :)
>
> Very true.
>
> Self-hosting compilers are a cute trick, but they're a far cry from
> being reproducible. They're just inscrutable binary blobs. If we want
> true reproducibility from the bottom up, then it seems like the only way
> to do it is via a strategy like the following:
>
> 1) Write the simplest possible program (or collection of programs) in
> the simplest possible machine code. This program serves only one
> purpose: to enable you to write more code at a higher level of
> abstraction. It is effectively a compiler for a very primitive
> language, but the language it compiles will be one layer of abstraction
> above machine code, which is a step in the right direction. This first
> program must be a "binary blob", since we cannot rely on any existing
> tools to build it. It must be simple enough that someone can read and
> understand it using e.g. a hex editor, provided that they have access to
> the right reference materials. Since this program exists only as
> machine code, it must be documented thoroughly to make it easier to
> understand.
>
> 2) Write source code which, when compiled using the compiler/toolchain
> From the previous step, produces a new compiler/toolchain that will
> allow you to write more expressive source code at a higher layer of
> abstraction.
>
> 3) Repeat step (2) as many times as necessary to produce a compiler that
> is capable of compiling GCC from source.
>
> 4) Use the compiler from (3) to compile GCC from source.
>
> 5) Use the GCC from (4) to compile the rest of the world from source.
>
> If we want to free ourselves from reliance on inscrutable binary blobs,
> isn't something like that the only way?
GCC itself is not sufficient to build many compilers. For GHC, for
example, you need a Haskell compiler such as GHC. I looked at nhc98 and
other defunct Haskell compilers, but they all have a bootstrapping step
that either requires a Haskell compiler or building from generated C
sources.
In the case of Haskell I think it would make sense to try to figure out
how to bootstrap with Hugs, a Haskell interpreter, or with Yale Haskell,
which I’m trying to port to Guile.
It is probably easier for us to try to write primitive compilers in
Guile than to start from scratch each time. Then the only blob we need
to figure out how to bootstrap would be Guile itself.
~~ Ricardo
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 8:12 ` Ricardo Wurmus
@ 2016-03-26 9:23 ` Jookia
2016-03-26 14:31 ` Ludovic Courtès
1 sibling, 0 replies; 26+ messages in thread
From: Jookia @ 2016-03-26 9:23 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: guix-devel
On Sat, Mar 26, 2016 at 09:12:52AM +0100, Ricardo Wurmus wrote:
> GCC itself is not sufficient to build many compilers. For GHC, for
> example, you need a Haskell compiler such as GHC. I looked at nhc98 and
> other defunct Haskell compilers, but they all have a bootstrapping step
> that either requires a Haskell compiler or building from generated C
> sources.
>
> In the case of Haskell I think it would make sense to try to figure out
> how to bootstrap with Hugs, a Haskell interpreter, or with Yale Haskell,
> which I’m trying to port to Guile.
>
> It is probably easier for us to try to write primitive compilers in
> Guile than to start from scratch each time. Then the only blob we need
> to figure out how to bootstrap would be Guile itself.
I think this is one of those things that need help from upstream to create a
simpler compiler that we can write a interpreter for.
> ~~ Ricardo
Jookia.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 8:12 ` Ricardo Wurmus
2016-03-26 9:23 ` Jookia
@ 2016-03-26 14:31 ` Ludovic Courtès
2016-03-26 17:19 ` Christopher Allan Webber
1 sibling, 1 reply; 26+ messages in thread
From: Ludovic Courtès @ 2016-03-26 14:31 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: guix-devel
Ricardo Wurmus <rekado@elephly.net> skribis:
> It is probably easier for us to try to write primitive compilers in
> Guile than to start from scratch each time. Then the only blob we need
> to figure out how to bootstrap would be Guile itself.
+1
Though of course, writing a faithful C or Haskell or OCaml compiler is
huge task, and chances are we’ll always miss compilers for some
languages.
So I think we should take both routes: on one hand try to come up with
minimalist Guile implementations of languages (your work with Yale
Haskell, or the project on Bournish), and on the other, with help from
the reproducible-builds.org group, raise awareness about the issue among
compiler writers and users so that the ability to bootstrap from
another, common programming language becomes a requirement.
Of course Guile itself is pretty big, so we may eventually have to think
about a “PreScheme” language, as is used to bootstrap Scheme48¹.
Epsilon²³ contains interesting ideas as well.
Ludo’.
¹ https://en.wikipedia.org/wiki/PreScheme
² https://www.gnu.org/software/epsilon/
³ http://ageinghacker.net/publications/#phd-thesis
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-26 14:31 ` Ludovic Courtès
@ 2016-03-26 17:19 ` Christopher Allan Webber
0 siblings, 0 replies; 26+ messages in thread
From: Christopher Allan Webber @ 2016-03-26 17:19 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Ludovic Courtès writes:
> Ricardo Wurmus <rekado@elephly.net> skribis:
>
>> It is probably easier for us to try to write primitive compilers in
>> Guile than to start from scratch each time. Then the only blob we need
>> to figure out how to bootstrap would be Guile itself.
>
> +1
>
> Though of course, writing a faithful C or Haskell or OCaml compiler is
> huge task, and chances are we’ll always miss compilers for some
> languages.
>
> So I think we should take both routes: on one hand try to come up with
> minimalist Guile implementations of languages (your work with Yale
> Haskell, or the project on Bournish), and on the other, with help from
> the reproducible-builds.org group, raise awareness about the issue among
> compiler writers and users so that the ability to bootstrap from
> another, common programming language becomes a requirement.
>
> Of course Guile itself is pretty big, so we may eventually have to think
> about a “PreScheme” language, as is used to bootstrap Scheme48¹.
> Epsilon²³ contains interesting ideas as well.
>
>
> Ludo’.
>
> ¹ https://en.wikipedia.org/wiki/PreScheme
> ² https://www.gnu.org/software/epsilon/
> ³ http://ageinghacker.net/publications/#phd-thesis
It's a pretty thrilling and enticing idea. It would be a ton of work.
I think we'd need to get more people excited about and successfully
contributing to Guile for it to happen. Thus I think we'd really need
to advertise more just how cool Guile's compiler tower is and get people
jazzed up about it. But bumping into Guile's compiler tower
documentation is one of the reasons I started paying attention to Guile
about a year and a half ago:
http://dustycloud.org/blog/javascript-beyond-javascript/
I'm sure others would find that narrative compelling.
It would be interesting to see the motivations of multi-language-Guile
from its earliest days come back to the forefront with style and
compelling motivation.
- Chris
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Removing compilers that cannot be bootstrapped
2016-03-21 17:54 Removing compilers that cannot be bootstrapped Thompson, David
` (2 preceding siblings ...)
2016-03-21 22:48 ` Ludovic Courtès
@ 2016-03-26 6:51 ` John Darrington
3 siblings, 0 replies; 26+ messages in thread
From: John Darrington @ 2016-03-26 6:51 UTC (permalink / raw)
To: Thompson, David; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1320 bytes --]
Doesn't gcc also belong in this category?
J'
On Mon, Mar 21, 2016 at 01:54:24PM -0400, Thompson, David wrote:
Haskell, OCaml, Chicken, and other compilers that we package have a
serious issue that many of us are aware of: they cannot be built from
source! They rely upon pre-built binaries of the same compiler. I
understand that it's very inconvenient to not have these compilers
available to us, and all of the software that is written in their
respective languages, but I feel like all of our work is undermined by
making exceptions for them. I would like to remove compilers that
don't have a bunch of dependent packages yet such as Chicken until
upstream fixes the issue. But we have tons of Haskell packages and a
handful of OCaml packages and it would be heartbreaking to some to
remove all of that hard work.
What can we possibly do to avoid being yet another distro that relies
on a bunch of blobs (leaving the *true* bootstrap binaries out of it
for now)?
- Dave
--
Avoid eavesdropping. Send strong encryted email.
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread