From: Ricardo Wurmus <rekado@elephly.net>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Storing serialised graph along with packages
Date: Mon, 24 Jul 2017 18:43:23 +0200 [thread overview]
Message-ID: <87pocp7plg.fsf@elephly.net> (raw)
In-Reply-To: <87vamim2uy.fsf@gnu.org>
Hi,
> Ricardo Wurmus <rekado@elephly.net> skribis:
>
>> it always bothered me that after building a package we lose all of the
>> beautiful features that Guix as a Guile library gives us. We always
>> need to keep track of the Guix version at the time of building the
>> package and only then can we hope to rebuild the same thing again at
>> some point in the future.
>>
>> What do you think about storing the serialised subset of the package
>> graph in a separate output of the package? Currently, the only place
>> where we store anything meta is the database. Wouldn’t it be great if
>> we could “dump an image” of the state of Guile when it has evaluated the
>> section of the package graph that is needed to build it?
>>
>> Then we could just load the serialised state into Guile at a later point
>> and inspect the package graph as if we had Guix checked out at the given
>> version. I suppose we could also store this kind of information in the
>> database.
>>
>> I’d really like the graph to stay alive even after Guix has moved on to
>> later versions. It also sounds like a really lispy thing to do.
>
> I sympathize with the goal, and I like the parallel with Lisp.
>
> However I’m skeptical about our ability to do something that is robust
> enough. The package → bag → derivation compilation process is “lossy”
> in the sense that at each layer we lose a bit of context from the higher
> layers. Each arrow potentially involves all the code and package
> definitions of Guix, as opposed to just a subset of the package
> definitions. We could certainly serialize package objects to sexps, but
> that would not capture the implementation of build systems,
> ‘package-derivation’, or even lower-level primitives. So this would be
> a rough approximation, at best.
Yes, indeed. My goal is to get a *better* approximation than what the
references database currently gives us.
Out of curiosity I’ve been playing with serialisation on the train ride
and build systems are indeed a problem. In my tests I just skipped
them until I figured something out.
I played with cutting out the sources for the package expression (using
“package-location”) and compiling the record to a file. Unfortunately,
this won’t work for packages that are the result of generator procedures
(like “gfortran”).
My current approach is just to go through each field of a package record
to generate an S-expression representing the package object, and then to
compile that. In a clean environment I can load that module along with
copies of the modules under the “guix” directory that implement things
like “url-fetch” or the search-path-specifications record.
To be able to traverse the dependency graph, one must load additional
modules for each of the store items making up the package closure.
(This would require that in addition to just embedded references we
would need to record the store items that were present at build time,
but that’s easy.)
> The safe way to achieve what you want would be to store the whole Guix
> tree (+ GUIX_PACKAGE_PATH), or a pointer to that (a Git commit).
>
> There’s a also the problem of bit-for-bit reproducibility: there’s an
> infinite set of source trees that can lead to a given store item. If we
> stored along with, say, Emacs, the Guix source tree/commit that led to
> it, then we’d effectively remove that equivalence (whitespace would
> become significant, for instance[*].)
Hmm, that’s true. And it’s not just a problem of sources. We might
still introduce unimportant differences if we only serialised the
compiled objects and completely excluded the plain text source code,
e.g. when we refactor supporting code that has no impact on the value of
the result but which would lead to a change in the compiled module.
Can we separate the two? Instead of installing modules (or the whole
Guix tree) into the output directory of a store item, could we instead
treat them like a table in the database? Building that part would not
be part of the package derivation; it would just be a pre- or
post-processing step, like registering the references in the database.
--
Ricardo
GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC
https://elephly.net
next prev parent reply other threads:[~2017-07-24 16:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-22 11:32 Storing serialised graph along with packages Ricardo Wurmus
2017-07-23 5:11 ` Catonano
2017-07-23 9:30 ` Ricardo Wurmus
2017-07-23 14:53 ` Ricardo Wurmus
2017-07-23 17:33 ` Jan Nieuwenhuizen
2017-07-24 12:33 ` Ludovic Courtès
2017-07-24 16:43 ` Ricardo Wurmus [this message]
2017-07-25 8:14 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pocp7plg.fsf@elephly.net \
--to=rekado@elephly.net \
--cc=guix-devel@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).