From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: Storing serialised graph along with packages Date: Tue, 25 Jul 2017 10:14:10 +0200 Message-ID: <87a83tndbh.fsf@gnu.org> References: <87a83waerc.fsf@elephly.net> <87vamim2uy.fsf@gnu.org> <87pocp7plg.fsf@elephly.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:49375) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dZuyv-0006ZU-05 for guix-devel@gnu.org; Tue, 25 Jul 2017 04:14:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dZuyr-0001Vg-18 for guix-devel@gnu.org; Tue, 25 Jul 2017 04:14:16 -0400 In-Reply-To: <87pocp7plg.fsf@elephly.net> (Ricardo Wurmus's message of "Mon, 24 Jul 2017 18:43:23 +0200") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Ricardo Wurmus Cc: guix-devel Hi, Ricardo Wurmus skribis: > Yes, indeed. My goal is to get a *better* approximation than what the > references database currently gives us. I think the problem is that this would remain an approximation; people might get a false sense that they can =E2=80=9Cdecompile=E2=80=9D a store i= tem to a package object and then be disappointed. > Out of curiosity I=E2=80=99ve been playing with serialisation on the trai= n ride > and build systems are indeed a problem. In my tests I just skipped > them until I figured something out. > > I played with cutting out the sources for the package expression (using > =E2=80=9Cpackage-location=E2=80=9D) and compiling the record to a file. = Unfortunately, > this won=E2=80=99t work for packages that are the result of generator pro= cedures > (like =E2=80=9Cgfortran=E2=80=9D). > > My current approach is just to go through each field of a package record > to generate an S-expression representing the package object, and then to > compile that. In a clean environment I can load that module along with > copies of the modules under the =E2=80=9Cguix=E2=80=9D directory that imp= lement things > like =E2=80=9Curl-fetch=E2=80=9D or the search-path-specifications record. > > To be able to traverse the dependency graph, one must load additional > modules for each of the store items making up the package closure. > (This would require that in addition to just embedded references we > would need to record the store items that were present at build time, > but that=E2=80=99s easy.) =E2=80=98source-module-closure=E2=80=99 might be helpful: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> ,use(guix) scheme@(guile-user)> ,use(guix modules) scheme@(guile-user)> (length (source-module-closure '((gnu packages gcc)))) $2 =3D 272 --8<---------------cut here---------------end--------------->8--- >> The safe way to achieve what you want would be to store the whole Guix >> tree (+ GUIX_PACKAGE_PATH), or a pointer to that (a Git commit). >> >> There=E2=80=99s a also the problem of bit-for-bit reproducibility: there= =E2=80=99s an >> infinite set of source trees that can lead to a given store item. If we >> stored along with, say, Emacs, the Guix source tree/commit that led to >> it, then we=E2=80=99d effectively remove that equivalence (whitespace wo= uld >> become significant, for instance[*].) > > Hmm, that=E2=80=99s true. And it=E2=80=99s not just a problem of sources= . We might > still introduce unimportant differences if we only serialised the > compiled objects and completely excluded the plain text source code, > e.g. when we refactor supporting code that has no impact on the value of > the result but which would lead to a change in the compiled module. > > Can we separate the two? Instead of installing modules (or the whole > Guix tree) into the output directory of a store item, could we instead > treat them like a table in the database? Building that part would not > be part of the package derivation; it would just be a pre- or > post-processing step, like registering the references in the database. To me the source/store mapping should be a separate service. I imagine we could have some sort of a ledger that maps Git commits to sets of store items (we could even call that a =E2=80=9Cblockchain=E2=80=9D= and be buzzword-compliant ;-)). Guix could come with a library to maintain such a database, and =E2=80=98guix publish=E2=80=99 could even publish it. = We=E2=80=99d have tools to query that database both for mappings and reverse-mappings, things like that. (There are also connections with the =E2=80=9Cbinary transparency=E2=80=9D = ledger discussed at the R-B summit.) WDYT? Ludo=E2=80=99.