From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?UTF-8?Q?G=C3=A1bor_Boskovits?= <boskovits@gmail.com>
Subject: Content addressable store
Date: Wed, 15 May 2019 10:33:18 +0200
Message-ID: <CAE4v=pj029SPi7o+cjwhrwbq1W0CB2SD-dS2qipfV_RgaY8aWQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="0000000000000fea2b0588e901a6"
Return-path: <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([209.51.188.92]:40612)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <boskovits@gmail.com>) id 1hQpLx-0006c3-FF
	for guix-devel@gnu.org; Wed, 15 May 2019 04:33:34 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <boskovits@gmail.com>) id 1hQpLw-00053J-5y
	for guix-devel@gnu.org; Wed, 15 May 2019 04:33:33 -0400
Received: from mail-ed1-x52b.google.com ([2a00:1450:4864:20::52b]:38203)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <boskovits@gmail.com>) id 1hQpLv-00052W-QK
	for guix-devel@gnu.org; Wed, 15 May 2019 04:33:32 -0400
Received: by mail-ed1-x52b.google.com with SMTP id w11so2955327edl.5
	for <guix-devel@gnu.org>; Wed, 15 May 2019 01:33:31 -0700 (PDT)
List-Id: "Development of GNU Guix and the GNU System distribution."
	<guix-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guix-devel/>
List-Post: <mailto:guix-devel@gnu.org>
List-Help: <mailto:guix-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=subscribe>
Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org
Sender: "Guix-devel" <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
To: Guix-devel <guix-devel@gnu.org>

--0000000000000fea2b0588e901a6
Content-Type: text/plain; charset="UTF-8"

On IRC at May. 14. 2019, the topic of content addressable store idea was
discussed.

This is also discussed here:
page 143 of https://nixos.org/~eelco/pubs/phd-thesis.pdf (or 135) on the
intentional model
and
https://github.com/NixOS/nix/issues/296.

Thanks for the links roptat.

So, after reading this an initial idea came up, which looks like this:

1. solve the content addressability problem like proposed in the thesis:
- build the derviation like we do it now
- rewrite the self-references to a known constant
- compute the hash after the rewrite
- relocate the package to the store-path indicated by the new hash

2. after the packager builds the package, the content address can be added
to the definition

3. fail tha package build, if it has a content address, but it mismatches
the produced artifact.

4. use flags to allow installing to the original path, and to the content
addressed path.
I propose to default these in such a way, that it installs to the original
path if no content
address specified, and to install to the content addressed path, if the
content address is specfied.
(This might come in hand in the transitional period, so that we can install
the package to both locations)

There are two issues with the approach:
1. only reproducible packages can be content addressed
2. when a package has a content address, then it will be resolved to that
in the dependents, opening up the possibility, that the package points to
the output of another derivation than the one defined in the package. As
per discussion a user using a channel trust the channel code, it was
concluded, that malicious injection can be ignored. What might still
happen, is that upon updating a package, the content address is not
modified, so the dependents still resolve to the old content address, and
have no way of knowing, that the package definition does not actually
build. With proper workflow support this might be manageable.

Benefits of this approach:
- the content addresses do not need a centralized database
- the complications resulting from derivations building to different
outputs is eliminated
- a very good reproducibility indicator is gained
- it can peacfully coexist with our current store.

Wdyt?

--0000000000000fea2b0588e901a6
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>On IRC at May. 14. 2019, the topic of content address=
able store idea was discussed.</div><div><br></div><div>This is also discus=
sed here:</div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_co=
ntent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"a=
uto">page 143 of <a href=3D"https://nixos.org/~eelco/pubs/phd-thesis.pdf" c=
lass=3D"gmail-linkified" target=3D"_blank" rel=3D"noopener">https://nixos.o=
rg/~eelco/pubs/phd-thesis.pdf</a> (or 135) on the intentional model</span><=
/span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_conte=
nt"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto=
">and</span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_Ev=
entTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body=
" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content=
"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto">=
<a href=3D"https://github.com/NixOS/nix/issues/296" class=3D"gmail-linkifie=
d" target=3D"_blank" rel=3D"noopener">https://github.com/NixOS/nix/issues/2=
96</a>.</span></span></span></span></div><div><span class=3D"gmail-mx_MText=
Body gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gma=
il-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_E=
ventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-bod=
y" dir=3D"auto"><br></span></span></span></span></div><div><span class=3D"g=
mail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_Event=
Tile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBo=
dy gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail=
-markdown-body" dir=3D"auto">Thanks for the links roptat.</span></span></sp=
an></span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_c=
ontent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"=
auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span c=
lass=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><br></spa=
n></span></span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-m=
x_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-=
body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_con=
tent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"au=
to">So, after reading this an initial idea came up, which looks like this:<=
/span></span></span></span></div><div><span class=3D"gmail-mx_MTextBody gma=
il-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markd=
own-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile=
_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=
=3D"auto"><br></span></span></span></span></div><div><span class=3D"gmail-m=
x_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_b=
ody gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gma=
il-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markd=
own-body" dir=3D"auto">1. solve the content addressability problem like pro=
posed in the thesis:</span></span></span></span></div><div><span class=3D"g=
mail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_Event=
Tile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBo=
dy gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail=
-markdown-body" dir=3D"auto">- build the derviation like we do it now</span=
></span></span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx=
_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-b=
ody" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_cont=
ent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"aut=
o">- rewrite the self-references to a known constant</span></span></span></=
span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_conten=
t"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"=
><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=
=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto">- compute the=
 hash after the rewrite</span></span></span></span></div><div><span class=
=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_=
EventTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MT=
extBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body =
gmail-markdown-body" dir=3D"auto">- relocate the package to the store-path =
indicated by the new hash</span></span></span></span></div><div><span class=
=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_=
EventTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MT=
extBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body =
gmail-markdown-body" dir=3D"auto"><br></span></span></span></span></div><di=
v><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=
=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><span class=
=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_=
EventTile_body gmail-markdown-body" dir=3D"auto">2. after the packager buil=
ds the package, the content address can be added to the definition</span></=
span></span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_Ev=
entTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body=
" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content=
"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto">=
<br></span></span></span></span></div><div><span class=3D"gmail-mx_MTextBod=
y gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-=
markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_Even=
tTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" =
dir=3D"auto">3. fail tha package build, if it has a content address, but it=
 mismatches the produced artifact.</span></span></span></span></div><div><s=
pan class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"=
gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gm=
ail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventT=
ile_body gmail-markdown-body" dir=3D"auto"><br></span></span></span></span>=
</div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><s=
pan class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><spa=
n class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gm=
ail-mx_EventTile_body gmail-markdown-body" dir=3D"auto">4. use flags to all=
ow installing to the original path, and to the content addressed path.</spa=
n></span></span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-m=
x_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-=
body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_con=
tent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"au=
to">I propose to default these in such a way, that it installs to the origi=
nal path if no content</span></span></span></span></div><div><span class=3D=
"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_Eve=
ntTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MText=
Body gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gma=
il-markdown-body" dir=3D"auto">address specified, and to install to the con=
tent addressed path, if the content address is specfied.</span></span></spa=
n></span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_co=
ntent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"a=
uto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span cl=
ass=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto">(This migh=
t come in hand in the transitional period, so that we can install the packa=
ge to both locations)</span></span></span></span></div><div><span class=3D"=
gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_Even=
tTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextB=
ody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmai=
l-markdown-body" dir=3D"auto"><br></span></span></span></span></div><div><s=
pan class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"=
gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gm=
ail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventT=
ile_body gmail-markdown-body" dir=3D"auto">There are two issues with the ap=
proach:</span></span></span></span></div><div><span class=3D"gmail-mx_MText=
Body gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gma=
il-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_E=
ventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-bod=
y" dir=3D"auto">1. only reproducible packages can be content addressed</spa=
n></span></span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-m=
x_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-=
body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_con=
tent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"au=
to">2. when a package has a content address, then it will be resolved to th=
at in the dependents, opening up the possibility, that the package points t=
o the output of another derivation than the one defined in the package. As =
per discussion a user using a channel trust the channel code, it was conclu=
ded, that malicious injection can be ignored. What might still happen, is t=
hat upon updating a package, the content address is not modified, so the de=
pendents still resolve to the old content address, and have no way of knowi=
ng, that the package definition does not actually build. With proper workfl=
ow support this might be manageable.</span></span></span></span></div><div>=
<span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=
=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><span class=
=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_=
EventTile_body gmail-markdown-body" dir=3D"auto"><br></span></span></span><=
/span></div><div><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_conte=
nt"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto=
"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=
=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"auto">Benefits of t=
his approach:</span></span></span></span></div><div><span class=3D"gmail-mx=
_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_bo=
dy gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmai=
l-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdo=
wn-body" dir=3D"auto">- the content addresses do not need a centralized dat=
abase</span></span></span></span></div><div><span class=3D"gmail-mx_MTextBo=
dy gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail=
-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_Eve=
ntTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body"=
 dir=3D"auto">- the complications resulting from derivations building to di=
fferent outputs is eliminated</span></span></span></span></div><div><span c=
lass=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail=
-mx_EventTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-m=
x_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_b=
ody gmail-markdown-body" dir=3D"auto">- a very good reproducibility indicat=
or is gained</span></span></span></span></div><div><span class=3D"gmail-mx_=
MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_bod=
y gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail=
-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdow=
n-body" dir=3D"auto">- it can peacfully coexist with our current store.</sp=
an></span></span></span></div><div><span class=3D"gmail-mx_MTextBody gmail-=
mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown=
-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx_EventTile_co=
ntent"><span class=3D"gmail-mx_EventTile_body gmail-markdown-body" dir=3D"a=
uto"><br></span></span></span></span></div><div><span class=3D"gmail-mx_MTe=
xtBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body g=
mail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MTextBody gmail-mx=
_EventTile_content"><span class=3D"gmail-mx_EventTile_body gmail-markdown-b=
ody" dir=3D"auto">Wdyt?</span></span></span></span></div><div><span class=
=3D"gmail-mx_MTextBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_=
EventTile_body gmail-markdown-body" dir=3D"auto"><span class=3D"gmail-mx_MT=
extBody gmail-mx_EventTile_content"><span class=3D"gmail-mx_EventTile_body =
gmail-markdown-body" dir=3D"auto"><br></span></span></span></span></div></d=
iv>

--0000000000000fea2b0588e901a6--