all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Christine Lemmer-Webber <cwebber@dustycloud.org>
To: "Jonathan Frederickson" <jonathan@terracrypt.net>
Cc: "Sergio Pastor Pérez" <sergio.pastorperez@outlook.es>,
	"Marek Paśnikowski" <marek@marekpasnikowski.pl>,
	"Ludovic Courtès" <ludo@gnu.org>,
	guix-devel@gnu.org, guix-sysadmin <guix-sysadmin@gnu.org>
Subject: P2P Guix package building and distribution
Date: Wed, 21 Aug 2024 18:07:58 -0400	[thread overview]
Message-ID: <87msl5o7gh.fsf@dustycloud.org> (raw)
In-Reply-To: <be7d660f-0b57-47f1-be6f-d61ac4458548@app.fastmail.com> (Jonathan Frederickson's message of "Tue, 13 Aug 2024 19:38:41 -0400")

"Jonathan Frederickson" <jonathan@terracrypt.net> writes:

> On Tue, Aug 13, 2024, at 12:23 PM, Sergio Pastor Pérez wrote:
>
>> Wouldn't it be enough to have a few independent seeders that have the
>> same derivation output? We could have a field in the p2p service type
>> which allows the user to configure a "level of trust", where the user
>> specifies the minimum number of seeders with the same output for the
>> daemon to accept the substitute.
>
> This might be enough if you could do it, but the trouble is
> identifying "independent" seeders. If you get the same output from
> five different seeders, that could be five different people... or I
> could have set up five different nodes participating in the swarm
> serving my malicious substitutes. (This is known as a Sibyl attack.)
>
> But maybe taking inspiration from this... perhaps you could do
> something more akin to some of the web-of-trust features of
> e.g. PGP. In other words, you might have the ability to partially
> trust a server's substitutes such that you'll only use a substitute if
> N other partially trusted servers (or at least one fully trusted
> server) serve up the same content. This would still not let you have a
> totally permissionless set of P2P substitutes, but it would allow the
> community to build a list of individuals who are at least trusted not
> to collude with one another, if not fully trusted.
>
> Though there's a detail that might need addressing for this to
> work... you would want this to be an indication that multiple
> individuals were able to reproducibly build the same packages
> bit-for-bit. But my impression is that substitutes served by 'guix
> publish' are always signed with the substitute server's signing key,
> regardless of where they were built. That does mean that if 4 people
> were to pull substitutes of a package from one other person, those 5
> people would end up serving substitutes originating from one
> person. You may want a way for someone running a substitute server to
> additionally attest that they had individually built the derivation in
> question.

I definitely think that this is a future we'd want with Guix.

Goals:
 - That our software be fully reproducible in the first place
 - P2P distribution mechanisms for inputs (this one is relatively easy!)
 - "Community participation" of building derivatives
 - P2P distribution of built artifacts (actually, if you have p2p
   distribution of inputs, you can have this one relatively for
   free/cheap)

There are challenges with all of this, but really we know enough what
p2p content addressed infrastructure looks like, this isn't the hard part.
Figuring out how to build a set of "semi-trusted sources" and the UX
around it is the hard part.

(In a weird way, compiling and verifying software is a "soft trap door
problem", I have been thinking.  Certainly not as much so as the
functions we require for cryptography to work, but it's still a trap
door, which is why build farms are expensive.)

I guess a worthwhile question is "where are the costs coming from"?
Ludovic said:

> The various options and back-of-the-envelope estimates we came up with
> are as follows:
>
>  1. Buying and hosting hardware:
>      250k€ for hardware
>      3k€/month (36k€/year)
>
>  2. Renting machines (e.g., on Hetzner):
>      6k€/month (72k€/year)
>
>  3. Sponsored:
>      get hardware and/or hosting sponsored (by academic institutions or
>      companies).

So I am guessing bandwith costs are significant but the 250k EUR for
hardware indicates this is especially a build farm issue rather than a
content distribution / bandwith issue.  (Do I have that right?)

Regardless, something I have thought about... #2 and #3 are cheaper but
"less preferable" than #1 because of security concerns, per my
understanding.

But... what if we managed to make #2 and #3 *more secure*?  Here is an
idea that is semi-p2p, and maybe a path towards a more full p2p option,
that we could possibly persue.

 - We have machines hosted that we trust a bit less, at some hosting
   facility, or possibly sponsored from someplace we trust even less.
   Let's imagine we went with the imaginary MegaCloud Inc.

 - We have a set of keys for semi-trusted "Guix Builders", who have
   machines that we run in our houses/lodgings/etc which sit around
   compiling Guix packages all day.  These could be eg people who aren't
   even committers but have gained the trust of committers and maybe
   have even come to something like Guix Days in person.

Now imagine for a moment that I wanted to download the latest version
of... let's go oldschool in our FOSS references and say some expensive
to compile browser named IceWeasel. ;)

I want to download the latest version of IceWeasel.  I could compile it
myself, or I could get a substitute.  #1 feels like the most
"trustworthy" option at first glance but actually it could be even a
single point of failure attack source.

Okay, but what if instead I had the option to download something signed
off by *all of* the MegaCloud build service and two "Guix Builders", and
they all came to the same hash?

This seems even better than #1 from a security/integrity perspective, I
think.

Just speculating...
 - Christine



  parent reply	other threads:[~2024-08-21 22:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-02 14:24 Sustainable funding and maintenance for our infrastructure Ludovic Courtès
2024-07-03  1:13 ` indieterminacy
2024-07-04 16:37 ` Simon Tournier
2024-07-08 12:02   ` Ricardo Wurmus
2024-07-09 14:49     ` Simon Tournier
2024-07-11  9:23       ` Ludovic Courtès
2024-07-08 15:46 ` Vagrant Cascadian
2024-07-08 18:28   ` Vincent Legoll
2024-07-09  9:47     ` Tomas Volf
2024-07-11 10:33       ` Andreas Enge
2024-07-11 20:44         ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
2024-07-11  9:38   ` Ludovic Courtès
2024-07-12 10:44     ` Simon Tournier
2024-07-21 12:52       ` Ludovic Courtès
2024-07-08 16:27 ` Efraim Flashner
2024-07-08 17:21   ` Enrico Schwass
2024-07-11 10:48     ` Andreas Enge
2024-07-11  9:28   ` Ludovic Courtès
2024-08-01 22:11 ` Marek Paśnikowski
2024-08-13  2:53   ` Jonathan Frederickson
2024-08-13 16:23     ` Sergio Pastor Pérez
2024-08-13 23:38       ` Jonathan Frederickson
2024-08-14 13:21         ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
2024-08-24 23:15           ` Jonathan Frederickson
2024-08-21 22:07         ` Christine Lemmer-Webber [this message]
2024-08-22  9:05           ` P2P Guix package building and distribution Andreas Enge
2024-08-22 21:57             ` Samuel Christie via Development of GNU Guix and the GNU System distribution.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87msl5o7gh.fsf@dustycloud.org \
    --to=cwebber@dustycloud.org \
    --cc=guix-devel@gnu.org \
    --cc=guix-sysadmin@gnu.org \
    --cc=jonathan@terracrypt.net \
    --cc=ludo@gnu.org \
    --cc=marek@marekpasnikowski.pl \
    --cc=sergio.pastorperez@outlook.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.