unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [GSoC 23] distributed substitutes, cost of storage
@ 2023-03-25 19:00 Attila Lendvai
  2023-03-26 20:06 ` Vijaya Anand
  2023-03-30 11:08 ` Maxime Devos
  0 siblings, 2 replies; 17+ messages in thread
From: Attila Lendvai @ 2023-03-25 19:00 UTC (permalink / raw)
  To: Vijaya Anand; +Cc: pukkamustard, guix-devel

welcome on board Anand!


> In case a user requests for a substitute and there is a missing
> block in the decoding process, a HTTP request for block would sent
> to the substitute server and the server will encode the
> corresponding block in real time and push it back into the
> network. The block will be searched again and retrieved.


something to consider here: whose responsibility should it be that a block, that is missing from a p2p network, is (re-)uploaded there? the clients? or the current substitute server?

my gut instinct says that it's better if the clients do the (re-)upload of the blocks.

in this architecture the substitute server is just another storage mechanism along the other storage backends (although with a different reliability characteristics), and it's the clients that are doing the mirroring/spreading/distribution of the blocks among the various backends. the clients of course will/should keep the current substitute servers at the bottom of their list of backends in their configuration.

this way the load is distributed, and we don't need to add (too much) extra complexity to the substitute server codebase, and the actors are less tightly coupled.

it's another question whether this mirroring should be enabled by default in the clients. probably it shouldn't, and the project infrastructure should be running clients where it is turned on. altruistic third parties could also enable this mirroring feature, and donate their bandwidth/resources.

there's an issue with this, though:

some p2p storage backends will require some form of payment/credentials to use their resources. arguably, all p2p storage networks that will survive into the future will have some mechanism to limit the infinite abuse of their resources. it is to be researched how these payment mechanisms work on the various p2p networks, and whether it is possible that the Guix project pays for the storage globally, and then the random clients will have the necessary credentials to (re-)upload the missing blocks.

this architecture shouldn't be impossible, because the content is authenticated by its hash, and if the payment/authorization mechanism is based on the hashes of the blocks (probably), then any client could (re-)upload a missing block that was already paid for.

i'll look into this, especially in the context of Swarm.

meta: i think such specific discussions should be kept off-list, but the financing of the storage fees is probably something that should be known about more widely.

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
Every lie is a debt to the truth.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-25 19:00 [GSoC 23] distributed substitutes, cost of storage Attila Lendvai
@ 2023-03-26 20:06 ` Vijaya Anand
  2023-03-26 21:19   ` Attila Lendvai
  2023-03-30 11:08 ` Maxime Devos
  1 sibling, 1 reply; 17+ messages in thread
From: Vijaya Anand @ 2023-03-26 20:06 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: pukkamustard, guix-devel

[-- Attachment #1: Type: text/plain, Size: 3899 bytes --]

Hi Attila

Thanks for the welcome!
I agree that the responsibility of re-uploading the blocks back to the
network should be with the clients rather than the substitute server. Also
I didn't really think about the point about having to pay for the p2p
services at some point of time. In this case we will have to pay for the
storage of substitutes both on the p2p storage backend as well as for
storage in the substitute server am I right? So ideally we will want to
eliminate the usage of these substitute servers and shift totally to p2p
services and in this case we will have to shift the responsibility of
re-uploading the blocks to the clients itself.
Also if we dont keep the re-uploading blocks option as default for the
users, won't users usually not choose to enable it? Maybe we can keep it on
as default and resource conscious users can choose to turn it off? Please
let me know your thoughts on these points and I will change the
implementation point of my proposal accordingly.

Thank you
Vijaya Anand

On Sun, 26 Mar 2023 at 00:30, Attila Lendvai <attila@lendvai.name> wrote:

> welcome on board Anand!
>
>
> > In case a user requests for a substitute and there is a missing
> > block in the decoding process, a HTTP request for block would sent
> > to the substitute server and the server will encode the
> > corresponding block in real time and push it back into the
> > network. The block will be searched again and retrieved.
>
>
> something to consider here: whose responsibility should it be that a
> block, that is missing from a p2p network, is (re-)uploaded there? the
> clients? or the current substitute server?
>
> my gut instinct says that it's better if the clients do the (re-)upload of
> the blocks.
>
> in this architecture the substitute server is just another storage
> mechanism along the other storage backends (although with a different
> reliability characteristics), and it's the clients that are doing the
> mirroring/spreading/distribution of the blocks among the various backends.
> the clients of course will/should keep the current substitute servers at
> the bottom of their list of backends in their configuration.
>
> this way the load is distributed, and we don't need to add (too much)
> extra complexity to the substitute server codebase, and the actors are less
> tightly coupled.
>
> it's another question whether this mirroring should be enabled by default
> in the clients. probably it shouldn't, and the project infrastructure
> should be running clients where it is turned on. altruistic third parties
> could also enable this mirroring feature, and donate their
> bandwidth/resources.
>
> there's an issue with this, though:
>
> some p2p storage backends will require some form of payment/credentials to
> use their resources. arguably, all p2p storage networks that will survive
> into the future will have some mechanism to limit the infinite abuse of
> their resources. it is to be researched how these payment mechanisms work
> on the various p2p networks, and whether it is possible that the Guix
> project pays for the storage globally, and then the random clients will
> have the necessary credentials to (re-)upload the missing blocks.
>
> this architecture shouldn't be impossible, because the content is
> authenticated by its hash, and if the payment/authorization mechanism is
> based on the hashes of the blocks (probably), then any client could
> (re-)upload a missing block that was already paid for.
>
> i'll look into this, especially in the context of Swarm.
>
> meta: i think such specific discussions should be kept off-list, but the
> financing of the storage fees is probably something that should be known
> about more widely.
>
> --
> • attila lendvai
> • PGP: 963F 5D5F 45C7 DFCD 0A39
> --
> Every lie is a debt to the truth.
>
>

[-- Attachment #2: Type: text/html, Size: 4355 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-26 20:06 ` Vijaya Anand
@ 2023-03-26 21:19   ` Attila Lendvai
  2023-03-28 20:19     ` Vijaya Anand
  0 siblings, 1 reply; 17+ messages in thread
From: Attila Lendvai @ 2023-03-26 21:19 UTC (permalink / raw)
  To: Vijaya Anand; +Cc: pukkamustard, guix-devel

> Also I didn't really think about the point about having to pay for
> the p2p services at some point of time.


a quick note here: i forgot to mention that e.g. the Swarm Foundation has programs for supporting opensource projects. so, chances are high that the storage needs for Guix would be paid for by the foundation.


> In this case we will have to pay for the storage of substitutes both
> on the p2p storage backend as well as for storage in the substitute
> server am I right?


well, the substitute servers are currently already operated (and paid for) by the Guix team. i don't think that p2p storage solutions have reached a point of maturity that we could rely on them alone. there should definitely be some time where both infrastructures are running in parallel. somewhere down the road a choice could be made to stop running the current substitute servers, but we are far away from that.

also, running swarm nodes that serve the network can earn money. so, the cost of running enough swarm nodes to pay for the storage needs of Guix on the swarm network should be in the same ballpark of the costs of running the current substitute servers. the storage price will be market based (this part is just being rolled out on the live network), so it's reasonable to expect that people will fire up nodes if the storage price go well above the VPS costs.

and not all p2p storage networks are made equal. e.g. IPFS is only a registry of who is serving what. if you want to keep your data alive on IPFS, then you need to run some nodes and make sure that they are serving the content that you care about... and bear the costs of running these nodes. i.e. the DoS attack surface of IPFS is much smaller. (IPFS stores only the metadata in the DHT (i.e. where is what), while Swarm stores there the data itself -- they are different architectures with different features)

(i need to learn more about GNUnet)


> So ideally we will want to eliminate the usage of these substitute
> servers and shift totally to p2p services and in this case we will
> have to shift the responsibility of re-uploading the blocks to the
> clients itself.


yep, that's my way of thinking, too.

note that 'client' here has two meanings:

 1) some part of the codebase

 2) a program that is running on the computers of the Guix users

i was using it in the first sense.

without a functional Web of Trust solution, the Guix team will have to run nodes that compile packages, sign them with their PGP keys, and make them available somewhere. currently it's published through a HTTP based service that we call 'substitute servers'. this GSoC project is about adding more storage backends.

but those backends don't solve the problem that the Guix users need to trust someone with a private key who compiles and signs packages, regadless of the transport mechanism that gets the packages to the clients.

i can dream about a future where there's a social network that is based on digital signatures and encryption, and my Guix client authorizes compiled binaries based on some weighted transitive closure of signatures of my trusted peers... but we are not there yet. for now it's either trusting the Guix team's signature, or setting up your own substitute servers and build workers (or trusting someone else, but i'm not aware of any third party offering substitutes).


> Also if we dont keep the re-uploading blocks option as default for
> the users, won't users usually not choose to enable it? Maybe we can
> keep it on as default and resource conscious users can choose to
> turn it off? Please let me know your thoughts on these points and I
> will change the implementation point of my proposal accordingly.


first, it's a philosophical question: i value consensual relationships, and that implies that the other party is well informed. and clogging someone's network bandwidth is not an expected behavior from installing a linux distribution.

there's also a technical issue: these p2p backends will need to be configured. i have my doubts that we could ship a default config with Guix with which these p2p backends could just work out of the box, but... let's hope that i'm wrong!

and then there's the issue of payments: it's not obvious that a random client can just upload binaries into a p2p storage network. on some p2p networks someone needs to pay for that, and i don't yet understand well enough how the data is authorized through payments (they are called postage stamps in swarm).

a good, high level comparison of p2p storage solutions would be useful.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“There is something in the human spirit that will survive and prevail, there is a tiny and brilliant light burning in the heart of man that will not go out no matter how dark the world becomes.”
	— Leo Tolstoy (1828–1910)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-26 21:19   ` Attila Lendvai
@ 2023-03-28 20:19     ` Vijaya Anand
  2023-03-29  8:45       ` Andreas Enge
  2023-03-29  9:34       ` pukkamustard
  0 siblings, 2 replies; 17+ messages in thread
From: Vijaya Anand @ 2023-03-28 20:19 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: guix-devel, pukkamustard

[-- Attachment #1: Type: text/plain, Size: 7789 bytes --]

Hi,

Sorry for the late reply.
So in the case we are running swarm nodes that serves the network and hence
help fund the substitute server, we can also use these to also upload eris
encoded substitute blocks onto the network am I right? The total cost will
thus be cost to run the swarm nodes + storage cost of substitute blocks in
the network + cost to run substitute servers - money earned by running
swarm nodes. But when we don't run swarm nodes which run to serve the
network, the total cost is not really affect right as the cost to run swarm
nodes will be lessened and no money will be earned.
So in the context of fallback mechanism the user client can send request to
the substitute server for the missing block and the substitute server will
serve the eris encoded data block back to the user (using HTTP). The
responsibility of uploading these missing blocks back to the network is of
the third party nodes (which are running to serve the desired content to
the network). But how do we send the message to the node to report the
missing block on the network? Can it be done by the user itself?

"i can dream about a future where there's a social network that is based on
digital signatures and encryption, and my Guix client authorizes compiled
binaries based on some weighted transitive closure of signatures of my
trusted peers"....Interesting! In the case of accessing Guix substitutes
from p2p network, we ensure authorization by Guix team by making sure the
urn of the substitute is the urn mentioned in the narinfo (which we get
from a trusted source like the substitute server). So in the case of
accessing some random compiled binary from the network, we just need to
verify the authority of the document providing the urn of the content?

"i value consensual relationships, and that implies that the other party is
well informed. and clogging someone's network bandwidth is not an expected
behavior from installing a linux distribution."
I agree with this point and also since there are already specialised nodes
doing the work of uploading blocks onto the network I guess for now it is
better to assign the task to them itself. Also I will give a read about how
network's charge for uploading data onto them. As you had mentioned before
if the payment is associated with a block id, then maybe we could have any
random client upload data onto the network (if of course we are able to
ship in with Guix itself, the configuration to upload data onto the said
network).

Vijaya Anand




On Mon, 27 Mar 2023 at 2:49 AM Attila Lendvai <attila@lendvai.name> wrote:

> > Also I didn't really think about the point about having to pay for
> > the p2p services at some point of time.
>
>
> a quick note here: i forgot to mention that e.g. the Swarm Foundation has
> programs for supporting opensource projects. so, chances are high that the
> storage needs for Guix would be paid for by the foundation.
>
>
> > In this case we will have to pay for the storage of substitutes both
> > on the p2p storage backend as well as for storage in the substitute
> > server am I right?
>
>
> well, the substitute servers are currently already operated (and paid for)
> by the Guix team. i don't think that p2p storage solutions have reached a
> point of maturity that we could rely on them alone. there should definitely
> be some time where both infrastructures are running in parallel. somewhere
> down the road a choice could be made to stop running the current substitute
> servers, but we are far away from that.
>
> also, running swarm nodes that serve the network can earn money. so, the
> cost of running enough swarm nodes to pay for the storage needs of Guix on
> the swarm network should be in the same ballpark of the costs of running
> the current substitute servers. the storage price will be market based
> (this part is just being rolled out on the live network), so it's
> reasonable to expect that people will fire up nodes if the storage price go
> well above the VPS costs.
>
> and not all p2p storage networks are made equal. e.g. IPFS is only a
> registry of who is serving what. if you want to keep your data alive on
> IPFS, then you need to run some nodes and make sure that they are serving
> the content that you care about... and bear the costs of running these
> nodes. i.e. the DoS attack surface of IPFS is much smaller. (IPFS stores
> only the metadata in the DHT (i.e. where is what), while Swarm stores there
> the data itself -- they are different architectures with different features)
>
> (i need to learn more about GNUnet)
>
>
> > So ideally we will want to eliminate the usage of these substitute
> > servers and shift totally to p2p services and in this case we will
> > have to shift the responsibility of re-uploading the blocks to the
> > clients itself.
>
>
> yep, that's my way of thinking, too.
>
> note that 'client' here has two meanings:
>
>  1) some part of the codebase
>
>  2) a program that is running on the computers of the Guix users
>
> i was using it in the first sense.
>
> without a functional Web of Trust solution, the Guix team will have to run
> nodes that compile packages, sign them with their PGP keys, and make them
> available somewhere. currently it's published through a HTTP based service
> that we call 'substitute servers'. this GSoC project is about adding more
> storage backends.
>
> but those backends don't solve the problem that the Guix users need to
> trust someone with a private key who compiles and signs packages, regadless
> of the transport mechanism that gets the packages to the clients.
>
> i can dream about a future where there's a social network that is based on
> digital signatures and encryption, and my Guix client authorizes compiled
> binaries based on some weighted transitive closure of signatures of my
> trusted peers... but we are not there yet. for now it's either trusting the
> Guix team's signature, or setting up your own substitute servers and build
> workers (or trusting someone else, but i'm not aware of any third party
> offering substitutes).
>
>
> > Also if we dont keep the re-uploading blocks option as default for
> > the users, won't users usually not choose to enable it? Maybe we can
> > keep it on as default and resource conscious users can choose to
> > turn it off? Please let me know your thoughts on these points and I
> > will change the implementation point of my proposal accordingly.
>
>
> first, it's a philosophical question: i value consensual relationships,
> and that implies that the other party is well informed. and clogging
> someone's network bandwidth is not an expected behavior from installing a
> linux distribution.
>
> there's also a technical issue: these p2p backends will need to be
> configured. i have my doubts that we could ship a default config with Guix
> with which these p2p backends could just work out of the box, but... let's
> hope that i'm wrong!
>
> and then there's the issue of payments: it's not obvious that a random
> client can just upload binaries into a p2p storage network. on some p2p
> networks someone needs to pay for that, and i don't yet understand well
> enough how the data is authorized through payments (they are called postage
> stamps in swarm).
>
> a good, high level comparison of p2p storage solutions would be useful.
>
> --
> • attila lendvai
> • PGP: 963F 5D5F 45C7 DFCD 0A39
> --
> “There is something in the human spirit that will survive and prevail,
> there is a tiny and brilliant light burning in the heart of man that will
> not go out no matter how dark the world becomes.”
>         — Leo Tolstoy (1828–1910)
>
>

[-- Attachment #2: Type: text/html, Size: 8493 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-28 20:19     ` Vijaya Anand
@ 2023-03-29  8:45       ` Andreas Enge
  2023-03-29  9:26         ` pukkamustard
  2023-03-29  9:34       ` pukkamustard
  1 sibling, 1 reply; 17+ messages in thread
From: Andreas Enge @ 2023-03-29  8:45 UTC (permalink / raw)
  To: Vijaya Anand; +Cc: Attila Lendvai, guix-devel, pukkamustard

Hello,

Am Wed, Mar 29, 2023 at 01:49:23AM +0530 schrieb Vijaya Anand:
> In the case of accessing Guix substitutes from p2p
> network, we ensure authorization by Guix team by making sure the urn of the
> substitute is the urn mentioned in the narinfo

no, currently substitutes are authenticated by a digital signature with one
of the substitute servers (the user has control over which signing keys are
accepted, see /etc/guix/acl). It happens after the download.

And see
   https://guix.gnu.org/en/manual/devel/en/guix.html#Substitute-Server-Authorization .

Andreas



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-29  8:45       ` Andreas Enge
@ 2023-03-29  9:26         ` pukkamustard
  0 siblings, 0 replies; 17+ messages in thread
From: pukkamustard @ 2023-03-29  9:26 UTC (permalink / raw)
  To: Andreas Enge; +Cc: Vijaya Anand, Attila Lendvai, guix-devel


Andreas Enge <andreas@enge.fr> writes:

> Hello,
>
> Am Wed, Mar 29, 2023 at 01:49:23AM +0530 schrieb Vijaya Anand:
>> In the case of accessing Guix substitutes from p2p
>> network, we ensure authorization by Guix team by making sure the urn of the
>> substitute is the urn mentioned in the narinfo
>
> no, currently substitutes are authenticated by a digital signature with one
> of the substitute servers (the user has control over which signing keys are
> accepted, see /etc/guix/acl). It happens after the download.
>

Slight ellaboration:

Currently the official Guix substitute servers provide a signed Narinfo
that contains the SHA256 sum of the substitute. The SHA256 sum of a
downloaded substitute is checked to match what is in the signed
Narinfo.

With the ERIS patches (https://issues.guix.gnu.org/52555) the signed
Narinfo also contains the ERIS URN. When getting a substitute this
signed ERIS URN is used. Decoding content from an ERIS URN guarantees
integrity, thus we also have authenticity.

Nevertheless, we still compute the SHA256 sum and check it. This is not
really necessary for ensuring authenticity but, imho, good practice for
now to be really sure we only use authenticated substitutes. Especially
when developing transparent fallback mechanisms that might go back to
just downloading the entire substitute from HTTP.

-pukkamustard


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-28 20:19     ` Vijaya Anand
  2023-03-29  8:45       ` Andreas Enge
@ 2023-03-29  9:34       ` pukkamustard
  1 sibling, 0 replies; 17+ messages in thread
From: pukkamustard @ 2023-03-29  9:34 UTC (permalink / raw)
  To: Vijaya Anand; +Cc: Attila Lendvai, guix-devel


Vijaya Anand <sunrockers8@gmail.com> writes:

> Sorry for the late reply.
> So in the case we are running swarm nodes that serves the network and hence help fund the substitute server, we can also use these
> to also upload eris encoded substitute blocks onto the network am I right? The total cost will thus be cost to run the swarm nodes +
> storage cost of substitute blocks in the network + cost to run substitute servers - money earned by running swarm nodes. But when
> we don't run swarm nodes which run to serve the network, the total cost is not really affect right as the cost to run swarm nodes will
> be lessened and no money will be earned. 
> So in the context of fallback mechanism the user client can send request to the substitute server for the missing block and the
> substitute server will serve the eris encoded data block back to the user (using HTTP). The responsibility of uploading these missing
> blocks back to the network is of the third party nodes (which are running to serve the desired content to the network). But how do we
> send the message to the node to report the missing block on the network? Can it be done by the user itself?

Some remarks:

- Maybe we don't need to do too much of an economic cost estimation, the
  current p2p networks (blocks over HTTP and IPFS) as well as ones that
  will quite possibly work soonish (GNUnet) do not incur any additional
  monetary storage cost. I think we should focus on resource usage
  (memory, disk, network) of users, servers and caching peers instead.

- One fallback strategy we absolutely need is to use get a substitute
  using the existing mechanism (substitute is a single nar file that is
  retrieved over HTTP without ERIS or any other p2p stuff).

- I like the principle "Think globally, act locally". Maybe users
  downloading substitutes who want to improve access to substitutes over
  p2p should only try to do so withing the scope of what they can do
  locally. I.e. by making the blocks available on the p2p network from
  the local machine. For IPFS and GNUnet this works very well and
  elimintates necessity of more RPC endpoints.

> "i can dream about a future where there's a social network that is based on digital signatures and encryption, and my Guix client
> authorizes compiled binaries based on some weighted transitive closure of signatures of my trusted peers"....Interesting! In the case
> of accessing Guix substitutes from p2p network, we ensure authorization by Guix team by making sure the urn of the substitute is the
> urn mentioned in the narinfo (which we get from a trusted source like the substitute server). So in the case of accessing some random
> compiled binary from the network, we just need to verify the authority of the document providing the urn of the content?

This is a very nice vision, that I share! However, maybe we should keep
it out of scope from the GSoC project and rely on the existing signature
mechanism for authenticity.

A web-of-trust like system for substitute system would be an excellent
and very interesting follow-up project.

-pukkamustard

> On Mon, 27 Mar 2023 at 2:49 AM Attila Lendvai <attila@lendvai.name> wrote:
>
>  > Also I didn't really think about the point about having to pay for
>  > the p2p services at some point of time.
>
>  a quick note here: i forgot to mention that e.g. the Swarm Foundation has programs for supporting opensource projects. so,
>  chances are high that the storage needs for Guix would be paid for by the foundation.
>
>  > In this case we will have to pay for the storage of substitutes both
>  > on the p2p storage backend as well as for storage in the substitute
>  > server am I right?
>
>  well, the substitute servers are currently already operated (and paid for) by the Guix team. i don't think that p2p storage solutions
>  have reached a point of maturity that we could rely on them alone. there should definitely be some time where both
>  infrastructures are running in parallel. somewhere down the road a choice could be made to stop running the current substitute
>  servers, but we are far away from that.
>
>  also, running swarm nodes that serve the network can earn money. so, the cost of running enough swarm nodes to pay for the
>  storage needs of Guix on the swarm network should be in the same ballpark of the costs of running the current substitute servers.
>  the storage price will be market based (this part is just being rolled out on the live network), so it's reasonable to expect that
>  people will fire up nodes if the storage price go well above the VPS costs.
>
>  and not all p2p storage networks are made equal. e.g. IPFS is only a registry of who is serving what. if you want to keep your data
>  alive on IPFS, then you need to run some nodes and make sure that they are serving the content that you care about... and bear
>  the costs of running these nodes. i.e. the DoS attack surface of IPFS is much smaller. (IPFS stores only the metadata in the DHT
>  (i.e. where is what), while Swarm stores there the data itself -- they are different architectures with different features)
>
>  (i need to learn more about GNUnet)
>
>  > So ideally we will want to eliminate the usage of these substitute
>  > servers and shift totally to p2p services and in this case we will
>  > have to shift the responsibility of re-uploading the blocks to the
>  > clients itself.
>
>  yep, that's my way of thinking, too.
>
>  note that 'client' here has two meanings:
>
>   1) some part of the codebase
>
>   2) a program that is running on the computers of the Guix users
>
>  i was using it in the first sense.
>
>  without a functional Web of Trust solution, the Guix team will have to run nodes that compile packages, sign them with their PGP
>  keys, and make them available somewhere. currently it's published through a HTTP based service that we call 'substitute
>  servers'. this GSoC project is about adding more storage backends.
>
>  but those backends don't solve the problem that the Guix users need to trust someone with a private key who compiles and signs
>  packages, regadless of the transport mechanism that gets the packages to the clients.
>
>  i can dream about a future where there's a social network that is based on digital signatures and encryption, and my Guix client
>  authorizes compiled binaries based on some weighted transitive closure of signatures of my trusted peers... but we are not there
>  yet. for now it's either trusting the Guix team's signature, or setting up your own substitute servers and build workers (or trusting
>  someone else, but i'm not aware of any third party offering substitutes).
>
>  > Also if we dont keep the re-uploading blocks option as default for
>  > the users, won't users usually not choose to enable it? Maybe we can
>  > keep it on as default and resource conscious users can choose to
>  > turn it off? Please let me know your thoughts on these points and I
>  > will change the implementation point of my proposal accordingly.
>
>  first, it's a philosophical question: i value consensual relationships, and that implies that the other party is well informed. and
>  clogging someone's network bandwidth is not an expected behavior from installing a linux distribution.
>
>  there's also a technical issue: these p2p backends will need to be configured. i have my doubts that we could ship a default config
>  with Guix with which these p2p backends could just work out of the box, but... let's hope that i'm wrong!
>
>  and then there's the issue of payments: it's not obvious that a random client can just upload binaries into a p2p storage network.
>  on some p2p networks someone needs to pay for that, and i don't yet understand well enough how the data is authorized through
>  payments (they are called postage stamps in swarm).
>
>  a good, high level comparison of p2p storage solutions would be useful.
>
>  -- 
>  • attila lendvai
>  • PGP: 963F 5D5F 45C7 DFCD 0A39
>  --
>  “There is something in the human spirit that will survive and prevail, there is a tiny and brilliant light burning in the heart of man
>  that will not go out no matter how dark the world becomes.”
>          — Leo Tolstoy (1828–1910)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-25 19:00 [GSoC 23] distributed substitutes, cost of storage Attila Lendvai
  2023-03-26 20:06 ` Vijaya Anand
@ 2023-03-30 11:08 ` Maxime Devos
  2023-04-04 10:53   ` Attila Lendvai
  1 sibling, 1 reply; 17+ messages in thread
From: Maxime Devos @ 2023-03-30 11:08 UTC (permalink / raw)
  To: Attila Lendvai, Vijaya Anand; +Cc: pukkamustard, guix-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 3872 bytes --]



Op 25-03-2023 om 20:00 schreef Attila Lendvai:
> welcome on board Anand!
> 
> 
>> In case a user requests for a substitute and there is a missing
>> block in the decoding process, a HTTP request for block would sent
>> to the substitute server and the server will encode the
>> corresponding block in real time and push it back into the
>> network. The block will be searched again and retrieved. > something to consider here: whose responsibility should it be that a 
block, that is missing from a p2p network, is (re-)uploaded there? the 
clients? or the current substitute server?
> 
> my gut instinct says that it's better if the clients do the (re-)upload of the blocks.
> 
> in this architecture the substitute server is just another storage mechanism along the other storage backends (although with a different reliability characteristics), and it's the clients that are doing the mirroring/spreading/distribution of the blocks among the various backends. the clients of course will/should keep the current substitute servers at the bottom of their list of backends in their configuration.
> 
> this way the load is distributed, and we don't need to add (too much) extra complexity to the substitute server codebase, and the actors are less tightly coupled.
> 
> it's another question whether this mirroring should be enabled by default in the clients. probably it shouldn't,

It probably should -- if things aren't mirrored, then it's not p2p; you 
would lose the main performance benefit of p2p systems.

More cynically, some p2p systems (e.g. GNUnet) have mechanisms to 
disincentive freeloaders -- clients that aren't being peers will get 
worse downloading speed.

> and the project infrastructure should be running clients where it is turned on. altruistic third parties could also enable this mirroring feature, and donate their bandwidth/resources.
> 
> there's an issue with this, though:
> 
> some p2p storage backends will require some form of payment/credentials to use their resources. arguably, all p2p storage networks that will survive into the future will have some mechanism to limit the infinite abuse of their resources. it is to be researched how these payment mechanisms work on the various p2p networks, and whether it is possible that the Guix project pays for the storage globally, and then the random clients will have the necessary credentials to (re-)upload the missing blocks.

GNUnet has a built-in mechanism for mirroring and for avoiding overuse 
of resources.  From what I recall of the documentation and the GNUnet 
papers:

* The more a peer A requests stuff of peer B,
   the more peer B dislikes peer A.
* Likewise, the more peer A fulfills requests of peer B,
   the more peer B likes peer A.
* Requests by liked peers are prioritized.

(If you squint, I suppose this could be considered a form of payment, 
but no literal currencies are involved, so no need for any financing.)

Mirroring:

* When putting a resource on the network, a few copies
   are stored in the network.  (I assume this discreases the dislike of
   the peer that received the copy by the peer that sent the copy, and
   increases the dislike by the peer that sent the copy by the peer that
   receives the copy.)

* The more popular a resource is, the more replicas are stored in
   the network (I don't recall the mechanism, but IIRC this is an
   automatic process).

* Peers set a quotum on how much bytes they are willing to store;
   when exceeded, they throw out old stuff and low-priority stuff.

(The only way to be 100% sure a resource remains on the network, is to 
have a local copy in the local peer, so you can't really reliably ‘save’ 
something on the network, but you can use it as a CDN to spread the load 
and tolerate occasional downtime.)

Greetings,
Maxime

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-03-30 11:08 ` Maxime Devos
@ 2023-04-04 10:53   ` Attila Lendvai
  2023-04-04 18:51     ` Maxime Devos
  2023-04-06  8:13     ` Simon Tournier
  0 siblings, 2 replies; 17+ messages in thread
From: Attila Lendvai @ 2023-04-04 10:53 UTC (permalink / raw)
  To: Maxime Devos; +Cc: Vijaya Anand, pukkamustard, guix-devel

> > it's another question whether this mirroring should be enabled by default in the clients. probably it shouldn't,
>
>
> It probably should -- if things aren't mirrored, then it's not p2p; you
> would lose the main performance benefit of p2p systems.
>
> More cynically, some p2p systems (e.g. GNUnet) have mechanisms to
> disincentive freeloaders -- clients that aren't being peers will get
> worse downloading speed.


any successful p2p solution must have an incentive system that makes attacks expensive (freeloading, DoS'ing, censorship, etc). arguably, the most important difference between the various solutions is what this incentive system looks like.

from a bird's eye view perspective, there are two fundamental architectures of p2p storage networks (that i know of):

 1) ipfs-like, or torrent-like, where the nodes register/publish what
    they have in their local store, and other nodes may request it
    from them

 2) swarm-like, where the nodes are responsible for storing whatever
    content "is" in their "neighborhood". (block hashes and node ids
    are in the same domain, so there's a distance metric between a
    block and a node). put another way: Swarm stores not only the
    metadata in the DHT, but also the data itself.

in 1) there's no need to pay for, and to upload content into the network. a node just registers as a source for whatever content it has locally, and then serves the incoming requests.

but if you have content that you want to make available in 2) then you need to make sure that this content gets to a set of distant nodes that will store it. this is very different from 1) from a game theoretic perspective, and can't be done without some form of payments/accounting.

in 1) it's simpler for a node to share: just give away your storage and bandwidth to the network.

in 2) it's more complicated, because if your node is requesting other nodes to do stuff, then you're spending a more complex set of resources than just your bandwidth, potentially including some crypto coin payments if the balance goes way off.

but both cases are fundamentally the same: users are spending their resources, and i wouldn't expect that installing a linux distro will start spending my network bandwidth, or any other resource than my machine's local resources.

but this of course can change, too: maybe a future Guix release can advertise with big red letters on the download page that installing it will use your network bandwidth to serve other guix nodes, unless it is turned off. and then all is well WRT informed consent.

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Historically, the most terrible things - war, genocide, and slavery - have resulted not from disobedience, but from obedience.”
	— Howard Zinn (1922–2010)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-04 10:53   ` Attila Lendvai
@ 2023-04-04 18:51     ` Maxime Devos
  2023-04-05  7:19       ` Attila Lendvai
  2023-04-06  8:13     ` Simon Tournier
  1 sibling, 1 reply; 17+ messages in thread
From: Maxime Devos @ 2023-04-04 18:51 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: Vijaya Anand, pukkamustard, guix-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 8791 bytes --]



Op 04-04-2023 om 12:53 schreef Attila Lendvai:
> Onderwerp:
> Re: [GSoC 23] distributed substitutes, cost of storage
> Van:
> Attila Lendvai <attila@lendvai.name>
> Datum:
> 04-04-2023 12:53
> 
> Aan:
> Maxime Devos <maximedevos@telenet.be>
> CC:
> Vijaya Anand <sunrockers8@gmail.com>, pukkamustard 
> <pukkamustard@posteo.net>, guix-devel@gnu.org
> 
> 
>>> it's another question whether this mirroring should be enabled by default in the clients. probably it shouldn't,
>>
>> It probably should -- if things aren't mirrored, then it's not p2p; you
>> would lose the main performance benefit of p2p systems.
>>
>> More cynically, some p2p systems (e.g. GNUnet) have mechanisms to
>> disincentive freeloaders -- clients that aren't being peers will get
>> worse downloading speed.
> 
> any successful p2p solution must have an incentive system that makes attacks expensive (freeloading, DoS'ing, censorship, etc). arguably, the most important difference between the various solutions is what this incentive system looks like.
> 
> from a bird's eye view perspective, there are two fundamental architectures of p2p storage networks (that i know of):
> 
>   1) ipfs-like, or torrent-like, where the nodes register/publish what
>      they have in their local store, and other nodes may request it
>      from them
> 
>   2) swarm-like, where the nodes are responsible for storing whatever
>      content "is" in their "neighborhood". (block hashes and node ids
>      are in the same domain, so there's a distance metric between a
>      block and a node). put another way: Swarm stores not only the
>      metadata in the DHT, but also the data itself.
> 
> in 1) there's no need to pay for, and to upload content into the network. a node just registers as a source for whatever content it has locally, and then serves the incoming requests.
> 
> but if you have content that you want to make available in 2) then you need to make sure that this content gets to a set of distant nodes that will store it. this is very different from 1) from a game theoretic perspective, and can't be done without some form of payments/accounting.
> 
> in 1) it's simpler for a node to share: just give away your storage and bandwidth to the network.
> 
> in 2) it's more complicated, because if your node is requesting other nodes to do stuff, then you're spending a more complex set of resources than just your bandwidth, potentially including some crypto coin payments if the balance goes way off.

GNUnet is (1) but also more than that, because of the automatic pushing 
to other nodes.  To my understanding it's not (2), but at the same time 
your comment about (2) applies.

Also, this crypto coin balance problem can be avoided by simply not 
basing your P2P system on money (crypto coins or otherwise); it's a 
problem that those systems invented for theirselves.

> but both cases are fundamentally the same: users are spending their resources, and i wouldn't expect that installing a linux distro will start spending my network bandwidth, or any other resource than my machine's local resources.

Network bandwidth (and storage) _is_ a local resource.

Also, how are you going to keep your distribution up to date or install 
new software without allowing your distribution to spend network 
bandwidth? -- For non-P2P systems, it is already the case that that 
network bandwidth is spent by the local machine, P2P systems just makes 
it more symmetrical and hence fairer.

More to the point, recalling that this is a reply to my statement that 
mirroring should be enabled by default:

 >> it's another question whether this mirroring should be enabled by 
default in the clients. probably it shouldn't,
 >
 > It probably should -- if things aren't mirrored, then it's not p2p; you
 > would lose the main performance benefit of p2p systems.
 >
 > More cynically, some p2p systems (e.g. GNUnet) have mechanisms to
 > disincentive freeloaders -- clients that aren't being peers will get
 > worse downloading speed.

... and noticing that you are making a distinction between the resources 
of the user and others:

‘users are spending _their_ sources, and i wouldn't expect that [...] 
will start spending _my_  network bandwith, [...], _my_ machine [...]’
(emphasis added)

... it appears that your view is that it's ok to spend resources of 
other people even without trying to reciprocate (*), and that it is 
unreasonable to expect reciprocation by default?

(*) I'm not claiming that not reciprocating is always bad -- it's a 
reasonable thing to not do when on a very limited plan.  Rather, the 
point is that reciprocating by default is reasonable and that in 
reasonable circumstances, not reciprocating is unreasonable.

I mean, given how you are a proponent of crypto, you appear to be a 
capitalist, so I'd think you are familiar with the idea that to use 
resources of other people, you need to compensate them (in money like 
with Swarm or in kind like with P2P systems (*)).

(*) I don't consider Swarm to be a P2P system -- Swarm _by design and 
intentionally_ actively maintains a class distinction between customers 
(people paying for storage and uploading) and, let's say, entrepreneurs 
(people getting paid for storage and downloading).  While sometimes a 
customer might also be an entrepreneur, by this inherent difference 
between customers and entrepreneurs in Swarm, by definition they aren't 
peers.

What also confuses me in that you appear to simultaneously subscribe to 
the view that it's fine to not compensate people _and_ the view that 
stuff should be paid -- for P2P systems with a quid-pro-quo system (like 
e.g. GNUnet), you believe it's unreasonable to automatically do the quid 
pro quo by default:

‘but both cases are fundamentally the same: users are spending their 
resources, and i wouldn't expect that installing a linux distro will 
start spending my network bandwidth, or any other resource than my 
machine's local resources.’

whereas at the same time you are an proponent of monetary systems like 
Swarm that are based on literally paying the person whose resources you 
are using.

More explicitly, I have a question: what makes the 'quid-pro-quo' and 
'literally money' systems so different that you think it's 
_unreasonable_ to expect people to _follow the basic principle of the 
quid-pro-quo system_ by default and _reasonable_ to just exploit the 
quid-pro-quo systems (i.e. by not doing the expected quid-pro-quo), and 
unreasonable to do exploiting (i.e. not paying) on 'literally money' 
systems?

> but this of course can change, too: maybe a future Guix release can advertise with big red letters on the download page that installing it will use your network bandwidth to serve other guix nodes, unless it is turned off. and then all is well WRT informed consent.

That's ‘consent’ the same way that cookie banners without a "Reject" 
button (*) are consent.  It's certainly ‘Informed’ and it's useful to 
warn people with low or expensive bandwidth to minimize the bandwidth 
limits in the GNUnet configuration, but to call it ‘consent’ is 
doublespeak.  I would prefer to not have doublespeak and instead to be 
honest that it's a requirement for installing Guix instead of twisting 
things into ‘consent’.

(*) Some variations are possible, e.g. a 'Reject all’ button that 
ignores the ‘Legitimate interest’ parts, and where you need to disable 
all illegitimate interests one-by-one which just takes a huge amount of 
time. Also cf. contracts of adhesion, e.g..

This paragraph also makes a false assumption: it assumes that Guix 
_will_ use network bandwidth to server other Guix nodes, even though it 
would presumably be still an option to use the central substitution 
server (likely not the recommended option, but still an option, at leas 
for a transition period).

Furthermore, this seems an illogical place to put such a warning -- 
there already is a place to put installation instructions and warnings: 
the manual! The Guix manual already has several warnings (just search 
for ‘Warning’), and there does not appear to be a fundamental difference 
between the new proposed warning about network bandwidth and the old 
warnings.

IIRC, the manual has a section on configuring which substitute servers 
to use (and presumably that part of the manual will later be extended 
with info about P2P substitution).  Likewise, IIRC, substitution is not 
enabled automatically, instead, there is some menu for configuring 
substitution.  The manual and that configuration menu seem way better 
places to me.

Greetings,
Maxime.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-04 18:51     ` Maxime Devos
@ 2023-04-05  7:19       ` Attila Lendvai
  0 siblings, 0 replies; 17+ messages in thread
From: Attila Lendvai @ 2023-04-05  7:19 UTC (permalink / raw)
  To: Maxime Devos; +Cc: Vijaya Anand, pukkamustard, guix-devel

thanks for the detailed elaboration Maxime!

prior to reading your email i was blind to the (rather obvious) fact that the current Guix servers are already run by someone (a peer), and they consume quite some resources, and it's currently financed through donations.

considering this, i now find it reasonable to have the resharing enabled by default. thanks for fixing this glitch in my model of reality, and sorry for being too dim here!


> Also, this crypto coin balance problem can be avoided by simply not
> basing your P2P system on money (crypto coins or otherwise); it's a
> problem that those systems invented for theirselves.


the root problem is efficient ways of locking out non-cooperating agents from the fruits of the cooperation. using a balance sheet, and monetary settlements above a certain threshold, is one attempt at solving this task. it's yet to be seen which solution will survive this evolutionary landscape.


> ... it appears that your view is that it's ok to spend resources of
> other people even without trying to reciprocate (), and that it is
> unreasonable to expect reciprocation by default?


no. i just lacked the necessary level of understanding of the terrain here.


> () I don't consider Swarm to be a P2P system -- Swarm by design and
> intentionally actively maintains a class distinction between customers
> (people paying for storage and uploading) and, let's say, entrepreneurs
> (people getting paid for storage and downloading). While sometimes a
> customer might also be an entrepreneur, by this inherent difference
> between customers and entrepreneurs in Swarm, by definition they aren't
> peers.


in the model proposed by Swarm every participant plays by the same rules. and on top of that, as long as someone's use of the shared resources is balanced with their contribution to the cooperation, then there are no monetary transactions involved. i don't see how this wouldn't qualify as p2p.

the only "class distinctions" i see here is the issuance of their crypto token, and the "unfair advantage" of the early investors and the founders (except the disadvantage of those who may end up losing their invested time/money if the project fails to deliver).


> That's ‘consent’ the same way that cookie banners without a "Reject"
> button () are consent. It's certainly ‘Informed’ and it's useful to
> warn people with low or expensive bandwidth to minimize the bandwidth
> limits in the GNUnet configuration, but to call it ‘consent’ is
> doublespeak. I would prefer to not have doublespeak and instead to be
> honest that it's a requirement for installing Guix instead of twisting
> things into ‘consent’.


i lost you here, and -- possibly due to that -- i find the doublespeak reference unfair.

consent means that i understand what's happening, and i agree to it, while i have the option to reject the situation/association without major harms to my interests.

if i'm aware that Guix will use my upstream (think of metered connections), and i install it anyway, and then i don't turn this feature off... then by those actions i implicitly consent to this happening.

it's somewhat tangential here, but the "reject cookies button" is not always a viable option. sometimes in life the only option besides agreeing to something offered is to "close the browser tab"; i.e. stop associating, which is not installing Guix in this context, which is not a major hindrance to anyone.. (although, this is arguable... :)

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“War is a racket. It always has been. It is possibly the oldest, easily the most profitable, surely the most vicious. It is the only one international in scope. It is the only one in which the profits are reckoned in dollars and the losses in lives.”
	— Smedley Butler (1881–1940), 'War is a racket' (1935), US Marine major general (highest rank at that time)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-04 10:53   ` Attila Lendvai
  2023-04-04 18:51     ` Maxime Devos
@ 2023-04-06  8:13     ` Simon Tournier
  2023-04-07 22:45       ` Attila Lendvai
  1 sibling, 1 reply; 17+ messages in thread
From: Simon Tournier @ 2023-04-06  8:13 UTC (permalink / raw)
  To: Attila Lendvai, Maxime Devos; +Cc: Vijaya Anand, pukkamustard, guix-devel

Hi,

On Tue, 04 Apr 2023 at 10:53, Attila Lendvai <attila@lendvai.name> wrote:

>  2) swarm-like, where the nodes are responsible for storing whatever
>     content "is" in their "neighborhood". (block hashes and node ids
>     are in the same domain, so there's a distance metric between a
>     block and a node). put another way: Swarm stores not only the
>     metadata in the DHT, but also the data itself.

If like me, some reader does not know what Swarm means, I guess it
refers to “hard disk of the world computer” that the Ethereum Foundation
envisions.  From my rough understanding, it is the way the Ethereum
cryptocurrency [1] stores its blockchain.

The only reference I am able to find – and I have not read it at all –
is this book of 287 pages [2]:

                           the book of Swarm

  (storage and communication infrastructure for self-sovereign digital
           society back-end stack for the decentralised web)

Well, fully ignorant on this topic, I am missing how its design
specifically targeting one blockchain dedicated to cryptocurrency could
be adapted to share Guix susbtitutes.  However, somehow, I am not
convinced that Guix should introduce some mechanisms to tackle some free
rider problems [3].

Could you be provide some details for helping my curiosity?

1: https://ethereum.org/en/what-is-ethereum/
2: https://www.ethswarm.org/The-Book-of-Swarm.pdf
3: https://en.wikipedia.org/wiki/Free-rider_problem

Cheers,
simon


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-06  8:13     ` Simon Tournier
@ 2023-04-07 22:45       ` Attila Lendvai
  2023-04-08  0:46         ` Csepp
  2023-04-08  9:30         ` Simon Tournier
  0 siblings, 2 replies; 17+ messages in thread
From: Attila Lendvai @ 2023-04-07 22:45 UTC (permalink / raw)
  To: Simon Tournier; +Cc: Maxime Devos, Vijaya Anand, pukkamustard, guix-devel

> Could you be provide some details for helping my curiosity?


this page also has a much shorter whitepaper:

https://www.ethswarm.org/why

it started out as a storage layer for Ethereum, and as such one of its main task is to store the blockchain data, but it's not limited to that. there's nothing special about the blockchain data from this perspective, and reflecting that, the Swarm project is not under the Ethereum umbrella anymore.


> However, somehow, I am not convinced that Guix should introduce some
> mechanisms to tackle some free rider problems [3].


isn't the main task of every p2p storage solution to tackle the free rider problem?

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“A proper association is united by ideas, not by men, and its members are loyal to the ideas, not to the group.”
	— Ayn Rand (1905–1982), 'Philosophy: Who Needs It' (1982)
-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“At a distance, you only see my light. Come closer and know that I am you.”
	— Rumi (1207–1273)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-07 22:45       ` Attila Lendvai
@ 2023-04-08  0:46         ` Csepp
  2023-04-08 16:05           ` Attila Lendvai
  2023-04-08  9:30         ` Simon Tournier
  1 sibling, 1 reply; 17+ messages in thread
From: Csepp @ 2023-04-08  0:46 UTC (permalink / raw)
  To: Attila Lendvai
  Cc: Simon Tournier, Maxime Devos, Vijaya Anand, pukkamustard,
	guix-devel


Attila Lendvai <attila@lendvai.name> writes:

>> Could you be provide some details for helping my curiosity?
>
>
> this page also has a much shorter whitepaper:
>
> https://www.ethswarm.org/why
>
> it started out as a storage layer for Ethereum, and as such one of its
> main task is to store the blockchain data, but it's not limited to
> that. there's nothing special about the blockchain data from this
> perspective, and reflecting that, the Swarm project is not under the
> Ethereum umbrella anymore.

Haven't read the Swarm thing, going more off of the general vibe of
these cryptocurrency related projects that keep popping up:
Using some kind of (optional) web of trust for clients makes more sense
to me than making people pay with cryptocurrencies.

I should be able to set up two computers on a LAN in the middle of
nowhere without having to care about some blockchain's global
consistency.

NDN can do this right and has been able to for a while.
Personally, I would try that and other established non-ponzi
technologies first for distributed substitutes.
Not being "permissionless" should not be considered a must have.

I have some more thoughts about this but I really should be sleeping.  I
already spent way too much redrafting this email.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-07 22:45       ` Attila Lendvai
  2023-04-08  0:46         ` Csepp
@ 2023-04-08  9:30         ` Simon Tournier
  2023-04-08 15:53           ` Attila Lendvai
  1 sibling, 1 reply; 17+ messages in thread
From: Simon Tournier @ 2023-04-08  9:30 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: Maxime Devos, Vijaya Anand, pukkamustard, guix-devel

Hi,

Please skip if it interferes in the discussion. :-)  That’s just my
partial understanding as a ignorant person on this topic of P2P storage
system.


First of all, thanks for considering the two architectures of P2P
storage networks: (1) one node publishes their local content or (2) one
node publishes a chunk of the global content.

Between all the drawbacks and/or difficulties of (2), one corollary
feature appears to me interesting: enforcing a policy for the global
that allows redundancy and so never lost some data.  À la
git-annex-numcopies [1].

However, I am still missing how Swarm-like could help…


On Fri, 07 Apr 2023 at 22:45, Attila Lendvai <attila@lendvai.name> wrote:

> this page also has a much shorter whitepaper:
>
> https://www.ethswarm.org/why
>
> it started out as a storage layer for Ethereum, and as such one of its
> main task is to store the blockchain data, but it's not limited to
> that. there's nothing special about the blockchain data from this
> perspective, and reflecting that, the Swarm project is not under the
> Ethereum umbrella anymore.

… Thanks for the pointer.

Well, I do not understand how Swarm-like does not rely on some
cryptocurrency, here BZZ for Swarm itself.  Since Swarm-like enforces
that I must store data that I do not care, then because it is a strong
constraint by the system, somehow it breaks my altruism and I only would
do such by being financially compensated – what they implement:

        p.1
        Swarm is a peer-to-peer network of nodes that collectively
        provide a decentralised storage and communication service. This
        system is economically self-sustaining due to a built-in
        incentive system which is enforced through smart contracts on
        the Ethereum blockchain and powered by the BZZ token.

        p.2
        Built-in incentives seek to optimise the allocation of bandwidth
        and storage resources and render Swarm economically
        self-sustaining. Swarm nodes track their relative bandwidth
        contribution on each peer connection, and excess debt due to
        unequal consumption can be settled in BZZ. Publishers in Swarm
        must spend BZZ to purchase the right to write data to Swarm and
        prepay some rent for long term storage.

        p.7
        As nodes relay requests and responses, they keep track of their
        relative consumption of bandwidth with each of their
        peers. Within bounds peers engage in a service-for-service
        exchange. However, once a limit is reached, the party in debt
        can either wait until their liabilities are amortised over time,
        or can pay by sending cheques that cash out in BZZ on the
        blockchain (see figure 5).

        [...]

        Nodes are financially motivated to help each in relaying
        messages, because each node that successfully routes a request
        closer to the destination earns BZZ when the request was
        successfully served. If that node is not storing the data
        itself, it pays a small amount of money to request chunks from
        an even closer node. By doing such trades, nodes earn a little
        profit when serving a request. This implies that nodes are
        motivated to cache chunks as, after purchasing the chunk once
        from a closer node, any subsequent requests for the same chunk
        will earn pure profit.

        p.8
        A contract on the blockchain allows advance purchase of a
        postage batch for BZZ tokens. A batch entitles the owner to
        issue a limited number of stamps. These stamps then serve as a
        fiduciary signal indicating how much it is worth for a user to
        persist the associated content in Swarm. By using this value to
        prioritise which chunks to remove from the reserve first, storer
        nodes maximise the utility of the DISC (see figure 6).

    From the whitepaper: <https://www.ethswarm.org/swarm-whitepaper.pdf>.


Other said, if the underlying cryptocurrency does not have a financial
value – what it would be the case, IMHO, for some Swarm-like system
applied to Guix – then I am not convinced that this “way“ to solve the
free-rider problem is better (more sustainable) than other P2P system as
IPFS-like, Torrent-like, GNUnet-like, etc.


>> However, somehow, I am not convinced that Guix should introduce some
>> mechanisms to tackle some free rider problems [3].
>
> isn't the main task of every p2p storage solution to tackle the free
> rider problem? 

Yes, it is the task of P2P storage system.  Is Guix one P2P storage
solution?  Or should Guix exploit already implemented P2P storage
systems?

Cheers,
simom

1: https://git-annex.branchable.com/git-annex-numcopies/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-08  9:30         ` Simon Tournier
@ 2023-04-08 15:53           ` Attila Lendvai
  0 siblings, 0 replies; 17+ messages in thread
From: Attila Lendvai @ 2023-04-08 15:53 UTC (permalink / raw)
  To: Simon Tournier; +Cc: Maxime Devos, Vijaya Anand, pukkamustard, guix-devel

> Yes, it is the task of P2P storage system. Is Guix one P2P storage
> solution? Or should Guix exploit already implemented P2P storage
> systems?


i automatically assumed the latter, because p2p storage is a non-trivial task that multiple teams are working to solve, and it's yet to be seen which one will work out in the long run.

BTW, due to this it's a good idea to have ERIS between Guix and the various storage backends.

a Guix-specific p2p storage solution would be less exposed to abuse initially, but probably not forever.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Heroes are heroes because they are heroic in behavior, not because they won or lost.”
	— Nassim Taleb (1960–), 'Fooled by Randomness' (2004)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GSoC 23] distributed substitutes, cost of storage
  2023-04-08  0:46         ` Csepp
@ 2023-04-08 16:05           ` Attila Lendvai
  0 siblings, 0 replies; 17+ messages in thread
From: Attila Lendvai @ 2023-04-08 16:05 UTC (permalink / raw)
  To: Csepp; +Cc: Simon Tournier, Maxime Devos, Vijaya Anand, pukkamustard,
	guix-devel

> Haven't read the Swarm thing, going more off of the general vibe of
> these cryptocurrency related projects that keep popping up:
> Using some kind of (optional) web of trust for clients makes more sense
> to me than making people pay with cryptocurrencies.
> 
> I should be able to set up two computers on a LAN in the middle of
> nowhere without having to care about some blockchain's global
> consistency.


yes, but those are different tasks, solved by different tools.

Swarm (and IPFS, and their ilk) solve large-scale cooperation in storing content. and the larger the scale of cooperation, the larger the benefits (redundance, fault tolerance, speed due to automatic caching, etc).


> NDN can do this right and has been able to for a while.
> Personally, I would try that and other established non-ponzi
> technologies first for distributed substitutes.
> Not being "permissionless" should not be considered a must have.


no, it's not a must have. it's just one more storage backend for storing substitutes -- once an abstraction layer like ERIS is installed.

without using ERIS, i wouldn't advocate for adding Swarm as the sole p2p backend integrated into Guix. there are better candidates for such a hypothetical singular position, or even as the first backend to get integrated.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“The greatest crimes in the world are not committed by people breaking the rules. It's people who follow orders that drop bombs and massacre villages. As a precaution to ever committing major acts of evil it is our solemn duty never to do what we're told, this is the only way we can be sure.”
	— Banksy, a graffiti artist from Bristol, 'Wall and Piece'



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-04-08 16:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-25 19:00 [GSoC 23] distributed substitutes, cost of storage Attila Lendvai
2023-03-26 20:06 ` Vijaya Anand
2023-03-26 21:19   ` Attila Lendvai
2023-03-28 20:19     ` Vijaya Anand
2023-03-29  8:45       ` Andreas Enge
2023-03-29  9:26         ` pukkamustard
2023-03-29  9:34       ` pukkamustard
2023-03-30 11:08 ` Maxime Devos
2023-04-04 10:53   ` Attila Lendvai
2023-04-04 18:51     ` Maxime Devos
2023-04-05  7:19       ` Attila Lendvai
2023-04-06  8:13     ` Simon Tournier
2023-04-07 22:45       ` Attila Lendvai
2023-04-08  0:46         ` Csepp
2023-04-08 16:05           ` Attila Lendvai
2023-04-08  9:30         ` Simon Tournier
2023-04-08 15:53           ` Attila Lendvai

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).