unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Yet another Hydra mirror: hydra-mirror.marusich.info
@ 2016-03-08  6:37 Chris Marusich
  2016-03-08  9:04 ` Andy Wingo
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Chris Marusich @ 2016-03-08  6:37 UTC (permalink / raw)
  To: guix-devel

Hi,

I've set up a caching proxy to serve substitutes from Hydra. If you want
to use it, it's available here:

http://hydra-mirror.marusich.info

Behind the scenes, this endpoint is set up to distribute hydra.gnu.org's
substitutes via Amazon CloudFront, which is a global content
distribution network with points of presence in the United States,
Europe, Asia, and South America. That means that almost no matter where
you are, if your neighbor uses it to download substitutes, then you'll
be able to download the same cached substitutes quickly from a nearby
location.

I'm making this mirror available to you in the hopes that you find it
useful. Please note that I make no guarantee about the continued
availability of the endpoint, so please don't rely on it for anything
important. Amazon CloudFront bills me for only the requests received and
bytes transferred via the endpoint, so the more people who use it, the
more it will cost. I reserve the right to disable it without prior
notice if it becomes too expensive.

Full disclosure: I'm employed by Amazon. However, I'm paying for this
mirror out of my own pocket. I'm personally making it available to you
because I like the Guix project a lot, and I want to help it in any way
I can. I do not represent Amazon, nor does the mirror, which I am
operating personally just like any other CloudFront customer.

I hope you find this mirror useful.

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-08  6:37 Yet another Hydra mirror: hydra-mirror.marusich.info Chris Marusich
@ 2016-03-08  9:04 ` Andy Wingo
  2016-03-08  9:57   ` Andreas Enge
  2016-03-08  9:13 ` Ludovic Courtès
  2016-04-06 13:43 ` Nils Gillmann
  2 siblings, 1 reply; 11+ messages in thread
From: Andy Wingo @ 2016-03-08  9:04 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

On Tue 08 Mar 2016 07:37, Chris Marusich <cmmarusich@gmail.com> writes:

> I've set up a caching proxy to serve substitutes from Hydra. If you want
> to use it, it's available here:
>
> http://hydra-mirror.marusich.info
>
> Behind the scenes, this endpoint is set up to distribute hydra.gnu.org's
> substitutes via Amazon CloudFront, which is a global content
> distribution network with points of presence in the United States,
> Europe, Asia, and South America. That means that almost no matter where
> you are, if your neighbor uses it to download substitutes, then you'll
> be able to download the same cached substitutes quickly from a nearby
> location.

Neat :)

Right now hydra.gnu.org is in this weird situation where people who use
it have to trust it, modulo "guix challenge" of course.  But really all
we have to trust is the mapping from the derivation (like the "foo"
package) to a hash of the build results; the actual build result could
be transferred from anywhere with no trust issues at all, provided that
we verify the hash.  (Do I understand the situation correctly?)  Anyway
it would be very interesting to be able to distribute the build products
using more scalable channels without having to trust more people.

Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-08  6:37 Yet another Hydra mirror: hydra-mirror.marusich.info Chris Marusich
  2016-03-08  9:04 ` Andy Wingo
@ 2016-03-08  9:13 ` Ludovic Courtès
  2016-03-09  8:27   ` Chris Marusich
  2016-04-06 13:43 ` Nils Gillmann
  2 siblings, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2016-03-08  9:13 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

Chris Marusich <cmmarusich@gmail.com> skribis:

> I've set up a caching proxy to serve substitutes from Hydra. If you want
> to use it, it's available here:
>
> http://hydra-mirror.marusich.info

Nice!  Are you using the nginx config that’s in guix-maintenance.git?

> Full disclosure: I'm employed by Amazon. However, I'm paying for this
> mirror out of my own pocket. I'm personally making it available to you
> because I like the Guix project a lot, and I want to help it in any way
> I can. I do not represent Amazon, nor does the mirror, which I am
> operating personally just like any other CloudFront customer.

Thank you for your contribution!  Redundancy is always an improvement.

Ludo’.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-08  9:04 ` Andy Wingo
@ 2016-03-08  9:57   ` Andreas Enge
  2016-03-09 12:37     ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Andreas Enge @ 2016-03-08  9:57 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guix-devel

On Tue, Mar 08, 2016 at 10:04:33AM +0100, Andy Wingo wrote:
> Right now hydra.gnu.org is in this weird situation where people who use
> it have to trust it, modulo "guix challenge" of course.  But really all
> we have to trust is the mapping from the derivation (like the "foo"
> package) to a hash of the build results; the actual build result could
> be transferred from anywhere with no trust issues at all, provided that
> we verify the hash.  (Do I understand the situation correctly?)

Yes, if I understand you correctly :-)  Clearly, we need to trust someone;
it is hydra.gnu.org (or more precisely, a machine in its build farm) that
creates the mapping from a derivation to a build result. So we cannot do
without trusting it. The signature that hydra provides serves two purposes:
it creates the hash and adds this trust value.

> Anyway
> it would be very interesting to be able to distribute the build products
> using more scalable channels without having to trust more people.

This is the case for the web caches, which distribute the signature of
hydra.gnu.org with the packages. Actually, any distribution process would do,
a DHT or any kind of store.

Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-08  9:13 ` Ludovic Courtès
@ 2016-03-09  8:27   ` Chris Marusich
  2016-03-09 12:42     ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Marusich @ 2016-03-09  8:27 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Nice!  Are you using the nginx config that’s in guix-maintenance.git?

No, I'm not using that at the moment. In the future, if I set up any
nginx servers to accomplish the same task, I will definitely use it. I
would prefer to run my own servers, but for now this is something I can
do immediately to help the project, so I decided to do it.

CloudFront is a service, so you use its API (or the AWS Management
Console, which is a web UI for the API) to create a "distribution" and
configure it to use hydra.gnu.org as its "origin". A little extra work
is required to glue everything together. For example, I had to create a
CNAME pointing from hydra-mirror.marusich.info to
d2xj50ygrk34qq.cloudfront.net, which is the canonical name of my
distribution. Once it's configured, all requests sent to
hydra-mirror.marusich.info are serviced by a nearby point of presence in
the CloudFront content distribution network, and the results are cached.

I've noticed that Hydra does not include cache-related headers (e.g.,
Cache-Control). Perhaps for this reason, the nginx config you linked
seems to pick arbitrary caching settings. When using CloudFront, a
distribution can be configured to respect the Cache-Control headers sent
by the origin server. Does nginx provide similar functionality? Would it
make sense to have hydra.gnu.org return such headers?

For now, I've configured hydra-mirror.marusich.info to cache all
successful requests for 1 week, and to respect cache-related headers
from the hydra.gnu.org if it ever decides to send them. This seemed like
a reasonable configuration for data which is not expected to change.

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-08  9:57   ` Andreas Enge
@ 2016-03-09 12:37     ` Ludovic Courtès
  0 siblings, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2016-03-09 12:37 UTC (permalink / raw)
  To: Andreas Enge; +Cc: guix-devel

Andreas Enge <andreas@enge.fr> skribis:

> On Tue, Mar 08, 2016 at 10:04:33AM +0100, Andy Wingo wrote:
>> Right now hydra.gnu.org is in this weird situation where people who use
>> it have to trust it, modulo "guix challenge" of course.  But really all
>> we have to trust is the mapping from the derivation (like the "foo"
>> package) to a hash of the build results; the actual build result could
>> be transferred from anywhere with no trust issues at all, provided that
>> we verify the hash.  (Do I understand the situation correctly?)
>
> Yes, if I understand you correctly :-)

I think you both understand correctly.  :-)

That is, hydra.gnu.org serves narinfos like:

  http://hydra.gnu.org/n0rgvy9c0cwv453k5bczwscp6iwqa4fc.narinfo

They contain all the meta-data for the corresponding store item,
including a hash of its content, and said meta-data is signed.  See
(guix pki) and
<https://www.gnu.org/software/guix/manual/html_node/Substitutes.html>
for details

This is why we can mirror things as-is and have users benefit from it
without having to trust any additional party.


Mirrors are nice because they’re easy to set up, completely transparent
for users, and allow our infrastructure to scale quickly.  Now, another
thing that would be great is to have independent build farms (running
‘guix publish’) so there is no single point of trust.

Ludo’.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-09  8:27   ` Chris Marusich
@ 2016-03-09 12:42     ` Ludovic Courtès
  2016-03-11  4:08       ` Chris Marusich
  0 siblings, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2016-03-09 12:42 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

Chris Marusich <cmmarusich@gmail.com> skribis:

> No, I'm not using that at the moment. In the future, if I set up any
> nginx servers to accomplish the same task, I will definitely use it. I
> would prefer to run my own servers, but for now this is something I can
> do immediately to help the project, so I decided to do it.
>
> CloudFront is a service, so you use its API (or the AWS Management
> Console, which is a web UI for the API) to create a "distribution" and
> configure it to use hydra.gnu.org as its "origin". A little extra work
> is required to glue everything together. For example, I had to create a
> CNAME pointing from hydra-mirror.marusich.info to
> d2xj50ygrk34qq.cloudfront.net, which is the canonical name of my
> distribution. Once it's configured, all requests sent to
> hydra-mirror.marusich.info are serviced by a nearby point of presence in
> the CloudFront content distribution network, and the results are cached.

OK.

Do you know exactly how much is cached?  Also, when does caching happen?

When using nginx as a proxy as on mirror.guixsd.org, it fetches things
lazily, so on a cache miss it goes connect to hydra.gnu.org.

> I've noticed that Hydra does not include cache-related headers (e.g.,
> Cache-Control). Perhaps for this reason, the nginx config you linked
> seems to pick arbitrary caching settings. When using CloudFront, a
> distribution can be configured to respect the Cache-Control headers sent
> by the origin server. Does nginx provide similar functionality? Would it
> make sense to have hydra.gnu.org return such headers?

Dunno, maybe!  Maybe we could tell nginx to add such headers?  What
would be the right thing?

> For now, I've configured hydra-mirror.marusich.info to cache all
> successful requests for 1 week, and to respect cache-related headers
> from the hydra.gnu.org if it ever decides to send them. This seemed like
> a reasonable configuration for data which is not expected to change.

Yeah, mirror.guixsd.org also caches for one week now.  I wasn’t sure how
much disk space that would represent, but so far we’re around 10G, so
increasing to 1 week seemed reasonable.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-09 12:42     ` Ludovic Courtès
@ 2016-03-11  4:08       ` Chris Marusich
  2016-03-11 14:47         ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Marusich @ 2016-03-11  4:08 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Yeah, mirror.guixsd.org also caches for one week now.  I wasn’t sure how
> much disk space that would represent, but so far we’re around 10G, so
> increasing to 1 week seemed reasonable.

Are there any URL paths a caching proxy should NOT cache for a long
time? My understanding is that it's OK to cache all URL paths for 1
week. Are there any URL paths for which I should return something
different from Hydra (e.g., /nix-cache-info)?

> OK.
>
> Do you know exactly how much is cached?  Also, when does caching happen?

Every OK result is cached for 7 days. My understanding is that negative
results (e.g., 404) are not cached. The storage limit is effectively
unlimited from my perspective, since CloudFront is a service.

> When using nginx as a proxy as on mirror.guixsd.org, it fetches things
> lazily, so on a cache miss it goes connect to hydra.gnu.org.

CloudFront operates the same way.

>> I've noticed that Hydra does not include cache-related headers (e.g.,
>> Cache-Control). Perhaps for this reason, the nginx config you linked
>> seems to pick arbitrary caching settings. When using CloudFront, a
>> distribution can be configured to respect the Cache-Control headers sent
>> by the origin server. Does nginx provide similar functionality? Would it
>> make sense to have hydra.gnu.org return such headers?
>
> Dunno, maybe!  Maybe we could tell nginx to add such headers?  What
> would be the right thing?

I'm not sure. It's obviously not necessary, but if the common proxy
servers (e.g., nginx) obey such cache control headers by default, it
might be nice to return the headers. That way, Hydra could "recommend"
caching behavior, and it would be easier to set up a correctly
functioning caching proxy. However, if servers like nginx don't obey
cache control headers by default, or if Hydra does not require different
objects to be cached differently, it may not worth the effort.

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-11  4:08       ` Chris Marusich
@ 2016-03-11 14:47         ` Ludovic Courtès
  0 siblings, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2016-03-11 14:47 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

Chris Marusich <cmmarusich@gmail.com> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Yeah, mirror.guixsd.org also caches for one week now.  I wasn’t sure how
>> much disk space that would represent, but so far we’re around 10G, so
>> increasing to 1 week seemed reasonable.
>
> Are there any URL paths a caching proxy should NOT cache for a long
> time? My understanding is that it's OK to cache all URL paths for 1
> week. Are there any URL paths for which I should return something
> different from Hydra (e.g., /nix-cache-info)?

Everything can be cached, where “everything” really means:

  /nix-cache-info
  /.*\.narinfo$
  /nar/.*

The rest should not be proxied at all, because it’s dynamic in nature
and some of it is quite expensive to generate (things that involve
complex SQL queries.)

>> OK.
>>
>> Do you know exactly how much is cached?  Also, when does caching happen?
>
> Every OK result is cached for 7 days. My understanding is that negative
> results (e.g., 404) are not cached. The storage limit is effectively
> unlimited from my perspective, since CloudFront is a service.

Good.  :-)

>> When using nginx as a proxy as on mirror.guixsd.org, it fetches things
>> lazily, so on a cache miss it goes connect to hydra.gnu.org.
>
> CloudFront operates the same way.

Yeah, I’ve seen from the headers that it runs nginx.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-03-08  6:37 Yet another Hydra mirror: hydra-mirror.marusich.info Chris Marusich
  2016-03-08  9:04 ` Andy Wingo
  2016-03-08  9:13 ` Ludovic Courtès
@ 2016-04-06 13:43 ` Nils Gillmann
  2016-04-07  4:56   ` Chris Marusich
  2 siblings, 1 reply; 11+ messages in thread
From: Nils Gillmann @ 2016-04-06 13:43 UTC (permalink / raw)
  To: guix-devel

Chris Marusich <cmmarusich@gmail.com> writes:

> Hi,
>
> I've set up a caching proxy to serve substitutes from Hydra. If you want
> to use it, it's available here:
>
> http://hydra-mirror.marusich.info
>
> Behind the scenes, this endpoint is set up to distribute hydra.gnu.org's
> substitutes via Amazon CloudFront, which is a global content
> distribution network with points of presence in the United States,
> Europe, Asia, and South America. That means that almost no matter where
> you are, if your neighbor uses it to download substitutes, then you'll
> be able to download the same cached substitutes quickly from a nearby
> location.
>
> I'm making this mirror available to you in the hopes that you find it
> useful. Please note that I make no guarantee about the continued
> availability of the endpoint, so please don't rely on it for anything
> important. Amazon CloudFront bills me for only the requests received and
> bytes transferred via the endpoint, so the more people who use it, the
> more it will cost. I reserve the right to disable it without prior
> notice if it becomes too expensive.
>
> Full disclosure: I'm employed by Amazon. However, I'm paying for this
> mirror out of my own pocket. I'm personally making it available to you
> because I like the Guix project a lot, and I want to help it in any way
> I can. I do not represent Amazon, nor does the mirror, which I am
> operating personally just like any other CloudFront customer.
>
> I hope you find this mirror useful.
>
> Chris
>
>

Hi,

since you posted this some time has passed, about a month now.

Could you tell me how much in- and outbound traffic it consumed
(separate values if possible)? I'm taking in consideration for
current moving situation if it will be mirror now or buildserver
later that I will be testing with and include in my setup.

-- 
ng0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Yet another Hydra mirror: hydra-mirror.marusich.info
  2016-04-06 13:43 ` Nils Gillmann
@ 2016-04-07  4:56   ` Chris Marusich
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Marusich @ 2016-04-07  4:56 UTC (permalink / raw)
  To: Nils Gillmann; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2470 bytes --]

Nils Gillmann <niasterisk@grrlz.net> writes:

> Chris Marusich <cmmarusich@gmail.com> writes:
>
>> Hi,
>>
>> I've set up a caching proxy to serve substitutes from Hydra. If you want
>> to use it, it's available here:
>>
>> http://hydra-mirror.marusich.info
>>
>> Behind the scenes, this endpoint is set up to distribute hydra.gnu.org's
>> substitutes via Amazon CloudFront, which is a global content
>> distribution network with points of presence in the United States,
>> Europe, Asia, and South America. That means that almost no matter where
>> you are, if your neighbor uses it to download substitutes, then you'll
>> be able to download the same cached substitutes quickly from a nearby
>> location.
>>
>> I'm making this mirror available to you in the hopes that you find it
>> useful. Please note that I make no guarantee about the continued
>> availability of the endpoint, so please don't rely on it for anything
>> important. Amazon CloudFront bills me for only the requests received and
>> bytes transferred via the endpoint, so the more people who use it, the
>> more it will cost. I reserve the right to disable it without prior
>> notice if it becomes too expensive.
>>
>> Full disclosure: I'm employed by Amazon. However, I'm paying for this
>> mirror out of my own pocket. I'm personally making it available to you
>> because I like the Guix project a lot, and I want to help it in any way
>> I can. I do not represent Amazon, nor does the mirror, which I am
>> operating personally just like any other CloudFront customer.
>>
>> I hope you find this mirror useful.
>>
>> Chris
>>
>>
>
> Hi,
>
> since you posted this some time has passed, about a month now.
>
> Could you tell me how much in- and outbound traffic it consumed
> (separate values if possible)? I'm taking in consideration for
> current moving situation if it will be mirror now or buildserver
> later that I will be testing with and include in my setup.

Sure!  Since it only services reads, there is no in-traffic, only
out-traffic.  Since 2016-03-06, I've observed a modest total of 12.2 GB
of data transferred, of which 10.8 GB came from cache misses.  The cache
misses are likely due to the fact that I'm the only one currently using
it.

Most of the time, daily traffic is quite low.  Sometimes, it spikes to
about 3-4 GB in a single day.  That probably corresponds to times when I
installed or upgraded GuixSD.

-- 
Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-04-07  4:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-08  6:37 Yet another Hydra mirror: hydra-mirror.marusich.info Chris Marusich
2016-03-08  9:04 ` Andy Wingo
2016-03-08  9:57   ` Andreas Enge
2016-03-09 12:37     ` Ludovic Courtès
2016-03-08  9:13 ` Ludovic Courtès
2016-03-09  8:27   ` Chris Marusich
2016-03-09 12:42     ` Ludovic Courtès
2016-03-11  4:08       ` Chris Marusich
2016-03-11 14:47         ` Ludovic Courtès
2016-04-06 13:43 ` Nils Gillmann
2016-04-07  4:56   ` Chris Marusich

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).