From: Chris Marusich <cmmarusich@gmail.com>
To: guix-devel@gnu.org
Subject: CDN Test Results - Should We Continue Using a CDN?
Date: Sun, 10 Mar 2019 20:47:59 -0700 [thread overview]
Message-ID: <87d0my1380.fsf@gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 7389 bytes --]
Hi Guix!
Recently, the Guix project experimented with using a CDN to improve
substitute availability and performance. This email summarizes the
results of the test for your review. I also hope this email will start
a discussion about whether or not we should continue to use a CDN.
First, I'll summarize what we did. Starting on February 23rd, 2019 we
conducted a test using Amazon CloudFront. We configured ci.guix.info so
that all requests for substitutes via that domain name would go through
an Amazon CloudFront distribution that we set up for this purpose. The
test concluded on March 23rd, and the CDN is not currently being used.
Amazon CloudFront provides us with billing information and aggregate
usage statistics. Here's the information for the duration of the test:
Duration: 28 days (February 23rd - March 23rd)
Expense: 156.88 US Dollars
Requests received: 3,732,919
Average request size: 490 KB
Bytes transferred: 1,744.5724 GB
Bytes from misses: 684.3992 GB
Hits: 2.14 M (57.44%)
Misses: 0.99 M (26.41%)
Errors: 602.91 K (16.15%)
2xx: 2,983.24 K (79.92%)
3xx: 146.753 K (3.93%)
4xx: 593.159 K (15.89%)
5xx: 9.471 K (0.25%)
Usage was fairly constant throughout the test. This means that the
daily statistics for requests received, bytes transferred, HTTP response
distribution, and hit rate neither grew nor fell significantly.
The average request size (490 KB) may seem small, since usually one
might expect substitutes to be large binaries. However, the size is
reasonable because narinfo files are small, and error responses (e.g.,
404) are probably being included in the average.
The cache hit rate (57.44%) may also seem low, but it's also reasonable
because it's aggregated over all of CloudFront's points of presence
worldwide. If one request in Seattle is a cache hit, and one request in
London is a cache miss, then that results in an overall cache hit rate
of 50%. Different points of presence don't generally share caches.
According to Amazon CloudFront, 11.75% of requests received came from
"Bot/Crawler", which CloudFront defines as "primarily requests from
search engines that are indexing your content".
In addition, CloudFront reports that traffic came from the following
locations (sorted by bytes transferred):
Location Request Count Request % Bytes
---------------------------------------------------------------------
United States 933,448 25.01% 562.52 GB
Germany 687,548 18.42% 174.53 GB
France 341,573 9.15% 167.36 GB
Canada 179,630 4.81% 96.31 GB
Russian Federation 252,738 6.77% 94.28 GB
United Kingdom 177,328 4.75% 81.55 GB
Spain 38,476 1.03% 70.49 GB
Netherlands 118,902 3.19% 61.55 GB
Belgium 64,427 1.73% 54.16 GB
Australia 101,173 2.71% 51.33 GB
Brazil 71,174 1.91% 31.01 GB
Czech Republic 48,514 1.30% 29.60 GB
Sweden 45,446 1.22% 23.12 GB
Switzerland 41,804 1.12% 21.85 GB
South Africa 42,508 1.14% 17.94 GB
Poland 46,049 1.23% 17.12 GB
China 17,841 0.48% 16.45 GB
Israel 84,443 2.26% 14.78 GB
Norway 26,171 0.70% 14.49 GB
Japan 14,013 0.38% 13.73 GB
Reunion 19,144 0.51% 11.21 GB
India 19,751 0.53% 11.11 GB
Denmark 30,390 0.81% 10.24 GB
Belarus 25,943 0.69% 9.43 GB
Italy 25,359 0.68% 8.56 GB
Ecuador 13,321 0.36% 8.41 GB
Ukraine 68,807 1.84% 7.91 GB
Bolivia, Plurinational State of 8,932 0.24% 6.51 GB
Hungary 21,374 0.57% 5.99 GB
Romania 13,187 0.35% 5.65 GB
Mexico 7,299 0.20% 4.25 GB
Ireland 7,239 0.19% 4.05 GB
Greece 7,946 0.21% 3.98 GB
Iran, Islamic Republic of 7,730 0.21% 3.84 GB
Slovenia 19,901 0.53% 3.62 GB
Argentina 8,687 0.23% 3.57 GB
Finland 5,105 0.14% 3.51 GB
Turkey 7,287 0.20% 2.97 GB
Indonesia 5,342 0.14% 2.21 GB
Chile 5,590 0.15% 2.08 GB
Bangladesh 2,791 0.07% 1.51 GB
Estonia 4,315 0.12% 1.24 GB
Austria 5,267 0.14% 1.23 GB
Unknown 2,882 0.08% 1.10 GB
Lithuania 3,319 0.09% 0.96 GB
New Zealand 6,084 0.16% 0.85 GB
United Arab Emirates 3,190 0.09% 0.78 GB
Colombia 7,136 0.19% 618.11 MB
Hong Kong 27,178 0.73% 578.07 MB
Serbia 2,620 0.07% 4.80 MB
I'm not sure how accurate that geolocation information really is, but
it's exciting to think that people in so many different places are using
Guix!
Since the test has concluded, we are not currently using a CDN. Going
forward, we need to decide if we want to continue to use a CDN. Did you
notice an improvement in download speed or substitute availability
during the test period? Do you have metrics of your own that you can
share with us? If so, please share the information so we can understand
whether it's worth continuing to pay for a CDN.
One of the reasons why we wanted to use a CDN in the first place was to
free up resources so that the community could spend more time working on
better solutions. For example, some people have expressed an interest
in a distributed or peer-to-peer substitute mechanism using IPFS or
GNUnet. In fact, Ludo paved the way for this by submitting patches to
distribute substitutes over IPFS:
https://issues.guix.info/issue/33899
However, it seems his work hasn't succeeded in exciting people enough to
carry the momentum forward. We need more people who are interested in
this and can work on it! Otherwise, it may never become a reality. So
if you care about distributed or peer-to-peer substitutes, please help!
I hope you found this information interesting. Thank you for your time!
--
Chris
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next reply other threads:[~2019-03-11 3:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-11 3:47 Chris Marusich [this message]
2019-03-11 8:09 ` CDN Test Results - Should We Continue Using a CDN? Pierre Neidhardt
2019-03-11 15:16 ` mikadoZero
2019-03-11 16:11 ` Ricardo Wurmus
2019-03-11 19:25 ` mikadoZero
2019-03-12 0:57 ` Maxim Cournoyer
2019-03-12 3:21 ` Chris Marusich
2019-03-12 13:38 ` Ludovic Courtès
2019-03-13 2:13 ` Maxim Cournoyer
2019-03-14 20:12 ` Leo Famulari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d0my1380.fsf@gmail.com \
--to=cmmarusich@gmail.com \
--cc=guix-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.