Hi Guix! Recently, the Guix project experimented with using a CDN to improve substitute availability and performance. This email summarizes the results of the test for your review. I also hope this email will start a discussion about whether or not we should continue to use a CDN. First, I'll summarize what we did. Starting on February 23rd, 2019 we conducted a test using Amazon CloudFront. We configured ci.guix.info so that all requests for substitutes via that domain name would go through an Amazon CloudFront distribution that we set up for this purpose. The test concluded on March 23rd, and the CDN is not currently being used. Amazon CloudFront provides us with billing information and aggregate usage statistics. Here's the information for the duration of the test: Duration: 28 days (February 23rd - March 23rd) Expense: 156.88 US Dollars Requests received: 3,732,919 Average request size: 490 KB Bytes transferred: 1,744.5724 GB Bytes from misses: 684.3992 GB Hits: 2.14 M (57.44%) Misses: 0.99 M (26.41%) Errors: 602.91 K (16.15%) 2xx: 2,983.24 K (79.92%) 3xx: 146.753 K (3.93%) 4xx: 593.159 K (15.89%) 5xx: 9.471 K (0.25%) Usage was fairly constant throughout the test. This means that the daily statistics for requests received, bytes transferred, HTTP response distribution, and hit rate neither grew nor fell significantly. The average request size (490 KB) may seem small, since usually one might expect substitutes to be large binaries. However, the size is reasonable because narinfo files are small, and error responses (e.g., 404) are probably being included in the average. The cache hit rate (57.44%) may also seem low, but it's also reasonable because it's aggregated over all of CloudFront's points of presence worldwide. If one request in Seattle is a cache hit, and one request in London is a cache miss, then that results in an overall cache hit rate of 50%. Different points of presence don't generally share caches. According to Amazon CloudFront, 11.75% of requests received came from "Bot/Crawler", which CloudFront defines as "primarily requests from search engines that are indexing your content". In addition, CloudFront reports that traffic came from the following locations (sorted by bytes transferred): Location Request Count Request % Bytes --------------------------------------------------------------------- United States 933,448 25.01% 562.52 GB Germany 687,548 18.42% 174.53 GB France 341,573 9.15% 167.36 GB Canada 179,630 4.81% 96.31 GB Russian Federation 252,738 6.77% 94.28 GB United Kingdom 177,328 4.75% 81.55 GB Spain 38,476 1.03% 70.49 GB Netherlands 118,902 3.19% 61.55 GB Belgium 64,427 1.73% 54.16 GB Australia 101,173 2.71% 51.33 GB Brazil 71,174 1.91% 31.01 GB Czech Republic 48,514 1.30% 29.60 GB Sweden 45,446 1.22% 23.12 GB Switzerland 41,804 1.12% 21.85 GB South Africa 42,508 1.14% 17.94 GB Poland 46,049 1.23% 17.12 GB China 17,841 0.48% 16.45 GB Israel 84,443 2.26% 14.78 GB Norway 26,171 0.70% 14.49 GB Japan 14,013 0.38% 13.73 GB Reunion 19,144 0.51% 11.21 GB India 19,751 0.53% 11.11 GB Denmark 30,390 0.81% 10.24 GB Belarus 25,943 0.69% 9.43 GB Italy 25,359 0.68% 8.56 GB Ecuador 13,321 0.36% 8.41 GB Ukraine 68,807 1.84% 7.91 GB Bolivia, Plurinational State of 8,932 0.24% 6.51 GB Hungary 21,374 0.57% 5.99 GB Romania 13,187 0.35% 5.65 GB Mexico 7,299 0.20% 4.25 GB Ireland 7,239 0.19% 4.05 GB Greece 7,946 0.21% 3.98 GB Iran, Islamic Republic of 7,730 0.21% 3.84 GB Slovenia 19,901 0.53% 3.62 GB Argentina 8,687 0.23% 3.57 GB Finland 5,105 0.14% 3.51 GB Turkey 7,287 0.20% 2.97 GB Indonesia 5,342 0.14% 2.21 GB Chile 5,590 0.15% 2.08 GB Bangladesh 2,791 0.07% 1.51 GB Estonia 4,315 0.12% 1.24 GB Austria 5,267 0.14% 1.23 GB Unknown 2,882 0.08% 1.10 GB Lithuania 3,319 0.09% 0.96 GB New Zealand 6,084 0.16% 0.85 GB United Arab Emirates 3,190 0.09% 0.78 GB Colombia 7,136 0.19% 618.11 MB Hong Kong 27,178 0.73% 578.07 MB Serbia 2,620 0.07% 4.80 MB I'm not sure how accurate that geolocation information really is, but it's exciting to think that people in so many different places are using Guix! Since the test has concluded, we are not currently using a CDN. Going forward, we need to decide if we want to continue to use a CDN. Did you notice an improvement in download speed or substitute availability during the test period? Do you have metrics of your own that you can share with us? If so, please share the information so we can understand whether it's worth continuing to pay for a CDN. One of the reasons why we wanted to use a CDN in the first place was to free up resources so that the community could spend more time working on better solutions. For example, some people have expressed an interest in a distributed or peer-to-peer substitute mechanism using IPFS or GNUnet. In fact, Ludo paved the way for this by submitting patches to distribute substitutes over IPFS: https://issues.guix.info/issue/33899 However, it seems his work hasn't succeeded in exciting people enough to carry the momentum forward. We need more people who are interested in this and can work on it! Otherwise, it may never become a reality. So if you care about distributed or peer-to-peer substitutes, please help! I hope you found this information interesting. Thank you for your time! -- Chris