From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:403:4789::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms8.migadu.com with LMTPS id 6BumN8I4L2X4dgEAG6o9tA:P1 (envelope-from ) for ; Wed, 18 Oct 2023 03:45:39 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:4789::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id 6BumN8I4L2X4dgEAG6o9tA (envelope-from ) for ; Wed, 18 Oct 2023 03:45:39 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 386065D048 for ; Wed, 18 Oct 2023 03:45:38 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jy6iPsel; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "help-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="help-guix-bounces+larch=yhetil.org@gnu.org" ARC-Seal: i=1; s=key1; d=yhetil.org; t=1697593538; a=rsa-sha256; cv=none; b=PdIULGqi/SQ8A1CIIrzQl+8ndknpM0qpzR0G/CZ+vnN6VdTlaFwtkEwECAEYwkjb/aqx00 rISk92N8YNGdQUeZafLFGRqkYUWBFxdwzaStijoDLUdIMJ679VoRbStMe9IDbC4d5FXWXf Bmq4ab/o6HLhWK57siHTXo9tXxrZAm7s2vYwnjD+3QaDfXbrKY5MDba997tGnPphS+ih9o 1/hZ4yeUOxFlzWj6N7FFmBnwYgk5l5JRMCtpcX16ABBkDxI9ovvbIXdKeq6RITnjJwwo4X 5Iym+eLykKJMw5/Mm3mOmgKdpbhBSS58IUhuajuoBODZuw67oX7YqLETH3CI+g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jy6iPsel; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "help-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="help-guix-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1697593538; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=q/f+45hkm3TKVml9KzKg6KqEV9sXKLp3VCD4dVYVO64=; b=ZZ6k4IETuIGYWlQvyR1hPmx/dLTVP/Q4LlCU+4fq0UQi+upc8zdsDBJDmgRgRAWyvQd4Wf NL+Yw2oM8SN3wNOy+x9mwSMx0zh21W1UFGTkyCwt4hr3rlb82oJKyng0B2DfRYs0psbVQX rFioiKyGmC2OegqkKZNcgNFLqS60HKpbTyR5c/vUv4oblyYyMXBWRg48Gl5N8aSuBPSMDt mR3ps3IqFdolnGWKCWvUFkPsZGpsssHEjYFahBuMbIlxUWcVsIhqpX+7PK8EZK1xEYtmCI 2PYGXtmd7Y5H/Q82B45KhtAcGRfsl5rTb6rum3H2kRSSF0VTcf1Ea12XgSypMQ== Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsvcb-0002Yv-Eg; Tue, 17 Oct 2023 21:45:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsvcX-0002YS-Pg for help-guix@gnu.org; Tue, 17 Oct 2023 21:45:13 -0400 Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qsvcV-0004wk-JD for help-guix@gnu.org; Tue, 17 Oct 2023 21:45:13 -0400 Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-405497850dbso59895845e9.0 for ; Tue, 17 Oct 2023 18:45:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697593508; x=1698198308; darn=gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=q/f+45hkm3TKVml9KzKg6KqEV9sXKLp3VCD4dVYVO64=; b=jy6iPselXfKotlZAdE+Ks/bwzXSb71J95GgjzS3ou+WBrkD9wfeeV3PRT+2fEP+OQI dGMJfG7Td8D1g6UNLxA01bp6e8NiFLg+fXmyQJsRUpI/Nr9cHQ5sF1w556iJ8O1gn1iA b99TG4zVj0NPhppiW4of1O8N6/y10cV6shJ0oKKiEKVrGaEqzamZTQXwS9NP/qOuFWeH NNAKiWIOOr1ZMmM7kxIzWenGnXznzGVujN9AuYWfDcaL91EUxolVXyLSTMNNjUAhPpB+ cYRRbrGRoxm8fxD7HYvo0syAjTOk5V3WtdQvFu1gPwazBf7kzu8vZdbkSpEcGPSAF5gc WheA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697593508; x=1698198308; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q/f+45hkm3TKVml9KzKg6KqEV9sXKLp3VCD4dVYVO64=; b=deb1VtuoYK0jnwxUByoJa3DYjuCvruMMyhzlnu4/WtsFIhkd7FvbKbKdCvR7rWrvnC qfMT4Hp59f7c6huOhgFuaqui0oVElnw5qj4kAVjzYgAcAO1xMjQu6Kkt8p+SSI8pXlmU nRA6k4o21fBhxZTqy1euq2Zr5b9FY8Sbm8Sa4GlvRm0WnCMwa/0Tjy7jKqX1m/pk/3XE ZzMRgmvH/7xUd2MSc3oOiathi7MZQ8OhBBrUxNOqBPhTee4WM/xXllKGz/rt8yYVWf/V EDZ+2Ff4Uc+1StgnpwPrYXnRAG4Ly/OIi20rUxEJxZ9kmSUWHkgakdsVBhyKlmwo3Ym/ gURA== X-Gm-Message-State: AOJu0YzRcwcqLJutMHMRk6Kb8omvhx+cPHyCld/PbdKx10hjEXZMc8cy 7fCmp6/BQdiuiGKNhZopOJXkOD6NPyUzIv30uMzM3DwS X-Google-Smtp-Source: AGHT+IEkSsxwUxe0myBDySOt44nBVhOMuDHwtcVKuAIsA465Fd/dzjmtXWODnQaZgf137bEKxRTYwL5hC7/FodPUJeA= X-Received: by 2002:adf:f010:0:b0:31d:c73d:d2ed with SMTP id j16-20020adff010000000b0031dc73dd2edmr3115341wro.5.1697593507709; Tue, 17 Oct 2023 18:45:07 -0700 (PDT) MIME-Version: 1.0 References: <87jzrse047.fsf@cbaines.net> <8734ye965y.fsf@cbaines.net> <20231013190552.65c8ddb4@jrhaighs-debian-x200> In-Reply-To: From: Josh Marshall Date: Tue, 17 Oct 2023 21:44:56 -0400 Message-ID: Subject: =?UTF-8?Q?Re=3A_Architecture_to_reduce_download_time_when_pullin?= =?UTF-8?Q?g_multiple_packages_=E2=80=93_historic_success_with_magnet_URLs=2C_B?= =?UTF-8?Q?TIHs=2C_=26_Aria2c=21?= To: "James R. Haigh (+ML.GNU.Guix subaddress)" Cc: help-guix@gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::32e; envelope-from=joshua.r.marshall.1991@gmail.com; helo=mail-wm1-x32e.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: help-guix@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-guix-bounces+larch=yhetil.org@gnu.org Sender: help-guix-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx1.migadu.com X-Migadu-Spam-Score: -6.48 X-Spam-Score: -6.48 X-Migadu-Queue-Id: 386065D048 X-TUID: /sSk0NZ0zQLP How long is traditional before I can bump a thread? On Sun, Oct 15, 2023 at 2:21=E2=80=AFPM Josh Marshall wrote: > > So it sounds like my first steps are to re-implement the downloads > using aria2c. This would affect the minimum base package, no? Can I > get some buy-in from maintainers that such changes are acceptable? > > On Fri, Oct 13, 2023 at 2:06=E2=80=AFPM James R. Haigh (+ML.GNU.Guix > subaddress) wrote: > > > > Hi Josh, > > > > At Z-0400=3D2023-10-13Fri12:36:01, Josh Marshall sent: > > > This is to parallelize connections which should never hurt downloadin= g but can help. Mirroring would be parallelizing for providing packages, w= hat I want to implement is to parallelize obtaining packages. Server side = vs client side. > > > > Please, if you are going to do something like this, please use = a torrent architecture like BitTorrent or GNUnet =E2=80=93 I suggest Aria2c= as a very good CLI download backend that can be daemonised and sent instru= ctions over a socket to add, pause, remove downloads, etc., and it supports= magnet URLs including the existing nontorrent servers (via =E2=80=98as=E2= =80=99 parameters, iirc.). > > > > I actually implemented this in a local copy of APT Daemon many = years ago (circa 2011), but the change was not accepted upstream to Launchp= ad (because I was not on bleeding-edge; I was too slow to keep-up with the = upstream development). My fork got forgotten about, because to get the ful= l benefit the server would have had to have added a BitTorrent Info Hash (B= TIH) to the metadata of each package, along with the MD5, SHA-256, etc. tha= t it already did (not a big ask, really). That said, without the full bene= fit of having the metadata, it did provide immediate benefit and I used it = for many years, not upgrading my Ubuntu 11.04 Natty Narwhal that I was usin= g back then until I really had to. > > > > The immediate benefit that it provided was exactly as you descr= ibed: It allowed parallelisation of nontorrent downloads, be it from the sa= me server or from multiple mirrors. Iirc., I achieved this by simply passi= ng the download list to Aria2c in daemon mode, I think I also converted all= the HTTP URLs to =E2=80=98as=E2=80=99 parameters in magnet links, so that = multiple mirrors could be passed using multiple =E2=80=98as=E2=80=99 parame= ters in each magnet link. Then I simply relied on Aria2c being amazing at = parallelising everything that I had given it! I then also implemented prog= ress updates such that APT Daemon could reflect where Aria2c was up to. > > > > The way I implemented this using Aria2c and magnet URLs meant t= hat if additional hashes were known, they could be used as well, and so if = the server metadata made the simple addition of adding BTIHs, it allows swa= rming to occur, which in-turn would massively reduce load on the central se= rvers, and allow anyone who want to be a mirror to be a mirror simply by se= eding indefinitely. A default share ratio of 1.0 means that no user is a b= urden on the network, unless they deliberately change that. Users can dona= te to the running costs of the project simply by increasing their share rat= io, which adds another means of contribution that they may find easier than= the others. > > > > Anyone keen to keep old packages online can simply seed them in= definitely, so this is also really great for archival purposes. Even if th= e central project loses interest in the old packages and deletes them, anyo= ne else can keep them up. The hashes ensure that they have not been tamper= ed with. > > > > There is also a really cool benefit that occurs, or can occur, = on a LAN. An entire network of computers can all swarm locally with each o= ther, thus needing each package to only need downloading through the metere= d last mile bottleneck from the WAN precisely once =E2=80=93 providing that= local broadcasting is supported. I think this requires Avahi, and I seem = to remember that Aria2c supports this but I can't remember. I don't ever r= emember getting this bit working but also I did not try hard because it wou= ld have required the metadata that I didn't have until after download, so e= ven if I got it working it would not have been directly useful unless the A= PT repositories that I was using would include the BTIHs. > > > > So yeah, loads of great benefits to this architecture, and I hi= ghly-recommend it: convert all existing URLs to magnet links (can be done c= lient-side as I did; or server-side); optionally add any additional mirrors= as additional =E2=80=98as=E2=80=99 parameters (again client-side or server= -side); add =E2=80=98btih=E2=80=99 parameters to the magnet links (the BTIH= must be included in the server metadata to get the full benefit of the swa= rming, but conversion to magnet link format can be done client-side or serv= er-side); then simply pass all this to a really good parallelising backend = such as Aria2c; then update any progress data and relay pause, resume, canc= el, etc. to the backend. > > > > One final note, as I am sure that there are a lot of GNUnet fan= s on this list, is that I would try Aria2c first to see how well it can wor= k, and then try GNUnet or whatever else once you have a standard to benchma= rk against. Both are Free Software, so no concern there. Aria2c is an all= -round download manager CLI that works with or without swarming, i.e. it is= just as good at HTTPS as it is BitTorrent, and can do both at the same tim= e. GNUnet has the advantage of working from SHA-256 iirc., which is genera= lly already included in the metadata of the repositories of various distrib= utions, but I think it lacks a lot of other features and stability and ecos= ystem of alternative backends, compared to the BitTorrent network. > > > > Of course, there is no harm in including other hashes along wit= h BTIH, to allow people to experiment with alternative backends, while alwa= ys ensuring that what works works well. Another hash that may be useful to= include is the Tiger Tree Hash, which is structurally very similar to BTIH= , but stronger, iirc.. > > > > The first thing that the Guix project can do to signal interest= in this architecture is to simply include the BTIH of each package in the = repository metadata. Be it in magnet URL form or not does not matter becau= se the client can later convert that as needed. The important thing is an = authoritative statement in metadata that this version of this package has t= his BTIH. Once that metadata is available, the game is on to implement swa= rming support, be it with Aria2c as a backend (as I recommend at least star= ting with) or otherwise. > > > > I know that this architecture works well out of first-hand expe= rience with APT Daemon written in Python. The only failure I had with it w= as lack of upstream support. So I consider it important to first attain th= e upstream approval before really investing more time into this. I seem to= remember suggesting this to the Nix project many years ago and didn't get = anywhere, and now I don't have the energy to try to improve upstream projec= ts if they reject my ideas, so I'll be interested to see whether you have a= ny success with your attempt to do the same. > > > > Good luck! ;-) > > > > Kind regards, > > James. > > -- > > Wealth doesn't bring happiness, but poverty brings sadness. > > Sent from Debian with Claws Mail, using email subaddressing as an alter= native to error-prone heuristical spam filtering. > > Postal: James R. Haigh, Middle Farm, Vennington, nr. Westbury, nr. Shre= wsbury, Salop, SY5 9RG, Britain