From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id uP78LctcF1/FfAAA0tVLHw (envelope-from ) for ; Tue, 21 Jul 2020 21:23:23 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id SD+zKctcF19vIQAAbx9fmQ (envelope-from ) for ; Tue, 21 Jul 2020 21:23:23 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 40B2294053C for ; Tue, 21 Jul 2020 21:23:23 +0000 (UTC) Received: from localhost ([::1]:51676 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jxzjO-0003uA-7m for larch@yhetil.org; Tue, 21 Jul 2020 17:23:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53578) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jxzj4-0003tw-13 for bug-guix@gnu.org; Tue, 21 Jul 2020 17:23:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:57229) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jxzj3-0006rv-Oh for bug-guix@gnu.org; Tue, 21 Jul 2020 17:23:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jxzj3-0001bQ-Jh for bug-guix@gnu.org; Tue, 21 Jul 2020 17:23:01 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#42162: Recovering source tarballs Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Tue, 21 Jul 2020 21:23:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 42162 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: zimoun Received: via spool by 42162-submit@debbugs.gnu.org id=B42162.15953665326071 (code B ref 42162); Tue, 21 Jul 2020 21:23:01 +0000 Received: (at 42162) by debbugs.gnu.org; 21 Jul 2020 21:22:12 +0000 Received: from localhost ([127.0.0.1]:40539 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jxziF-0001Zr-Oq for submit@debbugs.gnu.org; Tue, 21 Jul 2020 17:22:12 -0400 Received: from eggs.gnu.org ([209.51.188.92]:59896) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jxziC-0001Zd-UR for 42162@debbugs.gnu.org; Tue, 21 Jul 2020 17:22:09 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36586) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jxzi6-0006kJ-Pe; Tue, 21 Jul 2020 17:22:02 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=56814 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jxzi6-00017G-99; Tue, 21 Jul 2020 17:22:02 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87mu4iv0gc.fsf@inria.fr> <86h7uq8fmk.fsf@gmail.com> <87d05etero.fsf@gnu.org> <87r1tit5j6.fsf_-_@gnu.org> <87365mzil1.fsf@gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 4 Thermidor an 228 de la =?UTF-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 21 Jul 2020 23:22:00 +0200 In-Reply-To: (zimoun's message of "Mon, 20 Jul 2020 17:52:09 +0200") Message-ID: <87k0ywlg1z.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -3.3 (---) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 42162@debbugs.gnu.org, Maurice =?UTF-8?Q?Br=C3=A9mond?= Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: -1.01 X-TUID: oNgJMc8OgGUR Hi! zimoun skribis: > On Mon, 20 Jul 2020 at 10:39, Ludovic Court=C3=A8s wrote: >> zimoun skribis: >> > On Sat, 11 Jul 2020 at 17:50, Ludovic Court=C3=A8s wrot= e: > >> There are many many comments in your message, so I took the liberty to >> reply only to the essence of it. :-) > > Many comments because many open topics. ;-) Understood, and they=E2=80=99re very valuable but (1) I choose not to just = do email :-), and (2) I like to separate issues in reasonable chunks rather than long threads addressing all the problems we=E2=80=99ll have to deal wi= th. I think it really helps keep things tractable! >> Lookup issue. :-) The hash in a CID is not just a raw blob hash. >> Files are typically chunked beforehand, assembled as a Merkle tree, and >> the CID is roughly the hash to the tree root. So it would seem we can= =E2=80=99t >> use IPFS as-is for tarballs. > > Using the Git-repo map/table, then it becomes an option, right? > Well, SWH would be a backend and IPFS could be another one. Or any > "cloudy" storage system that could appear in the future, right? Sure, why not. >> >> =E2=80=A2 If we no longer deal with tarballs but upstreams keep sig= ning >> >> tarballs (not raw directory hashes), how can we authenticate our >> >> code after the fact? >> > >> > Does Guix automatically authenticate code using signed tarballs? >> >> Not automatically; packagers are supposed to authenticate code when they >> add a package (=E2=80=98guix refresh -u=E2=80=99 does that automatically= ). > > So I miss the point of having this authentication information in the > future where upstream has disappeared. What I meant above, is that often, what we have is things like detached signatures of raw tarballs, or documents referring to a tarball hash: https://sympa.inria.fr/sympa/arc/swh-devel/2016-07/msg00009.html >> But today, we store tarball hashes, not directory hashes. > > We store what "guix hash" returns. ;-) > So it is easy to migrate from tarball hashes to whatever else. :-) True, but that other thing, as it stands, would be a nar hash (like for =E2=80=98git-fetch=E2=80=99), not a Git-tree hash (what SWH uses). > I mean, it is "(sha256 (base32" and it is easy to have also > "(sha256-tree (base32" or something like that. Right, but that first and foremost requires daemon support. It=E2=80=99s doable, but migration would have to take a long time, since th= is is touching core parts of the =E2=80=9Cprotocol=E2=80=9D. > I have not done yet the clear back-to-envelop computations. Roughly, > there are ~23 commits on average per day updating packages, so say 70% > of them are url-fetch, it is ~16 new tarballs per day, on average. > How the model using a Git-repo will scale? Because, naively the > output of "disassemble-archive" in full text (pretty-print format) for > the hello-2.10.tar is 120KB and so 16*365*120K =3D ~700Mb per year > without considering all the Git internals. Obviously, it depends on > the number of files and I do not know if hello is a representative > example. Interesting, thanks for making that calculation! We could make the format more compact if needed. Thanks, Ludo=E2=80=99.