From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:303:e16b::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms8.migadu.com with LMTPS id eORLOW7rs2VnPQEA62LTzQ:P1 (envelope-from ) for ; Fri, 26 Jan 2024 18:27:11 +0100 Received: from aspmx1.migadu.com ([2001:41d0:303:e16b::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id eORLOW7rs2VnPQEA62LTzQ (envelope-from ) for ; Fri, 26 Jan 2024 18:27:11 +0100 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=jVymmtjE; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gnu.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1706290030; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=Lwb26KY2vEcJEDw9p47u6Xvyqyw92y9f72CIRcaz1RE=; b=Vipj1f4Oo9/U+kuofMKSZ0wTqbC8kHZFS17O69PwLORm8InvRUUcyD7uqMbswGAePl+QAJ 6s+2apt5h/QmmCnE88MWcFyKrRp/UXVpYZhwWsgFKH48/23KD2LdByzEm6E/w0lr63ENeV YxsIX1WScpbt3uPqo0ELCO01of3JDGX5RxO5la6hKLVk1wwtoY9eBrTjtTP9MhRPy3HenM f/y/kIq+iGvOjXHPl6t8r0rFpgAQ37PGIpySg9DrP//DSd9an6pYQCljTq4rdAw/+aSgId ZBpLF0Li6sGupqDt8spY738+KuR9Ezg5O/RjNu2/Qubt3Jzjp0LPFfuMUNXkxQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=jVymmtjE; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gnu.org ARC-Seal: i=1; s=key1; d=yhetil.org; t=1706290030; a=rsa-sha256; cv=none; b=a0BvLR3IkYnmo4KeIPfyV3JrWEtYPamVUgIySooSyPkVimVsouQJkJI10f/Wuz+qWVWei5 MMTh/WhAlyIviKWsz8VtYt9cdf6PBrx8KWC/yqMwae9e+htPAwbsu5zg0e1R8ZESBX/EdM MDZQEuEnJdByA6V0nCtD339A2whU6hdX+G4wHJCmVaYE4csg5vIbEKyeP4WOZEvZOT7sFb G6aE/JIFy1k0OE4MLJ9+sXtCCZmOJJOgs1grJ5fcRFdOIEQARTSi75sdotpJj/ivfRNmYf 8ZLC863AK+RAE9c+hBFvaoyZOvT2uyvkl536ItaGss2Mo/rl9xLWtUvuAA0H+Q== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id B59845007D for ; Fri, 26 Jan 2024 18:27:10 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rTPyj-0006EX-L7; Fri, 26 Jan 2024 12:26:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPyh-0006E6-Pl for guix-patches@gnu.org; Fri, 26 Jan 2024 12:26:55 -0500 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rTPyh-0000Ha-HZ for guix-patches@gnu.org; Fri, 26 Jan 2024 12:26:55 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rTPyo-0004kO-1V for guix-patches@gnu.org; Fri, 26 Jan 2024 12:27:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#68741] [PATCH 0/6] Content-addressed downloads from Software Heritage Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 26 Jan 2024 17:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 68741 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 68741@debbugs.gnu.org Cc: Timothy Sample , Antoine R. Dumont (@ardumont) Received: via spool by 68741-submit@debbugs.gnu.org id=B68741.170628996818158 (code B ref 68741); Fri, 26 Jan 2024 17:27:01 +0000 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:26:08 +0000 Received: from localhost ([127.0.0.1]:52705 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxv-0004iY-Il for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:26:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:50928) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxh-0004fx-49 for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:56 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxT-00004y-Ua; Fri, 26 Jan 2024 12:25:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=Lwb26KY2vEcJEDw9p47u6Xvyqyw92y9f72CIRcaz1RE=; b=jVymmtjE8iWENk5x0uk2 ApbcbuBSgnGabA8WbKKT4xsoL3PgkHI15iT75xc86RrHucAueIIly4SKJ5dgZz+QcTJQrjPLKioGA xVEoNFjZAFdAL/cuC5l1QFvDb0Okx0DlN+ODRoAYKPkm+Bm6uyJ+Dgm9GPwcLv24QWNcOq/4Hp67j voiMGYWAzxarXVgG5mNYal79SPlyrrAYROok8lFS3GRhRBmUf+Sak0x1p0B8eVGFzcx5fRfwkRghF SzHZDdDWoSB1fPOm70T6blNaUA6ZYgswDQ5CkqBWihZgv3AxMlcM8j6l8qi3Q/89ge3wIsjV0u/qf MfEfcsGdpz65Vg==; From: Ludovic =?UTF-8?Q?Court=C3=A8s?= In-Reply-To: ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Fri, 26 Jan 2024 18:16:40 +0100") References: Date: Fri, 26 Jan 2024 18:25:37 +0100 Message-ID: <87y1ccm2ge.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: guix-patches-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -7.36 X-Migadu-Scanner: mx13.migadu.com X-Spam-Score: -7.36 X-Migadu-Queue-Id: B59845007D X-TUID: gw2neCJx7v3e Oops, I forgot to Cc: the fine people for the cover letter; fixed! See . Ludovic Court=C3=A8s skribis: > Hello Guix! > > For those who=E2=80=99ve been following along, you might remember that the > main impedance mismatch between SWH and Guix is that SWH uses Git > tree SHA1 hashes to identify directories whereas Guix uses nar SHA256 > hashes (and possibly other hash functions in the future): > > https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to= -a-long-term-source-code-archive/ > > Because of this, the SWH fallback path for =E2=80=98git-download=E2=80=99= had two > options: > > 1. If =E2=80=98git-reference=E2=80=99 specifies a full SHA1 commit ID, = it would > look it up on SWH and fetch it. > > 2. If =E2=80=98git-reference=E2=80=99 specifies a tag, which is perhaps= the > majority of cases, Guix would ask SWH the commit that once > corresponded to that tag at that URL, and then fetch it. > > Case #1 is ideal: it=E2=80=99s content-addressed. Case #2 is brittle: we= =E2=80=99re > hoping that the tag hasn=E2=80=99t been modified and that the URL hasn=E2= =80=99t been > reused for something else; if that=E2=80=99s not the case, SWH might retu= rn > the =E2=80=9Cwrong=E2=80=9D commit and we end up fetching something unrel= ated. > > The good news is that our friends at SWH have just deployed a new > version of their code that lets us look up directories by some > =E2=80=9Cexternal identifier=E2=80=9D (=E2=80=9CExtID=E2=80=9D), among wh= ich there=E2=80=99s =E2=80=98nar-sha256=E2=80=99: > > https://archive.softwareheritage.org/api/1/extid/doc/ > > And that, my friends, makes a huge difference: the impedance mismatch > is gone, we can now use content-addressing to fetch our stuff from SWH!! > And that works not just for Git, but also for Mercurial, SVN, CVS, etc. > > Well, there=E2=80=99s a caveat: currently the =E2=80=98nar-sha256=E2=80= =99 is added only on > new visits and it=E2=80=99s apparently not being added yet for Mercurial = for > unclear reasons. So right now, we can get guile-sqlite3 0.1.3 (Git) by > nar-sha256, but we cannot get guile-wisp (hg) nor in fact most things. > That=E2=80=99ll improve over time though, and SWH comrades are open to ad= ding > those ExtIDs retroactively. > > The patches that follow do several things: > > 1. Follow redirects in the Vault: (guix swh) previously did not > do that (oops!) but the newly-deployed Vault now responds with > 302 redirects so we have to handle that. > > 2. Add bindings for the ExtID HTTP interface. > > 3. Add =E2=80=98swh-download-directory-by-nar-hash=E2=80=99, which does= what it > says. > > 4. Use that as the preferred fallback method for =E2=80=98git-fetch=E2= =80=99. > > Here=E2=80=99s a REPLshot: > > scheme@(guile-user)> (lookup-external-id "nar-sha256" (content-hash-value= (origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) ) > $43 =3D #< value: "0b56ba94c2b83b8f74e3772887c1109135802eb3e= 8962b628377987fe97e1e63" type: "nar-sha256" version: 0 target: "swh:1:dir:8= 4a8b34591712c0a90bab0af604188bcd1fe3153" target-url: "https://archive.softw= areheritage.org/swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153"> > scheme@(guile-user)> (swh-download-directory-by-nar-hash (content-hash-va= lue(origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) '= sha256 "/tmp/gsql") > SWH: found directory with nar-sha256 hash 0b56ba94c2b83b8f74e3772887c1109= 135802eb3e8962b628377987fe97e1e63 at 'swh:1:dir:84a8b34591712c0a90bab0af604= 188bcd1fe3153' > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/.gitignore > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/AUTHORS > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING.LESSER > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ChangeLog > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/Makefile.am > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/NEWS > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/README > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/guile.am > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/test-driver.= scm > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/configure.ac > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/env.in > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/sqlite3.scm.in > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/basic.scm > $46 =3D #t > > Huge thanks to everyone over at #swh-devel for helping me out > over the past few days! > > Next tasks: implement download fallback for =E2=80=98hg-fetch=E2=80=99, c= hange > =E2=80=98guix lint -c archival=E2=80=99 to make =E2=80=98save-origin=E2= =80=99 requests not just > for Git repos, assess the situation with SVN and sub-directories > to see what can be done. > > Thoughts? > > Ludo=E2=80=99. > > PS: Apologies for the wall of text! > > Ludovic Court=C3=A8s (6): > swh: =E2=80=98vault-fetch=E2=80=99 follows redirects. > swh: Add bindings for the =E2=80=9CExtID=E2=80=9D API. > swh: Add =E2=80=98swh-download-directory-by-nar-hash=E2=80=99. > lint: archival: Check with =E2=80=98lookup-directory-by-nar-hash=E2=80= =99. > git-download: Download from SWH by nar hash when possible. > swh: Fix docstring of =E2=80=98lookup-directory=E2=80=99. > > guix/build/git.scm | 20 ++++-- > guix/git-download.scm | 4 +- > guix/lint.scm | 28 +++++--- > guix/scripts/perform-download.scm | 4 +- > guix/swh.scm | 113 ++++++++++++++++++++++++++---- > tests/lint.scm | 33 +++++++-- > tests/swh.scm | 21 +++++- > 7 files changed, 189 insertions(+), 34 deletions(-) > > > base-commit: 8bee6bb9aaaf35c36fe325675d1eb2daebd69c25