From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id QJpeIbfELmS2WQAASxT56A (envelope-from ) for ; Thu, 06 Apr 2023 15:10:15 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id +GYqIbfELmSmuwAA9RJhRA (envelope-from ) for ; Thu, 06 Apr 2023 15:10:15 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E1A7A1153D for ; Thu, 6 Apr 2023 15:10:14 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pkPMi-0003ju-CY; Thu, 06 Apr 2023 09:09:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pkPMg-0003jd-VA for guix-devel@gnu.org; Thu, 06 Apr 2023 09:09:22 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pkPMe-00068B-Bq; Thu, 06 Apr 2023 09:09:22 -0400 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3f05f61adbeso1170735e9.0; Thu, 06 Apr 2023 06:09:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680786558; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=l1ylXcEte70poQeBXZYir71qyVlvN3YfdRc1BqyJNAY=; b=ZGCDBdzOwS/Raa4JQGOyi+nYfBMNpayHYKua9khAMN4iLR3js7K+/Kj8s3oGiGDomr SEyMGa5otv8UAZitWZjZAed08glmFEfh0f4jBIBFCDggi6zudwGr57k1GFFzTa23FKGk 8GXu+wOQTBiyAXlI07jq73tAhnVOFyV+ged94NaGLSragbNyGE5k40Dkx9wzPTXZKn6u o+1tihFKAQVkirYEq9D6s6Nok4PxwOpNrNKRpeF+MzD+GlK3eVBqaJgtrx+FfPH9ce8T qHimjVzyOmmstPayhuTRndUCC5Df5rZvjLCL2t9CMj2Rm2MXlOHdgddUuZyymOz/FnG6 VlSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680786558; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l1ylXcEte70poQeBXZYir71qyVlvN3YfdRc1BqyJNAY=; b=QvJp4xbuTHtmNexIusp6gZAJG8Zk3ytMLBxjMgcgBtK8Ry59vwakgYZIYybzVtpzkP H/kiYFmoU7tUbdpgkwG5ZofACvDMkXLwNVeNXjHzc5CRkqJu1m8V9VfjtBi1zeEYdgba F/V0JKMgD/PSR9F2yhaROpp5uC2WC2diTpaVIJZNWgzQF1TlAieK9ynLU3+EJ4XXTOHK abuDKRZ9JfJe4P++sgsO2G7DQrljCxJA+CN3h9JQDh4yvJkUQQDU0+BJlGTrDhB0M94F HvyoJGZKteu4Cr79rR/qjxrEoLU143ByjZzSo6CSTQtDITcOJ5vh6Jb0iA0STEo1Ibl9 ttuw== X-Gm-Message-State: AAQBX9d/wA1qynurbM7/kuezgTCtoNJtK7Y8pKAMCILuD16QkOkVSj+w SEa2TJgeaPD7a8qnm0Opn9K2Q2BvLbk= X-Google-Smtp-Source: AKy350ZeFm/OeHUSlXEYRgCTC9sg7xNf7H9yEnBKYZnpOOW5aYa86ciYdEZbbhoT0/ylJAx5p488Lw== X-Received: by 2002:adf:eac3:0:b0:2c7:940c:26f8 with SMTP id o3-20020adfeac3000000b002c7940c26f8mr3803470wrn.5.1680786557491; Thu, 06 Apr 2023 06:09:17 -0700 (PDT) Received: from pfiuh07 ([193.48.40.241]) by smtp.gmail.com with ESMTPSA id k6-20020adff286000000b002eaac3a9beesm1801380wro.8.2023.04.06.06.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:09:17 -0700 (PDT) From: Simon Tournier To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: Guix Devel Subject: Re: intrinsic vs extrinsic identifier: toward more robustness? In-Reply-To: <87a60cbnf7.fsf@gnu.org> References: <87jzzxd7z8.fsf@gmail.com> <87a60cbnf7.fsf@gnu.org> Date: Thu, 06 Apr 2023 14:15:56 +0200 Message-ID: <87lej5mcjn.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=zimon.toutoune@gmail.com; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1680786615; a=rsa-sha256; cv=none; b=c8X4nAogjd/XEEb9uD7HdBxN4meP+axvA989g5vg/LZul8KjAuGzn8waaGV3qhZz2sKloc 1MN4+SpqtjSx2WtfsyJXRvp96oKoUOZ40GVNE1cZNNbAnXPW2ZK0FqwEqywqFJGS2MU5b+ XVVP0eM2rZdBDz920yLY5J3InWiTlQUcRHvr00UMVfLMZynsAI6KxGNuyjZVLZMPCZr5Zi 3SwxddEv5fVBqGqJ+qTlaPlQ//Jg+hINGvzQkUtreGOb4ceNdEN8+jwQoSFERu6xIkry6f Pn5tMJROsnz2AosUrQ776NVbMhXxZ9whK6XOPheNBtELBOmQPK2m22yNwNQVRg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=ZGCDBdzO; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1680786615; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=l1ylXcEte70poQeBXZYir71qyVlvN3YfdRc1BqyJNAY=; b=MNL2iILRu1UHepwNWWgOPMuWh16BrUYg0iKoJ1jDSCkz7Is1RDMw9sYH7QaYc5PAdXd7zf oKj7h/CXdHqiNRt1fjF/NYoe1YuXJEGi/m9lV+Ja30/n+zmV1JUkMfns269xpsip2RVTQk 3kL3NLItJYtxIj5fBZP/VRruh3BThIFSskm62fel2MB/wTJCu/hRJcT70nIpFuLl7NVLtp 0ElH61nmYsBlbLHrs8xtM7fdORt6fRDB3MwEbIECOByuAtisnbzOgSOwjTO3aaWoS/TVXO bMHWbGceo75tLM7ZgTb+qb1n+5vSnmRllI9C+Ysbnq/Xs4P4aUrarNo8xGg4hA== Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=ZGCDBdzO; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Scanner: scn0.migadu.com X-Migadu-Spam-Score: -7.25 X-Spam-Score: -7.25 X-Migadu-Queue-Id: E1A7A1153D X-TUID: 9ntevmDti5L9 Hi, On jeu., 16 mars 2023 at 18:45, Ludovic Court=C3=A8s wrote: >> For sure, we have to fix the holes and bugs. :-) However, I am asking >> what we could add for having more robustness on the long term. > Sources (fixed-output derivations) are already content-addressed, by > definition (I prefer =E2=80=9Ccontent addressing=E2=80=9D over =E2=80=9Ci= ntrinsic > identification=E2=80=9D because that=E2=80=99s a more widely recognized t= erm). This is the case when you consider that the result of the fixed-output derivation is already inside the Guix =E2=80=9Cecosystem=E2=80=9D=E2=80=A6 > In a way, like Maxime way saying, the URL/URI is just a hint; what > matters it the content hash that appears in the origin. =E2=80=A6but else URL/URI is not just a =E2=80=9Chint=E2=80=9C. Or could y= ou explain what you mean by a =E2=80=9Chint=E2=80=9D? Maybe I misunderstand something, from my understanding, URL/URI is a =E2=80=9Chint=E2=80=9D only when substitutes is available, else Guix relies= on plain URL/URI for fetching data. --8<---------------cut here---------------start------------->8--- $ guix build hello -S --no-substitutes --check The following derivation will be built: /gnu/store/3hxraqxb0zklq065zjrxcs199ynmvicy-hello-2.12.1.tar.gz.drv building /gnu/store/3hxraqxb0zklq065zjrxcs199ynmvicy-hello-2.12.1.tar.gz.dr= v... Starting download of /gnu/store/1s6xba6nafkxb242kafkg3x10jkdn2n9-hello-2.12= .1.tar.gz >From https://ftpmirror.gnu.org/gnu/hello/hello-2.12.1.tar.gz... following redirection to `https://mirror.cyberbits.eu/gnu/hello/hello-2.12.= 1.tar.gz'... downloading from https://ftpmirror.gnu.org/gnu/hello/hello-2.12.1.tar.gz ... warning: rewriting hashes in `/gnu/store/3dq55rw99wdc4g4wblz7xikc8a2jy7a3-h= ello-2.12.1.tar.gz'; cross fingers --8<---------------cut here---------------end--------------->8--- Other said, when speaking about robustness (broad meaning), I think we cannot assume that the =E2=80=9Ccontent addressing=E2=80=9D provided by the= derivation, --8<---------------cut here---------------start------------->8--- Derive ([("out","/gnu/store/3dq55rw99wdc4g4wblz7xikc8a2jy7a3-hello-2.12.1.tar.gz",= "sha256","8d99142afd92576f30b0cd7cb42a8dc6809998bc5d607d88761f512e26c7db20"= )] ,[] ,["/gnu/store/0mxnx8l4fgigvd7gakwdk6hc6im4wnai-disarchive-mirrors","/gnu/s= tore/ckxc05iflc8jagdxwh4z1cxc23mb6i6q-mirrors","/gnu/store/wg1yp2vx8gb7qmcg= yibqnwblahpp4bjg-content-addressed-mirrors"] ,"x86_64-linux","builtin:download",[] ,[("content-addressed-mirrors","/gnu/store/wg1yp2vx8gb7qmcgyibqnwblahpp4bj= g-content-addressed-mirrors") ,("disarchive-mirrors","/gnu/store/0mxnx8l4fgigvd7gakwdk6hc6im4wnai-disa= rchive-mirrors") ,("impureEnvVars","http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMN= S") ,("mirrors","/gnu/store/ckxc05iflc8jagdxwh4z1cxc23mb6i6q-mirrors") ,("out","/gnu/store/3dq55rw99wdc4g4wblz7xikc8a2jy7a3-hello-2.12.1.tar.gz= ") ,("preferLocalBuild","1") ,("url","\"mirror://gnu/hello/hello-2.12.1.tar.gz\"")]) --8<---------------cut here---------------end--------------->8--- is still there and instead it would mean Guix has to rely on another system (here =E2=80=99url=E2=80=99). Somehow, I am proposing to optionally= add more =E2=80=9Ccontent addressing=E2=80=9D than the current NAR+SHA256 (and URL/U= RI) to then be able to exploit other =E2=80=9Ccontent addressing=E2=80=9C systems. > So it seems to me that the basics are already in place. Well, there is two possible choices: (1) rely on an external service that would be bridge the different content addressing systems (as extending the Disarchive database or hope SWH will do it :-)) but this other external service needs to be always available or (2) extend the information of packages (optional fields, etc.). Moreover about (1), all third-party channels would have to be ingested by this external service. About SWH, that=E2=80=99s possible. About Disar= chive database, it would mean register this third-party channel or maintain their own database. Contrary to (2) where the identifier would be optionally part of the package definition. > What=E2=80=99s missing, both in SWH and in Guix, is the ability to store > multiple hashes. SWH could certainly store several hashes, computed > using different serialization and hash algorithm combinations. Please note that currently Guix relies on a =E2=80=9Chint=E2=80=9C when SWH= is used as fallback. For instance, consider most of the cases of git-fetch, Guix provides to the SWH API the context (URL and Git tag) and let SWH resolves in order to find the content addressing identifier. It works for many cases but it fails for history of history cases, e.g., when upstream does in-place tag replacement. And this strategy does not work with Subversion (svn-fetch) or Mercurial (hg-fetch) or else. It requires more work on our side (parse the result of the query, extract relevant information etc.). Nothing impossible but far to be done, IMHO. :-) Well, I still have mixed feelings about the SWH fallback robustness. :-) > This is what you suggested at > ; it was > also discussed in the thread at > . It > would be awesome if SWH would store Nar hashes; that would solve all our > problems, as you explained. Yeah that=E2=80=99s nice. :-) The progress is tracked by, https://gitlab.softwareheritage.org/swh/meta/-/issues/4979 and the first part for computing NAR is now merged, IIUC, with: https://gitlab.softwareheritage.org/swh/devel/swh-loader-core/-/merge_r= equests/459 However, exposing via their API this NAR and then bridging NAR -> swhid is not planned on SWH side yet, AFAIK. > The other option=E2=80=94storing multiple hashes for each origin in Guix= =E2=80=94doesn=E2=80=99t > sound practical: I can=E2=80=99t imagine packages storing and updating mo= re than > one content hash per package. That doesn=E2=80=99t sound reasonable. Pl= us it > would be a long-term solution and wouldn=E2=80=99t help today. Storing a list of content addressing identifiers (NAR+SHA256, Git+SHA1, GNUnet, IPFS, etc.) would allow to add robustness, IMHO. Other said, it is not affordable to have a =E2=80=99gnunet-fetch=E2=80=99 m= ethod as proposed in [1] but we could optionally have, (origin (method url-fetch) (uri (string-append "mirror://gnu/hello/hello-" version ".tar.gz")) (sha256 (base32 "086vqwk2wl8zfs47sq2xpjc9k066ilmb8z6dn0q6ymwjzlm196cd")) (identifiers (list (gnunet "Y48PGS5RVX643NT2B7GDNFCBT4DWG692PF4YNHERR96K6MSFRZ4ZWRPQ4= KVKZV29MGRZTWAMY9ETTST4B6VFM47JR2JS5PWBTPVXB0.8A9HRYABJ7HDA7B0") (git+sha1 "swh:1:dir:013573086777370b558b1a9ecb6d0dca9bb8ea18") (none+sha1 "8f261739d33d31867ab9c5fa26f973c37da26ca5")))) And we could also have Git commit hash (for packages using git-fetch method), etc. Having an optional field =E2=80=99identifiers=E2=80=99 would allow to help = today for all other fetch methods than url-fetch and git-fetch. For sure, it is not straightforward. For instance, how to insure the consistency? Via =E2=80=9Cguix lint=E2=80=9D? Else?=20 Well, on the other hand, sometimes I would like to have a list of sources using different fetch method, say try first using this url-fetch and then this git-fetch and then this SWH fallback, etc. To me the other viable option would be to extend the Disarchive database and services around. Thought? Cheers, simon 1: https://issues.guix.gnu.org/44199#0-lineno68