From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id aMoNARSuzWFyAwEAgWs5BA (envelope-from ) for ; Thu, 30 Dec 2021 14:03:16 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id sNIBOhOuzWHhDgAA9RJhRA (envelope-from ) for ; Thu, 30 Dec 2021 14:03:15 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 4988830530 for ; Thu, 30 Dec 2021 14:03:15 +0100 (CET) Received: from localhost ([::1]:50562 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n2v5O-0002O2-Cg for larch@yhetil.org; Thu, 30 Dec 2021 08:03:14 -0500 Received: from eggs.gnu.org ([209.51.188.92]:55846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n2upL-0003we-C2 for guix-devel@gnu.org; Thu, 30 Dec 2021 07:46:39 -0500 Received: from [2a00:1450:4864:20::431] (port=46654 helo=mail-wr1-x431.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n2upJ-0006g2-2q for guix-devel@gnu.org; Thu, 30 Dec 2021 07:46:38 -0500 Received: by mail-wr1-x431.google.com with SMTP id i22so50187373wrb.13 for ; Thu, 30 Dec 2021 04:46:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:in-reply-to:references:disposition-notification-to :date:message-id:mime-version:content-transfer-encoding; bh=3iSKxXQegUa6ZoMIM4CWZ5OP+X4KmH/6HDh4Q78W8Is=; b=KJweVy3Bz+c+LK0+KOE8/If5JkiXf7tUJ74e1qMGi5y/S/nCaqJNWd0fUY+xoYVPKw 4j7Lo1ncbIaL+ElXRlYs/8ecItkwJPjmanBIiMiC4o1Pu5Eb1ItsvSeEhtkv53UqhkKL W1CxRPRaFnqT358FPrPJ9D5Zl5Nf1PQYwiEVOr3MgaHeewe6D3E4V3rH0hjLtpOSrE26 SUtvnZg+SaBoD8PqhyZvObKv0gm7XJUxqyfYPh+Cz9ngSKiRjWXZiDPmi93MtLJ/BiRw bbl/HzGJBbkS3LFC9JMgdpBf/TTlFPj1rhIJ2+p20p91sfX1xJQl4kmnK0SEZctMg1yT Lzcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:in-reply-to:references :disposition-notification-to:date:message-id:mime-version :content-transfer-encoding; bh=3iSKxXQegUa6ZoMIM4CWZ5OP+X4KmH/6HDh4Q78W8Is=; b=SCuYf9XwwAuNSAo3ARnJH9+1UV4wa3nmuVFygzUT+LdxUOQb4okzPp38dZkJzJXZO1 7i5HgNt8AAZlrQrFq/lJCwtLyBJXUg5J1Vlmh9JIOhO9Qzc2nLe5unDLDv6wB5MaEFpy v/CLkGETlZeCZRH3rgwuUk2WRWdfGguRDYDurDzIgchye1Lih+O6mZDCkD2X7XpQD1XL fyiKtNUlnax6pl8o3raQK+k8XhcnU0LOXHYmqFj6e9Nw6uTOYvG6nwRUJnHEjvXg1ihU kfiGApf9lH9KvHvOzp2dFdkfpHOQq9pgpPa8/Qf4yHRPgPQjxg4gbJ/f5dQTWtomymC4 97IQ== X-Gm-Message-State: AOAM531bC7Cic9+pwshZe963JOcCkBuJGbysN0icyeZ6AjsgGKVgLO7Y CD+nq73/ybAhKTJX1PD43AVMf3AQ5ks= X-Google-Smtp-Source: ABdhPJwDbwLkwbSTZ1EhRFfv/hbMqm0Pxs9QlWVX+QM2+ffe4WgVE3XLLlUZIQbaKPGLb9T170qFlg== X-Received: by 2002:adf:ea44:: with SMTP id j4mr16268435wrn.74.1640868395064; Thu, 30 Dec 2021 04:46:35 -0800 (PST) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id h14sm23866199wrz.31.2021.12.30.04.46.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Dec 2021 04:46:34 -0800 (PST) From: zimoun To: Liliana Marie Prikler , guix-devel@gnu.org Subject: Re: On raw strings in commit field In-Reply-To: <899587fb6a76ddfa37d197d3d0fd23cdc7ad8592.camel@gmail.com> References: <6e451a878b749d4afb6eede9b476e5faabb0d609.camel@gmail.com> <86y243kdoo.fsf@gmail.com> <899587fb6a76ddfa37d197d3d0fd23cdc7ad8592.camel@gmail.com> Date: Thu, 30 Dec 2021 13:43:40 +0100 Message-ID: <867dbmi7pf.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:1450:4864:20::431 (failed) Received-SPF: pass client-ip=2a00:1450:4864:20::431; envelope-from=zimon.toutoune@gmail.com; helo=mail-wr1-x431.google.com X-Spam_score_int: 11 X-Spam_score: 1.1 X-Spam_bar: + X-Spam_report: (1.1 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, GB_FREEMAIL_DISPTO=0.499, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1640869395; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=3iSKxXQegUa6ZoMIM4CWZ5OP+X4KmH/6HDh4Q78W8Is=; b=PPIEkQNrXYeZXKuEY4Ik3k1lIBZMiHlSpyW5pjnfYhkVm7NbKyVHtELhy8Shs6Q+uQ2UFI Coqcjt+vAUC5TDEuusjqWO0HNgMn2wM8ElxKkgsDzldZliGhitfrdwtPIH6+RXf8pjrUVc 7M8AuYhonBIro5LoaCgDGZhEnsdbGmAxsPYHEViHHGHArq3CCb106ld9gPIcSu2pT9sap+ Jlbvwi5DIxuA+2aGbjgDosxwzmH0rSxP85D7dqHOGmnmshBfAjI/hKcfiKYQozm+bP+6j3 glDDPK6ux5zy1ZGGjReWglzUzfWRfe76D4rztL/wb7NN69fsytI8CFUavbF44A== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1640869395; a=rsa-sha256; cv=none; b=GDqG/roSlsg3aIq0V1myLKqVXHEdEWAYymSHxpsh3eGN/AhvuxHKHRtrulryr21X2fznmC 6Qz2Cz/QlRbtQ7a+6Kx3xtAdEKgu72vWwMdww9zK5la27he/hzF5hMKnAkyuW7oEg4Y6f3 050lheuix7K21E3vT3dV2+fBdVWowVCzlmW4ijoxxDFlhaDam1JJSLXYS6Ro2eC9WpR71c BvK1niulSRPSlhu62H5ihnZcnE2SXKavLsuWFpW2XfE3WylechRK3t12R8ZYAigXYbUhrr hCzdt3Y0zqj4uLwZ6lrSnc/QUv1+Cae1HLlwmrCRE5xKhHokk5rPqYU7QOREJQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20210112 header.b=KJweVy3B; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -2.47 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20210112 header.b=KJweVy3B; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 4988830530 X-Spam-Score: -2.47 X-Migadu-Scanner: scn0.migadu.com X-TUID: Hm2Iox3lU2WZ Hi Liliana, On Wed, 29 Dec 2021 at 21:25, Liliana Marie Prikler wrote: > Am Mittwoch, dem 29.12.2021 um 09:39 +0100 schrieb zimoun: >> On Tue, 28 Dec 2021 at 21:55, Liliana Marie Prikler >> wrote: > The notion of equivalence I am using here is the same as in the > statement "5 =E2=89=A1 2 mod 3", wherein the =E2=89=A1 symbol is ironical= ly called > IDENTICAL TO in Unicode despite being used very differently in > mathematics. =C2=A0Perhaps there is a language barrier here; in German we > read that as "5 is equivalent to 2 modulo 3" and logic equivalence > functions similarly. I do not understand against what you are arguing so I skip it. :-) > For the record, one could argue that I should have used that symbol for > comparing Guix "1.2.3" to upstream "v1.2.3" because they are in fact > not equal, only equivalent, but that's besides the point. The point > is, with an upstream behaving as we want upstreams to behave (not just > git ones, url-fetch suffers from the same issue with moving tarballs > for instance), you can substitute one for the other without a change in > meaning; both will fetch the same commit. If I understand you correctly: - Guix "1.2.3" means the field =E2=80=99version=E2=80=99 - upstream =E2=80=9Cv1.2.3=E2=80=9D means the upstream tag used by the fie= ld =E2=80=99commit=E2=80=99 of =E2=80=99git-reference=E2=80=99. and yes it is strongly expected that these both fields matches. :-) But it is irrelevant, IMHO, to your initial message =C2=ABcommit tags are in principle mutable and hence can not be relied on when fetching sources. I do have a few issues with that explanation=C2=BB. It is fortunate and not robust that =E2=80=99commit=E2=80=99 matches =E2=80=99version=E2=80=99 via = upstream =E2=80=99tag=E2=80=99. Because how =E2=80=99commit=E2=80=99 and =E2=80=99tag=E2=80=99 are defined = is different. I cannot tell it differently than: Git commit depends only on the content, although =E2=80=99tag=E2=80=99 not. Version (or tag) is convenient names for humans. It is easier to tell version 0.23.1 than 09rdbcr8dinzijyx9h940ann91yjlbg0fangx365llhvy354n840. And we can deduce that 0.22.3 is older than 0.23.1, when it is impossible for commits. If you prefer to keep the frame: =C2=AByou can substitute one for the other without a change in meaning=C2=BB, then, for what my opinion is worth on th= at matter, my probably wrong understanding of your words is that perhaps you are missing a point about content-addressability. >> From the content to the hash, three keys: 1) how to serialize and 2) >> how to hash and 3) how to represent the hash.=C2=A0 For #1, Git uses the= ir >> own serializer and Guix, inheriting from Nix, uses another (Nar); >> although the difference is minor.=C2=A0 For #2, Git uses by default SHA-= 1 as >> hash function, although Guix uses SHA-256.=C2=A0 And for #3, Git uses >> hexadecimal format and Guix uses nix-base32. [...] >> To make it explicit, the checksum hash of =E2=80=99git-reference=E2=80= =99 could be >> removed because it is somehow redundant with the commit hash. >> Obviously, it cannot because security reason (SHA-1 is considered as >> weak). > > The other way also works. If Git used a secure hashing function such > as SHA-256 (or SHA-512 or Keccak) and Guix supported that hash, we > could generate a git hash from the Guix hash (assuming also we allow > the origin serializer to be configured, which would be required either > way). Yes somehow. To be on the same wavelength, we need to be precise when we speak about hash here because hash means: - serializer: how to deal with all the bits making the full content (files, folder, tree, etc.) - hashing function - format So yes, on principles, instead of NAR + SHA-256 + Nix-base32, the Guix project could have chosen Git + SHA-1 + Hex, or Git + SHA-512 + Base64 or any other combinations. (I think this choice inherited from Nix is rooted in daemon implementations and another triplet would have been more changes when starting Guix, I guess.) However, knowing only the final Guix checksum hash (NAR + SHA-256 + Nix-base32), say 09rdbcr8dinzijyx9h940ann91yjlbg0fangx365llhvy354n840, you can easily replace by any other formats (Hex or Base64), but it is not straightforward to compute the Git commit hash (here c78b91edb7c17c6fbf3b294452f44e91d75e3c67) from this Guix checksum hash, because the serializer NAR and Git have minor differences, and mainly because one uses SHA-256 and the other SHA-1 =E2=80=93 and it is generally = not possible to convert the hash from one hashing function to another hashing function. To make it short, my point is: a) a Git commit hash owns the same properties as any checksum hash and b) a string tag is obviously not a checksum. > I don't know too much about Disarchive here, so please enlighten me.=20 > If it used a pair of origin file name + hash, whether or not the git- > reference uses tags would be irrelevant, no? Do we have to take values > from the uri field? I am not sure to understand the questions. Maybe the thread starting here is worth: Otherwise, could you explain more what you have in mind? >> To me, robustness means make a map from intrinsic values to content; >> as Disarchive is doing for instance. > > See above, I don't understand why Disarchive would need more than the > content hash as an intrinsic value to do so. Basically nothing more, so nothing to understand. :-) Your initial messages started with: when Ricardo recently added guile-aiscm to Guix, I was confused that both the version field of the package and the commit field of the git- reference used in its origin. It turns out, that this is a rare pattern observed in less than 200 packages currently in Guix. The reason to do so (as far as I understand and was explained to me in IRC) is that commit tags are in principle mutable and hence can not be relied on when fetching sources. I do have a few issues with that explanation, but before that let's go a step back and discuss the relation of version and commit. and my intent was to point the reason is not really the =E2=80=9Cmutable=E2= =80=9D part but the reason is because it is better to rely on intrinsic values (discussed in link above). Obviously, intrinsic value is immutable but, IMHO, intrinsic value is somehow a key-point for lookup in content-address systems. Git-commit hash is one way, SWH-ID is another, IPFS uses another, GNUnet another, etc. The recent ERIS [1,2] is an attempt to bridge, IIUC. Addressing =E2=80=99origin=E2=80=99 by intrinsic values implies which ones = and The Right Thing is really hard to predict. My opinion is that robust long-term =E2=80=93 i.e., near future I want =E2= =80=93 is to rely on more intrinsic values in =E2=80=99source=E2=80=99 or =E2=80=99origi= n=E2=80=99 and less tags, urls, etc. Well, I am fine if we disagree. You asked =C2=ABWhat do y'all think?=C2=BB, now you know what I think. :-) Last, sorry if I am misunderstanding you, back to your initial message. You provided =E2=80=99guile-aiscm=E2=80=99 as one example of something that= confused you. Instead of the current definition, you would like this definition --8<---------------cut here---------------start------------->8--- 1 file changed, 1 insertion(+), 1 deletion(-) gnu/packages/machine-learning.scm | 2 +- modified gnu/packages/machine-learning.scm @@ -299,7 +299,7 @@ (define-public guile-aiscm (method git-fetch) (uri (git-reference (url "https://github.com/wedesoft/aiscm") - (commit "c78b91edb7c17c6fbf3b294452f44e91d75e3c67"))) + (commit (string-append "v" version)))) (file-name (git-file-name name version)) (sha256 (base32 --8<---------------cut here---------------end--------------->8--- ? Or something like along these lines, --8<---------------cut here---------------start------------->8--- (define-public guile-aiscm (let ((version "0.23.1") (commit "c78b91edb7c17c6fbf3b294452f44e91d75e3c67") (revision "0")) (package (name "guile-aiscm") (version (git-version version revision commit)) (source (origin (method git-fetch) (uri (git-reference (url "https://github.com/wedesoft/aiscm") (commit commit))) (file-name (git-file-name name version)) (sha256 (base32 "09rdbcr8dinzijyx9h940ann91yjlbg0fangx365llhvy354n840")))) [..] --8<---------------cut here---------------end--------------->8--- ? And your point is that =E2=80=9C0.23.1=E2=80=9D is redundant with =E2=80=9Cc78b91edb7c17c6fbf3b294452f44e91d75e3c67=E2=80=9D because Git so w= hy not just use =E2=80=9C0.23.1=E2=80=9D in =E2=80=99origin=E2=80=99. Right? In the current matter of facts, I do not think any rationale can be made in favor of one of the three main possible definitions (addressing by tag, by commit, using let). The only weak justification for addressing using commit hash is that the lookup when fallbacking to SWH is easier, i.e., it is easier when the Git-commit hash is known instead of URL+tag. These 200 packages can also be seen as real-world experiments complementing the other ways of addressing in order to find The Right Way for robust addressing. My personal preference, for what it is worth, is an explicit reference to the commit, i.e., the current definition or the =E2=80=99let=E2=80=99 on= e. Note it was also discussed this: have convenient things as url+tag for =E2=80=99uri= =E2=80=99 and use checksum coupled to an external service as disarchive.guix.gnu.org; but the definitions would be not self-consistent anymore. Heh, The Right Thing is not obvious. :-) Other said, version and tag are currently first-class when commit is second-class, somehow. As you said =C2=ABit allows us to derive commit from tag=C2=BB (tag is mine). And I think it is inherited from the long history about releasing software which is now somehow inadequate these days. Obviously, I do not know how to do but it should be the contrary: commit first-class which allows us to derive version second-class. 1: 2: Cheers, simon PS: You said in initial email =C2=AB(1) is more convenient; it allows us to derive commit from version, which is often done through an affine mapping.=C2=BB. I do not understand the =E2=80=9Caffine mapping=E2=80=9D. Why would it be = an affine mapping? Well, I miss what is the affine space here, I am able to imagine the set but what would be the vector space? Bah you are probably referring to maths I have never studied. :-)