unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Liliana Marie Prikler <liliana.prikler@gmail.com>
To: Mark H Weaver <mhw@netris.org>, guix-devel@gnu.org
Subject: Re: On raw strings in <origin> commit field
Date: Sat, 01 Jan 2022 12:12:33 +0100	[thread overview]
Message-ID: <b72eb9430694fe7bccb6b37f9cfce7d8e47f1385.camel@gmail.com> (raw)
In-Reply-To: <871r1smdu6.fsf@netris.org>

Am Freitag, dem 31.12.2021 um 20:41 -0500 schrieb Mark H Weaver:
> I disagree with the last line above.  What makes you think that I'm
> presupposing that the tag does change?
> 
> There's a difference between "presupposing that the tag does change"
> and "not assuming that the tag will not change".  Do you see the
> difference?
I'm pretty sure ¬assume(¬X) = assume(¬¬X) in this concept.  You have to
start with some assumptions and while ideally we'd like to encode "I
don't care", we do not have a system that allows us to do so.

> > However, if we are always talking about more than one possible
> > "1.2.3" (with the included future tag that we have yet to witness),
> > we lose the basis by which we currently assign "1.2.3" as the
> > version 
> 
> I see what you're getting at here, but still I disagree.  Our basis
> for associating version "1.2.3" with commit XYZ is simply that
> upstream had indicated that version "1.2.3" was commit XYZ.  That
> historical fact is immutable.
History is a social construct, it's not immutable.

> If upstream later indicates that version "1.2.3" is now commit YYZ, I
> don't think that invalidates our basis for continuing to associate
> version "1.2.3" with commit XYZ.  The aforementioned immutable
> historical fact still remains our basis and justification for making
> that association.
I'm pretty sure it does, particularly to a future observer who may not
have the luxury of a history to distinguish that record from one in
which a malicious committer linked those versions and tag together and
then no one bothered to check.

> Perhaps some people would prefer to use a distro where version
> "1.2.3" of package FOO could mean a different thing tomorrow than it
> means today.  Personally, that's not what I want.
> 
> If upstream changes their mind about the meaning of version "1.2.3",
> I want that to correspond to a different version number in Guix,
> perhaps "1.2.3a" or something, as Taylan suggested.  Incidentally, I
> vaguely recall that we've done that in the past, but I don't know if
> we've done it consistently.
The entire point here is to use git-version in combination with let-
bound commit hashes, yes.

> > As pointed out elsewhere, SWH keeps a history of the tags that we
> > could look up until one matches,
> 
> Only if SWH took a snapshot at the right time.  I would guess that
> mutations of release tags usually happen within a few days after the
> release tag is first created.  Relying on SWH to take a snapshot
> within that possibly quite small time interval doesn't sound very
> robust to me.
If the scenario is "a few days within release" vs "literally forever",
there would for one only be a relatively short range of bad Guix
versions having a broken reference (limiting impact) and for another,
the majority of the audience would also associate the latter with said
version.  There's really no good argument from the robustness side to
be had here.

> > and there'd also be the option to keep a secondary index ourselves
> > (or have a third party do it).
> 
> That's true, but then we'd be adding another piece of centralized
> infrastructure that users would need to rely upon in order to
> reliably reproduce their systems.  That infrastructure would have to
> be maintained indefinitely.  If we failed to keep up maintenance, the
> users could run into problems reproducing their older systems.
> 
> It seems to me clearly better to avoid relying on a piece of
> centralized infrastructure if it can be easily avoided, no?
We can make that a key-value store for which you write a distributed
MapReduce function in Erlang if it makes you happier.

> > > On the other hand, if we refer to git _commit hashes_, then it
> > > *is* feasible for us to fetch the archived source from SWH,
> > > regardless of what upstream has done to its tags in the meantime.
> > > 
> > > For that reason alone, I think that the way Ricardo wrote the
> > > guile-aiscm package definition is clearly the right approach,
> > > given Guix's longstanding goals.
> > To me, it rather sounds like a workaround for longstanding bugs [1,
> > 2].
> [...]
> > [1] https://issues.guix.gnu.org/28659
> > [2] https://issues.guix.gnu.org/39575
> 
> I don't understand how it's a workaround for those bugs.  Even if
> those bugs were fixed, we'd still need a reliable way to find the git
> commit that matches the one expected by a git-fetch <origin> record,
> i.e. the one that will produce a source checkout with the expected
> SHA256 hash.
> 
> Am I missing something?
We are working on the base assumption here, that we have an (array of)
reachable fallbacks in any case, I don't think it's too big of a leap
to assume that we can keep a mapping 
  (origin-file-name x origin-hash) → canonicalized-uri
around either as part of said fallbacks or in parallel.

> Regarding "Tricking Peer Review": I think it would be ideal for
> package definitions to include both the git tag _and_ the git commit
> hash, and to teach our linter to raise an alarm when the expected
> tags are missing or fail to match the expected commit hash.
That is among the solutions I've proposed here, so naturally I'd be
fine with it.

> For similar reasons, it would also be good to include the
> fingerprints of upstream PGP signing keys in our package definitions,
> and to teach our linter to check those signatures and that they match
> the SHA256 hashes in our recipes.
> 
> What do you think?
I think that is one of the main things we could import over from Guix
for Racket users (previously Xiden, currently denxi).  I.e. we could
have 

  (origin
    ...
    (sha256 some-hash)
    (sha512 some-other-hash)
    (pgp-signature sig)
    [other validation forms...]
    [patches and snippet])

We would have to break record ABI for that, but imo with field
sanitizers that's something we could code up.  If at some time in the
future SHA-2 is broken, we can then still rely on the robustness that
breaking all of these hashes would be difficult and perhaps not worth
it for GNU Hello.

Now obviously, there is a performance tradeoff here.  You don't want to
only check signatures all the time for a relatively minor build
(particularly with Rust where the build phase is literally copy-paste
for 90% of the packages).  So we'd have to add a configuration option
on the sliding scale between "only check the weakest" over "only check
the strongest" over "check at least N at random or all of them" to
"check everything always".

Cheers


  reply	other threads:[~2022-01-01 11:12 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-28 20:55 On raw strings in <origin> commit field Liliana Marie Prikler
2021-12-29  8:39 ` zimoun
2021-12-29 20:25   ` Liliana Marie Prikler
2021-12-30 12:43     ` zimoun
2021-12-31  0:02       ` Liliana Marie Prikler
2021-12-31  1:23         ` zimoun
2021-12-31  3:27           ` Liliana Marie Prikler
2021-12-31  9:31             ` Ricardo Wurmus
2021-12-31 11:07               ` Liliana Marie Prikler
2021-12-31 12:31                 ` Ricardo Wurmus
2021-12-31 13:18                   ` Liliana Marie Prikler
2021-12-31 13:15               ` zimoun
2021-12-31 15:19                 ` Liliana Marie Prikler
2021-12-31 17:21                   ` zimoun
2021-12-31 20:52                     ` Liliana Marie Prikler
2021-12-31 23:36         ` Mark H Weaver
2022-01-01  1:33           ` Liliana Marie Prikler
2022-01-01  5:00             ` Mark H Weaver
2022-01-01 10:33               ` Liliana Marie Prikler
2022-01-01 20:37                 ` Mark H Weaver
2022-01-01 22:55                   ` Liliana Marie Prikler
2022-01-02 22:57                     ` Mark H Weaver
2022-01-03 21:25                       ` Liliana Marie Prikler
2022-01-03 23:14                         ` Mark H Weaver
2022-01-04 19:55                           ` Liliana Marie Prikler
2022-01-04 23:42                             ` Mark H Weaver
2022-01-05  9:28                               ` Mark H Weaver
2022-01-05 20:43                                 ` Liliana Marie Prikler
2022-01-06 10:38                                   ` Mark H Weaver
2022-01-06 11:25                                     ` Liliana Marie Prikler
2022-01-02 19:30                   ` zimoun
2022-01-02 21:35                     ` Liliana Marie Prikler
2022-01-03  9:22                       ` zimoun
2022-01-03 18:13                         ` Liliana Marie Prikler
2022-01-03 19:07                           ` zimoun
2022-01-03 20:19                             ` Liliana Marie Prikler
2022-01-03 23:00                               ` zimoun
2022-01-04  5:23                                 ` Liliana Marie Prikler
2022-01-04  8:51                                   ` zimoun
2022-01-04 13:15                                     ` zimoun
2022-01-04 19:45                                       ` Liliana Marie Prikler
2022-01-04 19:53                                         ` zimoun
2021-12-31 23:56         ` Mark H Weaver
2022-01-01  0:15           ` Liliana Marie Prikler
2021-12-30  1:13 ` Mark H Weaver
2021-12-30 12:56   ` zimoun
2021-12-31  3:15   ` Liliana Marie Prikler
2021-12-31  7:57     ` Taylan Kammer
2021-12-31 10:55       ` Liliana Marie Prikler
2022-01-01  1:41     ` Mark H Weaver
2022-01-01 11:12       ` Liliana Marie Prikler [this message]
2022-01-01 17:45         ` Timothy Sample
2022-01-01 19:52           ` Liliana Marie Prikler
2022-01-02 23:00             ` Timothy Sample
2022-01-03 15:46           ` Ludovic Courtès
2022-01-01 20:19         ` Mark H Weaver
2022-01-01 23:20           ` Liliana Marie Prikler
2022-01-02 12:25             ` Mark H Weaver
2022-01-02 14:09               ` Liliana Marie Prikler
2022-01-02  2:07         ` Bengt Richter
2021-12-31 17:56 ` Vagrant Cascadian
2022-01-03 15:51   ` Ludovic Courtès
2022-01-03 16:29     ` Vagrant Cascadian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b72eb9430694fe7bccb6b37f9cfce7d8e47f1385.camel@gmail.com \
    --to=liliana.prikler@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).