all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: zimoun <zimon.toutoune@gmail.com>
To: Liliana Marie Prikler <liliana.prikler@gmail.com>, guix-devel@gnu.org
Subject: Re: On raw strings in <origin> commit field
Date: Wed, 29 Dec 2021 09:39:19 +0100	[thread overview]
Message-ID: <86y243kdoo.fsf@gmail.com> (raw)
In-Reply-To: <6e451a878b749d4afb6eede9b476e5faabb0d609.camel@gmail.com>

Hi,

On Tue, 28 Dec 2021 at 21:55, Liliana Marie Prikler <liliana.prikler@gmail.com> wrote:

> Consider a package being added or updated in Guix.  At the time of
> commit, we have the tag v1.2.3 pointing towards commit deadbeef.  We
> therefore create a guix package with version "1.2.3" pointing to said
> commit (either directly or indirectly).  At this point, one of the
> following holds:
>   (1) Guix "1.2.3" -> upstream "v1.2.3" -> upstream "deadbeef"
>   (2) Guix "1.2.3" -> upstream "deadbeef" <- upstream "v1.2.3"
> From either, we can follow that Guix "1.2.3" = upstream "v1.2.3".  If
> upstream keeps their tags around, then both forms are equivalent, but
> (1) is more convenient; it allows us to derive commit from version,
> which is often done through an affine mapping.

No, tags and hash commit are not equivalent.  Hash commit is intrinsic:
it only depends on the content.  Whereas, tags are extrinsic, they
depend on external choice.

From the content to the hash, three keys: 1) how to serialize and 2) how
to hash and 3) how to represent the hash.  For #1, Git uses their own
serializer and Guix, inheriting from Nix, uses another (Nar); although
the difference is minor.  For #2, Git uses by default SHA-1 as hash
function, although Guix uses SHA-256.  And for #3, Git uses hexadecimal
format and Guix uses nix-base32.

The subcommand “guix hash” with the options ’-S, -H’ and ’-f’ exposes
these 3 keys.  For instance:

        $ cat /tmp/foo.txt | git hash-object --stdin
        557db03de997c86a4a028e1ebd3a1ceb225be238
        $ ./pre-inst-env guix hash -S git -H sha1 -f hex /tmp/foo.txt
        557db03de997c86a4a028e1ebd3a1ceb225be238


To make it explicit, the checksum hash of ’git-reference’ could be
removed because it is somehow redundant with the commit hash.
Obviously, it cannot because security reason (SHA-1 is considered as
weak).


> Problems arise, when upstreams move or delete tags.  At this point,
> guix packages that use them break and are no longer able to fetch their
> source code.  Raw commits are in principle resilient to this kind of
> denial of service; instead upstreams would have to actually delete the
> commits themselves, including also possible backups such as SWH to
> break it.  There is certainly an argument for robustness to be made
> here, particularly concerning `guix time-machine', though as noted it
> is not infallible.  

SWH provides ’swh:id’ which is another triplet (really close to Git).
Basically, content means data and metadata and to make it short, SWH
deals their way with metadata for reason of large scale.  And SWH does
snapshots of Git repositories.

Therefore, to have something really robust, Guix has to rely on a map
from package definition to SWH.

Using Git commit hash instead of tag makes this map.  For tag, to have
something robust, we need an external map from checksum hash to SWH hash
via Git commit hash.  This “external” is done by Disarchive.


> Long-term, we might want to support having multiple <git-references> in
> git-fetch -- if the first one fails due to a hash mismatch, we would
> warn about that instead of producing an error and thereafter continue
> with the second, third, etc. similar to how we currently have mirror://
> urls for some well-known mirrored repositories.  That way, we have a
> system to warn us about naughty upstreams while also providing
> robustness for the time machine.

I think the long term is to completely remove tag and only use commit
hash; as done for ’guile-aiscm’.  But it will not happen for convenience
reasons, I guess.

What you are proposing is to mix extrinsic (tag, URL, etc.) with
intrinsic (commit hash, checksum hash, etc.).  Well, I do not know if
this proposed fallback mechanism would ease the maintenance and would
make Guix more robust.

To me, robustness means make a map from intrinsic values to content; as
Disarchive is doing for instance.


Cheers,
simon


  reply	other threads:[~2021-12-29  8:44 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-28 20:55 On raw strings in <origin> commit field Liliana Marie Prikler
2021-12-29  8:39 ` zimoun [this message]
2021-12-29 20:25   ` Liliana Marie Prikler
2021-12-30 12:43     ` zimoun
2021-12-31  0:02       ` Liliana Marie Prikler
2021-12-31  1:23         ` zimoun
2021-12-31  3:27           ` Liliana Marie Prikler
2021-12-31  9:31             ` Ricardo Wurmus
2021-12-31 11:07               ` Liliana Marie Prikler
2021-12-31 12:31                 ` Ricardo Wurmus
2021-12-31 13:18                   ` Liliana Marie Prikler
2021-12-31 13:15               ` zimoun
2021-12-31 15:19                 ` Liliana Marie Prikler
2021-12-31 17:21                   ` zimoun
2021-12-31 20:52                     ` Liliana Marie Prikler
2021-12-31 23:36         ` Mark H Weaver
2022-01-01  1:33           ` Liliana Marie Prikler
2022-01-01  5:00             ` Mark H Weaver
2022-01-01 10:33               ` Liliana Marie Prikler
2022-01-01 20:37                 ` Mark H Weaver
2022-01-01 22:55                   ` Liliana Marie Prikler
2022-01-02 22:57                     ` Mark H Weaver
2022-01-03 21:25                       ` Liliana Marie Prikler
2022-01-03 23:14                         ` Mark H Weaver
2022-01-04 19:55                           ` Liliana Marie Prikler
2022-01-04 23:42                             ` Mark H Weaver
2022-01-05  9:28                               ` Mark H Weaver
2022-01-05 20:43                                 ` Liliana Marie Prikler
2022-01-06 10:38                                   ` Mark H Weaver
2022-01-06 11:25                                     ` Liliana Marie Prikler
2022-01-02 19:30                   ` zimoun
2022-01-02 21:35                     ` Liliana Marie Prikler
2022-01-03  9:22                       ` zimoun
2022-01-03 18:13                         ` Liliana Marie Prikler
2022-01-03 19:07                           ` zimoun
2022-01-03 20:19                             ` Liliana Marie Prikler
2022-01-03 23:00                               ` zimoun
2022-01-04  5:23                                 ` Liliana Marie Prikler
2022-01-04  8:51                                   ` zimoun
2022-01-04 13:15                                     ` zimoun
2022-01-04 19:45                                       ` Liliana Marie Prikler
2022-01-04 19:53                                         ` zimoun
2021-12-31 23:56         ` Mark H Weaver
2022-01-01  0:15           ` Liliana Marie Prikler
2021-12-30  1:13 ` Mark H Weaver
2021-12-30 12:56   ` zimoun
2021-12-31  3:15   ` Liliana Marie Prikler
2021-12-31  7:57     ` Taylan Kammer
2021-12-31 10:55       ` Liliana Marie Prikler
2022-01-01  1:41     ` Mark H Weaver
2022-01-01 11:12       ` Liliana Marie Prikler
2022-01-01 17:45         ` Timothy Sample
2022-01-01 19:52           ` Liliana Marie Prikler
2022-01-02 23:00             ` Timothy Sample
2022-01-03 15:46           ` Ludovic Courtès
2022-01-01 20:19         ` Mark H Weaver
2022-01-01 23:20           ` Liliana Marie Prikler
2022-01-02 12:25             ` Mark H Weaver
2022-01-02 14:09               ` Liliana Marie Prikler
2022-01-02  2:07         ` Bengt Richter
2021-12-31 17:56 ` Vagrant Cascadian
2022-01-03 15:51   ` Ludovic Courtès
2022-01-03 16:29     ` Vagrant Cascadian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86y243kdoo.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=liliana.prikler@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.