From: wolf <wolf@wolfsden.cz>
To: 66268@debbugs.gnu.org
Subject: bug#66268: Guix makes invalid assumptions regarding guile-git guarantees leading to guix pull failing
Date: Fri, 29 Sep 2023 18:52:43 +0200 [thread overview]
Message-ID: <ZRcA23RUYvBuE1JX@ws> (raw)
[-- Attachment #1: Type: text/plain, Size: 15289 bytes --]
Table of Contents
_________________
1. Guix makes an invalid assumption regarding guile-git's guarantees
.. 1. Root cause analysis
..... 1. Object size
..... 2. Cache space
.. 2. Reproduction to verify the analysis
2. Possible solutions
.. 1. Do not use `eq?'
.. 2. Increase the cache size / limits
.. 3. Ensure the id stability in guile-git
3. Questions
.. 1. Is this a regression?
4. Attachments
.. 1. d51135e8477118dc63a7e5462355cd27e884f4fb
.. 2. test.scm
1 Guix makes an invalid assumption regarding guile-git's guarantees
===================================================================
There is an assumption made by Guix regarding guile-git, which is not
true. The problem is demonstrated using my fork, since that is where
I encountered it first, but official Guix will hit the same problem
sooner or later. I will also provide an independent repository for
the verification.
Guix made a design decision to compare commit objects using eq?, based
on the assumption that guile-git will return the same object for the
same commit. However that assumption is wrong and can lead to fun
issues like:
,----
| scheme@(guile-user)> (use-modules (git) (guix git))
| scheme@(guile-user)> (define %repo (repository-open "/tmp/my-fork"))
| scheme@(guile-user)> (define %hash "d51135e8477118dc63a7e5462355cd27e884f4fb")
| scheme@(guile-user)> (commit-relation
| (commit-lookup %repo (string->oid %hash))
| (commit-lookup %repo (string->oid %hash)))
| $5 = unrelated
`----
This does break (at least) `guix pull':
,----
| Updating channel 'guix' from Git repository at 'https://git.sr.ht/~graywolf/guix'...
| guix pull: error: aborting update of channel 'guix' to commit 4dbd25fa0e09b40ba2ab01d1e64fa36db652b501, which is not a descendant of d51135e8477118dc63a7e5462355cd27e884f4fb
| hint: This could indicate that the channel has been tampered with and is trying to force a roll-back, preventing you from getting the latest updates. If you think this is not the case,
| explicitly allow non-forward updates.
`----
The commit actually is a descendant, but it is not found in the
`commit-closure' due to the `eq?' comparison being used. The
verification of the relation between the commits:
,----
| $ git log --first-parent --oneline -4
| 4dbd25fa0e (HEAD -> master, origin/master) etc/fork-guix: Use absolute path for the patch file.
| 601029b97a Merge updates from the Guix proper
| afa5eabc93 git-authenticate: Fix tracking of trusted parents.
| d51135e847 Merge updates from the Guix proper
`----
1.1 Root cause analysis
~~~~~~~~~~~~~~~~~~~~~~~
Guile-git is a wrapper around libgit2, and libgit2 (as far as I can
tell) makes no guarantees about returning the same commit object every
time lookup is performed for the same hash. It just does it,
sometimes, due to the internal use of a cache.
There are two rules in place.
1.1.1 Object size
-----------------
Object has to be smaller than configured size limit for the object
type. For commits, that is 4096B. It can be increased
(`git_libgit2_opts' called with `GIT_OPT_SET_CACHE_OBJECT_LIMIT'), but
guile-git does not offer any binding for that functionality.
In the case illustrated above, the commit object for 4.1 has size of
4139B, and is therefore never cached. That means that a new object is
returned every time, and therefore they are never considered equal by
Guix. As you can see, the commit message, while long, is sensible and
useful, so it is fairly easy to hit the limit.
1.1.2 Cache space
-----------------
Cache is limited to 268435456B, and it is pruned if the need to get
more space arises. That is especially problematic because it could
lead to hard to debug failures based on the access pattern to the
repository.
Typical signed commit seems to be larger than 1kB. Based on the cache
size, rough calculation suggests that Guix (which requires the commits
to be signed) can only store somewhere around 252052 commits before
running into possible issues. Currently there are 64416 commits since
the channel introduction, so we are already getting close and it
*will* be a problem in the future, even if we keep the commits
smallish.
1.2 Reproduction to verify the analysis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To verify that the analysis above is correct, I created a test
repository to run tests against. `head' refers to a commit at the
HEAD, it is a small one (210B libgit2's entry size). `large' refers
to a commit with a large (4kB) commit message, it is the parent of
`head'. `root' refers to the root commit. All other commits are of
210B entry size. The repository is located here[0].
When the 4.2 is executed, it seems to support the conclusions reached
in previous section. The output when executed on my machine is:
,----
| Checking (eq? (%commit X) (find-c (%hash X))) for X:
|
| ;;; (large #f)
|
| ;;; (head #t)
|
| ;;; (root #t)
| Collecting all commits...
| Checking if they match themselves...
| # of mismatches 300530 of 1578267 commits
| Relation between 'head and 'large: unrelated
`----
We can see that both `head' and `root' do match themselves, since they
are small and fit into the cache. However `large' does not fit into
the cache, and therefore we get new, distinct, object from each
`commit-lookup' call. That seems to confirm the hypothesis regarding
the limit on object size that fits into the cache.
After that we record all commits into a vhash, overwhelming the cache
and forcing evictions. After that we do a second sweep to check how
many of the commit will match itself. We can see that about 20% of
commits do not match, confirming the hypothesis regarding the cache
capacity.
And finally we see that `head' and `large' are unrelated, despite that
not being the case, after all, `head' is the parent of `large'. Git
tooling does confirm that:
,----
| $ git merge-base --is-ancestor 9b985229bcd 71f544c102a; echo $?
| 0
`----
0: <https://git.sr.ht/~graywolf/guix-guile-git-repro>
2 Possible solutions
====================
2.1 Do not use `eq?'
~~~~~~~~~~~~~~~~~~~~
The correct solution is to stop using `eq?' to compare the commits
(and other objects from guile-git, if that is being done). That will
come at some performance cost, but the benefit of being actually
correct does out-weight that.
My partial solution is based on new record type, `<commit-set>', to
use instead of (guix sets).
,----
| (define-record-type <commit-set>
| (%make-commit-set vhash)
| commit-set?
| (vhash commit-set-vhash))
|
| (define (make-commit-set)
| (%make-commit-set vlist-null))
|
| (define (commit-set-contains? commit commit-set)
| (->bool (vhash-assoc (oid->string (commit-id commit))
| (commit-set-vhash commit-set))))
|
| (define (commit-set-insert commit commit-set)
| (%make-commit-set (vhash-cons (oid->string (commit-id commit))
| #t
| (commit-set-vhash commit-set))))
`----
It is not complete, for example I do not check that the commits do
come from the same repository.
However I believe similar approach is the preferred solution.
2.2 Increase the cache size / limits
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
While the size limit for objects to be cache-able is configurable, the
maximum size of the cache is hard coded. However patching libgit2 is
always an option. But this does not really solve the issue, it just
pushes it down the line.
Removing the size limits is also an option, but would lead to unbound
memory usage.
2.3 Ensure the id stability in guile-git
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Technically it should be possible to improve guile-git to ensure the
object stability without patching the libgit2, but it would basically
involve a second cache, without size limits. It could probably
utilize weak hash tables to make the memory usage reasonable. However
it would be tricky to make sure everything is correctly wrapped.
3 Questions
===========
3.1 Is this a regression?
~~~~~~~~~~~~~~~~~~~~~~~~~
Since this was the approach chosen by Guix, did guile-git or libgit2
used to guarantee this? Should this be reported as a regression?
4 Attachments
=============
4.1 d51135e8477118dc63a7e5462355cd27e884f4fb
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
,----
| commit d51135e8477118dc63a7e5462355cd27e884f4fb
| Merge: 75daa689f1 0500af5556
| Author: Tomas Volf <wolf@wolfsden.cz>
| Date: Wed Sep 27 21:03:40 2023 +0200
|
| Merge updates from the Guix proper
|
| * upstream/master:
| Revert "build: Add missing guix-gc.timer file to binary tarball."
| gnu: tio: Update to 2.7.
| gnu: bcachefs-tools: Restore mount.bcachefs shell script version.
| gnu: bcachefs-tools: Remove obsolete phase.
| gnu: bcachefs-tools: Update to 1.2-0.1e35840.
| Revert "gnu: bcachefs-tools: Restyle format."
| gnu: sssd: Update to 2.9.2.
| gnu: wvkbd: Update to 0.14.1.
| gnu: yt-dlp: Update to 2023.09.24.
| gnu: lhasa: Update to 0.4.0.
| gnu: vcmi: Update to 1.3.2.
| gnu: cmark: Update to 0.30.3.
| gnu: python-tslearn: Update to 0.6.2.
| gnu: python-astropy: Update to 5.3.3.
| gnu: python-stdatamodels: Update to 1.8.0.
| gnu: python-roman-datamodels: Remove all test constraints.
| gnu: python-roman-datamodels: Update to 0.17.1.
| gnu: python-rad: Update to 0.17.1.
| gnu: python-pyvo: Update to 1.4.2.
| gnu: python-photutils: Update to 1.9.0.
| gnu: python-jwst: Update to 1.11.4.
| gnu: python-fitsio: Update to 1.2.0.
| gnu: python-crds: Update to 11.17.4.
| gnu: python-bayesicfitting: Update to 3.2.0.
| gnu: python-sunpy: Enable more tests.
| gnu: python-cdflib: Fix version detection.
| gnu: python-cdflib: Update to 1.1.0.
| gnu: python-astropy-healpix: Update to 1.0.0.
| gnu: splash: Update to 3.8.4.
| gnu: libxisf: Extend description.
| gnu: libxisf: Update to 0.2.9.
| gnu: python-coloful: Update to 0.5.5.
| gnu: python-pyotp: Update to 2.9.0.
| gnu: gajim: Clean up formatting.
| gnu: python-nbxmpp: Clean up formatting.
| gnu: gajim-openpgp: Update to 1.5.0.
| gnu: gajim-omemo: Update to 2.9.0.
| gnu: gajim: Update to 1.7.3.
| gnu: python-nbxmpp: Update to 4.2.2.
| gnu: transmission: Fix loading icons in pure environments.
| gnu: alex4: Remove non-free package.
| doc: Update bug-reference configuration snippet.
| tests: Assume ‘git’ is always available.
| git-download: Use “builtin:git-download” when available.
| perform-download: Use the ‘git’ command captured at configure time.
| build: Add dependency on Git.
| daemon: Add “git-download” built-in builder.
| perform-download: Remove unused one-argument clause.
| git-download: Honor the ‘GUIX_DOWNLOAD_FALLBACK_TEST’ environment variable.
| git-download: Move fallback code to (guix build git).
| tests: Adjust ‘guix graph --path’ test to latest Emacs changes.
| gnu: imgui: Update to 1.89.9.
| gnu: Add tracy.
| gnu: Add tracy-wayland.
| gnu: glfw: Patch dlopen calls.
| gnu: imgui: Enable freetype support.
| gnu: capstone: Update to 5.0.1.
| gnu: gtypist: Install the gtypist-mode Emacs major mode.
| multiqc: Don't propagate inputs.
| gnu: transmission: Restore HTML files in the default output.
| gnu: aalib: Really build the shared library on powerpc64le-linux.
| gnu: edk2-tools: Update to 202308.
| doc: Add new 'Circular Module Dependencies' section.
| gnu: embedded: Turn packages using top-level variables into procedures.
| gnu: avr: Delay all cross compilation packages.
| gnu: Add satdump.
| gnu: nng: Update to 1.5.2.
| gnu: sdrangel: Update to 7.16.0.
`----
4.2 test.scm
~~~~~~~~~~~~
,----
| (use-modules (ice-9 vlist)
| (git)
| (guix git))
|
| ;;; Tweak the path as necessary.
| (define %repo (repository-open "/home/wolf/tmp/guix-guile-git-repro"))
|
| ;;; All hashes that are of interest to us.
| (define %hashes '((large . "9b985229bcd447261b147c6bf70a86c2a345f234")
| (head . "71f544c102a658ed5f2f2258862f2d59cbe70b8b")
| (root . "db3f74122f4c384897ba7fddac73b893d19c1c67")))
| (define (%hash for)
| "Return a sha1 hash for a specified key."
| (assoc-ref %hashes for))
|
| (define %Xs
| (map car %hashes))
|
| (define (find-c hash)
| "Return a commit based on the string sha1 hash."
| (commit-lookup %repo (string->oid hash)))
|
| ;;; Memoize the commits so that we can compare against them later.
| (define %commits (map (λ (k) `(,k . ,(find-c (%hash k))))
| %Xs))
| (define (%commit for)
| "Return a memoized commit for a specified key."
| (assoc-ref %commits for))
|
| (display "Checking (eq? (%commit X) (find-c (%hash X))) for X:\n")
| (for-each (λ (x)
| (pk x (eq? (%commit x) (find-c (%hash x)))))
| %Xs)
|
| (display "Collecting all commits...\n")
| (define all-commits
| (let loop ((commits vlist-null)
| (hash (%hash 'head)))
| (let* ((c (find-c hash))
| (parents (commit-parents c))
| (commits (vhash-cons (oid->string (commit-id c))
| c
| commits)))
| (if (null? parents)
| commits
| (loop commits
| (oid->string (commit-id (car parents))))))))
| (display "Checking if they match themselves...\n")
| (format #t "# of mismatches ~a of ~a commits\n"
| (vlist-fold (λ (x total)
| (let ((hash (car x))
| (commit (cdr x)))
| (if (eq? commit (find-c hash))
| total
| (+ total 1))))
| 0
| all-commits)
| (vlist-length all-commits))
|
| (format #t "Relation between 'head and 'large: ~a\n"
| (commit-relation (%commit 'head)
| (%commit 'large)))
`----
Have a nice day,
W.
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next reply other threads:[~2023-09-29 16:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-29 16:52 wolf [this message]
2023-09-30 15:48 ` bug#66268: Guix makes invalid assumptions regarding guile-git guarantees leading to guix pull failing Simon Tournier
2023-10-02 20:54 ` wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZRcA23RUYvBuE1JX@ws \
--to=wolf@wolfsden.cz \
--cc=66268@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).