Continued where left off from 42023: Ludovic Courtès writes: >> From b992a3aaac7e3b30222e0bf1df09093f18e25e6a Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt >> Date: Sat, 8 Aug 2020 11:25:57 -0500 >> Subject: [PATCH 3/6] deduplication: retry on ENOENT. >> >> It's possible for the garbage collector to remove the "canonical" link after >> it's been detected as existing by 'deduplicate'. This would cause an ENOENT >> error when replace-with-link attempts to create the temporary link. This >> changes it so that it will properly handle that by retrying. > > Would that ENOENT cause an error, or just a missed deduplication > opportunity? Depends on how we handle it. Previously the error would be uncaught entirely; simply skipping deduplication of that file would work, though it would be suboptimal. > >> * guix/store/deduplication.scm (replace-with-link): renamed to >> canonicalize-with-link, now also handles the case where the target link >> doesn't exist yet, and retries on ENOENT. >> (deduplicate): modified to use canonicalize-with-link. > > There’s an issue with this patch. I gave it a spin (offloading a few > builds) and it got stuck in a infinite loop: > > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhrk", 0x7ffe43898cd0) = -1 ENOENT (Dosiero aŭ dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwps-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhrk") = -1 EEXIST (Dosiero jam ekzistas) > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhrk", 0x7ffe43898cd0) = -1 ENOENT (Dosiero aŭ dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwps-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhrk") = -1 EEXIST (Dosiero jam ekzistas) > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhrk", 0x7ffe43898cd0) = -1 ENOENT (Dosiero aŭ dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwps-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhrk") = -1 EEXIST (Dosiero jam ekzistas) > I believe I can explain this. In 'deduplicate' we currently treat anything that isn't a directory as a hardlinkable thing. This includes symlinks (although it's implementation-defined whether symlinks can be hardlinked to - we use CAN_LINK_SYMLINK to test this in nix/libstore/optimise-store.cc). This means that at present we unconditionally attempt to deduplicate symlinks (which happens to work on linux). However, 'file-exists?' uses stat, not lstat, to check for file existence. Thus, if there is a dangling symlink, 'file-exists?' will return #f when passed it, but of course attempting to call link() to create it will fail with EEXIST. Attached is a modified patch that tests for file existence with lstat instead. I expect that will fix the problem. We should probably still add a test in 'deduplicate' for whether symlinks can be hardlinked to. Tangent: I was curious why libwps-0.4.so would be a dangling symlink, and it turns out that it's actually a relative symlink, so when accessing it via /gnu/store/...-libwps-0.4.12/lib/libwps-0.4.so it isn't dangling, but when accessing it via /gnu/store/.links/0k63r... it is. > I think we should work on reducing the complexity of that code (e.g., > there are several layers of system-error handling). I've since flattened it down into just one layer of catch'es. It adds a bit of redundancy, but might make it clearer. - reepca