Hi Guix, This is an update to the preservation of Guix report. There are no new commits or fixed-output derivations in this report, but I spent some time cleaning up the results, and I think the improvements are worth sharing. The last report generated a lot of questions. This one doesn’t answer all of them, but it’s a big improvement: Since the last report, I added many more reference categories and moved them to the database. The new categories are 'hg', 'svn', 'cvs', 'bzr', 'tar-bz2', 'tar', 'zip', and 'text'. Of these, only 'tar' and 'text' are being processed. The rest are currently unsupported by my scripts. Moving the categories to the database allows me to make manual corrections when needed. It also encouraged me to look through the references a bit more carefully to track down some of the weirder 'text' sources (like Bash patches) and fix up some other ones (in the style of “/tar_gz?download=yes”). I also made the fetching code more tenacious. Now it uses the content-addressed mirrors from Guix and Nix to find regular files, and will recover “easy” Git references from SWH (“easy” means the commit is specified). Between improving the fetching code and adding 'tar' and 'text' processing, I’ve computed another 2.5K SWHIDs. We now have SWHIDs for 86% of our fixed-output derivations. There are only 51 “unknown” non-recursive Git sources now (the list is attached). But that’s not all! The scripts now categorize failures, so we have a better idea of what’s going on with the remaining 14% “unknown” sources: no-ref: 13 disarchive: 863 fetch: 1262 bail: 3324 ----------------- total: 5462 The “bail” category is all the stuff my scripts don’t yet process, like Mercurial repositories and bzip2 tarballs. The “fetch” category is everything the scripts couldn’t track down. The “disarchive” category is all the tarballs Disarchive failed to process. An interesting thing here is that most of them are from Cargo. Long story short: older versions of Cargo used the “miniz” implementation of DEFLATE (rewritten in Rust) to compress tarballs. Disarchive doesn’t support this (yet...?). There are 686 old-Cargo-produced tarballs in the “disarchive” category. The “no-ref” category covers a few fixed-output derivations used in bootstrapping that do not come from an origin record. I will probably just load them by hand eventually. (In the future I hope to put some of this in the report itself.) One last thing to add is that the SWH folks were very quick to fix the loading error, so the increase in missing sources for recent commits is now gone. -- Tim