From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms8.migadu.com with LMTPS id EP5sKkOktWVbWQAAqHPOHw:P1 (envelope-from ) for ; Sun, 28 Jan 2024 01:48:03 +0100 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0.migadu.com with LMTPS id EP5sKkOktWVbWQAAqHPOHw (envelope-from ) for ; Sun, 28 Jan 2024 01:48:03 +0100 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=ngyro.com header.s=fm1 header.b="we/ScsCC"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=ZkGAYe4G; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1706402883; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=d/iqbtTFI0i5aEL3V1YO1go6y3dTgQgstBZGGx079XM=; b=cIOPA1i2aAtrCNurndIoXO667tlRhzbCOgiHLn/nACfpQrJORacMx3L1p/JhYfB0RFqPgM Gyyg4YD2L9JyoAvZvb6bDJxKmCIKCxwwsUM23zexHsGkftHOx2aw8uOHqI9SYFSgjwGgH+ dXuernW/o+w6gCXB3zf0eU7nI+X9niZF/5zxhw+yRCfkq9Om0tCNrRb5rjooMb6RaKkHxh d3SYeH7hQ8pXG58v9D5HGRRZ+tL/PZ7d/ulEpWPmVkpXLx9hSiYySu1x23rs5z605eXWpq SU+DqFWMpHbEexPSSyzhKSw2XAV31oZRSRgc+UHxNHSharqYxScTzxlXIyHoCw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=ngyro.com header.s=fm1 header.b="we/ScsCC"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=ZkGAYe4G; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=none ARC-Seal: i=1; s=key1; d=yhetil.org; t=1706402883; a=rsa-sha256; cv=none; b=JDRQlqtt04b4YCjlMmWygF0H7QG9QhPkwCPoMTIi3PWT0jPTzYbCfeOO44V72/MvHTb71r 2v1gvPIrLzwybTebEzL89DLU9tyxzG17CltzAE6Ju39/bdEu7qRwUWeUUsbspREkON6Hbs rNd/LCFbt3vkB8R+sn5CSvP29T89dMW41l03C6pjf1QaY5+xfAhQHd6TDZIx1wZe9q5Khn CNzg6j7rp5cAr90a/yQ+jvIMmM1cI0bg7wzF/RjWVMpctv0Ij8CWkHS7ReuhWT/mhmAWke wwC5zPHb9UPFRGqLc8tMckonpZvZkQFopbrHaZ0F6NTGGiknVTKIwY9t3eHWyA== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 70FF938E16 for ; Sun, 28 Jan 2024 01:48:02 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rTtKk-0005wH-Fb; Sat, 27 Jan 2024 19:47:38 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTtKi-0005uu-8D for guix-devel@gnu.org; Sat, 27 Jan 2024 19:47:36 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTtKf-0008N7-Ga for guix-devel@gnu.org; Sat, 27 Jan 2024 19:47:36 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 373753200A04 for ; Sat, 27 Jan 2024 19:47:31 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Sat, 27 Jan 2024 19:47:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ngyro.com; h=cc :content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:message-id:mime-version:reply-to:subject :subject:to:to; s=fm1; t=1706402850; x=1706489250; bh=d/iqbtTFI0 i5aEL3V1YO1go6y3dTgQgstBZGGx079XM=; b=we/ScsCC+c4OQ1loNwisBS0ELB HImIbI/Zxbiwa4mKoHZxcyxYf0P9N6sgNzf5Alg3qCBHIYbYgDKYaSVV5amv29hb vmpUtq5OhRfJQzrzdZ3GMtVMVyzdeL04t/0fhXmK1J1JILCC6zApBULY+avDmsYA o6uBL6JH9QEaAraFAKDG0RqtX5/di0VxUY4dxFijZCBQSa6BA3EKK2nIYlnbVDV5 PRuWXI133ugrwILckIVMncSwgFU6fCWwsXsNlQ8u0qZmZEE7B7Cn+FHoZOsS4qgB D3cISFu6oTuk9b7pWpVjM10QVEk85IvU7ov/U5+mJ89szjbpyXrF/iBCCfZA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:message-id:mime-version:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706402850; x=1706489250; bh=d/iqbtTFI0i5aEL3V1YO1go6y3dT gQgstBZGGx079XM=; b=ZkGAYe4G88/camUFQqQ+KfhTAd/79nVNjq+i9exfh40r KxMI3mHpZuNzwMQr8ISn94lEteuM1zO0iPPuE8QdRbH51Ictx8qnBYuUBwpXb8aT 1Mp5YCNcyUYFv1horTMC63hObpNoGTQ7kcYujqKoDJyh2Sh1CQNLzBLAxtCtLGh4 76ujblPaqtHOyQ4nAVmmDMdJpWD9iGe5Un+TgAVHKI5faYmDY1fa8p+SeDz4H4iF dXz6J3V62NQSa/fp1sAcLofrWp1rQwGsM01hIlron22RyeaEmLNkA/qxpcaSps38 SWMSQSHfTyW4eZd2cwmsHEn/lKSvmeZHSqWDqa5dGQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfedttddgvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkfgggtgfgsehtqhertd dtreejnecuhfhrohhmpefvihhmohhthhihucfurghmphhlvgcuoehsrghmphhlvghtsehn ghihrhhordgtohhmqeenucggtffrrghtthgvrhhnpeetkeehleegtdegveeujeevleegud dvhfeghfeluddufedvgeffkedtgeegjeeggfenucffohhmrghinhepnhhghihrohdrtgho mhdpvghnlhhighhhthgvnhhmvghnthdrohhrghdphhgvrghnvghtrdhivgdpgigvmhgrtg hsrdhorhhgpdhruhgshihgvghmshdrohhrghdpshhrrdhhthenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehsrghmphhlvghtsehnghihrhhord gtohhm X-ME-Proxy: Feedback-ID: i4721425c:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Sat, 27 Jan 2024 19:47:30 -0500 (EST) From: Timothy Sample To: guix-devel@gnu.org Subject: Preservation of Guix report for 2024-01-26 Date: Sat, 27 Jan 2024 18:47:27 -0600 Message-ID: <87a5oq8esg.fsf@ngyro.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=64.147.123.25; envelope-from=samplet@ngyro.com; helo=wout2-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -9.75 X-Migadu-Queue-Id: 70FF938E16 X-Spam-Score: -9.75 X-Migadu-Scanner: mx11.migadu.com X-TUID: V+6W9dUeqOxo Hello all, For a while now, I=E2=80=99ve been tracking coverage of Guix sources in the Software Heritage (SWH) archive. I maintain a dataset of sources that goes back (almost five years) to Guix 1.0.0. Every once in a while, I update this dataset and check it against SWH to see how much is missing. I just put together a new report. The permalink is https://ngyro.com/pog-reports/2024-01-26, but you can link to the latest report, too: https://ngyro.com/pog-reports/latest/. New in this edition is checking for Subversion sources and bzip2-compressed tarballs. Subversion is well covered (98.5%), since it is basically asking, =E2=80=9Cis TeX Live in SWH?=E2=80=9D. The bzip2 sour= ces are similar to other compressed tarballs. One of the benefits of this report is that it catches issues with our integration with SWH. This is the second time publishing this report that I discovered that SWH had stopped loading sources from us. When that happens, the number of missing sources starts climbing steeply for recent commits. Before publishing this, I reached out to SWH and they restarted the loader. It was able to bring in most of the sources but you can see a slight increase in missing sources about halfway between September (when it stopped) and now. That=E2=80=99s likely due to sources = that came and went from our =E2=80=9Csources.json=E2=80=9D listing while they we= ren=E2=80=99t looking. Speaking of which, another benefit of this dataset is that we have a list of ~6K historical sources that we would like to see added to SWH. We are currently coordinating with them to load these sources. I plan to update the report when we get results from that. However, there remain a handful of missing sources that are current, and should be getting loaded. This suggests areas where we could improve. Here=E2=80=99s a not-quite-random sample of some of the current missing sou= rces (from commit 25bcf4e), and my thoughts as to why they are missing. mirror://gnupg/gpgme/gpgme-1.18.0.tar.bz2 https://download.enlightenment.org/rel/apps/econnman/econnman-1.1.tar.gz https://ftp.heanet.ie/mirrors/ftp.xemacs.org/aux/compface-1.5.2.tar.gz mirror://cpan/authors/id/E/ET/ETHER/MooseX-Types-0.45.tar.gz mirror://apache/commons/daemon/source/commons-daemon-1.1.0-src.tar.gz Some of these (I didn=E2=80=99t check them all) are in SWH as content rat= her than directories. That=E2=80=99s kinda good, because Guix knows how to g= et them, but also kinda mysterious. I=E2=80=99ve asked swh-devel about it. Depending on the answer, I might have to adapt the checks to deal with the possibility of SWH having the tarball rather than its contents. In fact, that might be an improvement either way, but it muddies the data model quite a bit. https://rubygems.org/downloads/rjb-1.6.7.gem https://rubygems.org/downloads/mspec-1.9.1.gem https://rubygems.org/downloads/cztop-0.12.2.gem https://rubygems.org/downloads/morecane-0.2.0.gem This is an error on my side. I=E2=80=99ve been treating gems as regular files, but they are (and SWH treats them as) tarballs. https://git.sr.ht/~abcdw/guile-ares-rs This one was in SWH, but not up-to-date enough to have the tag we use. I don=E2=80=99t think they regularly crawl git.sr.ht yet. Also, it looks= like they tried to visit this origin while SourceHut was down (around a week ago). I used =E2=80=9CSave code now=E2=80=9D to fix this and now th= is source is in SWH. This kind of thing should be improved soon, as they are working on new code that will pick up Git repositories from our =E2=80=9Csources.json=E2=80=9D file. Given that some of those tarballs and Ruby gems are in fact in SWH and I=E2=80=99m just missing them, we are probably doing better than the report suggests! The short-term road map for this is to send the historical sources to SWH and fix the Ruby gems, and then make a new report. So expect a minor update with much better numbers soon-ish. The long-term road map is to make it work like an archive. It will run continuously and store *all* Guix sources. To make this easy data-wise, it will only store what=E2=80=99s not covered by SWH. I avoided this earli= er out of fear of creating another point of failure. I=E2=80=99m still afraid= of this, but as it stands every source that is just out there on the Internet and not in SWH is a point of failure. Surely having them all in one place would be better, right? -- Tim