From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id aELeG/QVFmRo5gAASxT56A (envelope-from ) for ; Sat, 18 Mar 2023 20:50:12 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id mJcKG/QVFmSF8gAA9RJhRA (envelope-from ) for ; Sat, 18 Mar 2023 20:50:12 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9671A3AF13 for ; Sat, 18 Mar 2023 20:50:11 +0100 (CET) Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=messagingengine.com header.s=fm2 header.b="L njPgii"; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1679169012; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=Ftf7ZYQmKNjFCOlTYEwhr+9bYfKuHh31nNxdQ1u/AZo=; b=Qjnrdb9zkl2iWkHb4ioFt+JqLy7ziBQjv/7mwKGG6dptQB71UIkCulqpr7KpSuRRokceIR cp1x3s/McSimtLo+B9yfcrmnWwJVd5Eox92PGSTZn5azR7z07Xz8IZKD5lR8DI/81lUXVG Tvx8wqZ1oDJ9PjhV9oO2NmguvyvOfGWyjueesNtXw/yFzXyCD9y7IsT9VtAJ0I9HRxB17b BBs4VIoNxbeFNP/AOsfYwGRTGTW5tofIufedxGXR2d19+rA9dRv4f6MEO9D7WhKvhTcGek HCi+Ye56qvleud+nKP67Kac/vuqu+GNaz6pb1XYeZabDhlrcrXKxgM7b7I44Vw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=messagingengine.com header.s=fm2 header.b="L njPgii"; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Seal: i=1; s=key1; d=yhetil.org; t=1679169012; a=rsa-sha256; cv=none; b=oeyOf9NsmC5IrywcacVt3vtQtTI7hMldsKLRe63SAFzS6HlrNSdInWaslZ/MGaEBrpDAPt thdT5CjSPFmO1BsGo52NE1fWKjRNgkrYzhuQwF0P3FYNJGX2m9TEnmsW9n9gfFVE/dZv3l Xjgboh628erqs9UgeyxMm+I4J4JSkq7xY9Yn3A0+W8OwSoHWzoT46lHIMr5O0dKKHIh3PY S80EWuWtPdtJhEZo+PMQe794QvQi8flVjdxX9+ljqMQpkFvRbGR0ij/6CFGPnGLOuODick xrL2N4X3tEwILGSv2q0Ox53Ms6N4R2laPvqXmtJ7Gt1hdx0VzGd8oUndOly11A== Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pdcYf-0002T8-LQ; Sat, 18 Mar 2023 15:49:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pdcYe-0002SI-NT; Sat, 18 Mar 2023 15:49:40 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pdcYc-00045O-QG; Sat, 18 Mar 2023 15:49:40 -0400 Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 45AA15C0046; Sat, 18 Mar 2023 15:49:36 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Sat, 18 Mar 2023 15:49:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1679168976; x= 1679255376; bh=Ftf7ZYQmKNjFCOlTYEwhr+9bYfKuHh31nNxdQ1u/AZo=; b=L njPgiiNhror9DMp7xu7XZQsMN5DQzklwgmB30aroF4wT+5xXrgPs9IdpZVSO4lSL eIPoPPUUpbqnkAyXSRXwyCD5CBq9X1fblTisDHoD32MJwHje4l4zG9YF/pUIU1zu RgDceDWRnnCE9aXZ/f3MrvLL5TCzckzOrqgzI8lYw8uGbUiDQGGjqLNGj0U5yyEW e65vC+t9pdTWvtpTZPBmoVlIB8ouIG/tUt6hqqcYP7c0D8EDTSvdoVwJBzBFX5C4 XoyDxOpQJHGTpTGiVKIsATiX7DiTL/qaR0ZkhFXP/XIXBd3lhd9lqAVgby8kqV8W FqFg9BpHrVEen6UQMf+CQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdefgedguddvkecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvfevufhffffkfgggtgfgsehtqhertddtreejnecuhfhrohhmpefvihhm ohhthhihucfurghmphhlvgcuoehsrghmphhlvghtsehnghihrhhordgtohhmqeenucggtf frrghtthgvrhhnpeefvddtgfdvtdetvdeuieelkeevtedtjeeugeehueeigfelkeffleeg vdevuedthfenucffohhmrghinhepghhnuhdrohhrghenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehsrghmphhlvghtsehnghihrhhordgtohhm X-ME-Proxy: Feedback-ID: i4721425c:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 18 Mar 2023 15:49:35 -0400 (EDT) From: Timothy Sample To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: guix-devel , guix-sysadmin@gnu.org, Simon Tournier Subject: Re: Disarchive database synchronization References: <877cvj2uqs.fsf@inria.fr> Date: Sat, 18 Mar 2023 13:49:34 -0600 Message-ID: <87sfe1lu0h.fsf@ngyro.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=66.111.4.25; envelope-from=samplet@ngyro.com; helo=out1-smtp.messagingengine.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: X-Migadu-Queue-Id: 9671A3AF13 X-Spam-Score: -2.25 X-Migadu-Spam-Score: -2.25 X-Migadu-Scanner: scn0.migadu.com List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-TUID: Ck270+wwcFmO Hey Ludo, Ludovic Court=C3=A8s writes: > I copied over the 12K entries that were missing from > disarchive.guix.gnu.org. (Note that there are currently only two copies > of the database: one at/in [bB]erlin, and one at/in [Bb]ordeaux.) > disarchive.guix.gnu.org now weighs in at 1.8=C2=A0GiB for 31,839 entries. Wow =E2=80=93 12K! For some reason I thought it would be fewer. It=E2=80= =99s very good that we (finally) sync=E2=80=99d up the databases. Also, my set is now at 31,821 after collecting the runoff from the latest Preservation of Guix Report. That=E2=80=99s shockingly close to the 31,839 you have. > For the remaining entries, it=E2=80=99s trickier. Sometimes it=E2=80=99s= just the > gzip compression parameters that differ, which could be addressed with a > little bit more work: > > $ file ffdc77f5e5cb2390b9309de63eb7be68d9fe631e898f4da6c04a8159daefc2c0.g= z ../../disarchive/sha256/ffdc77f5e5cb2390b9309de63eb7be68d9fe631e898f4da6c= 04a8159daefc2c0.gz > ffdc77f5e5cb2390b9309de63eb7be68d9fe631e898f4da6c04a8159daefc2c0.gz: = gzip compressed data, max compression, from Unix, origi= nal size modulo 2^32 446731 > ../../disarchive/sha256/ffdc77f5e5cb2390b9309de63eb7be68d9fe631e898f4da6c= 04a8159daefc2c0.gz: gzip compressed data, max speed, from Unix, original si= ze modulo 2^32 446731 I=E2=80=99m not sure getting the compressed files to match matters. Disarc= hive cares a lot about that when it comes to source code tarballs, because everybody signs and computes checksums over the compressed versions. However, for these files, the differences introduced by compression can be ignored. > Sometimes it=E2=80=99s trickier: > > # diff -u <(gunzip -d < 0001f025c1425ffe36270a81cb091eade87dd8d29ac773735= ae47e1a8c8066c9.gz) <(gunzip -d < ../../disarchive/sha256/0001f025c1425ffe3= 6270a81cb091eade87dd8d29ac773735ae47e1a8c8066c9.gz) > --- /dev/fd/63 2023-03-14 16:13:21.635733426 +0100 > +++ /dev/fd/62 2023-03-14 16:13:21.635733426 +0100 > @@ -1,7 +1,7 @@ > (disarchive > (version 0) > (gzip-member > - (name "webview-sys-0.6.2.tar.gz") > + (name "rust-webview-sys-0.6.2.tar.gz") > (digest > (sha256 > "0001f025c1425ffe36270a81cb091eade87dd8d29ac773735ae47e1a8c8066c= 9")) > @@ -13,7 +13,7 @@ > (footer (crc 1807070134) (isize 121344)) > (compressor zlib-best) > (input (tarball > - (name "webview-sys-0.6.2.tar") > + (name "rust-webview-sys-0.6.2.tar") > (digest > (sha256 > "4fb18f3206838e11f7f8caba6fad9e0f796109428b502793b9f2f0= 613fe0f275")) > @@ -78,7 +78,7 @@ > (padding 0) > (input (directory-ref > (version 0) > - (name "webview-sys-0.6.2") > + (name "rust-webview-sys-0.6.2") > (addresses > (swhid "swh:1:dir:fa41df38bf639ada28c900b0915661= e787fe6d15")) > (digest The name field is not used for data reconstruction. It=E2=80=99s for human consumption (and it may have made some early examples of use at the command line easier to explain). Here, the difference is based on the fact that Crate URIs are weird, and the Preservation of Guix code does not keep the origin file name. Hence, the PoG version extracts the Crate name alone from the URI, and the Cuirass version uses the Guix package name with the =E2=80=9Crust-=E2=80=9D prefix. > As Tim pointed out, Disarchive disassembly is not fully deterministic > and/or might change a bit over time as Disarchive evolves, and that=E2=80= =99s > prolly what we=E2=80=99re seeing here. I honestly think this is a good thing. My instincts tell me that we should excise all sources of ambiguity, like we=E2=80=99re trying to do in = the big picture. However, Disarchive will get better at describing things over time. For instance, it doesn=E2=80=99t handle tar extension headers elegantly at the moment. In the future, if I fix this, I might consider creating a =E2=80=9Cmigrate=E2=80=9D feature that improves existing specifi= cations (e.g., converting the old, verbose representation of extension headers into the new representation). In particular, I=E2=80=99ve left some warts = in the software in order to ship it, and I would be sad to try and commit to those for the rest of time! We might also add other resolver addresses besides SWHIDs.... Maybe I=E2=80=99m missing some perspective, but I don=E2=80=99t think tryin= g to commit to reproducible outputs for Disarchive makes sense. -- Tim P.S., we=E2=80=99ll have to do this dance again shortly, as I just computed 2,023 historical bzip2 specifications. They=E2=80=99re not online yet, but they=E2=80=99ll be up when I publish the next PoG report =E2=80=93 which sh= ould take less than a year this time! :p