From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id aLVOFku5p2EycgAAgWs5BA (envelope-from ) for ; Wed, 01 Dec 2021 19:04:59 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id CFIREku5p2FTHgAA1q6Kng (envelope-from ) for ; Wed, 01 Dec 2021 18:04:59 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 017DA2CBA7 for ; Wed, 1 Dec 2021 19:04:59 +0100 (CET) Received: from localhost ([::1]:35556 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1msTyU-0007fn-41 for larch@yhetil.org; Wed, 01 Dec 2021 13:04:58 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41286) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1msTy0-0007fD-PJ; Wed, 01 Dec 2021 13:04:28 -0500 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:42597) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1msTxw-0004df-EW; Wed, 01 Dec 2021 13:04:28 -0500 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 8E1625C0081; Wed, 1 Dec 2021 13:04:20 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Wed, 01 Dec 2021 13:04:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=X2Atm8+47M2upPIl8R7PWPZkbVfpJAJxI9tgGf+sDRQ=; b=lsU90jhZ feA+DWiie9cnqWla+aGBr/Xkaf4YPmPD0rIWN7IgIKrM5Q8PK4mC/NjIINbcJk7h BywwMDGMzMxNomyo+xb1ueBfD+A/yEWcnr2v+xOgoY3drwJ5liXJ13j3cCqb6ciF Srjh8kb5ycx0Zoy8nr7pkhWGSQmklE4EvuUif6QmIiMKcIRhi0PI9m/tJU1S61lH oYIdtkFRGx6uBESpt1Xm9fP0LRothlPNqLIGFcjbOSvvkpQNsdLreec/hd6sFRzv PAS1UdnAsIklrLqL6ZHeXTHKfb6iK3FdPWvF9qB8FZdz4N96/hcRZwpTHBMSjYi3 LFbz7HYqZQ6efw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvuddrieefgddutdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufhffffkfgggtgfgsehtqhertddtreejnecuhfhrohhmpefvihhmohht hhihucfurghmphhlvgcuoehsrghmphhlvghtsehnghihrhhordgtohhmqeenucggtffrrg htthgvrhhnpedvtdevuddukeevgefggffhtefhueffkeegkeevudeftedugefgtdefheef jeffieenucffohhmrghinhepghhnuhdrohhrghenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehsrghmphhlvghtsehnghihrhhordgtohhm X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 1 Dec 2021 13:04:19 -0500 (EST) From: Timothy Sample To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: Software Heritage fifth anniversary event References: <87sfvc4q8j.fsf@inria.fr> Date: Wed, 01 Dec 2021 13:04:18 -0500 Message-ID: <87tufsgq1p.fsf@ngyro.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=66.111.4.26; envelope-from=samplet@ngyro.com; helo=out2-smtp.messagingengine.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org, guix-science@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638381899; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=X2Atm8+47M2upPIl8R7PWPZkbVfpJAJxI9tgGf+sDRQ=; b=ZBrqzcDNYOfy4UWP9Z1PckR0oAmSAhlC5PbJt/0JokYF7R0LMFOaKnbsirSwpMhdAzsxFC cIe5mJLBRiNvr40CCi7w5H88Zqa3nGVg/p97JYP+BCzIvLvCMuZWixLLtBOVOJJcD+9mQQ DhoeyF31T7RRcPzsfNpJklXl0txX5CkeHUsRGGyyWNjI3eAGTsITccAJLfRqp0VLOcmqmy 0rbF9G0RuUwToRNYuOXKdmaBNVKDtEuuesmwS24M3Au2hQ34vGXoD+JFvofaX7EYEchKzx ksuavJaydUMc7or1JAx9S+iUzTXBT5kbfyNtPksJH9QtREwcEJed4EfvuBQbiQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638381899; a=rsa-sha256; cv=none; b=RjTYJ366HeSvTKju1i+aJ33pKFYONYPOLxhvMyXZWQS90p6rrmaoteZ5AkSqmdOTrvYwYS GscHR0hcJ+hK6BeqL/BYutGeIsbUCN2OgpodwUkvC2qTonITbcTcLWHftO30lkHFAxmfwd MyO38+BtEWFB0N7IZZNuYYVpBAJBdv/wrI/MVE7wJV7l1pULyhyKE6pyQ5GHvPJ8yEpgC9 Cac3z75+Ysuk+r89mVTCPuOOS3+jZc6QWf/W49mopmTHmIfA2EuPo0Uh7DX0Nf4WMeClfj 4X/9kDkvEzYg5jn5lYitX/1osHUXMxOUWWpIaaKIq4kp+7TZwqKbf/jsYlQskQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=messagingengine.com header.s=fm1 header.b=lsU90jhZ; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -2.02 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=messagingengine.com header.s=fm1 header.b=lsU90jhZ; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 017DA2CBA7 X-Spam-Score: -2.02 X-Migadu-Scanner: scn0.migadu.com X-TUID: 3g2fnBylzfId Ludovic Court=C3=A8s writes: > I gave a 10=E2=80=9315mn talk on how Guix uses SWH, what Disarchive is, w= hat > the current status of the =E2=80=9Cpreservation of Guix=E2=80=9D is, and = what remains > to be done: > > https://git.savannah.gnu.org/cgit/guix/maintenance.git/plain/talks/swh-= unesco-2021/talk.20211130.pdf Wow =E2=80=93 great work! > I chatted with the SWH tech team; they=E2=80=99re obviously very busy sol= ving > all sorts of scalability challenges :-) but they=E2=80=99re also truly > interested in what we=E2=80=99re doing and in supporting our use case. O= ff the > top of my head, here are some of the topics discussed: > > =E2=80=A2 ingesting past revisions: if we can give them =E2=80=98source= s.json=E2=80=99 for > past revisions, they=E2=80=99re happy to ingest them; This is something I can probably coax out of the Preservation of Guix database. That might be the cheapest way to do it. Alternatively, when we get =E2=80=9Csources.json=E2=80=9D built with Cuirass, we could tell Cui= rass to build out a sample of previous commits to get pretty good coverage. (Side note: eventually we could verify the coverage of the sampling approach using the Data Service, which has a processed a very exhaustive list of commits.) > =E2=80=A2 rate limit: we can find an arrangement to raise it for the pu= rposes > of statistics gathering like Simon and Timothy have been doing (we > can discuss the details off-list); Cool! So far it hasn=E2=80=99t been a concern for me, but it would help in= the future if want to try and track down Git repositories that have gone missing. > =E2=80=A2 Disarchive: they=E2=80=99d like to better understand the =E2= =80=9Cunknowns=E2=80=9D in the > PoG plots (I wasn=E2=80=99t sure if it was non-tar.gz tarballs or wha= t) and > to work on the definitely-missing origins that show up there; Many of the unknowns are there for me to track Disarchive progress. It=E2=80=99s not really the clearest reporting, but it tracks more what Gui= x can handle automatically than what we could theoretically know about. Basically something is =E2=80=9Cknown=E2=80=9D if it can be downloaded from= upstream, and either: it=E2=80=99s a non-recursive Git reference; or it=E2=80=99s som= ething Disarchive can handle. Hence, we know nothing about other version control systems and, say, =E2=80=9C.tar.bz2=E2=80=9D archives. Also, all t= hese things are based on heuristics. :) As we get closer to 100% known, we can start analyzing everything more closely. > they=E2=80=99re not opposed to the idea of eventually hosting or main= taining > the Disarchive database (in fact one of the developers thought we > were hosting it in Git and that as such they were already archiving > it=E2=80=94maybe we could go back to Git?); It=E2=80=99s a possibility, but right now I=E2=80=99m hopeful that the data= base will be in the care of SWH directly before too long. I=E2=80=99d rather wait and s= ee at this point. I=E2=80=99m sure we could manage it, but the uncompressed size= of the Disarchive specification of a Chromium tarball is 366M. Storing all the XZ specifications uncompressed is over 20G. It would be a big Git repo! > =E2=80=A2 bit-for-bit archival: there=E2=80=99s a tension between makin= g SWH a > =E2=80=9Ccanonical=E2=80=9D representation of VCS repos and making it= a faithful, > bit-for-bit identical copy of the original, and there are different > opinions in the team here; our use case pretty much requires > bit-for-bit copies, and fortunately this is what SWH is giving us in > practice for Git repos, so checkout authentication (for example) > should work even when fetching Guix from SWH. That=E2=80=99s interesting. I=E2=80=99m sure most of us in the Guix camp a= re on team bit-for-bit, but I=E2=80=99m sure we can all agree that it=E2=80=99s not ea= sy to get there. > There were other discussions about Guix and Nix and I was pleased to see > people were enthusiastic about functional package management and about > our whole endeavor. > > Anyway I think we can take this as an opportunity to increase bandwidth > with the SWH developers! Good idea. It=E2=80=99s nice when our efforts and experience produce somet= hing useful to the broader free software community. :) -- Tim