From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id mAhtCxobcWHjOQEAgWs5BA (envelope-from ) for ; Thu, 21 Oct 2021 09:47:38 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 4PscBxobcWF+SwAAB5/wlQ (envelope-from ) for ; Thu, 21 Oct 2021 07:47:38 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3F33C2E4E0 for ; Thu, 21 Oct 2021 09:47:37 +0200 (CEST) Received: from localhost ([::1]:56232 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mdSnY-0002n2-7y for larch@yhetil.org; Thu, 21 Oct 2021 03:47:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58196) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mdSm7-0002WB-07 for guix-devel@gnu.org; Thu, 21 Oct 2021 03:46:08 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]:44761) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mdSm1-0002OY-CS for guix-devel@gnu.org; Thu, 21 Oct 2021 03:46:05 -0400 Received: by mail-wm1-x335.google.com with SMTP id b189-20020a1c1bc6000000b0030da052dd4fso13384925wmb.3 for ; Thu, 21 Oct 2021 00:46:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:in-reply-to:references:date:message-id:mime-version :content-transfer-encoding; bh=wU/NvVHF6I5Dtyjm4K1e5ve08t0U16Kqs0GgeipgX1I=; b=oydB80palXOS5ngqGacDvTyZTU/HYf0h2j172lPXINOJGRdFvHEOsh0TCaE0IVijBh S/LZFJEwO6bd+JbypZZ9R2+08XwZpT10rslrUkTTFY3sjc0RiqN8255yBt1tSEYwl/YP dIr9ycfHY1DPXf/R1gojYJ/GdGj2aheWyFNohiOSWLAeOK4luNOMJvtu1mKDVwC8ZoIM BjFOwZ4ZxInxSKnIIzOQUpihZxAJ7FdCubM7nYT75XAg5V0n7e2b9ACdbLXeGi80zKFF FjMEQv0GHjrTiBqJL+JkuHhDzKklwmXQfPje1bWvYeDALtI7XADx2HWjyLcBAtuwa2FM +C/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=wU/NvVHF6I5Dtyjm4K1e5ve08t0U16Kqs0GgeipgX1I=; b=l2+rUYTQ8STkHVfm2wVk4euEQGN0F7iQ6s/zsUTmp3kx9XgSgzQK4Ulc3dHxYb0pIA 5Z8MSSMfPj1rsYbT8T8uhwfrV7QZ6UGQVIOWkbVt4I6mVRxsNk7ygPNd139baN84tq7S hDkNIkLvSPZagaPu0acdEyLEFfES/oY7AnaQnlcq+TDqivGJ2Lhihkty7cvUwxbYqqjp 9tlEgxObGJoBVPInuocEsFMTexXMNbTy6JKJCBrrd8tSKikrjb+steFAkBJnvm1jDN1F isVwgVNvoPoW71+jjaJgyRuyrYqQgPfOJMkXviQiz+BI0MKNnXLC03nWayqvEI+QJZk/ nWCA== X-Gm-Message-State: AOAM530YiiSNltOD8tAiAnpsXyHidlQp+ytcihZ/yzR8Ukkp1xcSj6vJ OqqDEclmaSX79lNX5MnWtLV6crRNjFE= X-Google-Smtp-Source: ABdhPJxRDKz1Ed8r1en+0oDY8ReDZSTAPoEnoT1vZ5l0+WpJmWwGuiw2aat5DnKQKzBF46rjped45g== X-Received: by 2002:a7b:c149:: with SMTP id z9mr4702049wmi.177.1634802359539; Thu, 21 Oct 2021 00:45:59 -0700 (PDT) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id i29sm677490wmb.6.2021.10.21.00.45.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Oct 2021 00:45:59 -0700 (PDT) From: zimoun To: Timothy Sample , guix-devel@gnu.org Subject: Re: Preservation of Guix Report In-Reply-To: <87o87jjx54.fsf@ngyro.com> References: <87o87jjx54.fsf@ngyro.com> Date: Thu, 21 Oct 2021 09:39:27 +0200 Message-ID: <86sfwug72o.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=zimon.toutoune@gmail.com; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1634802457; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=wU/NvVHF6I5Dtyjm4K1e5ve08t0U16Kqs0GgeipgX1I=; b=fcN3O7ZlTUe2SUWNdORAJ6Ik5tYiVpOjKVbHiIz5n+PO2avuJDb4Gmvjs8Gi/uWuCamUQ5 hoVbFz8V9Qzdx7/PXwVsG+DgC7a8XJBgqo5Rb6AcP8Xm6K8KMigE/qOOBGCNCHIM86MwrD BAYVO8NfKFYNlGDMBtoUe5NipXYnVaaHPNF3ZdUA4xenFgfM55Ic8W75oEhrgbap35CH57 eGtW2ghhQ6XAgFwPl8s9XSYT3Exuaqc7M1vWuMwJVnEszk3WC0zs04PgymFunxu8RUqFTP MlIsf70NXIRJh9TpY/fEpZYJXZaZFJ6Lg+7xmgAaG3xvO9Xz3/ScG9W7oCKV8g== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1634802457; a=rsa-sha256; cv=none; b=JcNPbXOPZtt8QmssHAttpsLagvfgQeB+dLr0bj1Ty/fL5zNfR1OlNp7hYvJVPkXfopXpGL bUaBHfEYiXUuyflIw8icHKGuMyL+Gcis0mdOPNNdPoMuZDt/2RelmLiz17K0VekYuiy70O hDWEGhn6eDsKiWtdplK6FcbbcKCAVxHh6NTQxQQf40ftqUuDQVoxZ/aygttPg6twbt5bJP Bw8wLwteiCNrzBaqu6E4shwAfeDD6Cl/4avd4rDfIBqhGSET14BeD4XmlgPm2LRME23iIQ 8Mb+M+koMwnIvtqTgTXnzAzZ+n74/oNwKHKm/loT/aIHnlcxP7UOFup6utN1Pg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=oydB80pa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -1.63 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=oydB80pa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 3F33C2E4E0 X-Spam-Score: -1.63 X-Migadu-Scanner: scn0.migadu.com X-TUID: cCTAw5oOHk5i Hi Timothy, On Wed, 20 Oct 2021 at 15:48, Timothy Sample wrote: > Early this summer I did a bunch of work trying to figure out which Guix > sources are preserved by the SWH archive. I=E2=80=99m finally ready to s= hare > some preliminary results! > > https://ngyro.com/pog-reports/2021-10-20/ Cool! Really interesting. > What=E2=80=99s cool is that the report is automated. Next on my list is = to > update the database and generate a new report. Then, we can compare the > results and see if we are improving. (My read on the results so far is > that improving =E2=80=9Csources.json=E2=80=9D will yield big improvements= , but we might > not be able to get to that before the next report.) Here two minor comments: 1. Since a couple of days, I run: $ GUIX_SWH_TOKEN=3D$TOKEN guix lint -c archival where $TOKEN is provided by the SWH Authentication service [1]. Instead of a rate limit at 120, it is 1200. Therefore, more =E2=80=99git-fetch=E2=80=99 packages are added. I am in the process to= automate that but do not hold your breath. :-) 2. For still unknown reasons, the bridge between SWH and Disarchive has some holes. For instance, $ guix lint -c archive znc gnu/packages/messaging.scm:996:12: znc@1.8.2: Disarchive entry refe= rs to non-existent SWH directory '33a3b509b5ff8e9039626d11b7a800281884cf2a' $ wget https://guix.gnu.org/sources.json $ cat sources.json | jq | grep znc "integrity": "sha256-IwbxlQzncsWlmlf1SG1Zu5yrmEl8RfxJy8RawN7BG= bs=3D" "integrity": "sha256-q0jatpd+j0PW//szIo0ViGX2jd5wJtEjxpPXcznc8= rs=3D" "https://znc.in/releases/archive/znc-1.8.2.tar.gz" $ guix download https://znc.in/releases/archive/znc-1.8.2.tar.gz Starting download of /tmp/guix-file.hnjWTE From https://znc.in/releases/archive/znc-1.8.2.tar.gz... znc-1.8.2.tar.gz 2.0MiB 599Ki= B/s 00:03 [##################] 100.0% /gnu/store/58khbiwp2ghhzg00gnzdy2jlfv49vajm-znc-1.8.2.tar.gz 03fyi0j44zcanj1rsdx93hkdskwfvhbywjiwd17f9q1a7yp8l8zz Therefore, something is wrong somewhere. Because of #1, I detect many of such examples. I do not know if SWH-ID computed by Disarchive is incorrect or if SWH has not ingested. Investigations required. :-) 1: > It=E2=80=99s surprising to me that SWH is not already getting these from > =E2=80=9Csources.json=E2=80=9D. I picked an arbitrary one, =E2=80=9Crust= -quote-0.6=E2=80=9D, and it=E2=80=99s > simply not in =E2=80=9Csources.json=E2=80=9D. On the other hand, I bet S= WH would like a > crates.io (and CRAN, etc.) loader, too. >From the SWH doc, there is a CRAN lister [2] but I have not checked what they ingest concretely. Because on our side, we are using =E2=80=99url-fet= ch=E2=80=99 and it appears to me possible to have a tiny mismatch between what is inside the release tarball (what we concretely use) vs what SWH ingests directly from CRAN. 2: And answering to your question [3] about =E2=80=9Csources.json=E2=80=9D, I = think the ingestion started after this commit 35bb77108fc7f2339da0b5be139043a5f3f21493 from guix-artwork. Other said, SWH started to ingest from =E2=80=9Csources.json=E2=80=9D after July 2020; = probably around September 2020. 3: > One other way to help would be to suggest improvements to the report. I > don=E2=80=99t want to fiddle with it too much, but if there is some simpl= e graph > or table or list that should be there, I=E2=80=99m happy to give it a go. For the Missing and Unknown fields, could you distinguish the kind of origin? Is it mainly git-fetch or url-fetch or others? It would help to spot the issues to work on it (sources.json, SWH side, Disarchive, etc.). Cheers, simon