From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id WJiwNkOLp2B2agEAgWs5BA (envelope-from ) for ; Fri, 21 May 2021 12:28:19 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id sCJ4MkOLp2DiTgAAB5/wlQ (envelope-from ) for ; Fri, 21 May 2021 10:28:19 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 16552182D2 for ; Fri, 21 May 2021 12:28:19 +0200 (CEST) Received: from localhost ([::1]:47152 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lk2OA-0003Z0-3n for larch@yhetil.org; Fri, 21 May 2021 06:28:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51098) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lk2Nu-0003Tz-8F for guix-patches@gnu.org; Fri, 21 May 2021 06:28:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:52584) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lk2Nu-0003us-0k for guix-patches@gnu.org; Fri, 21 May 2021 06:28:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lk2Nt-0003Sj-Th for guix-patches@gnu.org; Fri, 21 May 2021 06:28:01 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#48437] [PATCH v2] lint: archival: Lookup content in Disarchive database. Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 21 May 2021 10:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 48437 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 48437@debbugs.gnu.org Cc: Timothy Sample , Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by 48437-submit@debbugs.gnu.org id=B48437.162159284013258 (code B ref 48437); Fri, 21 May 2021 10:28:01 +0000 Received: (at 48437) by debbugs.gnu.org; 21 May 2021 10:27:20 +0000 Received: from localhost ([127.0.0.1]:35897 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lk2ND-0003Rl-L2 for submit@debbugs.gnu.org; Fri, 21 May 2021 06:27:20 -0400 Received: from eggs.gnu.org ([209.51.188.92]:57276) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lk2NA-0003RX-Nf for 48437@debbugs.gnu.org; Fri, 21 May 2021 06:27:18 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:49118) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lk2N5-0003RZ-3g; Fri, 21 May 2021 06:27:11 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=55948 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lk2N3-00023d-RD; Fri, 21 May 2021 06:27:10 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Date: Fri, 21 May 2021 12:27:03 +0200 Message-Id: <20210521102703.23383-1-ludo@gnu.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <87a6or67em.fsf_-_@gnu.org> References: <87a6or67em.fsf_-_@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1621592899; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=KQRmmcctJhCPXreRtcZ9RysKZWI08As1dRDAK1RkPkY=; b=naTkJ4HB7mxjaXKTZ9pCsWND6rUG0GOzBHM418YnMTcoR7NAWtym7+j2eCY/QJiKlnaVGp Lbvo+BGU5qmA9aETAQfnK0ZpE1vvd1tPYr7pETvHOPibfkTW7HBHRn2nNbyS4qPExSrhaz jwG2BXwfYsddgSBm2DcyT8xeDJZoLJgSgxQApxColkA0ckW80GXnA+lOleOKXpNBqSp7sE xV2s2g+UrWDK/M6P9mc5ToO7iWXVBKRQGSLTRLHOOuorPDRCrOGPQ4ViQdyllk1DOesPzf qhmVOwRWmwJLjGq+7WKT7Cns1tDqPcEx/4KHf5fkIakRuwTLjR1YI3TW9yJRrg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1621592899; a=rsa-sha256; cv=none; b=gs+6rVj5esaAIDan1kb9j2KXD1W+GvVFP2O2NUM0pvtBs200YKKZ+aaRjgEa+j3hRakQN6 3lQTXpNQaijfsFyR9QYEmBxeRXnJStfHOeGX0OQmm1tGj4K6NWK8iyPXL8DBF2WhhIwZEo caPpf2PXtfixL03rw+QZzimqWWmWSXuZyRynXL5T3gH0mB8fsjz9UIowQjYzy4JqA/NLDe 36D6uG5YgcMiAdGCp1YicBMtMtIgYI8ywqs0MQasNjLPIzFHenSg6P4zWaruxrjA1T2TQy k3k41dUGzBeEeTVe2cLMVhQ8sK9aU0AX89sIG9n8dp2T2pl9GbHAO5vC+k8Jsg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Spam-Score: -1.94 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Queue-Id: 16552182D2 X-Spam-Score: -1.94 X-Migadu-Scanner: scn0.migadu.com X-TUID: ZFIWoxtNPIof * guix/lint.scm (lookup-disarchive-spec): New procedure. (check-archival): When 'lookup-content' returns #f, call 'lookup-disarchive-spec'. Call 'lookup-directory' on the result of 'lookup-directory'. * guix/download.scm (%disarchive-mirrors): Make public. * tests/lint.scm ("archival: missing content"): Set '%disarchive-mirrors'. ("archival: content unavailable but disarchive available"): New test. --- guix/download.scm | 1 + guix/lint.scm | 62 ++++++++++++++++++++++++++++++++++++++++++++--- tests/lint.scm | 34 +++++++++++++++++++++++--- 3 files changed, 89 insertions(+), 8 deletions(-) Hi! This new version checks that the SWH IDs that appear in a Disarchive entry are indeed available at archive.softwareheritage.org. It also adds a test for that. Ludo'. diff --git a/guix/download.scm b/guix/download.scm index 72094e7318..b6eb97e6fa 100644 --- a/guix/download.scm +++ b/guix/download.scm @@ -35,6 +35,7 @@ #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:export (%mirrors + %disarchive-mirrors (url-fetch* . url-fetch) url-fetch/executable url-fetch/tarbomb diff --git a/guix/lint.scm b/guix/lint.scm index 1bebfe03d3..a2d6418b85 100644 --- a/guix/lint.scm +++ b/guix/lint.scm @@ -30,6 +30,7 @@ (define-module (guix lint) #:use-module (guix store) + #:autoload (guix base16) (bytevector->base16-string) #:use-module (guix base32) #:use-module (guix diagnostics) #:use-module (guix download) @@ -1227,6 +1228,43 @@ upstream releases") #:field 'source))))))) +(define (lookup-disarchive-spec hash) + "If Disarchive mirrors have a spec for HASH, return the list of SWH +directory identifiers the spec refers to. Otherwise return #f." + (define (extract-swh-id spec) + ;; Return the list of SWH directory identifiers SPEC refers to, where SPEC + ;; is a Disarchive sexp. Instead of attempting to parse it, traverse it + ;; in a pretty unintelligent fashion. + (let loop ((sexp spec) + (ids '())) + (match sexp + ((? string? str) + (let ((prefix "swh:1:dir:")) + (if (string-prefix? prefix str) + (cons (string-drop str (string-length prefix)) ids) + ids))) + ((head tail ...) + (loop tail (loop head ids))) + (_ ids)))) + + (any (lambda (mirror) + (with-networking-fail-safe + (format #f (G_ "failed to access Disarchive database at ~a") + mirror) + #f + (guard (c ((http-get-error? c) #f)) + (let* ((url (string-append mirror + (symbol->string + (content-hash-algorithm hash)) + "/" + (bytevector->base16-string + (content-hash-value hash)))) + (port (http-fetch (string->uri url) #:text? #t)) + (spec (read port))) + (close-port port) + (extract-swh-id spec))))) + %disarchive-mirrors)) + (define (check-archival package) "Check whether PACKAGE's source code is archived on Software Heritage. If it's not, and if its source code is a VCS snapshot, then send a \"save\" @@ -1302,10 +1340,26 @@ try again later") (symbol->string (content-hash-algorithm hash))) (#f - (list (make-warning package - (G_ "source not archived on Software \ -Heritage") - #:field 'source))) + ;; If SWH doesn't have HASH as is, it may be because it's + ;; a hand-crafted tarball. In that case, check whether + ;; the Disarchive database has an entry for that tarball. + (match (lookup-disarchive-spec hash) + (#f + (list (make-warning package + (G_ "source not archived on Software \ +Heritage and missing from the Disarchive database") + #:field 'source))) + (directory-ids + (match (find (lambda (id) + (not (lookup-directory id))) + directory-ids) + (#f '()) + (id + (list (make-warning package + (G_ " +Disarchive entry refers to non-existent SWH directory '~a'") + (list id) + #:field 'source))))))) ((? content?) '()))) '())))) diff --git a/tests/lint.scm b/tests/lint.scm index a2c8665142..d54fafc1d2 100644 --- a/tests/lint.scm +++ b/tests/lint.scm @@ -1,7 +1,7 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2012, 2013 Cyril Roelandt ;;; Copyright © 2014, 2015, 2016 Eric Bavier -;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès +;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès ;;; Copyright © 2015, 2016 Mathieu Lirzin ;;; Copyright © 2016 Hartmut Goebel ;;; Copyright © 2017 Alex Kost @@ -1008,10 +1008,13 @@ (method url-fetch) (uri "http://example.org/foo.tgz") (sha256 (make-bytevector 32)))) - (warnings (with-http-server '((404 "Not archived.")) + (warnings (with-http-server '((404 "Not archived.") + (404 "Not in Disarchive database.")) (parameterize ((%swh-base-url (%local-url))) - (check-archival (dummy-package "x" - (source origin))))))) + (mock ((guix download) %disarchive-mirrors + (list (%local-url))) + (check-archival (dummy-package "x" + (source origin)))))))) (warning-contains? "not archived" warnings))) (test-equal "archival: content available" @@ -1027,6 +1030,29 @@ (parameterize ((%swh-base-url (%local-url))) (check-archival (dummy-package "x" (source origin))))))) +(test-equal "archival: content unavailable but disarchive available" + '() + (let* ((origin (origin + (method url-fetch) + (uri "http://example.org/foo.tgz") + (sha256 (make-bytevector 32)))) + (disarchive (object->string + '(disarchive (version 0) + ... + "swh:1:dir:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"))) + ;; https://archive.softwareheritage.org/api/1/directory/ + (directory "[ { \"checksums\": {}, + \"dir_id\": \"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\", + \"type\": \"file\", + \"name\": \"README\" + \"length\": 42 } ]")) + (with-http-server `((404 "") ;lookup-content + (200 ,disarchive) ;Disarchive database lookup + (200 ,directory)) ;lookup-directory + (mock ((guix download) %disarchive-mirrors (list (%local-url))) + (parameterize ((%swh-base-url (%local-url))) + (check-archival (dummy-package "x" (source origin)))))))) + (test-assert "archival: missing revision" (let* ((origin (origin (method git-fetch) -- 2.31.1