From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id oDZDE2JYZWHl3wAAgWs5BA (envelope-from ) for ; Tue, 12 Oct 2021 11:41:54 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id UF8WDWJYZWHxdgAAbx9fmQ (envelope-from ) for ; Tue, 12 Oct 2021 09:41:54 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C8668315EF for ; Tue, 12 Oct 2021 11:41:53 +0200 (CEST) Received: from localhost ([::1]:35158 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEIC-0001s2-S0 for larch@yhetil.org; Tue, 12 Oct 2021 05:41:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41164) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maE4h-0004eL-N3 for guix-devel@gnu.org; Tue, 12 Oct 2021 05:27:57 -0400 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:38878) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1maE4f-00030Q-AS for guix-devel@gnu.org; Tue, 12 Oct 2021 05:27:55 -0400 Received: by mail-wr1-x42a.google.com with SMTP id u18so65070094wrg.5 for ; Tue, 12 Oct 2021 02:27:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:in-reply-to:references:date:message-id:mime-version :content-transfer-encoding; bh=FMEMhZBIQokS+kanf164UjsDxIIv6TNBhdKCKnwW+bg=; b=INQFF6LuGBXK4jpbPKK4cQCbk8O7gAcStMl7WEAyirRRTzqhvkxkoqtgk8P2e5NYR7 m/Eb7bz371BjlWTFxBZEhMOu9Id5eyIXo8ztXyy+gjZgS1FZgg+F//+sX18+jDePUOWq Ean684l6g5Bk85Qul7HBNUoWDXO+fxctNRjcYcV4jKkFcT4O/MTyT/isUNkcQl1kwxal MHYs9EYCu7fQG25gdKbO4o2xqqF+nRFN4v2putjJcc5UKM8YDxlrWNnEsKMsBfL6fZvR H4AOOBwn0ROWwFDLxH0awlfoqGhrZk2HjyJArXYEq9KphqIUwxrzzmABpnnWXB5IMVj1 YyPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=FMEMhZBIQokS+kanf164UjsDxIIv6TNBhdKCKnwW+bg=; b=KoCp87fYLWwGCbZ7Ao6NkvWaAiev8BLiknLv+JDIqAPRa8KtstBxDnHajsAEP83mqB i7iH90IS6jlpRjQ3kzbVSLjOvIJDeehuW398zyWfPlGbqTAwMNpvE5vY4WZRu1r1NToB y5NZjA9sN+t+mw0dnoOVldGeBMKsjz7ijrkdLViZLUN97VVcq2yZ+fH06tAOBmIlVmYL mVKpqZJn5XGeHdqsOgxhr+h6gA7y9hpIr+yapcXWZaqC7bz1cBBLgDDbEX/9bX3qci3Y jWApHWd+SDChwCkO9KgIFrGD1dlXbxTP+nPxioCczoD+d4rF+skzJa8vFD6NyVbDb15Z wKdA== X-Gm-Message-State: AOAM533vY1R5uMbRFWmeZ+M6ctjvoyNPfgl0+cyOxdncPRGSgyO22JNH fzKMSpSzn72HGc4ib5BOJN4+JcvybRFzWw== X-Google-Smtp-Source: ABdhPJyV4kSW45/y+CDb8RSaXTAq6KFymxRN0YVPjBbvfdgtp0wmVhpETjeNDVbaz4xK8StPM7LNVQ== X-Received: by 2002:adf:a48e:: with SMTP id g14mr30125044wrb.11.1634030871438; Tue, 12 Oct 2021 02:27:51 -0700 (PDT) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id q12sm1924718wmj.6.2021.10.12.02.27.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 02:27:51 -0700 (PDT) From: zimoun To: Ludovic =?utf-8?Q?Court=C3=A8s?= , guix-devel@gnu.org Subject: Re: Disarchive update In-Reply-To: <87r1cu1pj5.fsf@inria.fr> References: <87r1cu1pj5.fsf@inria.fr> Date: Tue, 12 Oct 2021 11:19:18 +0200 Message-ID: <86r1cqmwh5.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=zimon.toutoune@gmail.com; helo=mail-wr1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1634031713; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=FMEMhZBIQokS+kanf164UjsDxIIv6TNBhdKCKnwW+bg=; b=TesEoFpBaPohbgXOWFxq2iWtcIvSGIMpYd/U5x9RQYz9CESajVDpmr7UfjyyuOXX9hhQz/ t+SNs3o8Av6Eyzf2thwXtT6r9CHpPMP03B2Gk5SbDx8VCH3iq5JIWG50Lxt64P1m190eGr tE7oYfDRSr0N3KAEzit0VxUCL5w7mPj1ec2jkMAnnd5HIinKTu5QuEWDJpag0rkjg5p8q/ yHnEDo9UmaUOlbp5EHYCUNz5VpjxqqlJABbUJE+JWdDvnrf3vt2AtCiN8pg2TutDL0lGXq ESdHZzFTm9LQ3U8ZWSlVgPjHq8Ii/z7Y4yvJDk1vob7AKbCtQBnDLtt54GLh1g== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1634031713; a=rsa-sha256; cv=none; b=BXHE5D3a2aam+uwvdvy8cFXXvTjzfR7FHV5MEhzDdMYgaCOFj8cpIs8P/l/ckQvIIAdS+T FBF72B3B8wBKRh6EZjfZtETLHzjecgFHlCXNqlUAppqZuWqUOScH+YNXpYeiI8j6SWrCJm AE8BlYCMkvdkBXe+sn4j3RtALpelV9Do/qQ+M3NtnUwc1vtAfN76xPPtokAwK6isYkrFTD d+OJaBuJ2ySQJrJfU+NfVrqRvMt6I9dwN1yZPr84rebM5ZQuKPVIwcmBoPoJ1wT69o6ldO ZVeUEyBObwMMevtIqg7IoS7XDktNZmMgM6JfFNOGDFB5m1BPbV1eR6JNlJ/xsg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=INQFF6Lu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -4.01 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=INQFF6Lu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: C8668315EF X-Spam-Score: -4.01 X-Migadu-Scanner: scn1.migadu.com X-TUID: 9Cz9sIrglCrV Hi Ludo, On Sat, 09 Oct 2021 at 12:05, Ludovic Court=C3=A8s wrote: > If you run: > > guix build /gnu/store/nnl67m8c2x9rwqbnych1agc6p7g5473g-disarchive-colle= ction.drv Oh, cool! > and if you=E2=80=99re patient :-), you eventually get a 579=C2=A0MB direc= tory > containing Disarchive metadata for 8,413 tarballs out of 9,113 (the > missing tarballs are those that =E2=80=9Cdisarchive disassemble=E2=80=9D = fails to > handle, for instance because it couldn=E2=80=99t guess what compression m= ethod > is being used.) Timothy made this table months ago: tar+gz 9090 52.0% git 5294 30.3% tar+xz 1184 06.8% tar+bz2 775 04.4% tar 393 02.2% zip 273 01.6% svn-multi 175 01.0% svn 125 00.7% file 51 00.3% computed 38 00.2% hg 36 00.2% unknown-uri 20 00.1% tar+gz? 15 00.1% tar+lz 13 00.1% tar+Z 4 00.0% cvs 3 00.0% bzr 3 00.0% tar+lzma 2 00.0% total 17494 100.0% What is really missing is XZ and Bzip2 support in Disarchive, I guess. > Where to go from here? Timothy Sample had already set up a Disarchive > database at , which (guix download) uses > as a fallback; I=E2=80=99m not sure exactly how it=E2=80=99s populated. = The goal here > would be for the Guix project to set up infrastructure populating a > database automatically and creating backups, possibly via SWH (we=E2=80= =99ll > have to discuss it with them). Timothy was working on feeding the database using each release. Well, you can give a look at: Then something along these lines: $ sqlite3 /tmp/pog.db < schema.sql $ guix repl -L . <(echo ' (use-modules (pog)) (ingest "6298c3ffd9654d3231a6f25390b056483e8f407c" "/tmp/pog.db") ') for where the commit hash corresponds to v1.0.0. I do not know if it would be equivalent to run: guix time-machine --commit=3D6298c3ffd9654d3231a6f25390b056483e8f407c \ -- build -m etc/disarchive-manifest.scm > A plan we can already deploy would be: > > 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin. > > 2. On berlin, add an mcron job that periodically copies the output of > the latest =E2=80=9Cdisarchive-collection=E2=80=9D build to a direct= ory, say > /srv/disarchive. Thus, the database would accumulate tarball > metadata over time. > > 3. Add an nginx route so that /srv/disarchive is served at > https://disarchive.guix.gnu.org. > > 4. Add disarchive.guix.gnu.org to (guix download). To replace (or add to) the current =E2=80=99%disarchive-mirrors=E2=80=99 ri= ght? Going this road (use Cuirass), why not generating the sources.json similarly? Instead of the hack using the website builder. On my side, I will try to resume what I started months ago: knowing the SWH coverage. For instance, on this ~92% of tarballs, how many are currently stored into SWH? Well, do not take your breath and I would be happy if someone beats me. ;-) Cheers, simon