From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id AMftDvWwbV+iEgAA0tVLHw (envelope-from ) for ; Fri, 25 Sep 2020 08:57:25 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id yK7TCvWwbV9IeQAA1q6Kng (envelope-from ) for ; Fri, 25 Sep 2020 08:57:25 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CD2BA940215 for ; Fri, 25 Sep 2020 08:57:24 +0000 (UTC) Received: from localhost ([::1]:46508 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLjXf-0004Xg-Qe for larch@yhetil.org; Fri, 25 Sep 2020 04:57:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54074) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLjXP-0004W5-9Y for guix-devel@gnu.org; Fri, 25 Sep 2020 04:57:07 -0400 Received: from mail-qk1-x733.google.com ([2607:f8b0:4864:20::733]:44103) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLjXN-0006EC-EB for guix-devel@gnu.org; Fri, 25 Sep 2020 04:57:07 -0400 Received: by mail-qk1-x733.google.com with SMTP id n133so2050034qkn.11 for ; Fri, 25 Sep 2020 01:57:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eHvQbPFGoJZkJ5ctjztvMixL6tq3WS1Uw2Oh5fb/QjI=; b=B/zzX6RyEpx6quzE8bp0Ho6PIF9RS0UflxvPO7TvYxaGHuM6U9V+XcWMMtuubrsrTE DR7OCxF9Mku3+ziLWxNOVWagOV/TW1QuWDGh1MvZxaAWPLnf+akcpNGYi/f5G469fomh X8zuiqPFKiBIL7Ryi1vYfntUARh7u5/GnubwatPvPvFDn9AEx5Cx34Kug5RyS+lDSOsa mmr03PzUx6wVO7bEsaSd6OjXlHJsX1837rinHv+p4t+Ou4frLI7BLY0EQikeg2dU4hP3 U4qUGmegwSIpi4we1FGQdII9x2pUB5uOSnJhNf3TFs5u6PQI6FoB8et3LEMYJXdHPKzc RvRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eHvQbPFGoJZkJ5ctjztvMixL6tq3WS1Uw2Oh5fb/QjI=; b=unLA08UyYsP7sKheY3ISzwGqFGJmKtqJypKl66Oh7u5iTF95wyPvW+Mzrtj42Iloja V5ImINfsRENPTA9MA/umMUndn4igNE6wxQh88S3Yn5stvQWDaZdb+ZOivhUHjlYwwnBp 3tlIknyDGmZYn6HmTjBq4falI+ymKKZ5HCDlD9McikScc424gPwvG3LgmB7sPh4domv7 tEUb2/1iBoKACJbnOP/4I3wbZ3eBJqKPBAjlmUDbPPFvOKhNzqBHVf+Gzmvqqyp7AtyV dlCGqDkEqlL6aBVIIynMoQ5alQF48onO3mK74qyvzNHNWQuShOyAHuYGMNetn8962pWm MAeA== X-Gm-Message-State: AOAM532wUdTTkDjBKXV3v9PiYvLBpfKQT57NbqcUYqZE+lQHIam8qudl eWJDRG+EKCgLzSdvYFxqyXriFYsh2iREELLSH11KZ4u2 X-Google-Smtp-Source: ABdhPJwGgavrWJ/dVRVhV5DPixTvr/Ys4iExisyry1ZQ8yyRB0SyxCDwabZlXmjxZGOHwZrdLMV+qGw39H0YKfmu5MQ= X-Received: by 2002:a05:620a:a09:: with SMTP id i9mr2779669qka.201.1601024224046; Fri, 25 Sep 2020 01:57:04 -0700 (PDT) MIME-Version: 1.0 References: <874knnouzr.fsf@cbaines.net> In-Reply-To: <874knnouzr.fsf@cbaines.net> From: zimoun Date: Fri, 25 Sep 2020 10:56:53 +0200 Message-ID: Subject: Automation of SWH save (was: Cuirass: "lint -c archival"?) To: Christopher Baines Content-Type: text/plain; charset="UTF-8" Received-SPF: pass client-ip=2607:f8b0:4864:20::733; envelope-from=zimon.toutoune@gmail.com; helo=mail-qk1-x733.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix Devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=B/zzX6Ry; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -1.71 X-TUID: HmGic0cVS8Mq Hi, On Thu, 24 Sep 2020 at 21:06, Christopher Baines wrote: > zimoun writes: > So, my understanding is that Software Heritage is a potential store for > source material for Guix packages. I think the majority of builds > Cuirass does are because inputs change, rather than the source of a > package. To be precise, Software Heritage stores all the upstream source codes, only. Their API entry-point for "save" is the URL of a Git or Mercurial or Subversion repository and then they ingest the content that this very URL serves. And it is not necessary to build the package to send a "save" request; "guix lint -c archival foo" sends the request for the git-reference source of Guix packages. Note that Guix does not send the result of "guix build -S" but the real upstream URL. > I'm not sure hooking this up to Cuirass would make the most sense, > because of the above point. > > Also, unfortunately, the Guix Data Service doesn't have the ideal data > for this, as it doesn't really store the package source information in > the way that would be useful for this. Somehow, the GDS has this information because it reports Lint Warnings (for example [1]: bottom "no lint warnings"). However, if I read correctly, you added the option "--no-network" to only use the linters which do not require network access. Does the GDS run the linters by itself or does it use the log from Cuirass? [1] BTW, please consider the patch #43261 [2] fixing issue in the current implement of "--no-network". :-) [2] > Personally though (and I'm rather biased), I think the Guix Data Service > might still be an approach. If you take the view on this that the > Software Heritage is a means to a store item (which I think is right?), > the Guix Data Service knows about those store items (like [1]). > > 1: https://data.guix.gnu.org/gnu/store/5h4dz6ild4fkida5yfv5fhh59vfd8hvk-python-boolean.py-3.6-checkout Currently, Guix does not provide machinery to send its source substitutes. I am not convinced it makes sense to do so. The model I am imagining is: - short term: + a script runs as a cron job to lint all the packages, say once per day (packages will be missed but it is better than what we currently have) + try to implement the save request for hg and svn (I am working on it if no one beats me :-)) - middle term: add a hook (Cuirass or GDS) to trigger action if the package passes. - long term: SWH ingest everything via sources.json Somehow, send all the source substitutes should be done once, at the moment from short to middle term. Currently, SWH ingests all the tarballs (via sources.json) and few git-reference packages: the ones when the packager/reviewer did "guix lint -c archival". I am proposing to automatize instead of relying on a packager/reviewer willing. :-) Well, with wider point of view, the hook could send a save request to SWH or we could also imagine that the hook could do whatever with the results (store item): push to somewhere or dissambles the tarball (if any) and saves it to the database (be able then to fetch from SWH). Note that the long term does not depend on the Guix side but on the SWH side. So the term could be shorter. :-) Does this make sense? > To make the information actionable though, it would be necessary to > store more information about the sources for packages in the Guix Data > Service database. > > This is much more work than just using the existing linter, but it does > have the advantage that you'd be able to look at coverage statistics and > things like that, which the checker doesn't really afford. Yes. In summary, SWH limits the number of requests per hour (10 save requests and 120 query requests) and so it is impossible to automatize the saving mechanism. I am proposing to ask them to change this rate limit for one specific trusted machine (for example, if I understand correctly, the Nix and Debian projects are doing so). Therefore, the question is: - which machine? - what is the automation process? (see above) WDYT? All the best, simon