From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id kGxaFwFzYF/bTAAA0tVLHw (envelope-from ) for ; Tue, 15 Sep 2020 07:53:37 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id uJ66EQFzYF9TMwAA1q6Kng (envelope-from ) for ; Tue, 15 Sep 2020 07:53:37 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E2FE1940291 for ; Tue, 15 Sep 2020 07:53:36 +0000 (UTC) Received: from localhost ([::1]:57130 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kI5mR-0006ju-TD for larch@yhetil.org; Tue, 15 Sep 2020 03:53:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43376) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kI5mI-0006hU-LD for guix-devel@gnu.org; Tue, 15 Sep 2020 03:53:26 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:4837) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kI5mG-0007bc-49 for guix-devel@gnu.org; Tue, 15 Sep 2020 03:53:26 -0400 X-IronPort-AV: E=Sophos;i="5.76,359,1592863200"; d="scan'208";a="467688793" Received: from 91-160-117-201.subs.proxad.net (HELO ribbon) ([91.160.117.201]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Sep 2020 09:53:18 +0200 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Maxim Cournoyer Subject: Re: Speeding up archive export References: <87a6xyhujp.fsf@inria.fr> <87d02n4y29.fsf@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 30 Fructidor an 228 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 15 Sep 2020 09:53:08 +0200 In-Reply-To: <87d02n4y29.fsf@gmail.com> (Maxim Cournoyer's message of "Mon, 14 Sep 2020 21:40:30 -0400") Message-ID: <87d02na32z.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.83; envelope-from=ludovic.courtes@inria.fr; helo=mail2-relais-roc.national.inria.fr X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/15 03:53:18 X-ACL-Warn: Detected OS = ??? X-Spam_score_int: -68 X-Spam_score: -6.9 X-Spam_bar: ------ X-Spam_report: (-6.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix-devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -1.01 X-TUID: lHsNhqeXt95Q Hi Maxim, I=E2=80=99m a bad person, I realize I didn=E2=80=99t even follow up to my m= essage to point to , which I pushed just yesterday. Sorry for the confusion! Maxim Cournoyer skribis: > Ludovic Court=C3=A8s writes: > >> Hello Guix! >> >> If you=E2=80=99ve ever used offloading (or =E2=80=98guix copy=E2=80=99),= you=E2=80=99ve probably noticed >> that the time to send store items is proportional to the number of store >> items to send rather than their total size. Namely: >> >> guix archive --export coreutils >> >> is fast, but: >> >> guix archive --export $(guix build -d coreutils) >> >> is slow (there are lots of small files). >> >> Running =E2=80=98perf timechart record guix archive --export =E2=80=A6= =E2=80=99 confirms the >> problem: guix-daemon is mostly idle, waiting for all the tiny =E2=80=98g= uix >> authenticate=E2=80=99 programs it spawns to sign each every store item. = Here=E2=80=99s >> the Gantt diagram (grey =3D idle, blue =3D busy): > > Very cool! The timechart suggests the guix-authenticate programs are > run sequentially? Perhaps running them in parallel would be a cheap, > first step to improve performance? The sequence goes like this: 1. Export store item as nar and compute its hash. 2. Pass hash to =E2=80=98guix authenticate sign=E2=80=99. 3. Goto 1 for next store item. So it=E2=80=99s not really parallelizable, and even pipelining is not really feasible. >> 1. Sign the whole bundle instead of each individual item. >> >> That solves the problem, but that would prevent the receiver from >> storing individual store item signatures in the future (a few years >> ago Nix added signatures as part of the =E2=80=98ValidPathInfo=E2= =80=99 table of >> the store database, and I think that=E2=80=99s something we might w= ant to >> have too). > > Why? Couldn't the receiver do the book keeping no matter if it received > a signed bundle or a single file? It could assign the bundle signature > to individual store files in the database, for example. This seems the > obvious, easy solution. We need good arguments to not implement it. The idea of storing signatures is that you=E2=80=99d keep one signature per store item. That way, you have precise provenance tracking for each store item. Also, if you re-export them (via =E2=80=98guix publish=E2=80= =99 or =E2=80=98guix archive=E2=80=99), you can choose to serve those third-party signatures. >> 2. Sign fewer items: we can do that by signing only store items that >> are not content-addressed=E2=80=94i.e., resulting from a fixed-outp= ut >> derivation or being a =E2=80=9Csource=E2=80=9D coming from =E2=80= =98add-to-store=E2=80=99 or >> similar. >> >> That means we wouldn=E2=80=99t have to sign .drv and *-guile-builde= r, which >> would make a big difference and is generally advisable. >> Unfortunately, there=E2=80=99s no easy way to determine whether a s= tore >> item is content-addressable. Again Nix added >> =E2=80=9Ccertificate-addressability claims=E2=80=9D to =E2=80=98Val= idPathInfo=E2=80=99, which might >> help, though it=E2=80=99s not entirely clear. >> >> 3. Reimplement =E2=80=98guix authenticate=E2=80=99 and a subset of (gu= ix pki) in C++ (!). >> We could load the keys and the ACL only once, and we wouldn=E2=80= =99t have >> to fork and all, I=E2=80=99m sure it=E2=80=99d be very fast=E2=80= =A6 and very distracting >> too: I=E2=80=99d rather investigate in the daemon rewrite in Scheme. >> >> 4. Spawn =E2=80=98guix authenticate=E2=80=99 once and talk to it over = a pipe (similar >> to =E2=80=98guix offload=E2=80=99). That might be the easiest shor= t-term solution. > > Failing 1., 4. is my second favorite, because it seems the most Guixy of > the remaining options, and should provide acceptable performance. Yes, that=E2=80=99s what does, with qui= te some success. I=E2=80=99ve been testing it locally and will now give it a = spin on berlin. Thanks for your feedback! Ludo=E2=80=99.