From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id qGqgLi7Cul88UwAA0tVLHw (envelope-from ) for ; Sun, 22 Nov 2020 19:55:26 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id MLdwKi7Cul+YRgAA1q6Kng (envelope-from ) for ; Sun, 22 Nov 2020 19:55:26 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id D0AA494051F for ; Sun, 22 Nov 2020 19:55:25 +0000 (UTC) Received: from localhost ([::1]:60100 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kgvSG-0006PA-O8 for larch@yhetil.org; Sun, 22 Nov 2020 14:55:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:39458) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kgvRv-0006OV-Ly for bug-guix@gnu.org; Sun, 22 Nov 2020 14:55:05 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:36790) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kgvRt-0008VS-S5 for bug-guix@gnu.org; Sun, 22 Nov 2020 14:55:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kgvRt-0005mD-P1 for bug-guix@gnu.org; Sun, 22 Nov 2020 14:55:01 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#44760: Closure copy in =?UTF-8?Q?=E2=80=98guix?= system =?UTF-8?Q?init=E2=80=99?= is inefficient Resent-From: raingloom Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Sun, 22 Nov 2020 19:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44760 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by 44760-submit@debbugs.gnu.org id=B44760.160607485922148 (code B ref 44760); Sun, 22 Nov 2020 19:55:01 +0000 Received: (at 44760) by debbugs.gnu.org; 22 Nov 2020 19:54:19 +0000 Received: from localhost ([127.0.0.1]:48336 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgvRC-0005l8-MZ for submit@debbugs.gnu.org; Sun, 22 Nov 2020 14:54:19 -0500 Received: from mx1.riseup.net ([198.252.153.129]:51800) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgvR7-0005kx-Oe for 44760@debbugs.gnu.org; Sun, 22 Nov 2020 14:54:17 -0500 Received: from bell.riseup.net (bell-pn.riseup.net [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 4CfLZq0SYwzFcQJ; Sun, 22 Nov 2020 11:54:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1606074851; bh=uMMIgSwyK5XGW13lnRTmQDn8ZlxyciCFEiaa4aeoaRA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=X7mWqlK78rUE2gAUM92jR1toHmUn/XaRRc8nLKhcaAmgrBChTcXxPA1xg5rhZZuHF hZISYJrBxbRfx/p3+YACRXDiHekDYngKb4dlUQGJM64cmcGbG9IuHOJS8qeYwTZaoE 7IZ00lkHmMqR01h5PQOhpEy3Y/JCmoOUHy/V791g= X-Riseup-User-ID: EBAE9E51C33841F11A62CC299348D72DB52B9D82E70607CCEE9719F2BE95B275 Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 4CfLZk3wp1zJmmn; Sun, 22 Nov 2020 11:54:06 -0800 (PST) Date: Sun, 22 Nov 2020 20:46:34 +0100 From: raingloom Message-ID: <20201122204634.2730df12@riseup.net> In-Reply-To: <87h7pkffzy.fsf@inria.fr> References: <87h7pkffzy.fsf@inria.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -1.7 (-) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 44760@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=fail (headers rsa verify failed) header.d=riseup.net header.s=squak header.b=X7mWqlK7; dmarc=fail reason="SPF not aligned (relaxed)" header.from=riseup.net (policy=none); spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: 0.09 X-TUID: TSOFbz9ewybq On Fri, 20 Nov 2020 12:02:25 +0100 Ludovic Court=C3=A8s wrote: > =E2=80=98guix system init=E2=80=99 ends by copying the system=E2=80=99s c= losure from the > =E2=80=9Chost=E2=80=9D store to the target store; it also initializes the= database of > that target store. >=20 > That copy is inefficient for several reasons. Let=E2=80=99s pick one fil= e, > shred.1.gz, that ends up being copied, and let=E2=80=99s look at its > occurrences in the strace log of =E2=80=98guix system init config.scm > /tmp/os=E2=80=99: >=20 > --8<---------------cut here---------------start------------->8--- > $ grep -A2 '/shred.1.gz' ,,s > lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/m= an/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, > "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man= 1/shred.1.gz", > O_RDONLY) =3D 15 fstat(15, {st_mode=3DS_IFREG|0444, st_size=3D1490, ...})= =3D > 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > O_WRONLY|O_CREAT|O_TRUNC, 0444) =3D 16 read(15, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 8192) =3D 1490 write(16, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 1490) =3D 1490 -- utimensat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > [{tv_sec=3D1605721025, tv_nsec=3D616985411} /* > 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=3D1, tv_nsec=3D0} /* > 1970-01-01T01:00:01+0100 */], 0) =3D 0 > lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/m= an/man1/sleep.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D 0 openat(AT_FDCWD, > "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man= 1/sleep.1.gz", > O_RDONLY) =3D 15 -- > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shuf.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/sleep.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D 0 -- > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 1490) =3D 1490 -- > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 1490) =3D 1490 -- > link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/s= hare/man/man1/shred.1.gz", > "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342g= v82xl") > =3D 0 > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shuf.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shuf.1.gz", > O_RDONLY) =3D 17 --8<---------------cut > here---------------end--------------->8--- >=20 > First, /tmp/os/=E2=80=A6/shred.1.gz is read entirely twice: once in > =E2=80=98register-items=E2=80=99 (in the =E2=80=98nar-sha256=E2=80=99 cal= l) to compute its hash, and a > second time for deduplication (the =E2=80=98deduplicate=E2=80=99 call in = there.) >=20 > The =E2=80=98nar-sha256=E2=80=99 call could be avoided because the databa= se of > /gnu/store contains that value. As for deduplication, we could > perhaps create those =E2=80=98.links=E2=80=99 entries as we copy files in= stead of > re-traversing the whole thing afterwards. >=20 > Second, all of /tmp/os is traversed to reset timestamps, although we > could have cleared those timestamps when we created those files in the > first place ( prevents that though, > unless we keep a bug-fixed copy of =E2=80=98copy-recursively=E2=80=99 in = there.) >=20 > Third, in the case of the installer, we=E2=80=99re really copying from > /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same > device. In this case we could create hard links instead of actually > copying files. >=20 > Fourth, we=E2=80=99re adding items one by one in the target store databas= e, > but it may be more efficient to more or less dump the subset of the > source database in bulk. >=20 > Surely we can do better. >=20 > Ludo=E2=80=99. >=20 >=20 >=20 Also, if a store is already present (eg.: because of a partial install), it could make sense to (optionally) keep its contents. AFAIK this is still not possible. It was one the bigger time sinks while I was working on the F2FS support.