From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id qGdFHoGit1/pdgAA0tVLHw (envelope-from ) for ; Fri, 20 Nov 2020 11:03:29 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id CJU+GoGit1/UeQAAbx9fmQ (envelope-from ) for ; Fri, 20 Nov 2020 11:03:29 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E119B94006E for ; Fri, 20 Nov 2020 11:03:28 +0000 (UTC) Received: from localhost ([::1]:41898 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kg4CL-0005xu-Mx for larch@yhetil.org; Fri, 20 Nov 2020 06:03:25 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60308) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kg4By-0005mC-JP for bug-guix@gnu.org; Fri, 20 Nov 2020 06:03:02 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:58868) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kg4By-0006kW-BY for bug-guix@gnu.org; Fri, 20 Nov 2020 06:03:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kg4By-00060C-1p for bug-guix@gnu.org; Fri, 20 Nov 2020 06:03:02 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#44760: Closure copy in =?UTF-8?Q?=E2=80=98guix?= system =?UTF-8?Q?init=E2=80=99?= is inefficient Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Fri, 20 Nov 2020 11:03:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 44760 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 44760@debbugs.gnu.org X-Debbugs-Original-To: Received: via spool by submit@debbugs.gnu.org id=B.160587015123030 (code B ref -1); Fri, 20 Nov 2020 11:03:01 +0000 Received: (at submit) by debbugs.gnu.org; 20 Nov 2020 11:02:31 +0000 Received: from localhost ([127.0.0.1]:42181 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kg4BS-0005zO-V2 for submit@debbugs.gnu.org; Fri, 20 Nov 2020 06:02:31 -0500 Received: from lists.gnu.org ([209.51.188.17]:57686) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kg4BR-0005zG-3E for submit@debbugs.gnu.org; Fri, 20 Nov 2020 06:02:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60234) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kg4BQ-0005Oy-HP for bug-guix@gnu.org; Fri, 20 Nov 2020 06:02:28 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:49361) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kg4BQ-0006Zs-9r for bug-guix@gnu.org; Fri, 20 Nov 2020 06:02:28 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=39250 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kg4BP-0005BX-C4 for bug-guix@gnu.org; Fri, 20 Nov 2020 06:02:28 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 30 Brumaire an 229 de la =?UTF-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Fri, 20 Nov 2020 12:02:25 +0100 Message-ID: <87h7pkffzy.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -3.3 (---) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: -1.51 X-TUID: GctSKHYWJ1ye =E2=80=98guix system init=E2=80=99 ends by copying the system=E2=80=99s clo= sure from the =E2=80=9Chost=E2=80=9D store to the target store; it also initializes the database of that target store. That copy is inefficient for several reasons. Let=E2=80=99s pick one file, shred.1.gz, that ends up being copied, and let=E2=80=99s look at its occurr= ences in the strace log of =E2=80=98guix system init config.scm /tmp/os=E2=80=99: --8<---------------cut here---------------start------------->8--- $ grep -A2 '/shred.1.gz' ,,s lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man= /man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.3= 2/share/man/man1/shred.1.gz", O_RDONLY) =3D 15 fstat(15, {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shred.1.gz", O_WRONLY|O_CREAT|O_TRUNC, 0444) =3D 16 read(15, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\3= 22:M"..., 8192) =3D 1490 write(16, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\= 322:M"..., 1490) =3D 1490 -- utimensat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-cor= eutils-8.32/share/man/man1/shred.1.gz", [{tv_sec=3D1605721025, tv_nsec=3D61= 6985411} /* 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=3D1, tv_nsec=3D0= } /* 1970-01-01T01:00:01+0100 */], 0) =3D 0 lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man= /man1/sleep.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D 0 openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.3= 2/share/man/man1/sleep.1.gz", O_RDONLY) =3D 15 -- lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) = =3D 0 lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shuf.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/sleep.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D= 0 -- lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) = =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shred.1.gz", O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\3= 22:M"..., 1490) =3D 1490 -- lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) = =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shred.1.gz", O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\3= 22:M"..., 1490) =3D 1490 -- link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sha= re/man/man1/shred.1.gz", "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixl= ghihzf0vc56bpd9yylj342gv82xl") =3D 0 lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shuf.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shuf.1.gz", O_RDONLY) =3D 17 --8<---------------cut here---------------end--------------->8--- First, /tmp/os/=E2=80=A6/shred.1.gz is read entirely twice: once in =E2=80=98register-items=E2=80=99 (in the =E2=80=98nar-sha256=E2=80=99 call)= to compute its hash, and a second time for deduplication (the =E2=80=98deduplicate=E2=80=99 call in th= ere.) The =E2=80=98nar-sha256=E2=80=99 call could be avoided because the database= of /gnu/store contains that value. As for deduplication, we could perhaps create those =E2=80=98.links=E2=80=99 entries as we copy files instead of r= e-traversing the whole thing afterwards. Second, all of /tmp/os is traversed to reset timestamps, although we could have cleared those timestamps when we created those files in the first place ( prevents that though, unless we keep a bug-fixed copy of =E2=80=98copy-recursively=E2=80=99 in th= ere.) Third, in the case of the installer, we=E2=80=99re really copying from /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same device. In this case we could create hard links instead of actually copying files. Fourth, we=E2=80=99re adding items one by one in the target store database,= but it may be more efficient to more or less dump the subset of the source database in bulk. Surely we can do better. Ludo=E2=80=99.