unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / Atom feed
* bug#44760: Closure copy in ‘guix system init’ is inefficient
@ 2020-11-20 11:02 Ludovic Courtès
  2020-11-22 19:46 ` raingloom
  0 siblings, 1 reply; 3+ messages in thread
From: Ludovic Courtès @ 2020-11-20 11:02 UTC (permalink / raw)
  To: 44760

‘guix system init’ ends by copying the system’s closure from the “host”
store to the target store; it also initializes the database of that
target store.

That copy is inefficient for several reasons.  Let’s pick one file,
shred.1.gz, that ends up being copied, and let’s look at its occurrences
in the strace log of ‘guix system init config.scm /tmp/os’:

--8<---------------cut here---------------start------------->8---
$ grep -A2 '/shred.1.gz' ,,s
lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 15
fstat(15, {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_WRONLY|O_CREAT|O_TRUNC, 0444) = 16
read(15, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 8192) = 1490
write(16, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490
--
utimensat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", [{tv_sec=1605721025, tv_nsec=616985411} /* 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=1, tv_nsec=0} /* 1970-01-01T01:00:01+0100 */], 0) = 0
lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", {st_mode=S_IFREG|0444, st_size=813, ...}) = 0
openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", O_RDONLY) = 15
--
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", {st_mode=S_IFREG|0444, st_size=972, ...}) = 0
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz", {st_mode=S_IFREG|0444, st_size=813, ...}) = 0
--
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 17
lseek(17, 0, SEEK_CUR)                  = 0
read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490
--
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", O_RDONLY) = 17
lseek(17, 0, SEEK_CUR)                  = 0
read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..., 1490) = 1490
--
link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz", "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342gv82xl") = 0
lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", {st_mode=S_IFREG|0444, st_size=972, ...}) = 0
openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz", O_RDONLY) = 17
--8<---------------cut here---------------end--------------->8---

First, /tmp/os/…/shred.1.gz is read entirely twice: once in
‘register-items’ (in the ‘nar-sha256’ call) to compute its hash, and a
second time for deduplication (the ‘deduplicate’ call in there.)

The ‘nar-sha256’ call could be avoided because the database of
/gnu/store contains that value.  As for deduplication, we could perhaps
create those ‘.links’ entries as we copy files instead of re-traversing
the whole thing afterwards.

Second, all of /tmp/os is traversed to reset timestamps, although we
could have cleared those timestamps when we created those files in the
first place (<https://issues.guix.gnu.org/44741> prevents that though,
unless we keep a bug-fixed copy of ‘copy-recursively’ in there.)

Third, in the case of the installer, we’re really copying from
/mnt/guix-inst/store to /mnt/gnu/store, which is likely the same
device.  In this case we could create hard links instead of actually
copying files.

Fourth, we’re adding items one by one in the target store database, but
it may be more efficient to more or less dump the subset of the source
database in bulk.

Surely we can do better.

Ludo’.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#44760: Closure copy in ‘guix system init’ is inefficient
  2020-11-20 11:02 bug#44760: Closure copy in ‘guix system init’ is inefficient Ludovic Courtès
@ 2020-11-22 19:46 ` raingloom
  2020-11-22 21:10   ` Ludovic Courtès
  0 siblings, 1 reply; 3+ messages in thread
From: raingloom @ 2020-11-22 19:46 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 44760

On Fri, 20 Nov 2020 12:02:25 +0100
Ludovic Courtès <ludo@gnu.org> wrote:

> ‘guix system init’ ends by copying the system’s closure from the
> “host” store to the target store; it also initializes the database of
> that target store.
> 
> That copy is inefficient for several reasons.  Let’s pick one file,
> shred.1.gz, that ends up being copied, and let’s look at its
> occurrences in the strace log of ‘guix system init config.scm
> /tmp/os’:
> 
> --8<---------------cut here---------------start------------->8---
> $ grep -A2 '/shred.1.gz' ,,s
> lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,
> "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_RDONLY) = 15 fstat(15, {st_mode=S_IFREG|0444, st_size=1490, ...}) =
> 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_WRONLY|O_CREAT|O_TRUNC, 0444) = 16 read(15,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 8192) = 1490 write(16,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 1490) = 1490 -- utimensat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> [{tv_sec=1605721025, tv_nsec=616985411} /*
> 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=1, tv_nsec=0} /*
> 1970-01-01T01:00:01+0100 */], 0) = 0
> lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",
> {st_mode=S_IFREG|0444, st_size=813, ...}) = 0 openat(AT_FDCWD,
> "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",
> O_RDONLY) = 15 --
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",
> {st_mode=S_IFREG|0444, st_size=972, ...}) = 0
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/sleep.1.gz",
> {st_mode=S_IFREG|0444, st_size=813, ...}) = 0 --
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_RDONLY) = 17 lseek(17, 0, SEEK_CUR)                  = 0 read(17,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 1490) = 1490 --
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> {st_mode=S_IFREG|0444, st_size=1490, ...}) = 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> O_RDONLY) = 17 lseek(17, 0, SEEK_CUR)                  = 0 read(17,
> "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"...,
> 1490) = 1490 --
> link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shred.1.gz",
> "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342gv82xl")
> = 0
> lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",
> {st_mode=S_IFREG|0444, st_size=972, ...}) = 0 openat(AT_FDCWD,
> "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man1/shuf.1.gz",
> O_RDONLY) = 17 --8<---------------cut
> here---------------end--------------->8---
> 
> First, /tmp/os/…/shred.1.gz is read entirely twice: once in
> ‘register-items’ (in the ‘nar-sha256’ call) to compute its hash, and a
> second time for deduplication (the ‘deduplicate’ call in there.)
> 
> The ‘nar-sha256’ call could be avoided because the database of
> /gnu/store contains that value.  As for deduplication, we could
> perhaps create those ‘.links’ entries as we copy files instead of
> re-traversing the whole thing afterwards.
> 
> Second, all of /tmp/os is traversed to reset timestamps, although we
> could have cleared those timestamps when we created those files in the
> first place (<https://issues.guix.gnu.org/44741> prevents that though,
> unless we keep a bug-fixed copy of ‘copy-recursively’ in there.)
> 
> Third, in the case of the installer, we’re really copying from
> /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same
> device.  In this case we could create hard links instead of actually
> copying files.
> 
> Fourth, we’re adding items one by one in the target store database,
> but it may be more efficient to more or less dump the subset of the
> source database in bulk.
> 
> Surely we can do better.
> 
> Ludo’.
> 
> 
> 

Also, if a store is already present (eg.: because of a partial
install), it could make sense to (optionally) keep its contents. AFAIK
this is still not possible. It was one the bigger time sinks while I
was working on the F2FS support.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#44760: Closure copy in ‘guix system init’ is inefficient
  2020-11-22 19:46 ` raingloom
@ 2020-11-22 21:10   ` Ludovic Courtès
  0 siblings, 0 replies; 3+ messages in thread
From: Ludovic Courtès @ 2020-11-22 21:10 UTC (permalink / raw)
  To: raingloom; +Cc: 44760

Hi,

raingloom <raingloom@riseup.net> skribis:

> Also, if a store is already present (eg.: because of a partial
> install), it could make sense to (optionally) keep its contents. AFAIK
> this is still not possible. It was one the bigger time sinks while I
> was working on the F2FS support.

It’s on purpose that ‘guix system init’ forcefully wipes store items
already present before copying them; some time ago it was not the case
and that led to problems:

  https://issues.guix.gnu.org/20722

I think it’s safer to keep it unchanged.

Ludo’.




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-22 21:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-20 11:02 bug#44760: Closure copy in ‘guix system init’ is inefficient Ludovic Courtès
2020-11-22 19:46 ` raingloom
2020-11-22 21:10   ` Ludovic Courtès

unofficial mirror of bug-guix@gnu.org 

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://yhetil.org/guix-bugs/0 guix-bugs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 guix-bugs guix-bugs/ https://yhetil.org/guix-bugs \
		bug-guix@gnu.org
	public-inbox-index guix-bugs

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.gnu.guix.bugs
	nntp://news.gmane.io/gmane.comp.gnu.guix.bugs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git