all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Janneke Nieuwenhuizen <janneke@gnu.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Josselin Poiret <dev@jpoiret.xyz>,
	65456@debbugs.gnu.org, Simon Tournier <zimon.toutoune@gmail.com>,
	Mathieu Othacehe <othacehe@gnu.org>,
	Tobias Geerinckx-Rice <me@tobias.gr>,
	Ricardo Wurmus <rekado@elephly.net>,
	Christopher Baines <guix@cbaines.net>,
	Janneke Nieuwenhuizen <janneke@gnu.org>
Subject: bug#65456: [PATCH 0/2] Split guix build into more steps for 32bit hosts.
Date: Fri, 01 Sep 2023 14:48:38 +0200	[thread overview]
Message-ID: <87h6oeys7t.fsf@verum.com> (raw)
In-Reply-To: <87zg2gy013.fsf_-_@gnu.org> ("Ludovic Courtès"'s message of "Thu, 24 Aug 2023 16:42:32 +0200")

Ludovic Courtès writes:

Hello!

> Janneke Nieuwenhuizen <janneke@gnu.org> skribis:
>
>>>From ad94f06620e53fcc1495a2e2479dfc627177047c Mon Sep 17 00:00:00 2001
>> Message-ID: <ad94f06620e53fcc1495a2e2479dfc627177047c.1692783678.git.janneke@gnu.org>
>> From: Janneke Nieuwenhuizen <janneke@gnu.org>
>> Date: Thu, 22 Jun 2023 08:30:25 +0200
>> Subject: [PATCH v4] self: Build directories in chunks of max 25 files at a
>>  time.
>>
>> Similar to split build of make-go in Makefile.am, this breaks-up building
>> directories into chunks of max 25 files.  Also force garbage collection.
>
> The big difference with ‘make-go’ is that ‘make-go’ spawns a new process
> for each chunk of files: each process starts with an empty heap, which
> is not the case here as we reuse the same process.

Right.

> However, (guix self) is already splitting gnu/packages/*.scm in two
> pieces: ‘guix-packages-base’ and ‘guix-packages’.  The former is the
> closure of (gnu packages base), and the latter contains the remaining
> files.  Unfortunately this is uneven:

Okay...

> $ readlink -f $(type -P guix)
> /gnu/store/12p5axbr4gjrghlrqa4ikmhsxwq2wgw3-guix-command
> $ guix gc -R /gnu/store/12p5axbr4gjrghlrqa4ikmhsxwq2wgw3-guix-command|grep packages-base
> /gnu/store/ivprgy9b2lv8wmkm10wkypf7k24cdifb-guix-packages-base
> /gnu/store/05pjlcfcfa0k9y833nnxxxjcn5mqr8zj-guix-packages-base-source
> /gnu/store/gnxjbyfwfmb216krz2x0cf1z5k1lla9x-guix-packages-base-modules
> $ find /gnu/store/ivprgy9b2lv8wmkm10wkypf7k24cdifb-guix-packages-base  -type f |wc -l
> 361
> $ guix gc -R /gnu/store/12p5axbr4gjrghlrqa4ikmhsxwq2wgw3-guix-command|grep packages$
> /gnu/store/8cda50hsayydrlw0qrhcy8q4dr9f1avx-guix-locale-guix-packages
> ludo@ribbon ~/src/guix [env]$ find /gnu/store/8cda50hsayydrlw0qrhcy8q4dr9f1avx-guix-locale-guix-packages | wc -l
> 64
> $ guix describe
> Generation 271  Aug 20 2023 23:48:59    (current)
>   guix a0f5885
>     repository URL: https://git.savannah.gnu.org/git/guix.git
>     branch: master
>     commit: a0f5885fefd93a3859b6e4b82b18a6db9faeee05
>
> Maxime Devos looked into this a while back:
>
>   https://issues.guix.gnu.org/54539

Oh my....

>> * guix/self.scm (compiled-modules)[process-directory]: Split building of
>> directories into chunks of max 25 files.
>> +              (for-each
>> +               (lambda (chunck)
>
> s/chunck/chunk/

Oops, fixed.

> Can you confirm that this reduces memory usage observably?  One way to
> check that would be to print (gc-stats) from ‘process-directory’, with
> and without the change.  Could you give it a try?

What a good and seemingly simple question.  After a week of
instrumentation and testing, my answer can only be: I tried, and maybe.
(see below).

> Intuitively, I don’t see why it would eat less memory; maybe peak memory
> usage is lower because we do less at once?

Okay...

> Also, I think we should remove the explicit (gc) call: it should not be
> necessary, and if we depend on that, something’s wrong.

> Anyhow, thanks for tackling this issue!

Hehe.  You've probably seen Josselin's recent GraphML backend effort
that might really help to address this?  I'm afraid this patch can maybe
only postpone what really needs to be done...

There is gc-stats output from a successful `guix pull' or `make
as-derivation' on Guix/Hurd, that I can show you, and I've tried more
than 20 times; it always fails (OOM, hang, spontaneous reset, ...).

Below is a typical output of gc-stats on the Hurd for building self.scm,
when heap-size peaks (using the the max 25 files patch):

--8<---------------cut here---------------start------------->8---
((gc-time-taken . 1530)
 (heap-size . 2,625,474,560)
 (heap-free-size . 1127989248)
 (heap-total-allocated . 1337029496)
 (heap-allocated-since-gc . 28728)
 (protected-objects . 28)
 (gc-times . 324))
--8<---------------cut here---------------end--------------->8---

notice that it's *much* bigger (more than twice) than my findings on
linux-64 below.  I have no idea why this is of what it might mean...

So I turned to Guix GNU/Linux to get some gc-stat measurements.  What
you see below is the maximum head-size at any point (I also have
heap-total-allocated but I think that's irrelevant? and initially didn't
use a script that measured the time).

--8<---------------cut here---------------start------------->8---
* guix/self.scm: Vanilla, not chunked; print gc-stats.
((gc-time-taken . 27319485051)
 (heap-size . 1,360,330,752)
 (heap-free-size . 285,696,000)
 (heap-total-allocated . 74,067,590,944)
 (heap-allocated-since-gc . 186,250,144)
 (protected-objects . 28)
 (gc-times . 464))
real	24m36.643s

* guix/self.scm: Split building of directories into 26 chunks; print gc-stats.
 (heap-size . 1,131,298,816)

* guix/self.scm: Split building of directories into 26 chunks; no gc; print gc-stats.
 (heap-size . 1,121,116,160)

* guix/self.scm: Chunks of 25 files; run gc; print gc-stats.
 (heap-size . 1,066,725,376)

* guix/self.scm: Chunks of 50 files; no gc; print gc-stats.
 (heap-size . 1,299,230,720)
real	26m40.708s

* guix/self.scm: Chunks of 25 files; no gc; print gc-stats.
 (heap-size . 1,024,045,056)  ; 1st run
real	28m4.451s

* guix/self.scm: Chunks of 10 files; no gc; print gc-stats.
 (heap-size . 1,077,895,168)
real	30m14.049s
--8<---------------cut here---------------end--------------->8---

...strangely enough, if we assume that these statistics translate to the
Hurd, using chunks of max 25 files seems to be a sort of sweet spot?
25% less peak memory (~300MB), "only" 12% (3"45') slower...  though not
great for GNU/Linux users...

I have produced a handful of successful `guix pull's (from a local
checked-out worktree) using the 26-way split and chunks of max-25 files
patches, but sadly also many more attempts failed.  Initially, when
creating this patch series, I was convinced this fixed building on the
Hurd, but I'm much less enthusiastic now.

So I still have a slight preference for using the latest max-25-files
patch, but I'm sorry to say that I cannot back it up with tangible data.
All in all a rather discouraging week with much effort spent for little
gain.  Hopefully Josselin can do some of his magic here :)

Greetings,
Janneke

-- 
Janneke Nieuwenhuizen <janneke@gnu.org>  | GNU LilyPond https://LilyPond.org
Freelance IT https://www.JoyOfSource.com | Avatar® https://AvatarAcademy.com




  reply	other threads:[~2023-09-01 12:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-22 17:17 bug#65456: [PATCH 0/2] Split guix build into more steps for 32bit hosts Janneke Nieuwenhuizen
2023-08-22 17:19 ` bug#65456: [PATCH 1/2] build: Build gnu/packages/*.go in five steps Janneke Nieuwenhuizen
2023-08-22 17:33   ` Janneke Nieuwenhuizen
2023-08-22 17:19 ` bug#65456: [PATCH 2/2] self: Build gnu/packages/*.go in 26 steps on 32bit Janneke Nieuwenhuizen
2023-08-22 21:51   ` bug#65456: [PATCH 0/2] Split guix build into more steps for 32bit hosts Ludovic Courtès
2023-08-23  6:16     ` bug#65456: [PATCH v3] self: Build guix/ and gnu/packages/ directories in 26 steps Janneke Nieuwenhuizen
2023-08-23  9:41       ` bug#65456: [PATCH v4] self: Build directories in chunks of max 25 files at a time Janneke Nieuwenhuizen
2023-08-24 14:42         ` bug#65456: [PATCH 0/2] Split guix build into more steps for 32bit hosts Ludovic Courtès
2023-09-01 12:48           ` Janneke Nieuwenhuizen [this message]
2023-09-16 15:16             ` Janneke Nieuwenhuizen
2023-09-18  4:52               ` Janneke Nieuwenhuizen
2023-08-24  5:33       ` bug#65456: [PATCH v3] self: Build guix/ and gnu/packages/ directories in 26 steps Janneke Nieuwenhuizen
2023-08-24  6:44         ` Dr. Arne Babenhauserheide
2023-08-22 17:49 ` bug#65456: [PATCH v2 1/2] build: Build gnu/packages/*.go in five steps Janneke Nieuwenhuizen
2023-08-22 17:49   ` bug#65456: [PATCH v2 2/2] self: Build gnu/packages/*.go in 26 steps Janneke Nieuwenhuizen
2023-08-22 17:55     ` Janneke Nieuwenhuizen
2023-08-22 19:28       ` Janneke Nieuwenhuizen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h6oeys7t.fsf@verum.com \
    --to=janneke@gnu.org \
    --cc=65456@debbugs.gnu.org \
    --cc=dev@jpoiret.xyz \
    --cc=guix@cbaines.net \
    --cc=ludo@gnu.org \
    --cc=me@tobias.gr \
    --cc=othacehe@gnu.org \
    --cc=rekado@elephly.net \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.