unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Christopher Baines <mail@cbaines.net>
To: Guix Devel <guix-devel@gnu.org>
Subject: Persistent heap usage when computing derivations
Date: Sat, 09 Nov 2024 19:53:13 +0000	[thread overview]
Message-ID: <87a5e8tc9i.fsf@cbaines.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 2015 bytes --]

Hey,

I've been putting some more time and money in to trying to get the QA
data service (data.qa.guix.gnu.org) to perform better recently, but
unfortunately I haven't been having much success.

I've been trying to parallelise more and while I think this should speed
things up, butI'm having to reduce the actual parallelism due to lack of
memory (the machine I rent for data.qa.guix.gnu.org just has 32G).

One of the memory problems I'm having relates to the Guix inferior
processes that the data service uses when computing derivations. The
data serivce goes through the list of systems (x86_64-linux,
aarch64-linux, ...) and because the data cached for x86_64-linux
probably doesn't relate to aarch64-linux, there's some code that
attempts to clear the caches [1].

1: https://git.savannah.gnu.org/cgit/guix/data-service.git/tree/guix-data-service/jobs/load-new-guix-revision.scm#n1970

Unfortunately this code has to reach in to Guix internals to try and do
this, and it does reduce the heap usage significantly, but this doesn't
result in stable memory usage. Each system processed seems to add about
250MiB of data to the Guile heap that isn't cleared out. To me that
sounds like a lot of memory, but there's also a lot of systems/targets,
so overall this leads to the inferior process using with around 6GiB of
data in the heap after processing all the systems/targets. This peak
memory usage really limits how much the machine can do.

These numbers come from this specific job that ran with a parallelism of
1 to get clear data [2].

2: https://data.qa.guix.gnu.org/job/60896

I've tried using the heap profiler that Ludo wrote, but nothing jumps
out at me about what this extra 250MiB of stuff in the heap relates
to. I'm also aware that my current cache cleanup doesn't actually remove
references to the hash tables themselves, but I doubt they take up this
much space.

Does anyone have any suggestions as to what might be taking up this
space on the heap, or how to try and find out?

Thanks,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 987 bytes --]

             reply	other threads:[~2024-11-09 19:54 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-09 19:53 Christopher Baines [this message]
2024-11-30 15:53 ` Persistent heap usage when computing derivations Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5e8tc9i.fsf@cbaines.net \
    --to=mail@cbaines.net \
    --cc=guix-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).