unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Possible useful guix package performance data
@ 2024-11-01  6:27 Evan Cooney
  2024-11-01 13:14 ` Thompson, David
  0 siblings, 1 reply; 6+ messages in thread
From: Evan Cooney @ 2024-11-01  6:27 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

Hi everyone, I used guile's statprof profiler to collect some performance
data for guix package. I added the profiler by using the statprof module in
guix/guix/scripts/package.scm and wrapping the call to guix-package* in the
guix package define command thing in a lambda, then in statprof. Here is
the code: paste.debian.net/1334168
I collected this data by running guix package -i krita inkscape
rust-rustup-toolchain. It uses a significant amount of time in the garbage
collector (~75% of total runtime) and ~45% of time is spend calling append.
What do y'all make of this? Here is the full data: paste.debian.net/1334174

Happy hacking!
Evan Cooney

[-- Attachment #2: Type: text/html, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible useful guix package performance data
  2024-11-01  6:27 Possible useful guix package performance data Evan Cooney
@ 2024-11-01 13:14 ` Thompson, David
  2024-11-04 22:22   ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Thompson, David @ 2024-11-01 13:14 UTC (permalink / raw)
  To: Evan Cooney; +Cc: guix-devel

Hi Evan,

On Fri, Nov 1, 2024 at 2:28 AM Evan Cooney <evancooney71@gmail.com> wrote:
>
> Hi everyone, I used guile's statprof profiler to collect some performance data for guix package. I added the profiler by using the statprof module in guix/guix/scripts/package.scm and wrapping the call to guix-package* in the guix package define command thing in a lambda, then in statprof. Here is the code: paste.debian.net/1334168
> I collected this data by running guix package -i krita inkscape rust-rustup-toolchain. It uses a significant amount of time in the garbage collector (~75% of total runtime) and ~45% of time is spend calling append. What do y'all make of this? Here is the full data: paste.debian.net/1334174

What are you trying to measure, exactly? The data collected is about a
single thread of a Guile process, but `guix package -i` involves
interaction with the Guix build daemon, waiting for downloads, etc.
These things are outside the scope of what statprof is tracing.
Anything that involves reaching out to the internet is going to make
for an unreliable benchmark. I'm not actively working on Guix tools
but I don't see anything immediately actionable from the collected
data. I think getting some more clarity around the goal is necessary
in order to know what to profile and how.

- Dave


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible useful guix package performance data
  2024-11-01 13:14 ` Thompson, David
@ 2024-11-04 22:22   ` Ludovic Courtès
  2024-11-05 15:34     ` Efraim Flashner
  2024-11-07 15:14     ` Simon Tournier
  0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2024-11-04 22:22 UTC (permalink / raw)
  To: Thompson, David; +Cc: Evan Cooney, guix-devel

Hi,

"Thompson, David" <dthompson2@worcester.edu> skribis:

> What are you trying to measure, exactly? The data collected is about a
> single thread of a Guile process, but `guix package -i` involves
> interaction with the Guix build daemon, waiting for downloads, etc.
> These things are outside the scope of what statprof is tracing.
> Anything that involves reaching out to the internet is going to make
> for an unreliable benchmark. I'm not actively working on Guix tools
> but I don't see anything immediately actionable from the collected
> data. I think getting some more clarity around the goal is necessary
> in order to know what to profile and how.

Unless you’re specifically looking at performance of substitutes, I
recommend looking at profiles when using ‘--no-substitutes’; I would
also pass ‘--no-grafts’, to make sure you’re really looking at the core
of the whole package-to-derivation mechanism.

You may also want to check the ‘GUIX_PROFILING’ environment variable,
from the (guix profiling) module.  That can give a useful overview such
as the number of remote procedure calls (RPCs) made to guix-daemon and
the hit rate of the “object cache”:

--8<---------------cut here---------------start------------->8---
$ GUIX_PROFILING="gc rpc object-cache" guix build hello -d --no-grafts --no-substitutes
/gnu/store/79dhya6sngg4rf53m1cyxlhn8y4pnw2n-hello-2.12.1.drv
Object Cache:
  fresh caches:    16
  lookups:       5042
  hits:          4596 (91.2%)
  cache size:     445 entries
Remote procedure call summary: 382 RPCs
  built-in-builders              ...     1
  add-to-store/tree              ...     3
  add-to-store                   ...    48
  add-text-to-store              ...   330
Garbage collection statistics:
  heap size:        72.27 MiB
  allocated:        100.29 MiB
  GC times:         13
  time spent in GC: 0.42 seconds (45% of user time)
--8<---------------cut here---------------end--------------->8---

Keep in mind that the figures we’re looking at here are largely
dependent of the package graph being processed (that of ‘hello’ here).
So it also makes sense to look at the size of the package graph in
question first (in the case of ‘hello’, it’s probably fine, but let’s
say when Rust packages are in the loop, the graph might be much bigger
than needed/expected).

HTH,
Ludo’.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible useful guix package performance data
  2024-11-04 22:22   ` Ludovic Courtès
@ 2024-11-05 15:34     ` Efraim Flashner
  2024-11-06 17:15       ` Evan Cooney
  2024-11-07 15:14     ` Simon Tournier
  1 sibling, 1 reply; 6+ messages in thread
From: Efraim Flashner @ 2024-11-05 15:34 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Thompson, David, Evan Cooney, guix-devel

[-- Attachment #1: Type: text/plain, Size: 4379 bytes --]

On Mon, Nov 04, 2024 at 11:22:01PM +0100, Ludovic Courtès wrote:
> Hi,
> 
> "Thompson, David" <dthompson2@worcester.edu> skribis:
> 
> > What are you trying to measure, exactly? The data collected is about a
> > single thread of a Guile process, but `guix package -i` involves
> > interaction with the Guix build daemon, waiting for downloads, etc.
> > These things are outside the scope of what statprof is tracing.
> > Anything that involves reaching out to the internet is going to make
> > for an unreliable benchmark. I'm not actively working on Guix tools
> > but I don't see anything immediately actionable from the collected
> > data. I think getting some more clarity around the goal is necessary
> > in order to know what to profile and how.
> 
> Unless you’re specifically looking at performance of substitutes, I
> recommend looking at profiles when using ‘--no-substitutes’; I would
> also pass ‘--no-grafts’, to make sure you’re really looking at the core
> of the whole package-to-derivation mechanism.
> 
> You may also want to check the ‘GUIX_PROFILING’ environment variable,
> from the (guix profiling) module.  That can give a useful overview such
> as the number of remote procedure calls (RPCs) made to guix-daemon and
> the hit rate of the “object cache”:
> 
> --8<---------------cut here---------------start------------->8---
> $ GUIX_PROFILING="gc rpc object-cache" guix build hello -d --no-grafts --no-substitutes
> /gnu/store/79dhya6sngg4rf53m1cyxlhn8y4pnw2n-hello-2.12.1.drv
> Object Cache:
>   fresh caches:    16
>   lookups:       5042
>   hits:          4596 (91.2%)
>   cache size:     445 entries
> Remote procedure call summary: 382 RPCs
>   built-in-builders              ...     1
>   add-to-store/tree              ...     3
>   add-to-store                   ...    48
>   add-text-to-store              ...   330
> Garbage collection statistics:
>   heap size:        72.27 MiB
>   allocated:        100.29 MiB
>   GC times:         13
>   time spent in GC: 0.42 seconds (45% of user time)
> --8<---------------cut here---------------end--------------->8---
> 
> Keep in mind that the figures we’re looking at here are largely
> dependent of the package graph being processed (that of ‘hello’ here).
> So it also makes sense to look at the size of the package graph in
> question first (in the case of ‘hello’, it’s probably fine, but let’s
> say when Rust packages are in the loop, the graph might be much bigger
> than needed/expected).

Did someone say rust?

I made a manifest of the packages that use the cargo-build-system and
aren't rust-* packages, or which have rust as an input. 102 packages:

$ GUIX_PROFILING="gc rpc object-cache" ./pre-inst-env guix build \
        --no-grafts -m ./etc/teams/rust/rusty-packages.scm \
        --no-substitutes -d

Object Cache:
  fresh caches:    20
  lookups:      361239
  hits:         351403 (97.3%)
  cache size:    9835 entries
Remote procedure call summary: 11780 RPCs
  built-in-builders              ...     1
  add-to-store/tree              ...    33
  add-to-store                   ...   329
  add-text-to-store              ... 11417
Garbage collection statistics:
  heap size:        223.68 MiB
  allocated:        1332.86 MiB
  GC times:         34
  time spent in GC: 7.10 seconds (33% of user time)

I also did a run with hello to get a comparison on the same machine:

$ GUIX_PROFILING="gc rpc object-cache" ./pre-inst-env guix build hello -d --no-grafts --no-substitutes
/gnu/store/1y2k2jkb8w67mcwj42b8bxjv3yshk5kf-hello-2.12.1.drv
Object Cache:
  fresh caches:    16
  lookups:       5042
  hits:          4596 (91.2%)
  cache size:     445 entries
Remote procedure call summary: 382 RPCs
  built-in-builders              ...     1
  add-to-store/tree              ...     3
  add-to-store                   ...    48
  add-text-to-store              ...   330
Garbage collection statistics:
  heap size:        79.57 MiB
  allocated:        122.75 MiB
  GC times:         14
  time spent in GC: 0.94 seconds (69% of user time)

-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible useful guix package performance data
  2024-11-05 15:34     ` Efraim Flashner
@ 2024-11-06 17:15       ` Evan Cooney
  0 siblings, 0 replies; 6+ messages in thread
From: Evan Cooney @ 2024-11-06 17:15 UTC (permalink / raw)
  To: Ludovic Courtès, Thompson, David, Evan Cooney, guix-devel

[-- Attachment #1: Type: text/plain, Size: 4704 bytes --]

Hi,

This is amazingly helpful!! Thank you!

On Tue, Nov 5, 2024, 9:34 AM Efraim Flashner <efraim@flashner.co.il> wrote:

> On Mon, Nov 04, 2024 at 11:22:01PM +0100, Ludovic Courtès wrote:
> > Hi,
> >
> > "Thompson, David" <dthompson2@worcester.edu> skribis:
> >
> > > What are you trying to measure, exactly? The data collected is about a
> > > single thread of a Guile process, but `guix package -i` involves
> > > interaction with the Guix build daemon, waiting for downloads, etc.
> > > These things are outside the scope of what statprof is tracing.
> > > Anything that involves reaching out to the internet is going to make
> > > for an unreliable benchmark. I'm not actively working on Guix tools
> > > but I don't see anything immediately actionable from the collected
> > > data. I think getting some more clarity around the goal is necessary
> > > in order to know what to profile and how.
> >
> > Unless you’re specifically looking at performance of substitutes, I
> > recommend looking at profiles when using ‘--no-substitutes’; I would
> > also pass ‘--no-grafts’, to make sure you’re really looking at the core
> > of the whole package-to-derivation mechanism.
> >
> > You may also want to check the ‘GUIX_PROFILING’ environment variable,
> > from the (guix profiling) module.  That can give a useful overview such
> > as the number of remote procedure calls (RPCs) made to guix-daemon and
> > the hit rate of the “object cache”:
> >
> > --8<---------------cut here---------------start------------->8---
> > $ GUIX_PROFILING="gc rpc object-cache" guix build hello -d --no-grafts
> --no-substitutes
> > /gnu/store/79dhya6sngg4rf53m1cyxlhn8y4pnw2n-hello-2.12.1.drv
> > Object Cache:
> >   fresh caches:    16
> >   lookups:       5042
> >   hits:          4596 (91.2%)
> >   cache size:     445 entries
> > Remote procedure call summary: 382 RPCs
> >   built-in-builders              ...     1
> >   add-to-store/tree              ...     3
> >   add-to-store                   ...    48
> >   add-text-to-store              ...   330
> > Garbage collection statistics:
> >   heap size:        72.27 MiB
> >   allocated:        100.29 MiB
> >   GC times:         13
> >   time spent in GC: 0.42 seconds (45% of user time)
> > --8<---------------cut here---------------end--------------->8---
> >
> > Keep in mind that the figures we’re looking at here are largely
> > dependent of the package graph being processed (that of ‘hello’ here).
> > So it also makes sense to look at the size of the package graph in
> > question first (in the case of ‘hello’, it’s probably fine, but let’s
> > say when Rust packages are in the loop, the graph might be much bigger
> > than needed/expected).
>
> Did someone say rust?
>
> I made a manifest of the packages that use the cargo-build-system and
> aren't rust-* packages, or which have rust as an input. 102 packages:
>
> $ GUIX_PROFILING="gc rpc object-cache" ./pre-inst-env guix build \
>         --no-grafts -m ./etc/teams/rust/rusty-packages.scm \
>         --no-substitutes -d
>
> Object Cache:
>   fresh caches:    20
>   lookups:      361239
>   hits:         351403 (97.3%)
>   cache size:    9835 entries
> Remote procedure call summary: 11780 RPCs
>   built-in-builders              ...     1
>   add-to-store/tree              ...    33
>   add-to-store                   ...   329
>   add-text-to-store              ... 11417
> Garbage collection statistics:
>   heap size:        223.68 MiB
>   allocated:        1332.86 MiB
>   GC times:         34
>   time spent in GC: 7.10 seconds (33% of user time)
>
> I also did a run with hello to get a comparison on the same machine:
>
> $ GUIX_PROFILING="gc rpc object-cache" ./pre-inst-env guix build hello -d
> --no-grafts --no-substitutes
> /gnu/store/1y2k2jkb8w67mcwj42b8bxjv3yshk5kf-hello-2.12.1.drv
> Object Cache:
>   fresh caches:    16
>   lookups:       5042
>   hits:          4596 (91.2%)
>   cache size:     445 entries
> Remote procedure call summary: 382 RPCs
>   built-in-builders              ...     1
>   add-to-store/tree              ...     3
>   add-to-store                   ...    48
>   add-text-to-store              ...   330
> Garbage collection statistics:
>   heap size:        79.57 MiB
>   allocated:        122.75 MiB
>   GC times:         14
>   time spent in GC: 0.94 seconds (69% of user time)
>
> --
> Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
> GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
> Confidentiality cannot be guaranteed on emails sent or received unencrypted
>

[-- Attachment #2: Type: text/html, Size: 5852 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible useful guix package performance data
  2024-11-04 22:22   ` Ludovic Courtès
  2024-11-05 15:34     ` Efraim Flashner
@ 2024-11-07 15:14     ` Simon Tournier
  1 sibling, 0 replies; 6+ messages in thread
From: Simon Tournier @ 2024-11-07 15:14 UTC (permalink / raw)
  To: Ludovic Courtès, Thompson, David; +Cc: Evan Cooney, guix-devel

Hi,

On Mon, 04 Nov 2024 at 23:22, Ludovic Courtès <ludo@gnu.org> wrote:

> Garbage collection statistics:
>   heap size:        72.27 MiB
>   allocated:        100.29 MiB
>   GC times:         13
>   time spent in GC: 0.42 seconds (45% of user time)

I am puzzled by this.  Well, most of the time when the garbage
collection takes a relative amount of time compared to the whole task,
it potentially means some allocations are poorly designed.

How could I inspect that?


Cheers,
simon


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-07 17:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-01  6:27 Possible useful guix package performance data Evan Cooney
2024-11-01 13:14 ` Thompson, David
2024-11-04 22:22   ` Ludovic Courtès
2024-11-05 15:34     ` Efraim Flashner
2024-11-06 17:15       ` Evan Cooney
2024-11-07 15:14     ` Simon Tournier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).