* marking overhead, and on the cost of conditionals in hot code
@ 2009-01-16 22:00 Andy Wingo
2009-01-17 18:48 ` Neil Jerram
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Andy Wingo @ 2009-01-16 22:00 UTC (permalink / raw)
To: guile-devel
I dropped into cachegrind, and it tells me thing about scm_gc_mark in a
simple guile -c 1 run:
. void
. scm_gc_mark (SCM ptr)
794,344 {
155,170 => ???:0x00024917 (77585x)
198,586 if (SCM_IMP (ptr))
. return;
.
513,038 if (SCM_GC_MARK_P (ptr))
. return;
.
84,580 if (!scm_i_marking)
. {
. static const char msg[]
. = "Should only call scm_gc_mark() during GC.";
. scm_c_issue_deprecation_warning (msg);
. }
.
42,290 SCM_SET_GC_MARK (ptr);
63,435 scm_gc_mark_dependencies (ptr);
2,666,432 => /home/wingo/src/guile/vm/libguile/gc-mark.c:scm_gc_mark_dependencies (5222x)
704 => /usr/src/debug////////glibc-20081113T2206/elf/../sysdeps/i386/dl-trampoline.S:_dl_runtime_resolve (1x)
595,758 }
I think that the items on the left are cycle counts, and are of relative
importance. The => lines are the cumulative costs of the subroutines.
The salient point for me is that the scm_i_marking check slows down
this function by about 10%! Also, that the majority of the time in this
function is in the SCM_GC_MARK_P line.
If I thought that we'd keep our GC, I would work at inlining this
function, i think.
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: marking overhead, and on the cost of conditionals in hot code
2009-01-16 22:00 marking overhead, and on the cost of conditionals in hot code Andy Wingo
@ 2009-01-17 18:48 ` Neil Jerram
2009-01-17 22:37 ` Ludovic Courtès
2009-01-17 22:30 ` Ludovic Courtès
2009-01-19 3:35 ` Han-Wen Nienhuys
2 siblings, 1 reply; 5+ messages in thread
From: Neil Jerram @ 2009-01-17 18:48 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
2009/1/16 Andy Wingo <wingo@pobox.com>:
>
> If I thought that we'd keep our GC, I would work at inlining this
> function, i think.
It seems like a lot of things are starting to depend on whether or not
we move to BDW-GC. (This, the fix I just did for NetBSD,
scm_init_guile, forthcoming work on threads and mutex locking
inconsistencies, ...) We should aim to reach a definitive decision on
this soon!
Neil
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: marking overhead, and on the cost of conditionals in hot code
2009-01-16 22:00 marking overhead, and on the cost of conditionals in hot code Andy Wingo
2009-01-17 18:48 ` Neil Jerram
@ 2009-01-17 22:30 ` Ludovic Courtès
2009-01-19 3:35 ` Han-Wen Nienhuys
2 siblings, 0 replies; 5+ messages in thread
From: Ludovic Courtès @ 2009-01-17 22:30 UTC (permalink / raw)
To: guile-devel
Hello!
Andy Wingo <wingo@pobox.com> writes:
> I dropped into cachegrind, and it tells me thing about scm_gc_mark in a
> simple guile -c 1 run:
>
> . void
> . scm_gc_mark (SCM ptr)
> 794,344 {
> 155,170 => ???:0x00024917 (77585x)
> 198,586 if (SCM_IMP (ptr))
> . return;
> .
> 513,038 if (SCM_GC_MARK_P (ptr))
> . return;
> .
> 84,580 if (!scm_i_marking)
> . {
> . static const char msg[]
> . = "Should only call scm_gc_mark() during GC.";
> . scm_c_issue_deprecation_warning (msg);
> . }
> .
> 42,290 SCM_SET_GC_MARK (ptr);
> 63,435 scm_gc_mark_dependencies (ptr);
> 2,666,432 => /home/wingo/src/guile/vm/libguile/gc-mark.c:scm_gc_mark_dependencies (5222x)
> 704 => /usr/src/debug////////glibc-20081113T2206/elf/../sysdeps/i386/dl-trampoline.S:_dl_runtime_resolve (1x)
> 595,758 }
>
>
> I think that the items on the left are cycle counts, and are of relative
> importance. The => lines are the cumulative costs of the subroutines.
This is actually the output of Callgrind, and the left column is
instruction reads ("Ir"), which is not directly equivalent to the cycle
count, especially on a CISC arch (it's nevertheless a good
approximation, I'm just nitpicking ;-)).
> The salient point for me is that the scm_i_marking check slows down
> this function by about 10%! Also, that the majority of the time in this
> function is in the SCM_GC_MARK_P line.
>
> If I thought that we'd keep our GC, I would work at inlining this
> function, i think.
But it's a macro, isn't it?
Thanks,
Ludo'.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: marking overhead, and on the cost of conditionals in hot code
2009-01-17 18:48 ` Neil Jerram
@ 2009-01-17 22:37 ` Ludovic Courtès
0 siblings, 0 replies; 5+ messages in thread
From: Ludovic Courtès @ 2009-01-17 22:37 UTC (permalink / raw)
To: guile-devel
"Neil Jerram" <neiljerram@googlemail.com> writes:
> It seems like a lot of things are starting to depend on whether or not
> we move to BDW-GC. (This, the fix I just did for NetBSD,
> scm_init_guile, forthcoming work on threads and mutex locking
> inconsistencies, ...) We should aim to reach a definitive decision on
> this soon!
Right. Here's my (small) roadmap:
1. Experiment a bit more with static allocation, notably for subrs,
and see whether it's worth it.
2. Provide additional benchmarking results, based on those by Clinger,
Hansen et al., which are in the repo. I'd like to have a
reasonable understanding of what they do, though.
Additional feedback from interested parties could also be helpful in
trying to reach a decision.
Thanks,
Ludo'.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: marking overhead, and on the cost of conditionals in hot code
2009-01-16 22:00 marking overhead, and on the cost of conditionals in hot code Andy Wingo
2009-01-17 18:48 ` Neil Jerram
2009-01-17 22:30 ` Ludovic Courtès
@ 2009-01-19 3:35 ` Han-Wen Nienhuys
2 siblings, 0 replies; 5+ messages in thread
From: Han-Wen Nienhuys @ 2009-01-19 3:35 UTC (permalink / raw)
To: guile-devel
Andy Wingo escreveu:
> I dropped into cachegrind, and it tells me thing about scm_gc_mark in a
> simple guile -c 1 run:
>
>
> I think that the items on the left are cycle counts, and are of relative
> importance. The => lines are the cumulative costs of the subroutines.
>
> The salient point for me is that the scm_i_marking check slows down
> this function by about 10%!
This can easily be remedied by splitting off the actual work into internal
function which skips the check. The GC module could alway call the internal
function.
> Also, that the majority of the time in this
> function is in the SCM_GC_MARK_P line.
Well, GC_MARK_P is bit fiddling a pointer dereference, with a possible cache miss.
Also, the code up to that point will get executed much more often than what follows.
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-01-19 3:35 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-16 22:00 marking overhead, and on the cost of conditionals in hot code Andy Wingo
2009-01-17 18:48 ` Neil Jerram
2009-01-17 22:37 ` Ludovic Courtès
2009-01-17 22:30 ` Ludovic Courtès
2009-01-19 3:35 ` Han-Wen Nienhuys
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).