From: ludo@gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo@igalia.com>
Cc: Mark H Weaver <mhw@netris.org>, guile-devel <guile-devel@gnu.org>
Subject: Weak tables harmful to GC?
Date: Sun, 17 Sep 2017 15:56:49 +0200 [thread overview]
Message-ID: <87zi9th1i6.fsf_-_@gnu.org> (raw)
In-Reply-To: <87wp54z3p5.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Tue, 12 Sep 2017 11:06:14 +0200")
Hello,
ludo@gnu.org (Ludovic Courtès) skribis:
> Total time: 66.515425859 seconds (31.859444991 seconds in GC)
So, more investigation on the GC side…
‘perf’ shows a profile like this:
--8<---------------cut here---------------start------------->8---
12.33% guile libgc.so.1.0.3 [.] GC_mark_from
10.67% guile libgc.so.1.0.3 [.] GC_move_disappearing_link_inner
8.47% guile libgc.so.1.0.3 [.] GC_header_cache_miss
7.06% guile libgc.so.1.0.3 [.] GC_mark_and_push
5.98% guile libgc.so.1.0.3 [.] GC_finalize
4.08% guile libpthread-2.25.so [.] pthread_mutex_trylock
4.03% guile libgc.so.1.0.3 [.] GC_register_disappearing_link_inner
3.95% guile libpthread-2.25.so [.] __pthread_mutex_unlock_usercnt
3.07% guile libguile-2.2.so.1.2.0 [.] weak_table_put_x
3.05% guile libgc.so.1.0.3 [.] GC_base
2.49% guile libguile-2.2.so.1.2.0 [.] resize_table
2.47% guile libgc.so.1.0.3 [.] GC_is_marked
1.76% guile libguile-2.2.so.1.2.0 [.] rob_from_rich
1.54% guile libgc.so.1.0.3 [.] GC_grow_table
1.22% guile libgc.so.1.0.3 [.] GC_find_header
1.19% guile libgc.so.1.0.3 [.] GC_clear_mark_bit
1.17% guile libguile-2.2.so.1.2.0 [.] mem2uinteger
1.13% guile libgc.so.1.0.3 [.] GC_malloc_kind
0.98% guile libguile-2.2.so.1.2.0 [.] peek_byte_or_eof
0.98% guile libguile-2.2.so.1.2.0 [.] scm_getc
0.71% guile libguile-2.2.so.1.2.0 [.] scm_get_byte_or_eof
0.68% guile libguile-2.2.so.1.2.0 [.] scm_to_uint64
0.68% guile libguile-2.2.so.1.2.0 [.] read_token
0.67% guile libgc.so.1.0.3 [.] GC_is_heap_ptr
0.64% guile libguile-2.2.so.1.2.0 [.] scm_unget_bytes
0.60% guile libguile-2.2.so.1.2.0 [.] read_inner_expression
0.59% guile libguile-2.2.so.1.2.0 [.] scm_flush
0.58% guile libguile-2.2.so.1.2.0 [.] peek_utf8_codepoint
0.55% guile libguile-2.2.so.1.2.0 [.] scm_set_cdr_x
0.53% guile libgc.so.1.0.3 [.] GC_push_marked
0.53% guile libguile-2.2.so.1.2.0 [.] scm_c_weak_set_lookup
0.51% guile libgc.so.1.0.3 [.] GC_call_with_alloc_lock
0.51% guile libguile-2.2.so.1.2.0 [.] scm_read_sexp
0.50% guile libguile-2.2.so.1.2.0 [.] mark_weak_key_table
0.49% guile libguile-2.2.so.1.2.0 [.] move_weak_entry.part.0
0.49% guile libguile-2.2.so.1.2.0 [.] scm_ungetc
0.47% guile libguile-2.2.so.1.2.0 [.] scm_to_int32
0.46% guile libguile-2.2.so.1.2.0 [.] flush_ws
0.45% guile libguile-2.2.so.1.2.0 [.] scm_i_string_ref
0.41% guile libgc.so.1.0.3 [.] GC_reclaim_clear
0.39% guile libgc.so.1.0.3 [.] GC_move_disappearing_link
--8<---------------cut here---------------end--------------->8---
As you hinted on #guix a while back Andy, the mark procedures Guile uses
break libgc’s expectations. Specifically, ‘mark_weak_key_table’
typically pushes lots of objects on the mark stack (in my example the
source properties table has a 3595271 ‘scm_t_weak_entry’ objects, so at
every mark phase, we push roughly all of that on the mark stack.)
Internally, libgc would never do that: in ‘GC_mark_from’ it has a
limited “credit” for marking, and it stops when it has consumed all of
it. However, with mark procedures, it cannot do that because the mark
procedure obviously keeps running until it’s done, and from libgc’s
viewpoint, the mark procedure marks one object (in this case, the big
weak entry array.)
(Besides, libgc recommends against using mark procedures in the first
place, but we cannot use the GC_DS_BITMAP approach here because it only
works for objects up to 30 words, not 200+ MiB arrays…)
In addition to this, ‘GC_move_disappearing_link’ is expensive, as shown
in the profile, yet it’s called quite a lot on table resizing.
I’ve come to the conclusion that the 2.2 weak-table implementation
strategy cannot work efficiently with libgc.
I’m also skeptical (perhaps that’s also because I’m insufficiently
informed, tell me!) about the “open-addressed” strategy that is used.
To me, it’s necessarily less space-efficient than a regular hash table
with chaining since we always have at least 10% more weak entries than
the number of actual entries in the table (and in practice it’s usually
much more than 10% AFAICS, because of the gap between subsequent sizes.)
All in all, given that these issues are very likely causes of the
execution time and memory consumption issues that plague the compiler
(where we have huge symbol and source property tables), I’m in favor of
switching back to the 2.0 implementation of weak hash tables. That can
be done in an API-compatible way, I think.
WDYT? Better ideas maybe?
Thanks,
Ludo’.
next prev parent reply other threads:[~2017-09-17 13:56 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-16 16:19 Compiler memory consumption Ludovic Courtès
2017-05-16 20:14 ` Andy Wingo
2017-05-16 20:45 ` Ludovic Courtès
2017-09-12 9:06 ` Ludovic Courtès
2017-09-17 13:56 ` Ludovic Courtès [this message]
2017-10-09 13:10 ` Weak tables harmful to GC? Christopher Allan Webber
2017-10-22 1:52 ` Ludovic Courtès
2017-10-22 2:20 ` Christopher Allan Webber
2017-10-23 0:16 ` Ludovic Courtès
2017-10-24 15:02 ` Ricardo Wurmus
2017-10-24 15:32 ` Ricardo Wurmus
2017-10-25 6:38 ` Ricardo Wurmus
2017-10-26 7:03 ` Ludovic Courtès
2017-10-26 8:35 ` Ricardo Wurmus
2017-10-26 16:52 ` Ricardo Wurmus
2017-10-27 5:28 ` Ludovic Courtès
2017-10-28 9:56 ` Ricardo Wurmus
2017-10-30 12:35 ` Ludovic Courtès
2017-10-30 14:48 ` Ricardo Wurmus
2017-10-30 17:20 ` Ricardo Wurmus
2017-10-30 22:18 ` Ludovic Courtès
2017-10-30 17:29 ` bug#19180: " Andy Wingo
2017-10-30 23:13 ` Ludovic Courtès
2017-10-31 8:25 ` Andy Wingo
2017-10-31 16:56 ` Ludovic Courtès
2017-10-26 17:17 ` Ludovic Courtès
2017-10-24 22:45 ` Christopher Allan Webber
2017-10-24 22:58 ` Ludovic Courtès
2017-10-25 0:50 ` Christopher Allan Webber
2017-10-25 17:11 ` Ludovic Courtès
2017-10-25 17:42 ` Compiler memory consumption Ludovic Courtès
2017-10-30 15:16 ` Andy Wingo
2017-10-30 15:52 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zi9th1i6.fsf_-_@gnu.org \
--to=ludo@gnu.org \
--cc=guile-devel@gnu.org \
--cc=mhw@netris.org \
--cc=wingo@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).