* GC: cons sweeping
@ 2007-07-02 15:01 Dmitry Antipov
2007-07-02 18:15 ` Stefan Monnier
0 siblings, 1 reply; 4+ messages in thread
From: Dmitry Antipov @ 2007-07-02 15:01 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 4488 bytes --]
For the most common usage patterns, it was observed that from 50% to near 80%
of allocated cons blocks are full (for the dumped Emacs). Other blocks may
contains a free cons cells, and, when Emacs lifetime grows, the fragmentation for
such blocks usually increases (this is a weakness of all non-copying GCs we can't avoid).
Anyway, the number of full blocks is quite large, and some of non-full blocks may
contains large full subareas. This means that we may try to check the whole word of
the cons block bitmap first - if all appropriate conses are marked, it will be -1,
and, since there is nothing to link into free list from there, all mark bits might be
just cleared at once by setting this word to 0. Otherwise, we should scan over
each bit of this word and link non-marked conses to a free list.
Here is an implementation of a such sweeping scheme. It's a bit more complicated
than described because a) first cons block may be used partially and b) number of
bits in a cons block bitmap is not a multiple of a number of bits per word, so
last word of the bitmap might be scanned as usual. This patch also saves an old
sweeping code under #if and provides a simple timing of both sweeping methods
in order to compare the speed of them.
Finally, there are typical results of simple benchmarks I've performed.
Everything is started from 'src' subdirectory of the source tree. Cons sweeping
time (in us) for the original code is 2nd column, for the modified - 3rd.
Since the number of usecs is absolute and highly depends on your hardware,
only the difference (or ratio) between them makes sense.
1. Run 'emacs *.[ch]', then exit immediately:
GC0 612 514
GC1 692 578
GC2 814 674
GC3 741 522
GC4 778 534
GC5 776 505
GC6 779 490
GC7 856 543
GC8 912 590
GC9 941 611
GC10 971 597
GC11 1000 610
GC12 1031 620
GC13 1104 661
GC14 1099 664
GC15 1134 684
GC16 1149 679
GC17 1199 719
GC18 1215 717
GC19 1243 733
GC20 1277 780
GC21 1316 774
GC22 1422 802
GC23 1381 803
GC24 1400 823
GC25 1430 1009
GC26 1474 834
GC27 1497 840
GC28 1512 849
GC29 1536 866
GC30 1574 968
GC31 1595 954
GC32 1606 893
GC33 1671 921
GC34 1693 924
GC35 1727 937
GC36 1759 994
GC37 1856 987
GC38 1809 998
GC39 1831 987
2. Run 'emacs -batch -f batch-byte-compile ../lisp/textmodes/org.el':
GC0 602 504
GC1 561 383
GC2 613 438
GC3 650 451
GC4 676 412
GC5 696 411
GC6 712 457
GC7 799 509
GC8 825 526
GC9 850 515
GC10 914 556
GC11 893 500
GC12 1002 584
GC13 1052 619
GC14 1145 711
GC15 1340 909
GC16 1520 930
GC17 1333 894
GC18 1376 951
GC19 1402 992
GC20 1331 920
GC21 1381 1008
GC22 1382 970
GC23 1376 998
GC24 1441 1037
GC25 1384 991
GC26 1442 1065
GC27 1430 1060
GC28 1400 1017
GC29 1400 991
GC30 1521 1072
GC31 1427 1014
3. Run 'emacs ChangeLog', then '(replace-string "a" "__A__")' and exit
without saving the modified buffer:
GC0 607 510
GC1 697 593
GC2 809 670
GC3 757 537
GC4 736 599
GC5 863 679
GC6 937 661
GC7 1023 702
GC8 1144 753
GC9 1195 802
GC10 1308 829
GC11 1359 851
GC12 1509 900
GC13 1641 912
GC14 1648 969
GC15 1728 1004
GC16 1929 1047
GC17 2100 1146
GC18 2141 1175
GC19 2133 1249
GC20 2373 1317
GC21 2440 1369
GC22 2633 1464
GC23 2775 1521
GC24 2834 1615
GC25 3088 1703
GC26 3189 1823
GC27 3399 1915
GC28 3563 2143
GC29 3854 2163
GC30 4133 2262
GC31 4225 2373
GC32 4519 2580
GC33 4653 2605
GC34 4922 2768
GC35 5414 2910
Waiting for feedback,
Dmitry
[-- Attachment #2: new_cons_sweep.patch --]
[-- Type: text/plain, Size: 3493 bytes --]
Index: alloc.c
===================================================================
RCS file: /sources/emacs/emacs/src/alloc.c,v
retrieving revision 1.410
diff -u -r1.410 alloc.c
--- alloc.c 8 Jun 2007 19:59:46 -0000 1.410
+++ alloc.c 2 Jul 2007 14:54:46 -0000
@@ -5950,6 +5950,9 @@
static void
gc_sweep ()
{
+ EMACS_TIME beg, end, total;
+ char *sweep_type;
+
/* Remove or mark entries in weak hash tables.
This must be done before any object is unmarked. */
sweep_weak_hash_tables ();
@@ -5961,6 +5964,11 @@
#endif
/* Put all unmarked conses on free list */
+
+ EMACS_GET_TIME (beg);
+
+#if 0 /* Old straightforward sweep */
+
{
register struct cons_block *cblk;
struct cons_block **cprev = &cons_block;
@@ -5968,6 +5976,7 @@
register int num_free = 0, num_used = 0;
cons_free_list = 0;
+ sweep_type = "old";
for (cblk = cons_block; cblk; cblk = *cprev)
{
@@ -5991,6 +6000,88 @@
lim = CONS_BLOCK_SIZE;
/* If this block contains only free conses and we have already
seen more than two blocks worth of free conses then deallocate
+ this block. */
+ if (this_free == CONS_BLOCK_SIZE && num_free > CONS_BLOCK_SIZE)
+ {
+ *cprev = cblk->next;
+ /* Unhook from the free list. */
+ cons_free_list = cblk->conses[0].u.chain;
+ lisp_align_free (cblk);
+ n_cons_blocks--;
+ }
+ else
+ {
+ num_free += this_free;
+ cprev = &cblk->next;
+ }
+ }
+ total_conses = num_used;
+ total_free_conses = num_free;
+ }
+
+#else /* New, more compilcated, but hopefully faster, sweep */
+
+ {
+ register struct cons_block *cblk;
+ struct cons_block **cprev = &cons_block;
+ register int lim = cons_block_index;
+ register int num_free = 0, num_used = 0;
+
+ cons_free_list = 0;
+ sweep_type = "new";
+
+ for (cblk = cons_block; cblk; cblk = *cprev)
+ {
+ register int i = 0;
+ int this_free = 0;
+
+ while (1)
+ {
+ if (cblk->gcmarkbits[i] == -1)
+ {
+ /* Fast path - everything is marked. */
+ cblk->gcmarkbits[i++] = 0;
+ num_used += BITS_PER_INT;
+ }
+ else
+ {
+ /* Slow path - scan over each bit, from the beginning
+ of current word to 'min (word boundary, LIM)'. */
+ int start, stop;
+
+ start = i * BITS_PER_INT;
+ stop = lim - start;
+ if (stop > BITS_PER_INT)
+ stop = BITS_PER_INT;
+ stop += start;
+
+ while (start < stop)
+ {
+ if (!CONS_MARKED_P (&cblk->conses[start]))
+ {
+ this_free++;
+ cblk->conses[start].u.chain = cons_free_list;
+ cons_free_list = &cblk->conses[start];
+#if GC_MARK_STACK
+ cons_free_list->car = Vdead;
+#endif
+ }
+ else
+ {
+ num_used++;
+ CONS_UNMARK (&cblk->conses[start]);
+ }
+ start++;
+ }
+ if (stop < (++i) * BITS_PER_INT)
+ /* Whole bitmap is scanned or LIM is reached. */
+ break;
+ }
+ }
+
+ lim = CONS_BLOCK_SIZE;
+ /* If this block contains only free conses and we have already
+ seen more than two blocks worth of free conses then deallocate
this block. */
if (this_free == CONS_BLOCK_SIZE && num_free > CONS_BLOCK_SIZE)
{
@@ -6010,6 +6101,13 @@
total_free_conses = num_free;
}
+#endif /* Sweep types */
+
+ EMACS_GET_TIME (end);
+ EMACS_SUB_TIME (total, end, beg);
+ fprintf (stderr, "GC%u [%s sweep]: %u us\n", gcs_done,
+ sweep_type, EMACS_USECS (total));
+
/* Put all unmarked floats on free list */
{
register struct float_block *fblk;
[-- Attachment #3: Type: text/plain, Size: 142 bytes --]
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: GC: cons sweeping
2007-07-02 15:01 GC: cons sweeping Dmitry Antipov
@ 2007-07-02 18:15 ` Stefan Monnier
2007-07-02 22:39 ` Richard Stallman
2007-07-03 10:48 ` Dmitry Antipov
0 siblings, 2 replies; 4+ messages in thread
From: Stefan Monnier @ 2007-07-02 18:15 UTC (permalink / raw)
To: Dmitry Antipov; +Cc: emacs-devel
> GC39 1831 987
So it seems to be able to speed up sweeping of cons cells by anywhere between
20% and 50% (up to a factor of 2 speed up). Now, what fraction of time is
spent sweeping cons-cells?
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: GC: cons sweeping
2007-07-02 18:15 ` Stefan Monnier
@ 2007-07-02 22:39 ` Richard Stallman
2007-07-03 10:48 ` Dmitry Antipov
1 sibling, 0 replies; 4+ messages in thread
From: Richard Stallman @ 2007-07-02 22:39 UTC (permalink / raw)
To: Stefan Monnier; +Cc: dmantipov, emacs-devel
So it seems to be able to speed up sweeping of cons cells by anywhere between
20% and 50% (up to a factor of 2 speed up). Now, what fraction of time is
spent sweeping cons-cells?
The change is simple enough that we have no reason to argue more
before we install it. We just need legal papers. So I sent Dmitry
a message about that.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: GC: cons sweeping
2007-07-02 18:15 ` Stefan Monnier
2007-07-02 22:39 ` Richard Stallman
@ 2007-07-03 10:48 ` Dmitry Antipov
1 sibling, 0 replies; 4+ messages in thread
From: Dmitry Antipov @ 2007-07-03 10:48 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
Stefan Monnier wrote:
> So it seems to be able to speed up sweeping of cons cells by anywhere between
> 20% and 50% (up to a factor of 2 speed up). Now, what fraction of time is
> spent sweeping cons-cells?
In general, typical GC spent 80% of time in marking and 20% of time in sweeping
(so even a tiny optimization of the first phase might be more important than a
much more solid enhancement of a second - but this doesn't mean that we should
not try to optimize both :-), and cons sweeping time is typically 2-5 % of the
whole GC time. So, since an average GC takes 20-30 ms on a 3GHz x86 CPU, I agree
that this optimization is likely not to be seen by an average user on an average
modern hardware.
Dmitry
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-07-03 10:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-02 15:01 GC: cons sweeping Dmitry Antipov
2007-07-02 18:15 ` Stefan Monnier
2007-07-02 22:39 ` Richard Stallman
2007-07-03 10:48 ` Dmitry Antipov
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).