* profiling emacs-23.1 vs emacs-22.3
@ 2009-08-03 20:29 Dan Nicolaescu
2009-08-04 17:10 ` Leo
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Dan Nicolaescu @ 2009-08-03 20:29 UTC (permalink / raw)
To: emacs-devel
I run a small test.
Take src/config.h run C-x r t on it to insert a space in front of each
line.
Then build both emacs-23.1 and 22.3 with profiling enabled (-O2 -g -pg).
Then run:
emacs -Q -nw config.h
M-: (indent-region (point-min)(point-max)) RET
C-x C-c
The 22.3 gprof results:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
41.29 7.13 7.13 4653244 0.00 0.00 forw_comment
16.10 9.91 2.78 32473 0.09 0.31 scan_sexps_forward
9.09 11.48 1.57 15132 0.10 0.15 re_search_2
5.62 12.45 0.97 17286 0.06 0.15 scan_lists
5.59 13.41 0.96 33462255 0.00 0.00 lookup_char_property
3.36 13.99 0.58 18478958 0.00 0.00 next_interval
2.14 14.37 0.37 129733 0.00 0.00 mark_object
1.80 14.68 0.31 137020 0.00 0.00 skip_chars
1.74 14.97 0.30 14459993 0.00 0.00 previous_interval
1.62 15.26 0.28 57446793 0.00 0.00 Fcdr
1.36 15.49 0.23 2156082 0.00 0.00 update_syntax_table
1.27 15.71 0.22 3315615 0.00 0.00 re_match_2_internal
1.19 15.91 0.20 23687203 0.00 0.00 Fassq
0.87 16.07 0.15 13478 0.01 0.06 Fprevious_single_property_change
0.75 16.20 0.13 33462255 0.00 0.00 textget
0.75 16.32 0.13 228439 0.00 0.00 Fbyte_code
0.52 16.41 0.09 1978403 0.00 0.00 update_interval
0.46 16.50 0.08 1101056 0.00 0.00 find_interval
0.29 16.55 0.05 3613429 0.00 0.00 buf_bytepos_to_charpos
0.26 16.59 0.04 10137050 0.00 0.00 Fcar
0.26 16.64 0.04 1154883 0.00 0.00 balance_an_interval
0.23 16.68 0.04 579275 0.00 0.00 unbind_to
The 23.1 gprof results:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
29.60 7.03 7.03 406111658 0.00 0.00 char_table_ref
26.72 13.37 6.34 4653244 0.00 0.00 forw_comment
11.80 16.17 2.80 32121 0.00 0.00 scan_sexps_forward
5.98 17.59 1.42 23271 0.00 0.00 re_search_2
5.27 18.84 1.25 17289 0.00 0.00 scan_lists
4.13 19.82 0.98 33439981 0.00 0.00 lookup_char_property
1.94 20.28 0.46 18488739 0.00 0.00 next_interval
1.73 20.69 0.41 10038210 0.00 0.00 mark_object
1.26 20.99 0.30 2156799 0.00 0.00 update_syntax_table
1.20 21.27 0.29 sub_char_table_ref
1.10 21.53 0.26 14455134 0.00 0.00 previous_interval
0.88 21.74 0.21 57391428 0.00 0.00 Fcdr
0.88 21.95 0.21 3318263 0.00 0.00 re_match_2_internal
0.72 22.12 0.17 23671456 0.00 0.00 Fassq
0.67 22.28 0.16 196373 0.00 0.00 Fbyte_code
0.59 22.42 0.14 13478 0.00 0.00 Fprevious_single_property_change
0.51 22.54 0.12 38 0.00 0.02 Fgarbage_collect
0.38 22.63 0.09 33439981 0.00 0.00 textget
0.38 22.72 0.09 10115583 0.00 0.00 Fcar
[snip]
-----------------------------------------------
0.05 0.22 612/32121 back_comment [28]
2.75 11.33 31509/32121 Fparse_partial_sexp [6]
[5] 60.5 2.80 11.55 32121 scan_sexps_forward [5]
5.17 3.76 3796695/4653244 forw_comment [7]
1.62 0.00 93379139/406111658 char_table_ref [8]
0.15 0.86 1092890/2156799 update_syntax_table [11]
0.00 0.00 5096/10115583 Fcar [46]
0.00 0.00 6552/57391428 Fcdr [33]
0.00 0.00 147/952550 Fcons [105]
It can be seen that 23.1 is quite a bit slower, and that it has a lot of
extra calls to char_table_ref.
Are those calls necessary?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu
@ 2009-08-04 17:10 ` Leo
2009-08-04 19:50 ` Chong Yidong
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Leo @ 2009-08-04 17:10 UTC (permalink / raw)
To: emacs-devel
On 2009-08-03 21:29 +0100, Dan Nicolaescu wrote:
> It can be seen that 23.1 is quite a bit slower, and that it has a lot
> of extra calls to char_table_ref. Are those calls necessary?
Thank you for the quantitative results. From experience, there is
slowdown in Emacs 23.
--
Leo's Emacs uptime: 17 hours, 6 minutes, 51 seconds
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu
2009-08-04 17:10 ` Leo
@ 2009-08-04 19:50 ` Chong Yidong
2009-08-04 19:56 ` Dan Nicolaescu
2009-08-05 7:47 ` Dan Nicolaescu
2009-08-24 6:52 ` Kenichi Handa
3 siblings, 1 reply; 17+ messages in thread
From: Chong Yidong @ 2009-08-04 19:50 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: emacs-devel
Dan Nicolaescu <dann@ics.uci.edu> writes:
> I run a small test.
> Take src/config.h run C-x r t on it to insert a space in front of each
> line.
> Then build both emacs-23.1 and 22.3 with profiling enabled (-O2 -g -pg).
> Then run:
> emacs -Q -nw config.h
> M-: (indent-region (point-min)(point-max)) RET
> C-x C-c
>
> It can be seen that 23.1 is quite a bit slower, and that it has a lot of
> extra calls to char_table_ref.
Is there any difference in CVS trunk?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-04 19:50 ` Chong Yidong
@ 2009-08-04 19:56 ` Dan Nicolaescu
0 siblings, 0 replies; 17+ messages in thread
From: Dan Nicolaescu @ 2009-08-04 19:56 UTC (permalink / raw)
To: Chong Yidong; +Cc: emacs-devel
Chong Yidong <cyd@stupidchicken.com> writes:
> Dan Nicolaescu <dann@ics.uci.edu> writes:
>
> > I run a small test.
> > Take src/config.h run C-x r t on it to insert a space in front of each
> > line.
> > Then build both emacs-23.1 and 22.3 with profiling enabled (-O2 -g -pg).
> > Then run:
> > emacs -Q -nw config.h
> > M-: (indent-region (point-min)(point-max)) RET
> > C-x C-c
> >
> > It can be seen that 23.1 is quite a bit slower, and that it has a lot of
> > extra calls to char_table_ref.
>
> Is there any difference in CVS trunk?
No
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu
2009-08-04 17:10 ` Leo
2009-08-04 19:50 ` Chong Yidong
@ 2009-08-05 7:47 ` Dan Nicolaescu
2009-08-24 6:52 ` Kenichi Handa
3 siblings, 0 replies; 17+ messages in thread
From: Dan Nicolaescu @ 2009-08-05 7:47 UTC (permalink / raw)
To: emacs-devel
Here's another data point. This time from running a process (a
recursive grep for setq in the lisp subdirectory):
emacs --batch --eval '(progn (call-process "grep" nil (get-buffer-create "tmp") t "--color=never" "-r" "setq" "/tmp/Emacs-CVS/emacs/lisp/") (set-buffer "tmp") (write-file "/tmp/23"))'
emacs-22.3:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
20.59 0.07 0.07 111 0.63 0.63 decode_eol
14.71 0.12 0.05 5018 0.01 0.01 re_search_2
14.71 0.17 0.05 138 0.36 1.12 decode_coding
11.76 0.21 0.04 1 40.00 40.00 find_safe_codings
8.82 0.24 0.03 224 0.13 0.13 detect_coding_mask
5.88 0.26 0.02 1 20.00 20.00 unencodable_char_position
2.94 0.27 0.01 1401155 0.00 0.00 Faref
2.94 0.28 0.01 1396772 0.00 0.00 translate_char
2.94 0.29 0.01 6955 0.00 0.00 Fstring_equal
2.94 0.30 0.01 3180 0.00 0.00 allocate_vector
2.94 0.31 0.01 2062 0.00 0.00 mark_object
2.94 0.32 0.01 509 0.02 0.02 analyze_first
2.94 0.33 0.01 392 0.03 0.03 buf_charpos_to_bytepos
2.94 0.34 0.01 5 2.00 2.51 Faccessible_keymaps
0.00 0.34 0.00 84102 0.00 0.00 readchar
0.00 0.34 0.00 22523 0.00 0.00 re_match_2_internal
[snip]
CVS HEAD:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
18.18 0.06 0.06 2097964 0.00 0.00 mark_object
15.15 0.11 0.05 966 0.05 0.05 consume_chars
9.09 0.14 0.03 388 0.08 0.08 encode_coding_raw_text
9.09 0.17 0.03 219 0.14 0.15 produce_chars
6.06 0.19 0.02 140 0.14 0.21 detect_coding
6.06 0.21 0.02 98 0.20 0.20 decode_coding_iso_2022
6.06 0.23 0.02 1 20.00 20.00 Funencodable_char_position
3.03 0.24 0.01 377383 0.00 0.00 read_hex
3.03 0.25 0.01 219810 0.00 0.00 char_table_ref
3.03 0.26 0.01 8391 0.00 0.00 mem_insert
3.03 0.27 0.01 7036 0.00 0.00 Fcar
3.03 0.28 0.01 1472 0.01 0.01 emacs_read
3.03 0.29 0.01 467 0.02 0.02 Ftext_properties_at
3.03 0.30 0.01 201 0.05 0.05 gap_left
3.03 0.31 0.01 64 0.16 0.43 load_charset_map
3.03 0.32 0.01 4 2.50 2.50 Fkill_buffer
3.03 0.33 0.01 4 2.50 2.50 detect_coding_iso_2022
0.00 0.33 0.00 221420 0.00 0.00 char_table_set
0.00 0.33 0.00 106914 0.00 0.00 readchar
0.00 0.33 0.00 82427 0.00 0.00 Faref
[snip]
so we have a lot more calls to mark_object. They do not matter much
from the performance point of view in this particular case, but it
should be interesting to find out why we get them now.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu
` (2 preceding siblings ...)
2009-08-05 7:47 ` Dan Nicolaescu
@ 2009-08-24 6:52 ` Kenichi Handa
[not found] ` <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>
3 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-08-24 6:52 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: emacs-devel
In article <200908032029.n73KTi9h017528@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> It can be seen that 23.1 is quite a bit slower, and that it has a lot of
> extra calls to char_table_ref.
> Are those calls necessary?
I found that the syntax of C is mostly defined in the parent of
CURRENT_SYNTAX_TABLE, and thus, in the call of SYNTAX_ENTRY (C),
the optimization for ASCII in this code (in lisp.h) doesn't work.
/* Almost equivalent to Faref (CT, IDX) with optimization for ASCII
characters. Do not check validity of CT. */
#define CHAR_TABLE_REF(CT, IDX) \
((ASCII_CHAR_P (IDX) \
&& SUB_CHAR_TABLE_P (XCHAR_TABLE (CT)->ascii) \
&& !NILP (XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX])) \
? XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX] \
: char_table_ref ((CT), (IDX)))
Could you try the attached patch? If it improves the
performance, I'll commit it.
---
Kenichi Handa
handa@m17n.org
--- lisp.h.~1.661.~ 2009-08-21 15:03:39.000000000 +0900
+++ lisp.h 2009-08-24 15:47:02.000000000 +0900
@@ -793,13 +793,37 @@
#define CHAR_TABLE_EXTRA_SLOTS(CT) \
(((CT)->size & PSEUDOVECTOR_SIZE_MASK) - CHAR_TABLE_STANDARD_SLOTS)
+#ifdef __GNUC__
+
+#define CHAR_TABLE_REF_ASCII(CT, IDX) \
+ ({struct Lisp_Char_Table *_tbl = NULL; \
+ Lisp_Object _val; \
+ do { \
+ _tbl = _tbl ? XCHAR_TABLE (_tbl->parent) : XCHAR_TABLE (CT); \
+ _val = (! SUB_CHAR_TABLE_P (_tbl->ascii) ? _tbl->ascii \
+ : XSUB_CHAR_TABLE (_tbl->ascii)->contents[IDX]); \
+ if (NILP (_val)) \
+ _val = _tbl->defalt; \
+ } while (NILP (_val) && ! NILP (_tbl->parent)); \
+ _val; })
+
+#else /* not __GNUC__ */
+
+#define CHAR_TABLE_REF_ASCII(CT, IDX) \
+ (! NILP (XCHAR_TABLE (CT)->ascii) \
+ ? (! SUB_CHAR_TABLE_P (XCHAR_TABLE (CT)->ascii) \
+ ? XCHAR_TABLE (CT)->ascii \
+ : ! NILP (XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX]) \
+ ? XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX] \
+ : char_table_ref ((CT), (IDX))) \
+ : char_table_ref ((CT), (IDX)))
+
+#endif /* not __GNUC__ */
+
/* Almost equivalent to Faref (CT, IDX) with optimization for ASCII
characters. Do not check validity of CT. */
-#define CHAR_TABLE_REF(CT, IDX) \
- ((ASCII_CHAR_P (IDX) \
- && SUB_CHAR_TABLE_P (XCHAR_TABLE (CT)->ascii) \
- && !NILP (XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX])) \
- ? XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX] \
+#define CHAR_TABLE_REF(CT, IDX) \
+ (ASCII_CHAR_P (IDX) ? CHAR_TABLE_REF_ASCII ((CT), (IDX)) \
: char_table_ref ((CT), (IDX)))
/* Almost equivalent to Faref (CT, IDX). However, if the result is
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
[not found] ` <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>
@ 2009-08-24 11:39 ` Kenichi Handa
2009-08-24 18:26 ` Dan Nicolaescu
2009-08-24 22:18 ` Alan Mackenzie
0 siblings, 2 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-08-24 11:39 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: emacs-devel
In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> Could you try the attached patch? If it improves the
> It does improve performance:
How much in real time?
[...]
> It's still slower than 22.3 though.
> One big difference is then time/number of calls to mark_objects
> 129733 vs 18834514, so 145 times more calls to mark_object.
> Do you know where do those come from?
> The number of Fgarbage_collect calls does not increase that much:
> from 37 (for 22.3) to 43 (for 23.1).
It seems that c-indent-region has been changed a lot. I'm
not sure, but perhaps that is the reason. How do the other
people think?
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-24 11:39 ` Kenichi Handa
@ 2009-08-24 18:26 ` Dan Nicolaescu
2009-08-25 6:07 ` Kenichi Handa
2009-08-24 22:18 ` Alan Mackenzie
1 sibling, 1 reply; 17+ messages in thread
From: Dan Nicolaescu @ 2009-08-24 18:26 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
>
> > Could you try the attached patch? If it improves the
>
> > It does improve performance:
>
> How much in real time?
from 22 seconds to 19 (vs about 16 for 22.3 )
> [...]
> > It's still slower than 22.3 though.
>
> > One big difference is then time/number of calls to mark_objects
> > 129733 vs 18834514, so 145 times more calls to mark_object.
> > Do you know where do those come from?
>
> > The number of Fgarbage_collect calls does not increase that much:
> > from 37 (for 22.3) to 43 (for 23.1).
>
> It seems that c-indent-region has been changed a lot. I'm
> not sure, but perhaps that is the reason. How do the other
> people think?
I set the load path to cc-mode from 22.3, it did not make any significant
difference in the amount of mark_object and Fgarbage_collect calls.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-24 11:39 ` Kenichi Handa
2009-08-24 18:26 ` Dan Nicolaescu
@ 2009-08-24 22:18 ` Alan Mackenzie
1 sibling, 0 replies; 17+ messages in thread
From: Alan Mackenzie @ 2009-08-24 22:18 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Dan Nicolaescu, emacs-devel
On Mon, Aug 24, 2009 at 08:39:47PM +0900, Kenichi Handa wrote:
> In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> It seems that c-indent-region has been changed a lot. I'm not sure,
> but perhaps that is the reason [for a slowdown]. How do the other
> people think?
The change was a bug fix. It shouldn't have slowed down indentation, at
least not by more than epsilon.
> ---
> Kenichi Handa
> handa@m17n.org
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-24 18:26 ` Dan Nicolaescu
@ 2009-08-25 6:07 ` Kenichi Handa
2009-08-25 18:47 ` Dan Nicolaescu
0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-08-25 6:07 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: emacs-devel
In article <200908241826.n7OIQRhS001092@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> Kenichi Handa <handa@m17n.org> writes:
> In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
>
> > Could you try the attached patch? If it improves the
>
> > It does improve performance:
>
> How much in real time?
> from 22 seconds to 19 (vs about 16 for 22.3 )
I see. Then it is worth installing that patch. I've just
done it.
> [...]
> > It's still slower than 22.3 though.
>
> > One big difference is then time/number of calls to mark_objects
> > 129733 vs 18834514, so 145 times more calls to mark_object.
> > Do you know where do those come from?
>
> > The number of Fgarbage_collect calls does not increase that much:
> > from 37 (for 22.3) to 43 (for 23.1).
>
> It seems that c-indent-region has been changed a lot. I'm
> not sure, but perhaps that is the reason. How do the other
> people think?
> I set the load path to cc-mode from 22.3, it did not make any significant
> difference in the amount of mark_object and Fgarbage_collect calls.
Ok, then I suspect that the slowness is because of newly
introduced char-tables. I've just installed the attached
change too to improve the performance of object marking in
GC.
Please try again with the latest code.
---
Kenichi Handa
handa@m17n.org
2009-08-25 Kenichi Handa <handa@m17n.org>
* alloc.c (mark_char_table): New function.
(mark_object): Use mark_char_table for a char-table.
--- alloc.c.~1.448.~ 2009-08-17 21:17:19.000000000 +0900
+++ alloc.c 2009-08-25 15:01:28.000000000 +0900
@@ -5371,6 +5371,34 @@
return 1;
}
+/* Like mark_vectorlike but optimized for char-tables (and
+ sub-char-tables) assuming that the contents are mostly integers or
+ symbols. */
+
+static void
+mark_char_table (ptr)
+ struct Lisp_Vector *ptr;
+{
+ register EMACS_INT size = ptr->size & PSEUDOVECTOR_SIZE_MASK;
+ register int i;
+
+ VECTOR_MARK (ptr);
+ for (i = 0; i < size; i++)
+ {
+ Lisp_Object val = ptr->contents[i];
+
+ if (INTEGERP (val) || SYMBOLP (val) && XSYMBOL (val)->gcmarkbit)
+ continue;
+ if (SUB_CHAR_TABLE_P (val))
+ {
+ if (! VECTOR_MARKED_P (XVECTOR (val)))
+ mark_char_table (XVECTOR (val));
+ }
+ else
+ mark_object (val);
+ }
+}
+
void
mark_object (arg)
Lisp_Object arg;
@@ -5533,6 +5561,11 @@
VECTOR_MARK (XVECTOR (h->key_and_value));
}
}
+ else if (CHAR_TABLE_P (obj))
+ {
+ if (! VECTOR_MARKED_P (XVECTOR (obj)))
+ mark_char_table (XVECTOR (obj));
+ }
else
mark_vectorlike (XVECTOR (obj));
break;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-25 6:07 ` Kenichi Handa
@ 2009-08-25 18:47 ` Dan Nicolaescu
2009-08-26 6:01 ` Kenichi Handa
0 siblings, 1 reply; 17+ messages in thread
From: Dan Nicolaescu @ 2009-08-25 18:47 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> In article <200908241826.n7OIQRhS001092@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
>
> > Kenichi Handa <handa@m17n.org> writes:
> > In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> >
> > > Could you try the attached patch? If it improves the
> >
> > > It does improve performance:
> >
> > How much in real time?
>
> > from 22 seconds to 19 (vs about 16 for 22.3 )
>
> I see. Then it is worth installing that patch. I've just
> done it.
Thanks.
> > [...]
> > > It's still slower than 22.3 though.
> >
> > > One big difference is then time/number of calls to mark_objects
> > > 129733 vs 18834514, so 145 times more calls to mark_object.
> > > Do you know where do those come from?
> >
> > > The number of Fgarbage_collect calls does not increase that much:
> > > from 37 (for 22.3) to 43 (for 23.1).
> >
> > It seems that c-indent-region has been changed a lot. I'm
> > not sure, but perhaps that is the reason. How do the other
> > people think?
>
> > I set the load path to cc-mode from 22.3, it did not make any significant
> > difference in the amount of mark_object and Fgarbage_collect calls.
>
> Ok, then I suspect that the slowness is because of newly
> introduced char-tables. I've just installed the attached
> change too to improve the performance of object marking in
> GC.
>
> Please try again with the latest code.
It's better, but still not too good:
% cumulative self self total
time seconds seconds calls s/call s/call name
43.03 8.55 8.55 4864529 0.00 0.00 forw_comment
17.16 11.96 3.41 32860 0.00 0.00 scan_sexps_forward
7.90 13.53 1.57 17306 0.00 0.00 scan_lists
7.20 14.96 1.43 50443 0.00 0.00 re_search_2
4.43 15.84 0.88 33382219 0.00 0.00 lookup_char_property
2.42 16.32 0.48 18504890 0.00 0.00 next_interval
2.21 16.76 0.44 8177518 0.00 0.00 mark_object
1.86 17.13 0.37 14382565 0.00 0.00 previous_interval
1.66 17.46 0.33 3329754 0.00 0.00 re_match_2_internal
1.26 17.71 0.25 2156399 0.00 0.00 update_syntax_table
1.16 17.94 0.23 13472 0.00 0.00 Fprevious_single_property_change
1.01 18.14 0.20 23703332 0.00 0.00 Fassq
1.01 18.34 0.20 57233041 0.00 0.00 Fcdr
0.81 18.50 0.16 201989 0.00 0.00 Fbyte_code
0.70 18.64 0.14 43 0.00 0.02 Fgarbage_collect
0.45 18.73 0.09 3943360 0.00 0.00 mark_vectorlike
0.45 18.82 0.09 1976191 0.00 0.00 update_interval
0.40 18.90 0.08 103511 0.00 0.00 skip_chars
0.40 18.98 0.08 3111 0.00 0.00 mark_char_table
0.35 19.05 0.07 33382219 0.00 0.00 textget
0.25 19.10 0.05 1076559 0.00 0.00 find_interval
0.25 19.15 0.05 893966 0.00 0.00 re_iswctype
0.25 19.20 0.05 1294 0.00 0.00 Fsetcar
0.20 19.24 0.04 41380 0.00 0.00 re_compile_pattern
0.15 19.27 0.03 3613707 0.00 0.00 buf_bytepos_to_charpos
0.15 19.30 0.03 589363 0.00 0.00 set_point_both
0.15 19.33 0.03 329641 0.00 0.00 validate_interval_range
0.15 19.36 0.03 15287 0.00 0.00 adjust_for_invis_intang
0.15 19.39 0.03 interval_start_pos
0.10 19.41 0.02 987278 0.00 0.00 Fcons
compare to doing the same thing in an emacs -Q -nw session:
% cumulative self self total
time seconds seconds calls s/call s/call name
46.55 9.10 9.10 4864529 0.00 0.00 forw_comment
15.40 12.11 3.01 32860 0.00 0.00 scan_sexps_forward
7.31 13.54 1.43 17306 0.00 0.00 scan_lists
7.16 14.94 1.40 27938 0.00 0.00 re_search_2
4.40 15.80 0.86 33368915 0.00 0.00 lookup_char_property
2.35 16.26 0.46 18498165 0.00 0.00 next_interval
1.74 16.60 0.34 13472 0.00 0.00 Fprevious_single_property_change
1.64 16.92 0.32 3319319 0.00 0.00 re_match_2_internal
1.53 17.22 0.30 5830028 0.00 0.00 mark_object
1.53 17.52 0.30 2155779 0.00 0.00 update_syntax_table
1.48 17.81 0.29 14382565 0.00 0.00 previous_interval
1.28 18.06 0.25 23687804 0.00 0.00 Fassq
0.72 18.20 0.14 57187184 0.00 0.00 Fcdr
0.66 18.33 0.13 198744 0.00 0.00 Fbyte_code
0.51 18.43 0.10 33368915 0.00 0.00 textget
0.36 18.50 0.07 15287 0.00 0.00 adjust_for_invis_intang
0.36 18.57 0.07 2744 0.00 0.00 Frassq
0.36 18.64 0.07 38 0.00 0.01 Fgarbage_collect
0.31 18.70 0.06 103511 0.00 0.00 skip_chars
0.26 18.75 0.05 10044672 0.00 0.00 Fcar
0.26 18.80 0.05 1976191 0.00 0.00 update_interval
0.26 18.85 0.05 1862 0.00 0.00 mark_char_table
0.20 18.89 0.04 1066193 0.00 0.00 find_interval
0.20 18.93 0.04 73761 0.00 0.00 Flocal_variable_p
0.20 18.97 0.04 20489 0.00 0.00 re_compile_pattern
0.20 19.01 0.04 9351 0.00 0.00 back_comment
the -nw -Q session has fewer charsets loaded, the -Q version loads the
charset for BIG5 (and others) at startup even though it's running in a
LANG=C environment.
It seems that the charset data structure is too GC intensive, maybe it's
time to rethink it to make it less so.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-25 18:47 ` Dan Nicolaescu
@ 2009-08-26 6:01 ` Kenichi Handa
2009-08-26 6:33 ` Dan Nicolaescu
2009-08-27 6:27 ` Kenichi Handa
0 siblings, 2 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-08-26 6:01 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: emacs-devel
In article <200908251847.n7PIlb4C012720@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> the -nw -Q session has fewer charsets loaded, the -Q version loads the
> charset for BIG5 (and others) at startup even though it's running in a
> LANG=C environment.
Some CJK charsets are necessary to create the default
fontset. I think it's possible to avoid loading unnecessary
charsets for creating the default fontset. I'll work on it.
But, I think the main sloness of -Q version is fontsets. A
fontset is a char-table whose elements are vectors, and thus
the just-installed optimization doesn't work.
By the way, I did this test with (1) Emacs-22, (2) trunk
(without gc optimization), and (3) trunk (with gc
optimization).
% time LANG=C emacs -Q -batch --eval '(dotimes (i 100) (garbage-collect))'
The result is:
(1) user 0m1.460s
(2) user 0m1.772s
(3) user 0m1.644s
So, (3) is still 13% slower than (1), but I think it's a
necessary cost for having the full Unicode support. And
usually the slowness of GC doesn't direclty affect the total
execution time.
> It seems that the charset data structure is too GC intensive, maybe it's
> time to rethink it to make it less so.
For charset data, I agree that it doesn't have to be a
char-table (for encoding) nor a vector (for decoding). I'll
consider replacing them with some C structure.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-26 6:01 ` Kenichi Handa
@ 2009-08-26 6:33 ` Dan Nicolaescu
2009-08-26 8:06 ` Kenichi Handa
2009-08-27 6:27 ` Kenichi Handa
1 sibling, 1 reply; 17+ messages in thread
From: Dan Nicolaescu @ 2009-08-26 6:33 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> In article <200908251847.n7PIlb4C012720@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
>
> > the -nw -Q session has fewer charsets loaded, the -Q version loads the
> > charset for BIG5 (and others) at startup even though it's running in a
> > LANG=C environment.
>
> Some CJK charsets are necessary to create the default
> fontset. I think it's possible to avoid loading unnecessary
> charsets for creating the default fontset. I'll work on it.
Can the data in the default fontset be in the dumped image in pure
memory? To not have to create it every time on start up and GC it
should be a win.
> But, I think the main sloness of -Q version is fontsets. A
> fontset is a char-table whose elements are vectors, and thus
> the just-installed optimization doesn't work.
>
> By the way, I did this test with (1) Emacs-22, (2) trunk
> (without gc optimization), and (3) trunk (with gc
> optimization).
>
> % time LANG=C emacs -Q -batch --eval '(dotimes (i 100) (garbage-collect))'
>
> The result is:
>
> (1) user 0m1.460s
> (2) user 0m1.772s
> (3) user 0m1.644s
>
> So, (3) is still 13% slower than (1), but I think it's a
> necessary cost for having the full Unicode support. And
> usually the slowness of GC doesn't direclty affect the total
> execution time.
It does, see the profiling data that I sent. The problem with GC is
that it will flush all the caches if there's too much memory to walk
over. And the slowdown in GC speed is directly related to the big
increase in the number of GCable objects.
If you want to see a bit of this, please try this simple
instrumentation (is should apply to both 22.3 and CVS trunk):
Index: alloc.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/alloc.c,v
retrieving revision 1.449
diff -u -3 -p -u -p -r1.449 alloc.c
--- alloc.c 25 Aug 2009 06:03:09 -0000 1.449
+++ alloc.c 26 Aug 2009 06:28:36 -0000
@@ -4963,6 +4963,7 @@ inhibit_garbage_collection ()
return count;
}
+int call_count_mark_object = 0;
DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "",
doc: /* Reclaim storage for Lisp objects no longer needed.
@@ -4986,6 +4987,9 @@ returns nil, because real GC can't be do
Lisp_Object total[8];
int count = SPECPDL_INDEX ();
EMACS_TIME t1, t2, t3;
+ int bef = call_count_mark_object;
+
+ /* fprintf (stderr, "bef gc mark_object count = %d\n", call_count_mark_object); */
if (abort_on_gc)
abort ();
@@ -5278,6 +5282,8 @@ returns nil, because real GC can't be do
EMACS_USECS (t3) * 1.0e-6);
gcs_done++;
+ fprintf (stderr, "after gc mark_object delta = %d\n", call_count_mark_object - bef);
+
return Flist (sizeof total / sizeof *total, total);
}
@@ -5410,6 +5416,8 @@ mark_object (arg)
#endif
int cdr_count = 0;
+ call_count_mark_object++;
+
loop:
if (PURE_POINTER_P (XPNTR (obj)))
> > It seems that the charset data structure is too GC intensive, maybe it's
> > time to rethink it to make it less so.
>
> For charset data, I agree that it doesn't have to be a
> char-table (for encoding) nor a vector (for decoding). I'll
> consider replacing them with some C structure.
Great, thanks!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-26 6:33 ` Dan Nicolaescu
@ 2009-08-26 8:06 ` Kenichi Handa
2009-08-26 20:46 ` Andreas Schwab
0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-08-26 8:06 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: emacs-devel
In article <200908260633.n7Q6XRXi018187@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes:
> Some CJK charsets are necessary to create the default
> fontset. I think it's possible to avoid loading unnecessary
> charsets for creating the default fontset. I'll work on it.
> Can the data in the default fontset be in the dumped image in pure
> memory? To not have to create it every time on start up and GC it
> should be a win.
Is it possible to put the whole char-table in pure space,
then modify some part at running time?
> So, (3) is still 13% slower than (1), but I think it's a
> necessary cost for having the full Unicode support. And
> usually the slowness of GC doesn't direclty affect the total
> execution time.
> It does, see the profiling data that I sent. The problem with GC is
> that it will flush all the caches if there's too much memory to walk
> over. And the slowdown in GC speed is directly related to the big
> increase in the number of GCable objects.
If most of the execution is spent by GC, shouldn't such a
task run with bigger gc-cons-threshold?
But, I tried your test case of indent-region with different
gc-cons-threshold (gct), and the result shows that GC is not
the main culprit of the slowness.
emacs-22 (gct=400000): 16 to 17 sec
emacs-22 (gct=10000000):16 to 17 sec
trunk (gct=400000): 20 to 21 sec
trunk (gct=10000000): 19 to 20 sec
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-26 8:06 ` Kenichi Handa
@ 2009-08-26 20:46 ` Andreas Schwab
2009-08-27 2:02 ` Kenichi Handa
0 siblings, 1 reply; 17+ messages in thread
From: Andreas Schwab @ 2009-08-26 20:46 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Dan Nicolaescu, emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> Is it possible to put the whole char-table in pure space,
> then modify some part at running time?
Only by copying it. If you try to modify it you'll get a pure write
error.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-26 20:46 ` Andreas Schwab
@ 2009-08-27 2:02 ` Kenichi Handa
0 siblings, 0 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-08-27 2:02 UTC (permalink / raw)
To: Andreas Schwab; +Cc: dann, emacs-devel
In article <m2fxbecngn.fsf@igel.home>, Andreas Schwab <schwab@linux-m68k.org> writes:
> Kenichi Handa <handa@m17n.org> writes:
> > Is it possible to put the whole char-table in pure space,
> > then modify some part at running time?
> Only by copying it. If you try to modify it you'll get a pure write
> error.
If it means that we must deep-copy the whole char-table to
modify only some part of it, I'd like to find some other
way.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3
2009-08-26 6:01 ` Kenichi Handa
2009-08-26 6:33 ` Dan Nicolaescu
@ 2009-08-27 6:27 ` Kenichi Handa
1 sibling, 0 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-08-27 6:27 UTC (permalink / raw)
To: Kenichi Handa; +Cc: dann, emacs-devel
In article <E1MgBZt-0006U4-9O@etlken>, Kenichi Handa <handa@m17n.org> writes:
> Some CJK charsets are necessary to create the default
> fontset. I think it's possible to avoid loading unnecessary
> charsets for creating the default fontset. I'll work on it.
I've just installed a fix for fontset.el.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-08-27 6:27 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu
2009-08-04 17:10 ` Leo
2009-08-04 19:50 ` Chong Yidong
2009-08-04 19:56 ` Dan Nicolaescu
2009-08-05 7:47 ` Dan Nicolaescu
2009-08-24 6:52 ` Kenichi Handa
[not found] ` <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>
2009-08-24 11:39 ` Kenichi Handa
2009-08-24 18:26 ` Dan Nicolaescu
2009-08-25 6:07 ` Kenichi Handa
2009-08-25 18:47 ` Dan Nicolaescu
2009-08-26 6:01 ` Kenichi Handa
2009-08-26 6:33 ` Dan Nicolaescu
2009-08-26 8:06 ` Kenichi Handa
2009-08-26 20:46 ` Andreas Schwab
2009-08-27 2:02 ` Kenichi Handa
2009-08-27 6:27 ` Kenichi Handa
2009-08-24 22:18 ` Alan Mackenzie
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.