* profiling emacs-23.1 vs emacs-22.3 @ 2009-08-03 20:29 Dan Nicolaescu 2009-08-04 17:10 ` Leo ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Dan Nicolaescu @ 2009-08-03 20:29 UTC (permalink / raw) To: emacs-devel I run a small test. Take src/config.h run C-x r t on it to insert a space in front of each line. Then build both emacs-23.1 and 22.3 with profiling enabled (-O2 -g -pg). Then run: emacs -Q -nw config.h M-: (indent-region (point-min)(point-max)) RET C-x C-c The 22.3 gprof results: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 41.29 7.13 7.13 4653244 0.00 0.00 forw_comment 16.10 9.91 2.78 32473 0.09 0.31 scan_sexps_forward 9.09 11.48 1.57 15132 0.10 0.15 re_search_2 5.62 12.45 0.97 17286 0.06 0.15 scan_lists 5.59 13.41 0.96 33462255 0.00 0.00 lookup_char_property 3.36 13.99 0.58 18478958 0.00 0.00 next_interval 2.14 14.37 0.37 129733 0.00 0.00 mark_object 1.80 14.68 0.31 137020 0.00 0.00 skip_chars 1.74 14.97 0.30 14459993 0.00 0.00 previous_interval 1.62 15.26 0.28 57446793 0.00 0.00 Fcdr 1.36 15.49 0.23 2156082 0.00 0.00 update_syntax_table 1.27 15.71 0.22 3315615 0.00 0.00 re_match_2_internal 1.19 15.91 0.20 23687203 0.00 0.00 Fassq 0.87 16.07 0.15 13478 0.01 0.06 Fprevious_single_property_change 0.75 16.20 0.13 33462255 0.00 0.00 textget 0.75 16.32 0.13 228439 0.00 0.00 Fbyte_code 0.52 16.41 0.09 1978403 0.00 0.00 update_interval 0.46 16.50 0.08 1101056 0.00 0.00 find_interval 0.29 16.55 0.05 3613429 0.00 0.00 buf_bytepos_to_charpos 0.26 16.59 0.04 10137050 0.00 0.00 Fcar 0.26 16.64 0.04 1154883 0.00 0.00 balance_an_interval 0.23 16.68 0.04 579275 0.00 0.00 unbind_to The 23.1 gprof results: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 29.60 7.03 7.03 406111658 0.00 0.00 char_table_ref 26.72 13.37 6.34 4653244 0.00 0.00 forw_comment 11.80 16.17 2.80 32121 0.00 0.00 scan_sexps_forward 5.98 17.59 1.42 23271 0.00 0.00 re_search_2 5.27 18.84 1.25 17289 0.00 0.00 scan_lists 4.13 19.82 0.98 33439981 0.00 0.00 lookup_char_property 1.94 20.28 0.46 18488739 0.00 0.00 next_interval 1.73 20.69 0.41 10038210 0.00 0.00 mark_object 1.26 20.99 0.30 2156799 0.00 0.00 update_syntax_table 1.20 21.27 0.29 sub_char_table_ref 1.10 21.53 0.26 14455134 0.00 0.00 previous_interval 0.88 21.74 0.21 57391428 0.00 0.00 Fcdr 0.88 21.95 0.21 3318263 0.00 0.00 re_match_2_internal 0.72 22.12 0.17 23671456 0.00 0.00 Fassq 0.67 22.28 0.16 196373 0.00 0.00 Fbyte_code 0.59 22.42 0.14 13478 0.00 0.00 Fprevious_single_property_change 0.51 22.54 0.12 38 0.00 0.02 Fgarbage_collect 0.38 22.63 0.09 33439981 0.00 0.00 textget 0.38 22.72 0.09 10115583 0.00 0.00 Fcar [snip] ----------------------------------------------- 0.05 0.22 612/32121 back_comment [28] 2.75 11.33 31509/32121 Fparse_partial_sexp [6] [5] 60.5 2.80 11.55 32121 scan_sexps_forward [5] 5.17 3.76 3796695/4653244 forw_comment [7] 1.62 0.00 93379139/406111658 char_table_ref [8] 0.15 0.86 1092890/2156799 update_syntax_table [11] 0.00 0.00 5096/10115583 Fcar [46] 0.00 0.00 6552/57391428 Fcdr [33] 0.00 0.00 147/952550 Fcons [105] It can be seen that 23.1 is quite a bit slower, and that it has a lot of extra calls to char_table_ref. Are those calls necessary? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu @ 2009-08-04 17:10 ` Leo 2009-08-04 19:50 ` Chong Yidong ` (2 subsequent siblings) 3 siblings, 0 replies; 17+ messages in thread From: Leo @ 2009-08-04 17:10 UTC (permalink / raw) To: emacs-devel On 2009-08-03 21:29 +0100, Dan Nicolaescu wrote: > It can be seen that 23.1 is quite a bit slower, and that it has a lot > of extra calls to char_table_ref. Are those calls necessary? Thank you for the quantitative results. From experience, there is slowdown in Emacs 23. -- Leo's Emacs uptime: 17 hours, 6 minutes, 51 seconds ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu 2009-08-04 17:10 ` Leo @ 2009-08-04 19:50 ` Chong Yidong 2009-08-04 19:56 ` Dan Nicolaescu 2009-08-05 7:47 ` Dan Nicolaescu 2009-08-24 6:52 ` Kenichi Handa 3 siblings, 1 reply; 17+ messages in thread From: Chong Yidong @ 2009-08-04 19:50 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: emacs-devel Dan Nicolaescu <dann@ics.uci.edu> writes: > I run a small test. > Take src/config.h run C-x r t on it to insert a space in front of each > line. > Then build both emacs-23.1 and 22.3 with profiling enabled (-O2 -g -pg). > Then run: > emacs -Q -nw config.h > M-: (indent-region (point-min)(point-max)) RET > C-x C-c > > It can be seen that 23.1 is quite a bit slower, and that it has a lot of > extra calls to char_table_ref. Is there any difference in CVS trunk? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-04 19:50 ` Chong Yidong @ 2009-08-04 19:56 ` Dan Nicolaescu 0 siblings, 0 replies; 17+ messages in thread From: Dan Nicolaescu @ 2009-08-04 19:56 UTC (permalink / raw) To: Chong Yidong; +Cc: emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > Dan Nicolaescu <dann@ics.uci.edu> writes: > > > I run a small test. > > Take src/config.h run C-x r t on it to insert a space in front of each > > line. > > Then build both emacs-23.1 and 22.3 with profiling enabled (-O2 -g -pg). > > Then run: > > emacs -Q -nw config.h > > M-: (indent-region (point-min)(point-max)) RET > > C-x C-c > > > > It can be seen that 23.1 is quite a bit slower, and that it has a lot of > > extra calls to char_table_ref. > > Is there any difference in CVS trunk? No ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu 2009-08-04 17:10 ` Leo 2009-08-04 19:50 ` Chong Yidong @ 2009-08-05 7:47 ` Dan Nicolaescu 2009-08-24 6:52 ` Kenichi Handa 3 siblings, 0 replies; 17+ messages in thread From: Dan Nicolaescu @ 2009-08-05 7:47 UTC (permalink / raw) To: emacs-devel Here's another data point. This time from running a process (a recursive grep for setq in the lisp subdirectory): emacs --batch --eval '(progn (call-process "grep" nil (get-buffer-create "tmp") t "--color=never" "-r" "setq" "/tmp/Emacs-CVS/emacs/lisp/") (set-buffer "tmp") (write-file "/tmp/23"))' emacs-22.3: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 20.59 0.07 0.07 111 0.63 0.63 decode_eol 14.71 0.12 0.05 5018 0.01 0.01 re_search_2 14.71 0.17 0.05 138 0.36 1.12 decode_coding 11.76 0.21 0.04 1 40.00 40.00 find_safe_codings 8.82 0.24 0.03 224 0.13 0.13 detect_coding_mask 5.88 0.26 0.02 1 20.00 20.00 unencodable_char_position 2.94 0.27 0.01 1401155 0.00 0.00 Faref 2.94 0.28 0.01 1396772 0.00 0.00 translate_char 2.94 0.29 0.01 6955 0.00 0.00 Fstring_equal 2.94 0.30 0.01 3180 0.00 0.00 allocate_vector 2.94 0.31 0.01 2062 0.00 0.00 mark_object 2.94 0.32 0.01 509 0.02 0.02 analyze_first 2.94 0.33 0.01 392 0.03 0.03 buf_charpos_to_bytepos 2.94 0.34 0.01 5 2.00 2.51 Faccessible_keymaps 0.00 0.34 0.00 84102 0.00 0.00 readchar 0.00 0.34 0.00 22523 0.00 0.00 re_match_2_internal [snip] CVS HEAD: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 18.18 0.06 0.06 2097964 0.00 0.00 mark_object 15.15 0.11 0.05 966 0.05 0.05 consume_chars 9.09 0.14 0.03 388 0.08 0.08 encode_coding_raw_text 9.09 0.17 0.03 219 0.14 0.15 produce_chars 6.06 0.19 0.02 140 0.14 0.21 detect_coding 6.06 0.21 0.02 98 0.20 0.20 decode_coding_iso_2022 6.06 0.23 0.02 1 20.00 20.00 Funencodable_char_position 3.03 0.24 0.01 377383 0.00 0.00 read_hex 3.03 0.25 0.01 219810 0.00 0.00 char_table_ref 3.03 0.26 0.01 8391 0.00 0.00 mem_insert 3.03 0.27 0.01 7036 0.00 0.00 Fcar 3.03 0.28 0.01 1472 0.01 0.01 emacs_read 3.03 0.29 0.01 467 0.02 0.02 Ftext_properties_at 3.03 0.30 0.01 201 0.05 0.05 gap_left 3.03 0.31 0.01 64 0.16 0.43 load_charset_map 3.03 0.32 0.01 4 2.50 2.50 Fkill_buffer 3.03 0.33 0.01 4 2.50 2.50 detect_coding_iso_2022 0.00 0.33 0.00 221420 0.00 0.00 char_table_set 0.00 0.33 0.00 106914 0.00 0.00 readchar 0.00 0.33 0.00 82427 0.00 0.00 Faref [snip] so we have a lot more calls to mark_object. They do not matter much from the performance point of view in this particular case, but it should be interesting to find out why we get them now. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu ` (2 preceding siblings ...) 2009-08-05 7:47 ` Dan Nicolaescu @ 2009-08-24 6:52 ` Kenichi Handa [not found] ` <200908240807.n7O87ubg024643@godzilla.ics.uci.edu> 3 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-08-24 6:52 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: emacs-devel In article <200908032029.n73KTi9h017528@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > It can be seen that 23.1 is quite a bit slower, and that it has a lot of > extra calls to char_table_ref. > Are those calls necessary? I found that the syntax of C is mostly defined in the parent of CURRENT_SYNTAX_TABLE, and thus, in the call of SYNTAX_ENTRY (C), the optimization for ASCII in this code (in lisp.h) doesn't work. /* Almost equivalent to Faref (CT, IDX) with optimization for ASCII characters. Do not check validity of CT. */ #define CHAR_TABLE_REF(CT, IDX) \ ((ASCII_CHAR_P (IDX) \ && SUB_CHAR_TABLE_P (XCHAR_TABLE (CT)->ascii) \ && !NILP (XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX])) \ ? XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX] \ : char_table_ref ((CT), (IDX))) Could you try the attached patch? If it improves the performance, I'll commit it. --- Kenichi Handa handa@m17n.org --- lisp.h.~1.661.~ 2009-08-21 15:03:39.000000000 +0900 +++ lisp.h 2009-08-24 15:47:02.000000000 +0900 @@ -793,13 +793,37 @@ #define CHAR_TABLE_EXTRA_SLOTS(CT) \ (((CT)->size & PSEUDOVECTOR_SIZE_MASK) - CHAR_TABLE_STANDARD_SLOTS) +#ifdef __GNUC__ + +#define CHAR_TABLE_REF_ASCII(CT, IDX) \ + ({struct Lisp_Char_Table *_tbl = NULL; \ + Lisp_Object _val; \ + do { \ + _tbl = _tbl ? XCHAR_TABLE (_tbl->parent) : XCHAR_TABLE (CT); \ + _val = (! SUB_CHAR_TABLE_P (_tbl->ascii) ? _tbl->ascii \ + : XSUB_CHAR_TABLE (_tbl->ascii)->contents[IDX]); \ + if (NILP (_val)) \ + _val = _tbl->defalt; \ + } while (NILP (_val) && ! NILP (_tbl->parent)); \ + _val; }) + +#else /* not __GNUC__ */ + +#define CHAR_TABLE_REF_ASCII(CT, IDX) \ + (! NILP (XCHAR_TABLE (CT)->ascii) \ + ? (! SUB_CHAR_TABLE_P (XCHAR_TABLE (CT)->ascii) \ + ? XCHAR_TABLE (CT)->ascii \ + : ! NILP (XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX]) \ + ? XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX] \ + : char_table_ref ((CT), (IDX))) \ + : char_table_ref ((CT), (IDX))) + +#endif /* not __GNUC__ */ + /* Almost equivalent to Faref (CT, IDX) with optimization for ASCII characters. Do not check validity of CT. */ -#define CHAR_TABLE_REF(CT, IDX) \ - ((ASCII_CHAR_P (IDX) \ - && SUB_CHAR_TABLE_P (XCHAR_TABLE (CT)->ascii) \ - && !NILP (XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX])) \ - ? XSUB_CHAR_TABLE (XCHAR_TABLE (CT)->ascii)->contents[IDX] \ +#define CHAR_TABLE_REF(CT, IDX) \ + (ASCII_CHAR_P (IDX) ? CHAR_TABLE_REF_ASCII ((CT), (IDX)) \ : char_table_ref ((CT), (IDX))) /* Almost equivalent to Faref (CT, IDX). However, if the result is ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>]
* Re: profiling emacs-23.1 vs emacs-22.3 [not found] ` <200908240807.n7O87ubg024643@godzilla.ics.uci.edu> @ 2009-08-24 11:39 ` Kenichi Handa 2009-08-24 18:26 ` Dan Nicolaescu 2009-08-24 22:18 ` Alan Mackenzie 0 siblings, 2 replies; 17+ messages in thread From: Kenichi Handa @ 2009-08-24 11:39 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: emacs-devel In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > Could you try the attached patch? If it improves the > It does improve performance: How much in real time? [...] > It's still slower than 22.3 though. > One big difference is then time/number of calls to mark_objects > 129733 vs 18834514, so 145 times more calls to mark_object. > Do you know where do those come from? > The number of Fgarbage_collect calls does not increase that much: > from 37 (for 22.3) to 43 (for 23.1). It seems that c-indent-region has been changed a lot. I'm not sure, but perhaps that is the reason. How do the other people think? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-24 11:39 ` Kenichi Handa @ 2009-08-24 18:26 ` Dan Nicolaescu 2009-08-25 6:07 ` Kenichi Handa 2009-08-24 22:18 ` Alan Mackenzie 1 sibling, 1 reply; 17+ messages in thread From: Dan Nicolaescu @ 2009-08-24 18:26 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-devel Kenichi Handa <handa@m17n.org> writes: > In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > > > Could you try the attached patch? If it improves the > > > It does improve performance: > > How much in real time? from 22 seconds to 19 (vs about 16 for 22.3 ) > [...] > > It's still slower than 22.3 though. > > > One big difference is then time/number of calls to mark_objects > > 129733 vs 18834514, so 145 times more calls to mark_object. > > Do you know where do those come from? > > > The number of Fgarbage_collect calls does not increase that much: > > from 37 (for 22.3) to 43 (for 23.1). > > It seems that c-indent-region has been changed a lot. I'm > not sure, but perhaps that is the reason. How do the other > people think? I set the load path to cc-mode from 22.3, it did not make any significant difference in the amount of mark_object and Fgarbage_collect calls. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-24 18:26 ` Dan Nicolaescu @ 2009-08-25 6:07 ` Kenichi Handa 2009-08-25 18:47 ` Dan Nicolaescu 0 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-08-25 6:07 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: emacs-devel In article <200908241826.n7OIQRhS001092@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > Kenichi Handa <handa@m17n.org> writes: > In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > > > Could you try the attached patch? If it improves the > > > It does improve performance: > > How much in real time? > from 22 seconds to 19 (vs about 16 for 22.3 ) I see. Then it is worth installing that patch. I've just done it. > [...] > > It's still slower than 22.3 though. > > > One big difference is then time/number of calls to mark_objects > > 129733 vs 18834514, so 145 times more calls to mark_object. > > Do you know where do those come from? > > > The number of Fgarbage_collect calls does not increase that much: > > from 37 (for 22.3) to 43 (for 23.1). > > It seems that c-indent-region has been changed a lot. I'm > not sure, but perhaps that is the reason. How do the other > people think? > I set the load path to cc-mode from 22.3, it did not make any significant > difference in the amount of mark_object and Fgarbage_collect calls. Ok, then I suspect that the slowness is because of newly introduced char-tables. I've just installed the attached change too to improve the performance of object marking in GC. Please try again with the latest code. --- Kenichi Handa handa@m17n.org 2009-08-25 Kenichi Handa <handa@m17n.org> * alloc.c (mark_char_table): New function. (mark_object): Use mark_char_table for a char-table. --- alloc.c.~1.448.~ 2009-08-17 21:17:19.000000000 +0900 +++ alloc.c 2009-08-25 15:01:28.000000000 +0900 @@ -5371,6 +5371,34 @@ return 1; } +/* Like mark_vectorlike but optimized for char-tables (and + sub-char-tables) assuming that the contents are mostly integers or + symbols. */ + +static void +mark_char_table (ptr) + struct Lisp_Vector *ptr; +{ + register EMACS_INT size = ptr->size & PSEUDOVECTOR_SIZE_MASK; + register int i; + + VECTOR_MARK (ptr); + for (i = 0; i < size; i++) + { + Lisp_Object val = ptr->contents[i]; + + if (INTEGERP (val) || SYMBOLP (val) && XSYMBOL (val)->gcmarkbit) + continue; + if (SUB_CHAR_TABLE_P (val)) + { + if (! VECTOR_MARKED_P (XVECTOR (val))) + mark_char_table (XVECTOR (val)); + } + else + mark_object (val); + } +} + void mark_object (arg) Lisp_Object arg; @@ -5533,6 +5561,11 @@ VECTOR_MARK (XVECTOR (h->key_and_value)); } } + else if (CHAR_TABLE_P (obj)) + { + if (! VECTOR_MARKED_P (XVECTOR (obj))) + mark_char_table (XVECTOR (obj)); + } else mark_vectorlike (XVECTOR (obj)); break; ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-25 6:07 ` Kenichi Handa @ 2009-08-25 18:47 ` Dan Nicolaescu 2009-08-26 6:01 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Dan Nicolaescu @ 2009-08-25 18:47 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-devel Kenichi Handa <handa@m17n.org> writes: > In article <200908241826.n7OIQRhS001092@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > > > Kenichi Handa <handa@m17n.org> writes: > > In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > > > > > Could you try the attached patch? If it improves the > > > > > It does improve performance: > > > > How much in real time? > > > from 22 seconds to 19 (vs about 16 for 22.3 ) > > I see. Then it is worth installing that patch. I've just > done it. Thanks. > > [...] > > > It's still slower than 22.3 though. > > > > > One big difference is then time/number of calls to mark_objects > > > 129733 vs 18834514, so 145 times more calls to mark_object. > > > Do you know where do those come from? > > > > > The number of Fgarbage_collect calls does not increase that much: > > > from 37 (for 22.3) to 43 (for 23.1). > > > > It seems that c-indent-region has been changed a lot. I'm > > not sure, but perhaps that is the reason. How do the other > > people think? > > > I set the load path to cc-mode from 22.3, it did not make any significant > > difference in the amount of mark_object and Fgarbage_collect calls. > > Ok, then I suspect that the slowness is because of newly > introduced char-tables. I've just installed the attached > change too to improve the performance of object marking in > GC. > > Please try again with the latest code. It's better, but still not too good: % cumulative self self total time seconds seconds calls s/call s/call name 43.03 8.55 8.55 4864529 0.00 0.00 forw_comment 17.16 11.96 3.41 32860 0.00 0.00 scan_sexps_forward 7.90 13.53 1.57 17306 0.00 0.00 scan_lists 7.20 14.96 1.43 50443 0.00 0.00 re_search_2 4.43 15.84 0.88 33382219 0.00 0.00 lookup_char_property 2.42 16.32 0.48 18504890 0.00 0.00 next_interval 2.21 16.76 0.44 8177518 0.00 0.00 mark_object 1.86 17.13 0.37 14382565 0.00 0.00 previous_interval 1.66 17.46 0.33 3329754 0.00 0.00 re_match_2_internal 1.26 17.71 0.25 2156399 0.00 0.00 update_syntax_table 1.16 17.94 0.23 13472 0.00 0.00 Fprevious_single_property_change 1.01 18.14 0.20 23703332 0.00 0.00 Fassq 1.01 18.34 0.20 57233041 0.00 0.00 Fcdr 0.81 18.50 0.16 201989 0.00 0.00 Fbyte_code 0.70 18.64 0.14 43 0.00 0.02 Fgarbage_collect 0.45 18.73 0.09 3943360 0.00 0.00 mark_vectorlike 0.45 18.82 0.09 1976191 0.00 0.00 update_interval 0.40 18.90 0.08 103511 0.00 0.00 skip_chars 0.40 18.98 0.08 3111 0.00 0.00 mark_char_table 0.35 19.05 0.07 33382219 0.00 0.00 textget 0.25 19.10 0.05 1076559 0.00 0.00 find_interval 0.25 19.15 0.05 893966 0.00 0.00 re_iswctype 0.25 19.20 0.05 1294 0.00 0.00 Fsetcar 0.20 19.24 0.04 41380 0.00 0.00 re_compile_pattern 0.15 19.27 0.03 3613707 0.00 0.00 buf_bytepos_to_charpos 0.15 19.30 0.03 589363 0.00 0.00 set_point_both 0.15 19.33 0.03 329641 0.00 0.00 validate_interval_range 0.15 19.36 0.03 15287 0.00 0.00 adjust_for_invis_intang 0.15 19.39 0.03 interval_start_pos 0.10 19.41 0.02 987278 0.00 0.00 Fcons compare to doing the same thing in an emacs -Q -nw session: % cumulative self self total time seconds seconds calls s/call s/call name 46.55 9.10 9.10 4864529 0.00 0.00 forw_comment 15.40 12.11 3.01 32860 0.00 0.00 scan_sexps_forward 7.31 13.54 1.43 17306 0.00 0.00 scan_lists 7.16 14.94 1.40 27938 0.00 0.00 re_search_2 4.40 15.80 0.86 33368915 0.00 0.00 lookup_char_property 2.35 16.26 0.46 18498165 0.00 0.00 next_interval 1.74 16.60 0.34 13472 0.00 0.00 Fprevious_single_property_change 1.64 16.92 0.32 3319319 0.00 0.00 re_match_2_internal 1.53 17.22 0.30 5830028 0.00 0.00 mark_object 1.53 17.52 0.30 2155779 0.00 0.00 update_syntax_table 1.48 17.81 0.29 14382565 0.00 0.00 previous_interval 1.28 18.06 0.25 23687804 0.00 0.00 Fassq 0.72 18.20 0.14 57187184 0.00 0.00 Fcdr 0.66 18.33 0.13 198744 0.00 0.00 Fbyte_code 0.51 18.43 0.10 33368915 0.00 0.00 textget 0.36 18.50 0.07 15287 0.00 0.00 adjust_for_invis_intang 0.36 18.57 0.07 2744 0.00 0.00 Frassq 0.36 18.64 0.07 38 0.00 0.01 Fgarbage_collect 0.31 18.70 0.06 103511 0.00 0.00 skip_chars 0.26 18.75 0.05 10044672 0.00 0.00 Fcar 0.26 18.80 0.05 1976191 0.00 0.00 update_interval 0.26 18.85 0.05 1862 0.00 0.00 mark_char_table 0.20 18.89 0.04 1066193 0.00 0.00 find_interval 0.20 18.93 0.04 73761 0.00 0.00 Flocal_variable_p 0.20 18.97 0.04 20489 0.00 0.00 re_compile_pattern 0.20 19.01 0.04 9351 0.00 0.00 back_comment the -nw -Q session has fewer charsets loaded, the -Q version loads the charset for BIG5 (and others) at startup even though it's running in a LANG=C environment. It seems that the charset data structure is too GC intensive, maybe it's time to rethink it to make it less so. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-25 18:47 ` Dan Nicolaescu @ 2009-08-26 6:01 ` Kenichi Handa 2009-08-26 6:33 ` Dan Nicolaescu 2009-08-27 6:27 ` Kenichi Handa 0 siblings, 2 replies; 17+ messages in thread From: Kenichi Handa @ 2009-08-26 6:01 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: emacs-devel In article <200908251847.n7PIlb4C012720@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > the -nw -Q session has fewer charsets loaded, the -Q version loads the > charset for BIG5 (and others) at startup even though it's running in a > LANG=C environment. Some CJK charsets are necessary to create the default fontset. I think it's possible to avoid loading unnecessary charsets for creating the default fontset. I'll work on it. But, I think the main sloness of -Q version is fontsets. A fontset is a char-table whose elements are vectors, and thus the just-installed optimization doesn't work. By the way, I did this test with (1) Emacs-22, (2) trunk (without gc optimization), and (3) trunk (with gc optimization). % time LANG=C emacs -Q -batch --eval '(dotimes (i 100) (garbage-collect))' The result is: (1) user 0m1.460s (2) user 0m1.772s (3) user 0m1.644s So, (3) is still 13% slower than (1), but I think it's a necessary cost for having the full Unicode support. And usually the slowness of GC doesn't direclty affect the total execution time. > It seems that the charset data structure is too GC intensive, maybe it's > time to rethink it to make it less so. For charset data, I agree that it doesn't have to be a char-table (for encoding) nor a vector (for decoding). I'll consider replacing them with some C structure. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-26 6:01 ` Kenichi Handa @ 2009-08-26 6:33 ` Dan Nicolaescu 2009-08-26 8:06 ` Kenichi Handa 2009-08-27 6:27 ` Kenichi Handa 1 sibling, 1 reply; 17+ messages in thread From: Dan Nicolaescu @ 2009-08-26 6:33 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-devel Kenichi Handa <handa@m17n.org> writes: > In article <200908251847.n7PIlb4C012720@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > > > the -nw -Q session has fewer charsets loaded, the -Q version loads the > > charset for BIG5 (and others) at startup even though it's running in a > > LANG=C environment. > > Some CJK charsets are necessary to create the default > fontset. I think it's possible to avoid loading unnecessary > charsets for creating the default fontset. I'll work on it. Can the data in the default fontset be in the dumped image in pure memory? To not have to create it every time on start up and GC it should be a win. > But, I think the main sloness of -Q version is fontsets. A > fontset is a char-table whose elements are vectors, and thus > the just-installed optimization doesn't work. > > By the way, I did this test with (1) Emacs-22, (2) trunk > (without gc optimization), and (3) trunk (with gc > optimization). > > % time LANG=C emacs -Q -batch --eval '(dotimes (i 100) (garbage-collect))' > > The result is: > > (1) user 0m1.460s > (2) user 0m1.772s > (3) user 0m1.644s > > So, (3) is still 13% slower than (1), but I think it's a > necessary cost for having the full Unicode support. And > usually the slowness of GC doesn't direclty affect the total > execution time. It does, see the profiling data that I sent. The problem with GC is that it will flush all the caches if there's too much memory to walk over. And the slowdown in GC speed is directly related to the big increase in the number of GCable objects. If you want to see a bit of this, please try this simple instrumentation (is should apply to both 22.3 and CVS trunk): Index: alloc.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/alloc.c,v retrieving revision 1.449 diff -u -3 -p -u -p -r1.449 alloc.c --- alloc.c 25 Aug 2009 06:03:09 -0000 1.449 +++ alloc.c 26 Aug 2009 06:28:36 -0000 @@ -4963,6 +4963,7 @@ inhibit_garbage_collection () return count; } +int call_count_mark_object = 0; DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "", doc: /* Reclaim storage for Lisp objects no longer needed. @@ -4986,6 +4987,9 @@ returns nil, because real GC can't be do Lisp_Object total[8]; int count = SPECPDL_INDEX (); EMACS_TIME t1, t2, t3; + int bef = call_count_mark_object; + + /* fprintf (stderr, "bef gc mark_object count = %d\n", call_count_mark_object); */ if (abort_on_gc) abort (); @@ -5278,6 +5282,8 @@ returns nil, because real GC can't be do EMACS_USECS (t3) * 1.0e-6); gcs_done++; + fprintf (stderr, "after gc mark_object delta = %d\n", call_count_mark_object - bef); + return Flist (sizeof total / sizeof *total, total); } @@ -5410,6 +5416,8 @@ mark_object (arg) #endif int cdr_count = 0; + call_count_mark_object++; + loop: if (PURE_POINTER_P (XPNTR (obj))) > > It seems that the charset data structure is too GC intensive, maybe it's > > time to rethink it to make it less so. > > For charset data, I agree that it doesn't have to be a > char-table (for encoding) nor a vector (for decoding). I'll > consider replacing them with some C structure. Great, thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-26 6:33 ` Dan Nicolaescu @ 2009-08-26 8:06 ` Kenichi Handa 2009-08-26 20:46 ` Andreas Schwab 0 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-08-26 8:06 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: emacs-devel In article <200908260633.n7Q6XRXi018187@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > Some CJK charsets are necessary to create the default > fontset. I think it's possible to avoid loading unnecessary > charsets for creating the default fontset. I'll work on it. > Can the data in the default fontset be in the dumped image in pure > memory? To not have to create it every time on start up and GC it > should be a win. Is it possible to put the whole char-table in pure space, then modify some part at running time? > So, (3) is still 13% slower than (1), but I think it's a > necessary cost for having the full Unicode support. And > usually the slowness of GC doesn't direclty affect the total > execution time. > It does, see the profiling data that I sent. The problem with GC is > that it will flush all the caches if there's too much memory to walk > over. And the slowdown in GC speed is directly related to the big > increase in the number of GCable objects. If most of the execution is spent by GC, shouldn't such a task run with bigger gc-cons-threshold? But, I tried your test case of indent-region with different gc-cons-threshold (gct), and the result shows that GC is not the main culprit of the slowness. emacs-22 (gct=400000): 16 to 17 sec emacs-22 (gct=10000000):16 to 17 sec trunk (gct=400000): 20 to 21 sec trunk (gct=10000000): 19 to 20 sec --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-26 8:06 ` Kenichi Handa @ 2009-08-26 20:46 ` Andreas Schwab 2009-08-27 2:02 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Andreas Schwab @ 2009-08-26 20:46 UTC (permalink / raw) To: Kenichi Handa; +Cc: Dan Nicolaescu, emacs-devel Kenichi Handa <handa@m17n.org> writes: > Is it possible to put the whole char-table in pure space, > then modify some part at running time? Only by copying it. If you try to modify it you'll get a pure write error. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-26 20:46 ` Andreas Schwab @ 2009-08-27 2:02 ` Kenichi Handa 0 siblings, 0 replies; 17+ messages in thread From: Kenichi Handa @ 2009-08-27 2:02 UTC (permalink / raw) To: Andreas Schwab; +Cc: dann, emacs-devel In article <m2fxbecngn.fsf@igel.home>, Andreas Schwab <schwab@linux-m68k.org> writes: > Kenichi Handa <handa@m17n.org> writes: > > Is it possible to put the whole char-table in pure space, > > then modify some part at running time? > Only by copying it. If you try to modify it you'll get a pure write > error. If it means that we must deep-copy the whole char-table to modify only some part of it, I'd like to find some other way. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-26 6:01 ` Kenichi Handa 2009-08-26 6:33 ` Dan Nicolaescu @ 2009-08-27 6:27 ` Kenichi Handa 1 sibling, 0 replies; 17+ messages in thread From: Kenichi Handa @ 2009-08-27 6:27 UTC (permalink / raw) To: Kenichi Handa; +Cc: dann, emacs-devel In article <E1MgBZt-0006U4-9O@etlken>, Kenichi Handa <handa@m17n.org> writes: > Some CJK charsets are necessary to create the default > fontset. I think it's possible to avoid loading unnecessary > charsets for creating the default fontset. I'll work on it. I've just installed a fix for fontset.el. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: profiling emacs-23.1 vs emacs-22.3 2009-08-24 11:39 ` Kenichi Handa 2009-08-24 18:26 ` Dan Nicolaescu @ 2009-08-24 22:18 ` Alan Mackenzie 1 sibling, 0 replies; 17+ messages in thread From: Alan Mackenzie @ 2009-08-24 22:18 UTC (permalink / raw) To: Kenichi Handa; +Cc: Dan Nicolaescu, emacs-devel On Mon, Aug 24, 2009 at 08:39:47PM +0900, Kenichi Handa wrote: > In article <200908240807.n7O87ubg024643@godzilla.ics.uci.edu>, Dan Nicolaescu <dann@ics.uci.edu> writes: > It seems that c-indent-region has been changed a lot. I'm not sure, > but perhaps that is the reason [for a slowdown]. How do the other > people think? The change was a bug fix. It shouldn't have slowed down indentation, at least not by more than epsilon. > --- > Kenichi Handa > handa@m17n.org -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-08-27 6:27 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-03 20:29 profiling emacs-23.1 vs emacs-22.3 Dan Nicolaescu 2009-08-04 17:10 ` Leo 2009-08-04 19:50 ` Chong Yidong 2009-08-04 19:56 ` Dan Nicolaescu 2009-08-05 7:47 ` Dan Nicolaescu 2009-08-24 6:52 ` Kenichi Handa [not found] ` <200908240807.n7O87ubg024643@godzilla.ics.uci.edu> 2009-08-24 11:39 ` Kenichi Handa 2009-08-24 18:26 ` Dan Nicolaescu 2009-08-25 6:07 ` Kenichi Handa 2009-08-25 18:47 ` Dan Nicolaescu 2009-08-26 6:01 ` Kenichi Handa 2009-08-26 6:33 ` Dan Nicolaescu 2009-08-26 8:06 ` Kenichi Handa 2009-08-26 20:46 ` Andreas Schwab 2009-08-27 2:02 ` Kenichi Handa 2009-08-27 6:27 ` Kenichi Handa 2009-08-24 22:18 ` Alan Mackenzie
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).