* bug#3603: 23.0.94; takes much time to save large non-ASCII buffers
@ 2009-06-18 9:32 YAMAMOTO Mitsuharu
2009-06-18 11:43 ` Kenichi Handa
0 siblings, 1 reply; 3+ messages in thread
From: YAMAMOTO Mitsuharu @ 2009-06-18 9:32 UTC (permalink / raw)
To: emacs-pretest-bug
Steps to reproduce:
1. emacs -Q
2. C-x ( C-x i .../etc/tutorials/TUTORIAL.ja RET C-x )
3. C-u 20 C-x e
4. C-x C-s SOME-NEW-FILE-NAME RET
Result:
It takes much time (~10 sec.) to save this ~1MB buffer.
Emacs 22 can save it instantly.
The slowness comes from that of select-safe-coding-system, in
particular, find-coding-systems-region(-internal) in it. The
following patch makes it much faster (a few sec.) than the current
version.
Index: src/coding.c
===================================================================
RCS file: /sources/emacs/emacs/src/coding.c,v
retrieving revision 1.434
diff -c -p -r1.434 coding.c
*** src/coding.c 17 Jun 2009 00:42:07 -0000 1.434
--- src/coding.c 18 Jun 2009 06:05:04 -0000
*************** DEFUN ("find-coding-systems-region-inter
*** 8638,8644 ****
EMACS_INT start_byte, end_byte;
const unsigned char *p, *pbeg, *pend;
int c;
! Lisp_Object tail, elt;
if (STRINGP (start))
{
--- 8638,8644 ----
EMACS_INT start_byte, end_byte;
const unsigned char *p, *pbeg, *pend;
int c;
! Lisp_Object tail, elt, chars_checked;
if (STRINGP (start))
{
*************** DEFUN ("find-coding-systems-region-inter
*** 8696,8701 ****
--- 8696,8702 ----
while (p < pend && ASCII_BYTE_P (*p)) p++;
while (p < pend && ASCII_BYTE_P (*(pend - 1))) pend--;
+ chars_checked = Fmake_char_table (Qnil, Qnil);
while (p < pend)
{
if (ASCII_BYTE_P (*p))
*************** DEFUN ("find-coding-systems-region-inter
*** 8703,8708 ****
--- 8704,8711 ----
else
{
c = STRING_CHAR_ADVANCE (p);
+ if (!NILP (char_table_ref (chars_checked, c)))
+ continue;
charset_map_loaded = 0;
for (tail = coding_attrs_list; CONSP (tail);)
*************** DEFUN ("find-coding-systems-region-inter
*** 8734,8739 ****
--- 8737,8743 ----
p = pbeg + p_offset;
pend = pbeg + pend_offset;
}
+ char_table_set (chars_checked, c, Qt);
}
}
Some notes:
1. It's still much slower than Emacs 22. I guess we need to rewrite
select-safe-coding-system if we try to make its performance
comparable with Emacs 22. But perhaps we should avoid such
changes at this moment.
2. If the "if (charset_map_loaded) ..." clause in
Ffind_coding_systems_region_internal is intended for the
relocation caused by GC, then maybe `chars_checked' above (and
also `coding_attrs_list') should be GCPROed.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/local/share/emacs/23.0.94/etc/DEBUG for instructions.
In GNU Emacs 23.0.94.1 (powerpc-apple-darwin9.7.0, X toolkit)
of 2009-06-18 on yamamoto-mitsuharu-no-power-mac-g5.local
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure '--without-gif' '--without-jpeg' '--without-tiff''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: ja_JP.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
tool-bar-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#3603: 23.0.94; takes much time to save large non-ASCII buffers
2009-06-18 9:32 bug#3603: 23.0.94; takes much time to save large non-ASCII buffers YAMAMOTO Mitsuharu
@ 2009-06-18 11:43 ` Kenichi Handa
2009-06-19 8:46 ` YAMAMOTO Mitsuharu
0 siblings, 1 reply; 3+ messages in thread
From: Kenichi Handa @ 2009-06-18 11:43 UTC (permalink / raw)
To: YAMAMOTO Mitsuharu, 3603
In article <wleithrioa.wl%mituharu@math.s.chiba-u.ac.jp>, YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:
> Steps to reproduce:
> 1. emacs -Q
> 2. C-x ( C-x i .../etc/tutorials/TUTORIAL.ja RET C-x )
> 3. C-u 20 C-x e
> 4. C-x C-s SOME-NEW-FILE-NAME RET
> Result:
> It takes much time (~10 sec.) to save this ~1MB buffer.
> Emacs 22 can save it instantly.
I observed it too.
> The slowness comes from that of select-safe-coding-system, in
> particular, find-coding-systems-region(-internal) in it. The
> following patch makes it much faster (a few sec.) than the current
> version.
It seems that your patch is correct. Actually, Emacs 22
used the similar method, but I forgot to implement that part
when I re-wrote find-coding-systems-region-internal. :-(
[...]
> 1. It's still much slower than Emacs 22. I guess we need to rewrite
> select-safe-coding-system if we try to make its performance
> comparable with Emacs 22. But perhaps we should avoid such
> changes at this moment.
One possible strategy is to check, at first, whether or not
the default coding system(s) used for encoding (usually
buffer-file-coding-system) can encode the text.
> 2. If the "if (charset_map_loaded) ..." clause in
> Ffind_coding_systems_region_internal is intended for the
> relocation caused by GC, then maybe `chars_checked' above (and
> also `coding_attrs_list') should be GCPROed.
It was. But, as we modified load_charset_map_from_file to
disable file-name-handlers a while ago, we don't need that
check anymore. I just forgot to delete all those checks.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#3603: 23.0.94; takes much time to save large non-ASCII buffers
2009-06-18 11:43 ` Kenichi Handa
@ 2009-06-19 8:46 ` YAMAMOTO Mitsuharu
0 siblings, 0 replies; 3+ messages in thread
From: YAMAMOTO Mitsuharu @ 2009-06-19 8:46 UTC (permalink / raw)
To: Kenichi Handa; +Cc: 3603
>>>>> On Thu, 18 Jun 2009 20:43:28 +0900, Kenichi Handa <handa@m17n.org> said:
>> The slowness comes from that of select-safe-coding-system, in
>> particular, find-coding-systems-region(-internal) in it. The
>> following patch makes it much faster (a few sec.) than the current
>> version.
> It seems that your patch is correct. Actually, Emacs 22 used the
> similar method, but I forgot to implement that part when I re-wrote
> find-coding-systems-region-internal. :-(
I've installed the patch (with changing the variable name to the one
that is consistent with Emacs 22).
>> 2. If the "if (charset_map_loaded) ..." clause in
>> Ffind_coding_systems_region_internal is intended for the relocation
>> caused by GC, then maybe `chars_checked' above (and also
>> `coding_attrs_list') should be GCPROed.
> It was. But, as we modified load_charset_map_from_file to disable
> file-name-handlers a while ago, we don't need that check anymore. I
> just forgot to delete all those checks.
Thanks for the explanation. Actually, I couldn't find the part that
may cause GC, and I wondered why there's an adjustment for relocation.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-06-19 8:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-18 9:32 bug#3603: 23.0.94; takes much time to save large non-ASCII buffers YAMAMOTO Mitsuharu
2009-06-18 11:43 ` Kenichi Handa
2009-06-19 8:46 ` YAMAMOTO Mitsuharu
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).