unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#2435: Bug 2435
       [not found] ` <E1LePPk-0002zE-RW@etlken>
@ 2009-03-03 16:40   ` Chong Yidong
  2009-03-04  2:16     ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Chong Yidong @ 2009-03-03 16:40 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 2435

Kenichi Handa <handa@m17n.org> writes:

>> I get a crash with the following recipe, and I think it's related to the
>> new composition code:
>> 
>> emacs -Q
>> M-x customize-group RET whitespace RET
>> C-v C-v
>
> I can't reproduce the crash with that.

It doesn't happen all the time---about once every three or four
attempts.  Quite strange.

> It seems that the regular expression
> tibetan-composable-pattern (which starts with "[\340\275..."
> and the byte length is surely 88) is being compiled.
> Parhaps the display routine is going to display #xF20
> (tibetan character).  But, in my case, regex_compile never
> calls analyse_first at line 2853.
>
>   2846			if (many_times_ok)
>   2847			  {
>   2848			    boolean simple = skip_one_char (laststart) == b;
>   2849			    unsigned int startoffset = 0;
>   2850			    re_opcode_t ofj =
>   2851			      /* Check if the loop can match the empty string.  */
>   2852			      (simple || !analyse_first (laststart, b, NULL, 0))
>   2853			      ? on_failure_jump : on_failure_jump_loop;
>
> Here, "simple" is always set to 1, so analyse_first is not
> called.

When I get the crash, simple is set to 0.

(gdb) p b
$4 = (unsigned char *) 0x8b927b7 ""
(gdb) p laststart
$5 = (unsigned char *) 0x8b92786 "\a\201\f"

The return value of `skip_one_char (laststart)' is NULL.

> Could you please try:
>     ESC : (re-search-forward tibetan-composable-pattern) RET
> in some buffer and checks if analyse_first is called?

It's called, not for the search itself but in the error handler:

#0  analyse_first (
    p=0x8860bc0 "\t\002\037Previous command was not a
    yank\n\001\005eval-\0164", pend=0x8860be4 "\005eval-\0164",
    fastmap=0x83e3984 "", multibyte=0)
    at regex.c:4022
#1  0x081aa0b5 in re_compile_fastmap (bufp=0x83e3960) at regex.c:4339
#2  0x081aa2c2 in re_search_2 (bufp=0x83e3960, str1=0x0, size1=0, 
    str2=0x89fa230 "Search failed:
    \"[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*\"",
    size2=105, startpos=0, range=105, regs=0x0, 
    stop=105) at regex.c:4487
#3  0x081aa18c in re_search (bufp=0x83e3960, 
    string=0x89fa230 "Search failed:
    \"[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*\"",
    size=105, startpos=0, range=105, regs=0x0)
    at regex.c:4393
#4  0x081972f1 in fast_string_match (regexp=139244403, string=138652283)
    at search.c:505
#5  0x081d3fb6 in skip_debugger (conditions=138346117, data=141320317)
    at eval.c:1862
#6  0x081d4097 in maybe_call_debugger (conditions=138346117,
    sig=138556777, 
    data=141320309) at eval.c:1889
#7  0x081d41e0 in find_handler_clause (handlers=138402201, 
    conditions=138346117, sig=138556777, data=141320309) at eval.c:1961
#8  0x081d3c34 in Fsignal (error_symbol=138556777, data=141320309)
    at eval.c:1700






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-03 16:40   ` bug#2435: Bug 2435 Chong Yidong
@ 2009-03-04  2:16     ` Kenichi Handa
  2009-03-04  4:41       ` Chong Yidong
  0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-03-04  2:16 UTC (permalink / raw)
  To: Chong Yidong; +Cc: 2435

In article <87zlg2a7b9.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes:

>>> emacs -Q
>>> M-x customize-group RET whitespace RET
>>> C-v C-v
> >
> > I can't reproduce the crash with that.

> It doesn't happen all the time---about once every three or four
> attempts.  Quite strange.

I tried more than 10 times without crash.

> >   2846			if (many_times_ok)
> >   2847			  {
> >   2848			    boolean simple = skip_one_char (laststart) == b;
> >   2849			    unsigned int startoffset = 0;
> >   2850			    re_opcode_t ofj =
> >   2851			      /* Check if the loop can match the empty string.  */
> >   2852			      (simple || !analyse_first (laststart, b, NULL, 0))
> >   2853			      ? on_failure_jump : on_failure_jump_loop;
> >
> > Here, "simple" is always set to 1, so analyse_first is not
> > called.

> When I get the crash, simple is set to 0.

> (gdb) p b
> $4 = (unsigned char *) 0x8b927b7 ""
> (gdb) p laststart
> $5 = (unsigned char *) 0x8b92786 "\a\201\f"

That is different in my case.  When the execution reaches
the above code (three or four times while displaying that
Tibetan char), laststart is always "\004\200".  Here the
first byte \004 means `charset' OP, and that is reasonable
because we are now handling '*' after "[...]".  

But '\a' (== 7 == stop_memory) is very strange.  Please show
me these values when simple is set to 0.

(gdb) p bufp->buffer
$58 = (unsigned char *) 0xa905aa0 "\004\200"
(gdb) p laststart
$59 = (unsigned char *) 0xa905ab2 "\004\200"
(gdb) p bufp->buffer[0]@(b - bufp->buffer)
$60 = "\004\200\000\000\002\000@\017\000i\017\000j\017\000j\017\000\004\200\000\000\004\000\220\017\000\271\017\000\272\017\000\272\017\000\273\017\000\273\017\000\274\017\000\274\017"
(gdb) p laststart[0]@(b-laststart)
$61 = "\004\200\000\000\004\000\220\017\000\271\017\000\272\017\000\272\017\000\273\017\000\273\017\000\274\017\000\274\017"

Here,
  "\220\017\000" => 0xF90
  "\271\017\000" => 0xFB9
  "\272\017\000" => 0xFBA
  "\273\017\000" => 0xFBB
  "\274\017\000" => 0xFBC

So, we are now processing the '*' just after the second
[...] of tibetan-composable-pattern.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-04  2:16     ` Kenichi Handa
@ 2009-03-04  4:41       ` Chong Yidong
  2009-03-04  7:47         ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Chong Yidong @ 2009-03-04  4:41 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 2435

Kenichi Handa <handa@m17n.org> writes:

>> It doesn't happen all the time---about once every three or four
>> attempts.  Quite strange.
>
> I tried more than 10 times without crash.

Here are my specs (latest CVS, no modifications):

In GNU Emacs 23.0.91.29 (i686-pc-linux-gnu, GTK+ Version 2.14.4) of 2009-03-03 on furry
Windowing system distributor `The X.Org Foundation', version 11.0.10502000
configured using `configure  'CC=gcc' 'CFLAGS=-g''

LANG is en_US.UTF-8

I can reproduce it with `emacs -Q'.

Do you at least see the redisplay problem reported by the OP?

>> When I get the crash, simple is set to 0.
>
>> (gdb) p b
>> $4 = (unsigned char *) 0x8b927b7 ""
>> (gdb) p laststart
>> $5 = (unsigned char *) 0x8b92786 "\a\201\f"
>
> That is different in my case.  When the execution reaches
> the above code (three or four times while displaying that
> Tibetan char), laststart is always "\004\200".  Here the
> first byte \004 means `charset' OP, and that is reasonable
> because we are now handling '*' after "[...]".  
>
> But '\a' (== 7 == stop_memory) is very strange.  Please show
> me these values when simple is set to 0.

(gdb) f 2
#2  0x081a1798 in regex_compile (
    pattern=0x8356085
    "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*",
    size=88, syntax=3408388, bufp=0x83e3210) at regex.c:2853
2853                          ? on_failure_jump : on_failure_jump_loop;
(gdb) p bufp->buffer
$8 = (unsigned char *) 0x8b931d0 "\0169"
(gdb) p laststart
$10 = (unsigned char *) 0x8b93206 "\a\201\f"
(gdb) p bufp->buffer[0]@(b-bufp->buffer)
$11 = "\0169\000\002\002.Z\016.\000\006\001\016\006\000\002\001~\r!\000\002\002.~\004\b\000\000\000\000\000\000\377\003\022\r\000\004\b\000\000\000\000\000\000\377\003\r\360\377\002\001~\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017"
(gdb) p laststart[0]@(b-laststart)
$12 = "\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017"







^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-04  4:41       ` Chong Yidong
@ 2009-03-04  7:47         ` Kenichi Handa
  2009-03-05  4:09           ` Chong Yidong
  2009-03-05  4:42           ` Chong Yidong
  0 siblings, 2 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-03-04  7:47 UTC (permalink / raw)
  To: Chong Yidong; +Cc: 2435

In article <87ab81293z.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes:

> Here are my specs (latest CVS, no modifications):

> In GNU Emacs 23.0.91.29 (i686-pc-linux-gnu, GTK+ Version 2.14.4) of 2009-03-03 on furry
> Windowing system distributor `The X.Org Foundation', version 11.0.10502000
> configured using `configure  'CC=gcc' 'CFLAGS=-g''

> LANG is en_US.UTF-8

> I can reproduce it with `emacs -Q'.

Mine are the same:

In GNU Emacs 23.0.91.3 (i686-pc-linux-gnu, GTK+ Version 2.14.4) of 2009-02-27 on etlken
Windowing system distributor `The X.Org Foundation', version 11.0.10502000
configured using `configure  'CFLAGS=-g''

and can't reproduce it with 'LANG=en_US.UTF-8 emacs -Q'.

> Do you at least see the redisplay problem reported by the OP?

No.  There are 5 non-ASCII characters on the line of
"Whitespace Hspace Regexp:".  The first one is shown as
"\240", the second one by an empty box, the others are by
proper fonts, and Emacs doesn't stop.

> (gdb) f 2
> #2  0x081a1798 in regex_compile (
>     pattern=0x8356085
>     "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*",
>     size=88, syntax=3408388, bufp=0x83e3210) at regex.c:2853
> 2853                          ? on_failure_jump : on_failure_jump_loop;
> (gdb) p bufp->buffer
> $8 = (unsigned char *) 0x8b931d0 "\0169"
> (gdb) p laststart
> $10 = (unsigned char *) 0x8b93206 "\a\201\f"
> (gdb) p bufp->buffer[0]@(b-bufp->buffer)
> $11 = "\0169\000\002\002.Z\016.\000\006\001\016\006\000\002\001~\r!\000\002\002.~\004\b\000\000\000\000\000\000\377\003\022\r\000\004\b\000\000\000\000\000\000\377\003\r\360\377\002\001~\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017"
> (gdb) p laststart[0]@(b-laststart)
> $12 = "\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017"

It seems that `pattern' is correct, but `bufp->buffer' is
the compiled code for some of jkr-compr related regexp.
Could you please find why that happens?

This is the disassembled code of the head of bufp->buffer.

on_failer_jump #x39
exactn 2 ".Z"
on_failer_jump #x2E
start_memory 1
on_failer_jump #x06
exactn 1 "~"
jump #x21
exactn 2 ".~"
charset "0-9"
on_failure_jump_smart 0x08
charset "0-9"
jump -0x10
exactn 1 "~"
stop_memory \201 <-- should not happen

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-04  7:47         ` Kenichi Handa
@ 2009-03-05  4:09           ` Chong Yidong
  2009-03-05  6:39             ` Kenichi Handa
  2009-03-05  4:42           ` Chong Yidong
  1 sibling, 1 reply; 17+ messages in thread
From: Chong Yidong @ 2009-03-05  4:09 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 2435

Kenichi Handa <handa@m17n.org> writes:

> It seems that `pattern' is correct, but `bufp->buffer' is
> the compiled code for some of jkr-compr related regexp.
> Could you please find why that happens?

I think the problem is that regex_compile can call load_charset, which
can call regex_compile.  I think regex_compile is not designed to be
called recursively, leading to memory corruption.  Here is a backtrace
(I inserted some debugging code to detect when regex_compile is called
recursively):

#0  abort () at emacs.c:432
#1  0x08196965 in compile_pattern (pattern=139288251, regp=0x0, 
    translate=138358041, posix=0, multibyte=0) at search.c:262
#2  0x0819727d in fast_string_match (regexp=139288251, string=138555267)
    at search.c:509
#3  0x0817eb9f in Ffind_file_name_handler (filename=138555267, 
    operation=138416225) at fileio.c:380
#4  0x0817f5b3 in Fexpand_file_name (name=138555267, 
    default_directory=141645459) at fileio.c:859
#5  0x081fb92e in openp (path=138676461, str=138555267,
#suffixes=138926685, 
    storeptr=0x0, predicate=138358041) at lread.c:1425
#6  0x080b95f6 in load_charset_map_from_file (charset=0x84b4acc, 
    mapfile=138555267, control_flag=1) at charset.c:515
#7  0x080b9bb1 in load_charset (charset=0x84b4acc, control_flag=1)
    at charset.c:652
#8  0x080bdce8 in maybe_unify_char (c=1638542, val=138788937) at charset.c:1679
#9  0x080eb5fa in string_char (
    p=0x83560ac "\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", advanced=0x0, len=0xbfc70d28)
    at character.c:236
#10 0x081a2ac9 in regex_compile (
    pattern=0x8356085 "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", size=88, syntax=3408388, bufp=0x83e3210) at regex.c:2982
#11 0x081b3dca in re_compile_pattern (
    pattern=0x8356085 "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", length=88, bufp=0x83e3210) at regex.c:6518
#12 0x08196714 in compile_pattern_1 (cp=0x83e3200, pattern=137244011, 
    translate=138358041, regp=0x0, posix=0) at search.c:160
#13 0x08196996 in compile_pattern (pattern=137244011, regp=0x0, 
    translate=138358041, posix=0, multibyte=1) at search.c:265






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-04  7:47         ` Kenichi Handa
  2009-03-05  4:09           ` Chong Yidong
@ 2009-03-05  4:42           ` Chong Yidong
  2009-03-05 11:23             ` Kenichi Handa
  1 sibling, 1 reply; 17+ messages in thread
From: Chong Yidong @ 2009-03-05  4:42 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 2435

The following patch tries to avoid the problem by updating the regexp
list before using the regexp cell.  It seems to fix the problem; or do
have a different suggestion?

*** trunk/src/search.c.~1.237.~	2009-02-12 16:45:29.000000000 -0500
--- trunk/src/search.c	2009-03-04 23:38:35.000000000 -0500
***************
*** 134,140 ****
    char *val;
    reg_syntax_t old;
  
!   cp->regexp = Qnil;
    cp->buf.translate = (! NILP (translate) ? translate : make_number (0));
    cp->posix = posix;
    cp->buf.multibyte = STRING_MULTIBYTE (pattern);
--- 134,140 ----
    char *val;
    reg_syntax_t old;
  
!   cp->regexp = Qt;
    cp->buf.translate = (! NILP (translate) ? translate : make_number (0));
    cp->posix = posix;
    cp->buf.multibyte = STRING_MULTIBYTE (pattern);
***************
*** 239,245 ****
  	 nil should never appear before a non-nil entry.  */
        if (NILP (cp->regexp))
  	goto compile_it;
!       if (SCHARS (cp->regexp) == SCHARS (pattern)
  	  && STRING_MULTIBYTE (cp->regexp) == STRING_MULTIBYTE (pattern)
  	  && !NILP (Fstring_equal (cp->regexp, pattern))
  	  && EQ (cp->buf.translate, (! NILP (translate) ? translate : make_number (0)))
--- 239,246 ----
  	 nil should never appear before a non-nil entry.  */
        if (NILP (cp->regexp))
  	goto compile_it;
!       if (STRINGP (cp->regexp)
! 	  && SCHARS (cp->regexp) == SCHARS (pattern)
  	  && STRING_MULTIBYTE (cp->regexp) == STRING_MULTIBYTE (pattern)
  	  && !NILP (Fstring_equal (cp->regexp, pattern))
  	  && EQ (cp->buf.translate, (! NILP (translate) ? translate : make_number (0)))
***************
*** 248,273 ****
  	      || EQ (cp->syntax_table, current_buffer->syntax_table))
  	  && !NILP (Fequal (cp->whitespace_regexp, Vsearch_spaces_regexp))
  	  && cp->buf.charset_unibyte == charset_unibyte)
! 	break;
  
        /* If we're at the end of the cache, compile into the nil cell
  	 we found, or the last (least recently used) cell with a
! 	 string value.  */
        if (cp->next == 0)
  	{
  	compile_it:
  	  compile_pattern_1 (cp, pattern, translate, regp, posix);
  	  break;
  	}
      }
  
-   /* When we get here, cp (aka *cpp) contains the compiled pattern,
-      either because we found it in the cache or because we just compiled it.
-      Move it to the front of the queue to mark it as most recently used.  */
-   *cpp = cp->next;
-   cp->next = searchbuf_head;
-   searchbuf_head = cp;
- 
    /* Advise the searching functions about the space we have allocated
       for register data.  */
    if (regp)
--- 249,279 ----
  	      || EQ (cp->syntax_table, current_buffer->syntax_table))
  	  && !NILP (Fequal (cp->whitespace_regexp, Vsearch_spaces_regexp))
  	  && cp->buf.charset_unibyte == charset_unibyte)
! 	{
! 	  /* We found a compiled pattern.  Move it to the front of the
! 	     queue to mark it as most recently used.  */
! 	  *cpp = cp->next;
! 	  cp->next = searchbuf_head;
! 	  searchbuf_head = cp;
! 	  break;
! 	}
  
        /* If we're at the end of the cache, compile into the nil cell
  	 we found, or the last (least recently used) cell with a
! 	 string value.  We must update the queue before calling
! 	 compile_pattern_1, because compile_pattern_1 can end up
! 	 calling compile_pattern recursively (via load_charset).  */
        if (cp->next == 0)
  	{
  	compile_it:
+ 	  *cpp = cp->next;
+ 	  cp->next = searchbuf_head;
+ 	  searchbuf_head = cp;
  	  compile_pattern_1 (cp, pattern, translate, regp, posix);
  	  break;
  	}
      }
  
    /* Advise the searching functions about the space we have allocated
       for register data.  */
    if (regp)







^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-05  4:09           ` Chong Yidong
@ 2009-03-05  6:39             ` Kenichi Handa
  0 siblings, 0 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-03-05  6:39 UTC (permalink / raw)
  To: Chong Yidong; +Cc: 2435

In article <878wnk7gqb.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes:

> Kenichi Handa <handa@m17n.org> writes:
> > It seems that `pattern' is correct, but `bufp->buffer' is
> > the compiled code for some of jkr-compr related regexp.
> > Could you please find why that happens?

> I think the problem is that regex_compile can call load_charset, which
> can call regex_compile.  I think regex_compile is not designed to be
> called recursively, leading to memory corruption.

Ah!  That is a likely cause of the bug.  I'll find a way to
avoid loading charset at that moment.

---
Kenichi Handa
handa@m17n.org








^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-05  4:42           ` Chong Yidong
@ 2009-03-05 11:23             ` Kenichi Handa
  2009-03-05 16:46               ` Stefan Monnier
  0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-03-05 11:23 UTC (permalink / raw)
  To: Chong Yidong; +Cc: 2435

In article <877i34shq2.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes:

> The following patch tries to avoid the problem by updating the regexp
> list before using the regexp cell.  It seems to fix the problem; or do
> have a different suggestion?

It will fix the current problem of bufp contents being
changed by recursive call, but we still have a danger of
Lisp code being called in re_compile_pattern, which may lead
to relocation of string data pointed by the arg "pattern".
So, I think we must avoid any Lisp calls within
re_compile_pattern.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-05 11:23             ` Kenichi Handa
@ 2009-03-05 16:46               ` Stefan Monnier
  2009-03-06  3:38                 ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2009-03-05 16:46 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Chong Yidong, 2435

> It will fix the current problem of bufp contents being
> changed by recursive call, but we still have a danger of
> Lisp code being called in re_compile_pattern, which may lead
> to relocation of string data pointed by the arg "pattern".
> So, I think we must avoid any Lisp calls within
> re_compile_pattern.

Indeed, tho it's not clear how we could do that.  One way that occurs to
me is: if we need to `load' or do some such dangerous operation in
re_compile_pattern, signal an error so we exit re_compile_pattern.
Then catch this error in the calling code where we can do the
corresponding operation safely and call re_compile_pattern again.
Kind of ugly, but for autoload-style operations it seems acceptable.


        Stefan






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-05 16:46               ` Stefan Monnier
@ 2009-03-06  3:38                 ` Kenichi Handa
  2009-03-06  4:11                   ` Chong Yidong
  0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-03-06  3:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: cyd, 2435

In article <jwvbpsfyl5d.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

> > It will fix the current problem of bufp contents being
> > changed by recursive call, but we still have a danger of
> > Lisp code being called in re_compile_pattern, which may lead
> > to relocation of string data pointed by the arg "pattern".
> > So, I think we must avoid any Lisp calls within
> > re_compile_pattern.

> Indeed, tho it's not clear how we could do that.  One way that occurs to
> me is: if we need to `load' or do some such dangerous operation in
> re_compile_pattern, signal an error so we exit re_compile_pattern.
> Then catch this error in the calling code where we can do the
> corresponding operation safely and call re_compile_pattern again.
> Kind of ugly, but for autoload-style operations it seems acceptable.

We should take care of re_search_2 and re_match_2_internal
too.

If the problem is only the call of openp in
load_charset_map_from_file, and building various Lisp object
is ok, we can change load_charset_map_from_file to open a
charset map by itself without using openp.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-06  3:38                 ` Kenichi Handa
@ 2009-03-06  4:11                   ` Chong Yidong
  2009-03-06  4:48                     ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Chong Yidong @ 2009-03-06  4:11 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 2435

Kenichi Handa <handa@m17n.org> writes:

> We should take care of re_search_2 and re_match_2_internal
> too.
>
> If the problem is only the call of openp in
> load_charset_map_from_file, and building various Lisp object
> is ok, we can change load_charset_map_from_file to open a
> charset map by itself without using openp.

Not using openp would be troublesome.  How about binding
file-name-handler-alist to nil inside load_charset_map_from_file?  That
prevents find-file-name-handler from performing the regexp compilation
and string search.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-06  4:11                   ` Chong Yidong
@ 2009-03-06  4:48                     ` Kenichi Handa
  2009-03-06  5:07                       ` Stefan Monnier
  0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2009-03-06  4:48 UTC (permalink / raw)
  To: Chong Yidong; +Cc: 2435

In article <877i33cmto.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes:

> Kenichi Handa <handa@m17n.org> writes:
> > We should take care of re_search_2 and re_match_2_internal
> > too.
> >
> > If the problem is only the call of openp in
> > load_charset_map_from_file, and building various Lisp object
> > is ok, we can change load_charset_map_from_file to open a
> > charset map by itself without using openp.

> Not using openp would be troublesome.  How about binding
> file-name-handler-alist to nil inside load_charset_map_from_file?  That
> prevents find-file-name-handler from performing the regexp compilation
> and string search.

That's a good idea if "building various Lisp object is ok"
is true.  

Stefan, what do you think?

---
Kenichi Handa
handa@m17n.org


*** charset.c.~1.169.~	2009-01-30 21:29:36.000000000 +0900
--- charset.c	2009-03-06 13:44:24.000000000 +0900
***************
*** 494,499 ****
--- 494,509 ----
  
  extern void add_to_log P_ ((char *, Lisp_Object, Lisp_Object));
  
+ extern Lisp_Object Vfile_name_handler_alist;
+ 
+ static Lisp_Object
+ load_charset_map_from_file_restore (arg)
+      Lisp_Object arg;
+ {
+   Vfile_name_handler_alist = arg;
+   return Qnil;
+ }
+ 
  static void
  load_charset_map_from_file (charset, mapfile, control_flag)
       struct charset *charset;
***************
*** 508,518 ****
--- 518,533 ----
    Lisp_Object suffixes;
    struct charset_map_entries *head, *entries;
    int n_entries;
+   int count = SPECPDL_INDEX ();
  
    suffixes = Fcons (build_string (".map"),
  		    Fcons (build_string (".TXT"), Qnil));
  
+   record_unwind_protect (load_charset_map_from_file_restore,
+ 			 Vfile_name_handler_alist);
+   Vfile_name_handler_alist = Qnil;
    fd = openp (Vcharset_map_path, mapfile, suffixes, NULL, Qnil);
+   unbind_to (count, Qnil);
    if (fd < 0
        || ! (fp = fdopen (fd, "r")))
      {






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-06  4:48                     ` Kenichi Handa
@ 2009-03-06  5:07                       ` Stefan Monnier
  2009-03-06  5:21                         ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2009-03-06  5:07 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Chong Yidong, 2435

>> > We should take care of re_search_2 and re_match_2_internal
>> > too.
>> >
>> > If the problem is only the call of openp in
>> > load_charset_map_from_file, and building various Lisp object
>> > is ok, we can change load_charset_map_from_file to open a
>> > charset map by itself without using openp.

>> Not using openp would be troublesome.  How about binding
>> file-name-handler-alist to nil inside load_charset_map_from_file?  That
>> prevents find-file-name-handler from performing the regexp compilation
>> and string search.

> That's a good idea if "building various Lisp object is ok"
> is true.  

AFAIK building objects is perfectly fine, yes.
I wonder why you use record_unwind_protect rather than specbind, tho.
Also if we do that, we should add a note in the docstring of
charset-map-path that it does not obey file-name-handler-alist.


        Stefan






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-06  5:07                       ` Stefan Monnier
@ 2009-03-06  5:21                         ` Kenichi Handa
  2009-03-07  4:00                           ` Chong Yidong
  2009-03-07 23:07                           ` Stefan Monnier
  0 siblings, 2 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-03-06  5:21 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: cyd, 2435

In article <jwv3adrutp3.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

> AFAIK building objects is perfectly fine, yes.
> I wonder why you use record_unwind_protect rather than specbind, tho.

Don't we need record_unwind_protect for the case that quit
is signaled in emacs_open that is called from openp?

> Also if we do that, we should add a note in the docstring of
> charset-map-path that it does not obey file-name-handler-alist.

Ok.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-06  5:21                         ` Kenichi Handa
@ 2009-03-07  4:00                           ` Chong Yidong
  2009-03-07 23:07                           ` Stefan Monnier
  1 sibling, 0 replies; 17+ messages in thread
From: Chong Yidong @ 2009-03-07  4:00 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 2435

Kenichi Handa <handa@m17n.org> writes:

>> AFAIK building objects is perfectly fine, yes.
>> I wonder why you use record_unwind_protect rather than specbind, tho.
>
> Don't we need record_unwind_protect for the case that quit
> is signaled in emacs_open that is called from openp?

Please go ahead and check in the patch, thanks.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-06  5:21                         ` Kenichi Handa
  2009-03-07  4:00                           ` Chong Yidong
@ 2009-03-07 23:07                           ` Stefan Monnier
  2009-03-09  1:12                             ` Kenichi Handa
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2009-03-07 23:07 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: cyd, 2435

>> AFAIK building objects is perfectly fine, yes.
>> I wonder why you use record_unwind_protect rather than specbind, tho.
> Don't we need record_unwind_protect for the case that quit
> is signaled in emacs_open that is called from openp?

specbind does the same thing (except that it's specially designed for
the job at hand of temporarily modifying a variable).

>> Also if we do that, we should add a note in the docstring of
>> charset-map-path that it does not obey file-name-handler-alist.
> Ok.

Thanks,


        Stefan






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#2435: Bug 2435
  2009-03-07 23:07                           ` Stefan Monnier
@ 2009-03-09  1:12                             ` Kenichi Handa
  0 siblings, 0 replies; 17+ messages in thread
From: Kenichi Handa @ 2009-03-09  1:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: cyd, 2435

In article <jwvocwdszkq.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> AFAIK building objects is perfectly fine, yes.
>>> I wonder why you use record_unwind_protect rather than specbind, tho.
> > Don't we need record_unwind_protect for the case that quit
> > is signaled in emacs_open that is called from openp?

> specbind does the same thing (except that it's specially designed for
> the job at hand of temporarily modifying a variable).

Ah! I see.  Ok, I've just installed this change.

2009-03-09  Kenichi Handa  <handa@m17n.org>

	* charset.c (Qfile_name_handler_alist): Extern it.
	(load_charset_map_from_file): Temporarily bind
	`file-name-handler-alist' to nil while calling openp.

Index: charset.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/charset.c,v
retrieving revision 1.169
retrieving revision 1.170
diff -u -r1.169 -r1.170
--- charset.c	4 Feb 2009 01:55:07 -0000	1.169
+++ charset.c	9 Mar 2009 01:09:23 -0000	1.170
@@ -477,6 +477,7 @@
   return n;
 }
 
+extern Lisp_Object Qfile_name_handler_alist;
 
 /* Return a mapping vector for CHARSET loaded from MAPFILE.
    Each line of MAPFILE has this form
@@ -490,7 +491,10 @@
    The returned vector has this form:
 	[ CODE1 CHAR1 CODE2 CHAR2 .... ]
    where CODE1 is a code-point or a cons of code-points specifying a
-   range.  */
+   range.
+
+   Note that this funciton uses `openp' to open MAPFILE but ignores
+   `file-name-handler-alist to avoid running any Lisp codes.  */
 
 extern void add_to_log P_ ((char *, Lisp_Object, Lisp_Object));
 
@@ -508,11 +512,14 @@
   Lisp_Object suffixes;
   struct charset_map_entries *head, *entries;
   int n_entries;
+  int count = SPECPDL_INDEX ();
 
   suffixes = Fcons (build_string (".map"),
 		    Fcons (build_string (".TXT"), Qnil));
 
+  specbind (Qfile_name_handler_alist, Qnil);
   fd = openp (Vcharset_map_path, mapfile, suffixes, NULL, Qnil);
+  unbind_to (count, Qnil);
   if (fd < 0
       || ! (fp = fdopen (fd, "r")))
     {

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-03-09  1:12 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <87fxhvcmvf.fsf@cyd.mit.edu>
     [not found] ` <E1LePPk-0002zE-RW@etlken>
2009-03-03 16:40   ` bug#2435: Bug 2435 Chong Yidong
2009-03-04  2:16     ` Kenichi Handa
2009-03-04  4:41       ` Chong Yidong
2009-03-04  7:47         ` Kenichi Handa
2009-03-05  4:09           ` Chong Yidong
2009-03-05  6:39             ` Kenichi Handa
2009-03-05  4:42           ` Chong Yidong
2009-03-05 11:23             ` Kenichi Handa
2009-03-05 16:46               ` Stefan Monnier
2009-03-06  3:38                 ` Kenichi Handa
2009-03-06  4:11                   ` Chong Yidong
2009-03-06  4:48                     ` Kenichi Handa
2009-03-06  5:07                       ` Stefan Monnier
2009-03-06  5:21                         ` Kenichi Handa
2009-03-07  4:00                           ` Chong Yidong
2009-03-07 23:07                           ` Stefan Monnier
2009-03-09  1:12                             ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).