* bug#2435: Bug 2435 [not found] ` <E1LePPk-0002zE-RW@etlken> @ 2009-03-03 16:40 ` Chong Yidong 2009-03-04 2:16 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Chong Yidong @ 2009-03-03 16:40 UTC (permalink / raw) To: Kenichi Handa; +Cc: 2435 Kenichi Handa <handa@m17n.org> writes: >> I get a crash with the following recipe, and I think it's related to the >> new composition code: >> >> emacs -Q >> M-x customize-group RET whitespace RET >> C-v C-v > > I can't reproduce the crash with that. It doesn't happen all the time---about once every three or four attempts. Quite strange. > It seems that the regular expression > tibetan-composable-pattern (which starts with "[\340\275..." > and the byte length is surely 88) is being compiled. > Parhaps the display routine is going to display #xF20 > (tibetan character). But, in my case, regex_compile never > calls analyse_first at line 2853. > > 2846 if (many_times_ok) > 2847 { > 2848 boolean simple = skip_one_char (laststart) == b; > 2849 unsigned int startoffset = 0; > 2850 re_opcode_t ofj = > 2851 /* Check if the loop can match the empty string. */ > 2852 (simple || !analyse_first (laststart, b, NULL, 0)) > 2853 ? on_failure_jump : on_failure_jump_loop; > > Here, "simple" is always set to 1, so analyse_first is not > called. When I get the crash, simple is set to 0. (gdb) p b $4 = (unsigned char *) 0x8b927b7 "" (gdb) p laststart $5 = (unsigned char *) 0x8b92786 "\a\201\f" The return value of `skip_one_char (laststart)' is NULL. > Could you please try: > ESC : (re-search-forward tibetan-composable-pattern) RET > in some buffer and checks if analyse_first is called? It's called, not for the search itself but in the error handler: #0 analyse_first ( p=0x8860bc0 "\t\002\037Previous command was not a yank\n\001\005eval-\0164", pend=0x8860be4 "\005eval-\0164", fastmap=0x83e3984 "", multibyte=0) at regex.c:4022 #1 0x081aa0b5 in re_compile_fastmap (bufp=0x83e3960) at regex.c:4339 #2 0x081aa2c2 in re_search_2 (bufp=0x83e3960, str1=0x0, size1=0, str2=0x89fa230 "Search failed: \"[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*\"", size2=105, startpos=0, range=105, regs=0x0, stop=105) at regex.c:4487 #3 0x081aa18c in re_search (bufp=0x83e3960, string=0x89fa230 "Search failed: \"[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*\"", size=105, startpos=0, range=105, regs=0x0) at regex.c:4393 #4 0x081972f1 in fast_string_match (regexp=139244403, string=138652283) at search.c:505 #5 0x081d3fb6 in skip_debugger (conditions=138346117, data=141320317) at eval.c:1862 #6 0x081d4097 in maybe_call_debugger (conditions=138346117, sig=138556777, data=141320309) at eval.c:1889 #7 0x081d41e0 in find_handler_clause (handlers=138402201, conditions=138346117, sig=138556777, data=141320309) at eval.c:1961 #8 0x081d3c34 in Fsignal (error_symbol=138556777, data=141320309) at eval.c:1700 ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-03 16:40 ` bug#2435: Bug 2435 Chong Yidong @ 2009-03-04 2:16 ` Kenichi Handa 2009-03-04 4:41 ` Chong Yidong 0 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-03-04 2:16 UTC (permalink / raw) To: Chong Yidong; +Cc: 2435 In article <87zlg2a7b9.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes: >>> emacs -Q >>> M-x customize-group RET whitespace RET >>> C-v C-v > > > > I can't reproduce the crash with that. > It doesn't happen all the time---about once every three or four > attempts. Quite strange. I tried more than 10 times without crash. > > 2846 if (many_times_ok) > > 2847 { > > 2848 boolean simple = skip_one_char (laststart) == b; > > 2849 unsigned int startoffset = 0; > > 2850 re_opcode_t ofj = > > 2851 /* Check if the loop can match the empty string. */ > > 2852 (simple || !analyse_first (laststart, b, NULL, 0)) > > 2853 ? on_failure_jump : on_failure_jump_loop; > > > > Here, "simple" is always set to 1, so analyse_first is not > > called. > When I get the crash, simple is set to 0. > (gdb) p b > $4 = (unsigned char *) 0x8b927b7 "" > (gdb) p laststart > $5 = (unsigned char *) 0x8b92786 "\a\201\f" That is different in my case. When the execution reaches the above code (three or four times while displaying that Tibetan char), laststart is always "\004\200". Here the first byte \004 means `charset' OP, and that is reasonable because we are now handling '*' after "[...]". But '\a' (== 7 == stop_memory) is very strange. Please show me these values when simple is set to 0. (gdb) p bufp->buffer $58 = (unsigned char *) 0xa905aa0 "\004\200" (gdb) p laststart $59 = (unsigned char *) 0xa905ab2 "\004\200" (gdb) p bufp->buffer[0]@(b - bufp->buffer) $60 = "\004\200\000\000\002\000@\017\000i\017\000j\017\000j\017\000\004\200\000\000\004\000\220\017\000\271\017\000\272\017\000\272\017\000\273\017\000\273\017\000\274\017\000\274\017" (gdb) p laststart[0]@(b-laststart) $61 = "\004\200\000\000\004\000\220\017\000\271\017\000\272\017\000\272\017\000\273\017\000\273\017\000\274\017\000\274\017" Here, "\220\017\000" => 0xF90 "\271\017\000" => 0xFB9 "\272\017\000" => 0xFBA "\273\017\000" => 0xFBB "\274\017\000" => 0xFBC So, we are now processing the '*' just after the second [...] of tibetan-composable-pattern. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-04 2:16 ` Kenichi Handa @ 2009-03-04 4:41 ` Chong Yidong 2009-03-04 7:47 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Chong Yidong @ 2009-03-04 4:41 UTC (permalink / raw) To: Kenichi Handa; +Cc: 2435 Kenichi Handa <handa@m17n.org> writes: >> It doesn't happen all the time---about once every three or four >> attempts. Quite strange. > > I tried more than 10 times without crash. Here are my specs (latest CVS, no modifications): In GNU Emacs 23.0.91.29 (i686-pc-linux-gnu, GTK+ Version 2.14.4) of 2009-03-03 on furry Windowing system distributor `The X.Org Foundation', version 11.0.10502000 configured using `configure 'CC=gcc' 'CFLAGS=-g'' LANG is en_US.UTF-8 I can reproduce it with `emacs -Q'. Do you at least see the redisplay problem reported by the OP? >> When I get the crash, simple is set to 0. > >> (gdb) p b >> $4 = (unsigned char *) 0x8b927b7 "" >> (gdb) p laststart >> $5 = (unsigned char *) 0x8b92786 "\a\201\f" > > That is different in my case. When the execution reaches > the above code (three or four times while displaying that > Tibetan char), laststart is always "\004\200". Here the > first byte \004 means `charset' OP, and that is reasonable > because we are now handling '*' after "[...]". > > But '\a' (== 7 == stop_memory) is very strange. Please show > me these values when simple is set to 0. (gdb) f 2 #2 0x081a1798 in regex_compile ( pattern=0x8356085 "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", size=88, syntax=3408388, bufp=0x83e3210) at regex.c:2853 2853 ? on_failure_jump : on_failure_jump_loop; (gdb) p bufp->buffer $8 = (unsigned char *) 0x8b931d0 "\0169" (gdb) p laststart $10 = (unsigned char *) 0x8b93206 "\a\201\f" (gdb) p bufp->buffer[0]@(b-bufp->buffer) $11 = "\0169\000\002\002.Z\016.\000\006\001\016\006\000\002\001~\r!\000\002\002.~\004\b\000\000\000\000\000\000\377\003\022\r\000\004\b\000\000\000\000\000\000\377\003\r\360\377\002\001~\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017" (gdb) p laststart[0]@(b-laststart) $12 = "\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017" ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-04 4:41 ` Chong Yidong @ 2009-03-04 7:47 ` Kenichi Handa 2009-03-05 4:09 ` Chong Yidong 2009-03-05 4:42 ` Chong Yidong 0 siblings, 2 replies; 17+ messages in thread From: Kenichi Handa @ 2009-03-04 7:47 UTC (permalink / raw) To: Chong Yidong; +Cc: 2435 In article <87ab81293z.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes: > Here are my specs (latest CVS, no modifications): > In GNU Emacs 23.0.91.29 (i686-pc-linux-gnu, GTK+ Version 2.14.4) of 2009-03-03 on furry > Windowing system distributor `The X.Org Foundation', version 11.0.10502000 > configured using `configure 'CC=gcc' 'CFLAGS=-g'' > LANG is en_US.UTF-8 > I can reproduce it with `emacs -Q'. Mine are the same: In GNU Emacs 23.0.91.3 (i686-pc-linux-gnu, GTK+ Version 2.14.4) of 2009-02-27 on etlken Windowing system distributor `The X.Org Foundation', version 11.0.10502000 configured using `configure 'CFLAGS=-g'' and can't reproduce it with 'LANG=en_US.UTF-8 emacs -Q'. > Do you at least see the redisplay problem reported by the OP? No. There are 5 non-ASCII characters on the line of "Whitespace Hspace Regexp:". The first one is shown as "\240", the second one by an empty box, the others are by proper fonts, and Emacs doesn't stop. > (gdb) f 2 > #2 0x081a1798 in regex_compile ( > pattern=0x8356085 > "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", > size=88, syntax=3408388, bufp=0x83e3210) at regex.c:2853 > 2853 ? on_failure_jump : on_failure_jump_loop; > (gdb) p bufp->buffer > $8 = (unsigned char *) 0x8b931d0 "\0169" > (gdb) p laststart > $10 = (unsigned char *) 0x8b93206 "\a\201\f" > (gdb) p bufp->buffer[0]@(b-bufp->buffer) > $11 = "\0169\000\002\002.Z\016.\000\006\001\016\006\000\002\001~\r!\000\002\002.~\004\b\000\000\000\000\000\000\377\003\022\r\000\004\b\000\000\000\000\000\000\377\003\r\360\377\002\001~\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017" > (gdb) p laststart[0]@(b-laststart) > $12 = "\a\201\f\000\000\a\000p\017\000p\017\000\216\000\031\216\000\031q\017\000q\017\000r\017\000}\017\000\200\017\000\200\017\000\201\017\000\201\017\000\204\017\000\204\017" It seems that `pattern' is correct, but `bufp->buffer' is the compiled code for some of jkr-compr related regexp. Could you please find why that happens? This is the disassembled code of the head of bufp->buffer. on_failer_jump #x39 exactn 2 ".Z" on_failer_jump #x2E start_memory 1 on_failer_jump #x06 exactn 1 "~" jump #x21 exactn 2 ".~" charset "0-9" on_failure_jump_smart 0x08 charset "0-9" jump -0x10 exactn 1 "~" stop_memory \201 <-- should not happen --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-04 7:47 ` Kenichi Handa @ 2009-03-05 4:09 ` Chong Yidong 2009-03-05 6:39 ` Kenichi Handa 2009-03-05 4:42 ` Chong Yidong 1 sibling, 1 reply; 17+ messages in thread From: Chong Yidong @ 2009-03-05 4:09 UTC (permalink / raw) To: Kenichi Handa; +Cc: 2435 Kenichi Handa <handa@m17n.org> writes: > It seems that `pattern' is correct, but `bufp->buffer' is > the compiled code for some of jkr-compr related regexp. > Could you please find why that happens? I think the problem is that regex_compile can call load_charset, which can call regex_compile. I think regex_compile is not designed to be called recursively, leading to memory corruption. Here is a backtrace (I inserted some debugging code to detect when regex_compile is called recursively): #0 abort () at emacs.c:432 #1 0x08196965 in compile_pattern (pattern=139288251, regp=0x0, translate=138358041, posix=0, multibyte=0) at search.c:262 #2 0x0819727d in fast_string_match (regexp=139288251, string=138555267) at search.c:509 #3 0x0817eb9f in Ffind_file_name_handler (filename=138555267, operation=138416225) at fileio.c:380 #4 0x0817f5b3 in Fexpand_file_name (name=138555267, default_directory=141645459) at fileio.c:859 #5 0x081fb92e in openp (path=138676461, str=138555267, #suffixes=138926685, storeptr=0x0, predicate=138358041) at lread.c:1425 #6 0x080b95f6 in load_charset_map_from_file (charset=0x84b4acc, mapfile=138555267, control_flag=1) at charset.c:515 #7 0x080b9bb1 in load_charset (charset=0x84b4acc, control_flag=1) at charset.c:652 #8 0x080bdce8 in maybe_unify_char (c=1638542, val=138788937) at charset.c:1679 #9 0x080eb5fa in string_char ( p=0x83560ac "\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", advanced=0x0, len=0xbfc70d28) at character.c:236 #10 0x081a2ac9 in regex_compile ( pattern=0x8356085 "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", size=88, syntax=3408388, bufp=0x83e3210) at regex.c:2982 #11 0x081b3dca in re_compile_pattern ( pattern=0x8356085 "[\340\275\200-\340\275\251\340\275\252][\340\276\220-\340\276\271\340\276\272\340\276\273\340\276\274]*[\340\275\260\366\220\202\216\340\275\261\340\275\262-\340\275\275\340\276\200\340\276\201\340\276\204]*[\340\275\276\340\276\202\340\276\203\340\276\206-\340\276\213\340\274\231\340\274\265\340\274\267]*", length=88, bufp=0x83e3210) at regex.c:6518 #12 0x08196714 in compile_pattern_1 (cp=0x83e3200, pattern=137244011, translate=138358041, regp=0x0, posix=0) at search.c:160 #13 0x08196996 in compile_pattern (pattern=137244011, regp=0x0, translate=138358041, posix=0, multibyte=1) at search.c:265 ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-05 4:09 ` Chong Yidong @ 2009-03-05 6:39 ` Kenichi Handa 0 siblings, 0 replies; 17+ messages in thread From: Kenichi Handa @ 2009-03-05 6:39 UTC (permalink / raw) To: Chong Yidong; +Cc: 2435 In article <878wnk7gqb.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes: > Kenichi Handa <handa@m17n.org> writes: > > It seems that `pattern' is correct, but `bufp->buffer' is > > the compiled code for some of jkr-compr related regexp. > > Could you please find why that happens? > I think the problem is that regex_compile can call load_charset, which > can call regex_compile. I think regex_compile is not designed to be > called recursively, leading to memory corruption. Ah! That is a likely cause of the bug. I'll find a way to avoid loading charset at that moment. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-04 7:47 ` Kenichi Handa 2009-03-05 4:09 ` Chong Yidong @ 2009-03-05 4:42 ` Chong Yidong 2009-03-05 11:23 ` Kenichi Handa 1 sibling, 1 reply; 17+ messages in thread From: Chong Yidong @ 2009-03-05 4:42 UTC (permalink / raw) To: Kenichi Handa; +Cc: 2435 The following patch tries to avoid the problem by updating the regexp list before using the regexp cell. It seems to fix the problem; or do have a different suggestion? *** trunk/src/search.c.~1.237.~ 2009-02-12 16:45:29.000000000 -0500 --- trunk/src/search.c 2009-03-04 23:38:35.000000000 -0500 *************** *** 134,140 **** char *val; reg_syntax_t old; ! cp->regexp = Qnil; cp->buf.translate = (! NILP (translate) ? translate : make_number (0)); cp->posix = posix; cp->buf.multibyte = STRING_MULTIBYTE (pattern); --- 134,140 ---- char *val; reg_syntax_t old; ! cp->regexp = Qt; cp->buf.translate = (! NILP (translate) ? translate : make_number (0)); cp->posix = posix; cp->buf.multibyte = STRING_MULTIBYTE (pattern); *************** *** 239,245 **** nil should never appear before a non-nil entry. */ if (NILP (cp->regexp)) goto compile_it; ! if (SCHARS (cp->regexp) == SCHARS (pattern) && STRING_MULTIBYTE (cp->regexp) == STRING_MULTIBYTE (pattern) && !NILP (Fstring_equal (cp->regexp, pattern)) && EQ (cp->buf.translate, (! NILP (translate) ? translate : make_number (0))) --- 239,246 ---- nil should never appear before a non-nil entry. */ if (NILP (cp->regexp)) goto compile_it; ! if (STRINGP (cp->regexp) ! && SCHARS (cp->regexp) == SCHARS (pattern) && STRING_MULTIBYTE (cp->regexp) == STRING_MULTIBYTE (pattern) && !NILP (Fstring_equal (cp->regexp, pattern)) && EQ (cp->buf.translate, (! NILP (translate) ? translate : make_number (0))) *************** *** 248,273 **** || EQ (cp->syntax_table, current_buffer->syntax_table)) && !NILP (Fequal (cp->whitespace_regexp, Vsearch_spaces_regexp)) && cp->buf.charset_unibyte == charset_unibyte) ! break; /* If we're at the end of the cache, compile into the nil cell we found, or the last (least recently used) cell with a ! string value. */ if (cp->next == 0) { compile_it: compile_pattern_1 (cp, pattern, translate, regp, posix); break; } } - /* When we get here, cp (aka *cpp) contains the compiled pattern, - either because we found it in the cache or because we just compiled it. - Move it to the front of the queue to mark it as most recently used. */ - *cpp = cp->next; - cp->next = searchbuf_head; - searchbuf_head = cp; - /* Advise the searching functions about the space we have allocated for register data. */ if (regp) --- 249,279 ---- || EQ (cp->syntax_table, current_buffer->syntax_table)) && !NILP (Fequal (cp->whitespace_regexp, Vsearch_spaces_regexp)) && cp->buf.charset_unibyte == charset_unibyte) ! { ! /* We found a compiled pattern. Move it to the front of the ! queue to mark it as most recently used. */ ! *cpp = cp->next; ! cp->next = searchbuf_head; ! searchbuf_head = cp; ! break; ! } /* If we're at the end of the cache, compile into the nil cell we found, or the last (least recently used) cell with a ! string value. We must update the queue before calling ! compile_pattern_1, because compile_pattern_1 can end up ! calling compile_pattern recursively (via load_charset). */ if (cp->next == 0) { compile_it: + *cpp = cp->next; + cp->next = searchbuf_head; + searchbuf_head = cp; compile_pattern_1 (cp, pattern, translate, regp, posix); break; } } /* Advise the searching functions about the space we have allocated for register data. */ if (regp) ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-05 4:42 ` Chong Yidong @ 2009-03-05 11:23 ` Kenichi Handa 2009-03-05 16:46 ` Stefan Monnier 0 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-03-05 11:23 UTC (permalink / raw) To: Chong Yidong; +Cc: 2435 In article <877i34shq2.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes: > The following patch tries to avoid the problem by updating the regexp > list before using the regexp cell. It seems to fix the problem; or do > have a different suggestion? It will fix the current problem of bufp contents being changed by recursive call, but we still have a danger of Lisp code being called in re_compile_pattern, which may lead to relocation of string data pointed by the arg "pattern". So, I think we must avoid any Lisp calls within re_compile_pattern. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-05 11:23 ` Kenichi Handa @ 2009-03-05 16:46 ` Stefan Monnier 2009-03-06 3:38 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Stefan Monnier @ 2009-03-05 16:46 UTC (permalink / raw) To: Kenichi Handa; +Cc: Chong Yidong, 2435 > It will fix the current problem of bufp contents being > changed by recursive call, but we still have a danger of > Lisp code being called in re_compile_pattern, which may lead > to relocation of string data pointed by the arg "pattern". > So, I think we must avoid any Lisp calls within > re_compile_pattern. Indeed, tho it's not clear how we could do that. One way that occurs to me is: if we need to `load' or do some such dangerous operation in re_compile_pattern, signal an error so we exit re_compile_pattern. Then catch this error in the calling code where we can do the corresponding operation safely and call re_compile_pattern again. Kind of ugly, but for autoload-style operations it seems acceptable. Stefan ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-05 16:46 ` Stefan Monnier @ 2009-03-06 3:38 ` Kenichi Handa 2009-03-06 4:11 ` Chong Yidong 0 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-03-06 3:38 UTC (permalink / raw) To: Stefan Monnier; +Cc: cyd, 2435 In article <jwvbpsfyl5d.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > It will fix the current problem of bufp contents being > > changed by recursive call, but we still have a danger of > > Lisp code being called in re_compile_pattern, which may lead > > to relocation of string data pointed by the arg "pattern". > > So, I think we must avoid any Lisp calls within > > re_compile_pattern. > Indeed, tho it's not clear how we could do that. One way that occurs to > me is: if we need to `load' or do some such dangerous operation in > re_compile_pattern, signal an error so we exit re_compile_pattern. > Then catch this error in the calling code where we can do the > corresponding operation safely and call re_compile_pattern again. > Kind of ugly, but for autoload-style operations it seems acceptable. We should take care of re_search_2 and re_match_2_internal too. If the problem is only the call of openp in load_charset_map_from_file, and building various Lisp object is ok, we can change load_charset_map_from_file to open a charset map by itself without using openp. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-06 3:38 ` Kenichi Handa @ 2009-03-06 4:11 ` Chong Yidong 2009-03-06 4:48 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Chong Yidong @ 2009-03-06 4:11 UTC (permalink / raw) To: Kenichi Handa; +Cc: 2435 Kenichi Handa <handa@m17n.org> writes: > We should take care of re_search_2 and re_match_2_internal > too. > > If the problem is only the call of openp in > load_charset_map_from_file, and building various Lisp object > is ok, we can change load_charset_map_from_file to open a > charset map by itself without using openp. Not using openp would be troublesome. How about binding file-name-handler-alist to nil inside load_charset_map_from_file? That prevents find-file-name-handler from performing the regexp compilation and string search. ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-06 4:11 ` Chong Yidong @ 2009-03-06 4:48 ` Kenichi Handa 2009-03-06 5:07 ` Stefan Monnier 0 siblings, 1 reply; 17+ messages in thread From: Kenichi Handa @ 2009-03-06 4:48 UTC (permalink / raw) To: Chong Yidong; +Cc: 2435 In article <877i33cmto.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes: > Kenichi Handa <handa@m17n.org> writes: > > We should take care of re_search_2 and re_match_2_internal > > too. > > > > If the problem is only the call of openp in > > load_charset_map_from_file, and building various Lisp object > > is ok, we can change load_charset_map_from_file to open a > > charset map by itself without using openp. > Not using openp would be troublesome. How about binding > file-name-handler-alist to nil inside load_charset_map_from_file? That > prevents find-file-name-handler from performing the regexp compilation > and string search. That's a good idea if "building various Lisp object is ok" is true. Stefan, what do you think? --- Kenichi Handa handa@m17n.org *** charset.c.~1.169.~ 2009-01-30 21:29:36.000000000 +0900 --- charset.c 2009-03-06 13:44:24.000000000 +0900 *************** *** 494,499 **** --- 494,509 ---- extern void add_to_log P_ ((char *, Lisp_Object, Lisp_Object)); + extern Lisp_Object Vfile_name_handler_alist; + + static Lisp_Object + load_charset_map_from_file_restore (arg) + Lisp_Object arg; + { + Vfile_name_handler_alist = arg; + return Qnil; + } + static void load_charset_map_from_file (charset, mapfile, control_flag) struct charset *charset; *************** *** 508,518 **** --- 518,533 ---- Lisp_Object suffixes; struct charset_map_entries *head, *entries; int n_entries; + int count = SPECPDL_INDEX (); suffixes = Fcons (build_string (".map"), Fcons (build_string (".TXT"), Qnil)); + record_unwind_protect (load_charset_map_from_file_restore, + Vfile_name_handler_alist); + Vfile_name_handler_alist = Qnil; fd = openp (Vcharset_map_path, mapfile, suffixes, NULL, Qnil); + unbind_to (count, Qnil); if (fd < 0 || ! (fp = fdopen (fd, "r"))) { ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-06 4:48 ` Kenichi Handa @ 2009-03-06 5:07 ` Stefan Monnier 2009-03-06 5:21 ` Kenichi Handa 0 siblings, 1 reply; 17+ messages in thread From: Stefan Monnier @ 2009-03-06 5:07 UTC (permalink / raw) To: Kenichi Handa; +Cc: Chong Yidong, 2435 >> > We should take care of re_search_2 and re_match_2_internal >> > too. >> > >> > If the problem is only the call of openp in >> > load_charset_map_from_file, and building various Lisp object >> > is ok, we can change load_charset_map_from_file to open a >> > charset map by itself without using openp. >> Not using openp would be troublesome. How about binding >> file-name-handler-alist to nil inside load_charset_map_from_file? That >> prevents find-file-name-handler from performing the regexp compilation >> and string search. > That's a good idea if "building various Lisp object is ok" > is true. AFAIK building objects is perfectly fine, yes. I wonder why you use record_unwind_protect rather than specbind, tho. Also if we do that, we should add a note in the docstring of charset-map-path that it does not obey file-name-handler-alist. Stefan ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-06 5:07 ` Stefan Monnier @ 2009-03-06 5:21 ` Kenichi Handa 2009-03-07 4:00 ` Chong Yidong 2009-03-07 23:07 ` Stefan Monnier 0 siblings, 2 replies; 17+ messages in thread From: Kenichi Handa @ 2009-03-06 5:21 UTC (permalink / raw) To: Stefan Monnier; +Cc: cyd, 2435 In article <jwv3adrutp3.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > AFAIK building objects is perfectly fine, yes. > I wonder why you use record_unwind_protect rather than specbind, tho. Don't we need record_unwind_protect for the case that quit is signaled in emacs_open that is called from openp? > Also if we do that, we should add a note in the docstring of > charset-map-path that it does not obey file-name-handler-alist. Ok. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-06 5:21 ` Kenichi Handa @ 2009-03-07 4:00 ` Chong Yidong 2009-03-07 23:07 ` Stefan Monnier 1 sibling, 0 replies; 17+ messages in thread From: Chong Yidong @ 2009-03-07 4:00 UTC (permalink / raw) To: Kenichi Handa; +Cc: 2435 Kenichi Handa <handa@m17n.org> writes: >> AFAIK building objects is perfectly fine, yes. >> I wonder why you use record_unwind_protect rather than specbind, tho. > > Don't we need record_unwind_protect for the case that quit > is signaled in emacs_open that is called from openp? Please go ahead and check in the patch, thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-06 5:21 ` Kenichi Handa 2009-03-07 4:00 ` Chong Yidong @ 2009-03-07 23:07 ` Stefan Monnier 2009-03-09 1:12 ` Kenichi Handa 1 sibling, 1 reply; 17+ messages in thread From: Stefan Monnier @ 2009-03-07 23:07 UTC (permalink / raw) To: Kenichi Handa; +Cc: cyd, 2435 >> AFAIK building objects is perfectly fine, yes. >> I wonder why you use record_unwind_protect rather than specbind, tho. > Don't we need record_unwind_protect for the case that quit > is signaled in emacs_open that is called from openp? specbind does the same thing (except that it's specially designed for the job at hand of temporarily modifying a variable). >> Also if we do that, we should add a note in the docstring of >> charset-map-path that it does not obey file-name-handler-alist. > Ok. Thanks, Stefan ^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#2435: Bug 2435 2009-03-07 23:07 ` Stefan Monnier @ 2009-03-09 1:12 ` Kenichi Handa 0 siblings, 0 replies; 17+ messages in thread From: Kenichi Handa @ 2009-03-09 1:12 UTC (permalink / raw) To: Stefan Monnier; +Cc: cyd, 2435 In article <jwvocwdszkq.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> AFAIK building objects is perfectly fine, yes. >>> I wonder why you use record_unwind_protect rather than specbind, tho. > > Don't we need record_unwind_protect for the case that quit > > is signaled in emacs_open that is called from openp? > specbind does the same thing (except that it's specially designed for > the job at hand of temporarily modifying a variable). Ah! I see. Ok, I've just installed this change. 2009-03-09 Kenichi Handa <handa@m17n.org> * charset.c (Qfile_name_handler_alist): Extern it. (load_charset_map_from_file): Temporarily bind `file-name-handler-alist' to nil while calling openp. Index: charset.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/charset.c,v retrieving revision 1.169 retrieving revision 1.170 diff -u -r1.169 -r1.170 --- charset.c 4 Feb 2009 01:55:07 -0000 1.169 +++ charset.c 9 Mar 2009 01:09:23 -0000 1.170 @@ -477,6 +477,7 @@ return n; } +extern Lisp_Object Qfile_name_handler_alist; /* Return a mapping vector for CHARSET loaded from MAPFILE. Each line of MAPFILE has this form @@ -490,7 +491,10 @@ The returned vector has this form: [ CODE1 CHAR1 CODE2 CHAR2 .... ] where CODE1 is a code-point or a cons of code-points specifying a - range. */ + range. + + Note that this funciton uses `openp' to open MAPFILE but ignores + `file-name-handler-alist to avoid running any Lisp codes. */ extern void add_to_log P_ ((char *, Lisp_Object, Lisp_Object)); @@ -508,11 +512,14 @@ Lisp_Object suffixes; struct charset_map_entries *head, *entries; int n_entries; + int count = SPECPDL_INDEX (); suffixes = Fcons (build_string (".map"), Fcons (build_string (".TXT"), Qnil)); + specbind (Qfile_name_handler_alist, Qnil); fd = openp (Vcharset_map_path, mapfile, suffixes, NULL, Qnil); + unbind_to (count, Qnil); if (fd < 0 || ! (fp = fdopen (fd, "r"))) { --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-03-09 1:12 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <87fxhvcmvf.fsf@cyd.mit.edu> [not found] ` <E1LePPk-0002zE-RW@etlken> 2009-03-03 16:40 ` bug#2435: Bug 2435 Chong Yidong 2009-03-04 2:16 ` Kenichi Handa 2009-03-04 4:41 ` Chong Yidong 2009-03-04 7:47 ` Kenichi Handa 2009-03-05 4:09 ` Chong Yidong 2009-03-05 6:39 ` Kenichi Handa 2009-03-05 4:42 ` Chong Yidong 2009-03-05 11:23 ` Kenichi Handa 2009-03-05 16:46 ` Stefan Monnier 2009-03-06 3:38 ` Kenichi Handa 2009-03-06 4:11 ` Chong Yidong 2009-03-06 4:48 ` Kenichi Handa 2009-03-06 5:07 ` Stefan Monnier 2009-03-06 5:21 ` Kenichi Handa 2009-03-07 4:00 ` Chong Yidong 2009-03-07 23:07 ` Stefan Monnier 2009-03-09 1:12 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).