* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping @ 2012-05-19 16:10 Juanma Barranquero 2012-05-19 16:27 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Juanma Barranquero @ 2012-05-19 16:10 UTC (permalink / raw) To: 11519 Package: emacs Version: 24.1.50 With current trunk (revno:108308) bootstrapping on Windows throws an error while building custom-deps: make -w cus-load.el-CMD make[2]: Entering directory `C:/emacs/debug/lisp' echo ;;; cus-load.el --- automatically extracted custom dependencies> cus-load.el-CMD echo ;;>> cus-load.el-CMD echo ;;; Code:>> cus-load.el-CMD echo.\f>> cus-load.el-CMD echo ;; Local Variables:>> cus-load.el-CMD echo ;; version-control: never>> cus-load.el-CMD echo ;; no-byte-compile: t>> cus-load.el-CMD echo ;; no-update-autoloads: t>> cus-load.el-CMD echo ;; End:>> cus-load.el-CMD make[2]: Leaving directory `C:/emacs/debug/lisp' mv cus-load.el-CMD C:/emacs/debug/lisp/cus-load.el Directories: calc calendar emacs-lisp emulation erc eshell gnus international language mail mh-e net nxml org play progmodes textmodes url vc cedet ce "../src/oo/i386/emacs.exe" -batch --no-site-file --no-site-lisp -l cus-dep --eval "(setq find-file-hook nil)" \ -f custom-make-dependencies C:/Devel/emacs/repo/debug/lisp calc calendar emacs-lisp emulation erc eshell gnus international language Directory C:/emacs/debug/lisp Directory calc Directory calendar Directory emacs-lisp Directory emulation Directory erc Directory eshell Directory gnus Directory international Directory language Wrong type argument: characterp, 4195283 make[1]: [custom-deps] Error -1 (ignored) rm "../src/oo/i386/emacs.exe" After bootstraping, cd lisp make custom-deps works as expected. The error appeared somewhere in the past ten days or so (I have a bootstrap log from 2012-05-08 which does not show the problem). Juanma ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-19 16:10 bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping Juanma Barranquero @ 2012-05-19 16:27 ` Eli Zaretskii 2012-05-19 21:40 ` Juanma Barranquero 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-19 16:27 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 11519 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Sat, 19 May 2012 18:10:07 +0200 > > Directory language > Wrong type argument: characterp, 4195283 > make[1]: [custom-deps] Error -1 (ignored) Can you try finding out which file in lisp/language causes that, and on which line? ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-19 16:27 ` Eli Zaretskii @ 2012-05-19 21:40 ` Juanma Barranquero 2012-05-20 17:27 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Juanma Barranquero @ 2012-05-19 21:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 11519 > Can you try finding out which file in lisp/language causes that, and > on which line? Yes, but it is slow; it only happens on bootstraps. Juanma ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-19 21:40 ` Juanma Barranquero @ 2012-05-20 17:27 ` Eli Zaretskii 2012-05-20 19:00 ` Juanma Barranquero 2012-05-21 1:49 ` Stefan Monnier 0 siblings, 2 replies; 39+ messages in thread From: Eli Zaretskii @ 2012-05-20 17:27 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 11519 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Sat, 19 May 2012 23:40:06 +0200 > Cc: 11519@debbugs.gnu.org > > > Can you try finding out which file in lisp/language causes that, and > > on which line? > > Yes, but it is slow; it only happens on bootstraps. If you started looking, I think you can stop. It's a %$#@! heisenbug. If I add a single (message %s file) to custom-make-dependencies, it displays ethio-util.el as the last file name before the error. But adding another message makes the problem disappear, and the bootstrap runs flawlessly to completion. Likewise, running just "make custom-deps" with the emacs.exe built by the first stage of the bootstrap doesn't produce any errors. I will try a few more tricks; ideas for how to debug this, or where to look, are welcome. If you say that this started happening only lately, perhaps bisecting might give some insights. Thanks. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-20 17:27 ` Eli Zaretskii @ 2012-05-20 19:00 ` Juanma Barranquero 2012-05-21 1:50 ` Stefan Monnier 2012-05-21 1:49 ` Stefan Monnier 1 sibling, 1 reply; 39+ messages in thread From: Juanma Barranquero @ 2012-05-20 19:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 11519 On Sun, May 20, 2012 at 7:27 PM, Eli Zaretskii <eliz@gnu.org> wrote: > If you started looking, I think you can stop. It's a %$#@! > heisenbug. I didn't start, I was far from home today. > If you say that this started happening only lately, perhaps bisecting > might give some insights. The bug disappears if you revert this change: revno: 108157 committer: Glenn Morris <rgm@gnu.org> branch nick: trunk timestamp: Mon 2012-05-07 21:50:17 -0400 message: Remove no-byte-compile setting from some lisp/language files. Same comments as per r108075, 2012-04-30, for lisp/term: Not that compiling these will bring any noticeable speed benefit, but there's really no reason not to compile them. The extra disk space and build time is negligible, and it might reveal use of obsolete functions, bugs, etc. Juanma ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-20 19:00 ` Juanma Barranquero @ 2012-05-21 1:50 ` Stefan Monnier 2012-05-21 2:51 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-21 1:50 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 11519 > The bug disappears if you revert this change: [...] > Remove no-byte-compile setting from some lisp/language files. My gut tells me this is just a trigger but not the cause of the bug :-( Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 1:50 ` Stefan Monnier @ 2012-05-21 2:51 ` Eli Zaretskii 2012-05-21 7:59 ` Andreas Schwab 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-21 2:51 UTC (permalink / raw) To: Stefan Monnier; +Cc: lekktu, 11519 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Eli Zaretskii <eliz@gnu.org>, 11519@debbugs.gnu.org > Date: Sun, 20 May 2012 21:50:04 -0400 > > > The bug disappears if you revert this change: > [...] > > Remove no-byte-compile setting from some lisp/language files. > > My gut tells me this is just a trigger but not the cause of the bug :-( What could be the cause, given that this commit triggers it? ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 2:51 ` Eli Zaretskii @ 2012-05-21 7:59 ` Andreas Schwab 2012-05-21 17:51 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Andreas Schwab @ 2012-05-21 7:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 11519 Eli Zaretskii <eliz@gnu.org> writes: >> From: Stefan Monnier <monnier@iro.umontreal.ca> >> Cc: Eli Zaretskii <eliz@gnu.org>, 11519@debbugs.gnu.org >> Date: Sun, 20 May 2012 21:50:04 -0400 >> >> > The bug disappears if you revert this change: >> [...] >> > Remove no-byte-compile setting from some lisp/language files. >> >> My gut tells me this is just a trigger but not the cause of the bug :-( > > What could be the cause, given that this commit triggers it? Perhaps some constant sharing that the byte compiler introduces? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 7:59 ` Andreas Schwab @ 2012-05-21 17:51 ` Eli Zaretskii 2012-05-21 20:39 ` Stefan Monnier 2012-05-22 14:38 ` Kenichi Handa 0 siblings, 2 replies; 39+ messages in thread From: Eli Zaretskii @ 2012-05-21 17:51 UTC (permalink / raw) To: Andreas Schwab; +Cc: lekktu, 11519 > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Mon, 21 May 2012 09:59:50 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Stefan Monnier <monnier@iro.umontreal.ca> > >> Cc: Eli Zaretskii <eliz@gnu.org>, 11519@debbugs.gnu.org > >> Date: Sun, 20 May 2012 21:50:04 -0400 > >> > >> > The bug disappears if you revert this change: > >> [...] > >> > Remove no-byte-compile setting from some lisp/language files. > >> > >> My gut tells me this is just a trigger but not the cause of the bug :-( > > > > What could be the cause, given that this commit triggers it? > > Perhaps some constant sharing that the byte compiler introduces? Maybe. But based on the evidence below, and also on the fact that the problem is so elusive and behaves like a classical heisenbug, my first suspicion is elsewhere. After trying a few tricks, I succeeded to catch the bug in GDB. (The full backtrace is near the end of this message.) Here's what I found. First, this really happens while cus-dep.el scans ethio-util.el. The Lisp call which throws the error is this: (re-search-forward (concat "(provide[ \t\n]+\\('\\|(quote[ \t\n]\\)[ \t\n]*" (regexp-quote name) "[ \t\n)]") nil t)) What I see on the C level is this: (gdb) bt #0 xsignal2 (error_symbol=56891898, arg1=56963794, arg2=16781132) at eval.c:1740 #1 0x01024a51 in wrong_type_argument (predicate=56891898, value=16781132) at data.c:111 #2 0x0102b4e8 in Faref (array=61843973, idx=16781132) at data.c:2090 #3 0x0129e66e in char_table_translate (table=61843973, ch=4195283) at chartab.c:680 #4 0x01142f6a in re_search_2 (bufp=0x1933c00, str1=0x10757948 "\002╘⌐\002\303¿\002\312\212\002\307╖\002\333¼\002\334ק\002\341\214\217ß\002\341\214\216ß\002\303▒\002╘╗\002\341\214\215ß\002\303╗\002\341\214\214ß\002\341\214ßכ\002\305⌐\002\307ע\002\305┐\002\307ח\002\305¿\002\305\262\002\303צ\002\306\263\002\341\214\212ß\002\341\214ßי\002\303\220", size1=51024, str2=0x1077fb17 <Address 0x1077fb17 out of bounds>, size2=0, startpos=23749, range=27240, regs=0x19351f0, stop=51024) at regex.c:4422 #5 0x010fbd70 in search_buffer (string=272417249, pos=1, pos_byte=1, lim=48448, lim_byte=51025, n=1, RE=1, trt=61843973, inverse_trt=61841925, posix=0) at search.c:1205 #6 0x010fb578 in search_command (string=272417249, bound=56838170, noerror=56838194, count=56838170, direction=1, RE=1, posix=0) at search.c:997 #7 0x010fef92 in Fre_search_forward (regexp=272417249, bound=56838170, noerror=56838194, count=56838170) at search.c:2162 (gdb) fr 7 #7 0x010fef92 in Fre_search_forward (regexp=272417249, bound=56838170, noerror=56838194, count=56838170) at search.c:2162 2162 return search_command (regexp, bound, noerror, count, 1, 1, 0); (gdb) p regexp $9 = 272417249 (gdb) xstring $10 = (struct Lisp_String *) 0x103cc1e0 "(provide[ \n]+\\('\\|(quote[ \n]\\)[ \n]*ethio-util[ \n)]" (gdb) fr 3 #3 0x0129e66e in char_table_translate (table=61843973, ch=4195283) at chartab.c:680 680 value = Faref (table, make_number (ch)); (gdb) p table $11 = 61843973 (gdb) xtype Lisp_Vectorlike PVEC_CHAR_TABLE (gdb) xchartable $12 = (struct Lisp_Char_Table *) 0x3afaa00 Purpose: "case-table" 3 extra slots (gdb) p current_buffer->text->beg $13 = ( unsigned char *) 0x10744948 ";;; ethio-util.el --- utilities for Ethiopic -*- coding: utf-8-emacs; -*-\n\n;; Copyright (C) 1997-1998, 2002-2012 Free Software Foundation, Inc.\n;; Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 20"... (gdb) p current_buffer->pt $14 = 1 So we are searching from buffer beginning, and the buffer text looks perfectly OK. But look what "text" is regex.c trying to search: #4 0x01142f6a in re_search_2 (bufp=0x1933c00, str1=0x10757948 "\002╘⌐\002\303¿\002\312\212\002\307╖\002\333¼\002\334ק\002\341\214\217ß\002\341\214\216ß\002\303▒\002╘╗\002\341\214\215ß\002\303╗\002\341\214\214ß\002\341\214ßכ\002\305⌐\002\307ע\002\305┐\002\307ח\002\305¿\002\305\262\002\303צ\002\306\263\002\341\214\212ß\002\341\214ßי\002\303\220", size1=51024, str2=0x1077fb17 <Address 0x1077fb17 out of bounds>, size2=0, startpos=23749, range=27240, regs=0x19351f0, stop=51024) at regex.c:4422 Total garbage. So I think that what happened is that something, probably the translation through a char-table, caused allocation of a large chunk of memory, which in turn relocated the text of the current buffer behind regex.c's back, which still uses C pointers to the old location of the buffer text. Here's how search.c calls re_search_2: p1 = BEGV_ADDR; s1 = GPT_BYTE - BEGV_BYTE; p2 = GAP_END_ADDR; s2 = ZV_BYTE - GPT_BYTE; if (s1 < 0) { p2 = p1; s2 = ZV_BYTE - BEGV_BYTE; s1 = 0; } if (s2 < 0) { s1 = ZV_BYTE - BEGV_BYTE; s2 = 0; } re_match_object = Qnil; while (n < 0) { EMACS_INT val; val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, pos_byte - BEGV_BYTE, lim_byte - pos_byte, (NILP (Vinhibit_changing_match_data) ? &search_regs : &search_regs_1), /* Don't allow match past current point */ pos_byte - BEGV_BYTE); We pass s1 and s2, which are pointers to buffer text, so if buffer text is relocated, we are screwed. Does this explanation sound reasonable? If so, any ideas how to fix this? Here's the full backtrace: #0 xsignal2 (error_symbol=56891898, arg1=56963794, arg2=16781132) at eval.c:1740 #1 0x01024a51 in wrong_type_argument (predicate=56891898, value=16781132) at data.c:111 #2 0x0102b4e8 in Faref (array=61843973, idx=16781132) at data.c:2090 #3 0x0129e66e in char_table_translate (table=61843973, ch=4195283) at chartab.c:680 #4 0x01142f6a in re_search_2 (bufp=0x1933c00, str1=0x10757948 "\002╘⌐\002\303¿\002\312\212\002\307╖\002\333¼\002\334ק\002 341\214\\217ß\002\341\214\216ß\002\303▒\002╘╗\002\341\214\215ß\002\303╗\002\341\214\214ß\002\341\214ßכ\002\305⌐\002\307ע\002\305┐\002\307ח\002\305¿\002\305\262\002\303צ\002\306\263\002\341\214\212ß\002\341\214ßי\002\303\220", size1=51024, str2=0x1077fb17 <Address 0x1077fb17 out of bounds>, size2=0, startpos=23749, range=27240, regs=0x19351f0, stop=51024) at regex.c:4422 #5 0x010fbd70 in search_buffer (string=272417249, pos=1, pos_byte=1, lim=48448, lim_byte=51025, n=1, RE=1, trt=61843973, inverse_trt=61841925, posix=0) at search.c:1205 #6 0x010fb578 in search_command (string=272417249, bound=56838170, noerror=56838194, count=56838170, direction=1, RE=1, posix=0) at search.c:997 #7 0x010fef92 in Fre_search_forward (regexp=272417249, bound=56838170, noerror=56838194, count=56838170) at search.c:2162 #8 0x01036e84 in Ffuncall (nargs=4, args=0x82df40) at eval.c:2946 #9 0x011236db in exec_byte_code (bytestr=66621569, vector=62818821, maxdepth=32, args_template=56838170, nargs=0, args=0x0) at bytecode.c:785 #10 0x01037c4e in funcall_lambda (fun=61992805, nargs=0, arg_vector=0x82e1b4) at eval.c:3166 #11 0x0103712c in Ffuncall (nargs=1, args=0x82e1b0) at eval.c:2984 #12 0x01034c3c in eval_sub (form=59285366) at eval.c:2255 #13 0x01030431 in Fprogn (args=59285382) at eval.c:364 #14 0x010302b1 in Fif (args=59285334) at eval.c:315 #15 0x01034986 in eval_sub (form=59285310) at eval.c:2231 #16 0x01030431 in Fprogn (args=59285390) at eval.c:364 #17 0x01030389 in Fcond (args=59285398) at eval.c:342 #18 0x01034986 in eval_sub (form=59284782) at eval.c:2231 #19 0x01030431 in Fprogn (args=59309078) at eval.c:364 #20 0x01031bb3 in FletX (args=59286038) at eval.c:983 #21 0x01034986 in eval_sub (form=59285950) at eval.c:2231 #22 0x01030431 in Fprogn (args=59309158) at eval.c:364 #23 0x010320db in Fwhile (args=59285942) at eval.c:1075 #24 0x01034986 in eval_sub (form=59285934) at eval.c:2231 #25 0x01030431 in Fprogn (args=59309166) at eval.c:364 #26 0x01032044 in Flet (args=59285606) at eval.c:1053 #27 0x01034986 in eval_sub (form=59245286) at eval.c:2231 #28 0x01030431 in Fprogn (args=62798582) at eval.c:364 #29 0x01034986 in eval_sub (form=62798590) at eval.c:2231 #30 0x01030291 in Fif (args=62798606) at eval.c:314 #31 0x01034986 in eval_sub (form=62798614) at eval.c:2231 #32 0x0103530e in eval_sub (form=59245270) at eval.c:2344 #33 0x01030431 in Fprogn (args=59309182) at eval.c:364 #34 0x01032044 in Flet (args=59245262) at eval.c:1053 #35 0x01034986 in eval_sub (form=59245198) at eval.c:2231 #36 0x01030431 in Fprogn (args=59308166) at eval.c:364 #37 0x01037b1a in funcall_lambda (fun=59308190, nargs=1, arg_vector=0x82f1c0) at eval.c:3159 #38 0x01037461 in apply_lambda (fun=59308198, args=59276566) at eval.c:3043 #39 0x0103533d in eval_sub (form=59276590) at eval.c:2347 #40 0x01030431 in Fprogn (args=59276558) at eval.c:364 #41 0x01037b1a in funcall_lambda (fun=59275998, nargs=0, arg_vector=0x82f420) at eval.c:3159 #42 0x01037461 in apply_lambda (fun=59275990, args=56838170) at eval.c:3043 #43 0x0103533d in eval_sub (form=59187782) at eval.c:2347 #44 0x0103263c in Funwind_protect (args=59187766) at eval.c:1304 #45 0x01034986 in eval_sub (form=59187790) at eval.c:2231 #46 0x01030431 in Fprogn (args=59181374) at eval.c:364 #47 0x01032044 in Flet (args=59187798) at eval.c:1053 #48 0x01034986 in eval_sub (form=59187830) at eval.c:2231 #49 0x01030431 in Fprogn (args=59181366) at eval.c:364 #50 0x010302b1 in Fif (args=59066374) at eval.c:315 #51 0x01034986 in eval_sub (form=59066382) at eval.c:2231 #52 0x01030431 in Fprogn (args=59210614) at eval.c:364 #53 0x01037b1a in funcall_lambda (fun=59210590, nargs=0, arg_vector=0x82fb30) at eval.c:3159 #54 0x01037461 in apply_lambda (fun=59210582, args=56838170) at eval.c:3043 #55 0x0103533d in eval_sub (form=58576574) at eval.c:2347 #56 0x0103450f in Feval (form=58576574, lexical=56838170) at eval.c:2137 #57 0x01005123 in top_level_2 () at keyboard.c:1169 #58 0x01032ab9 in internal_condition_case (bfun=0x1005107 <top_level_2>, handlers=56891826, hfun=0x1004cba <cmd_error>) at eval.c:1448 #59 0x01005155 in top_level_1 (ignore=56838170) at keyboard.c:1177 #60 0x0103247c in internal_catch (tag=56889874, func=0x1005125 <top_level_1>, arg=56838170) at eval.c:1205 #61 0x01005090 in command_loop () at keyboard.c:1132 #62 0x01004678 in recursive_edit_1 () at keyboard.c:759 #63 0x0100499a in Frecursive_edit () at keyboard.c:823 #64 0x010027c9 in main (argc=39, argv=0xa34058) at emacs.c:1711 Lisp Backtrace: "re-search-forward" (0x82df44) "custom-make-dependencies" (0x82e1b4) "funcall" (0x82e1b0) "if" (0x82e430) "cond" (0x82e5c0) "let*" (0x82e790) "while" (0x82e930) "let" (0x82eb40) "progn" (0x82ec90) "if" (0x82ede0) "when" (0x82eef0) "let" (0x82f0f0) "command-line-1" (0x82f1c0) "command-line" (0x82f420) "unwind-protect" (0x82f6d0) "let" (0x82f8d0) "if" (0x82fa60) "normal-top-level" (0x82fb30) ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 17:51 ` Eli Zaretskii @ 2012-05-21 20:39 ` Stefan Monnier 2012-05-22 19:00 ` Eli Zaretskii 2012-05-22 14:38 ` Kenichi Handa 1 sibling, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-21 20:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, Andreas Schwab, 11519 > So I think that what happened is that something, probably the > translation through a char-table, caused allocation of a large chunk > of memory, which in turn relocated the text of the current buffer > behind regex.c's back, which still uses C pointers to the old location > of the buffer text. Here's how search.c calls re_search_2: > p1 = BEGV_ADDR; > s1 = GPT_BYTE - BEGV_BYTE; > p2 = GAP_END_ADDR; > s2 = ZV_BYTE - GPT_BYTE; > if (s1 < 0) > { > p2 = p1; > s2 = ZV_BYTE - BEGV_BYTE; > s1 = 0; > } > if (s2 < 0) > { > s1 = ZV_BYTE - BEGV_BYTE; > s2 = 0; > } > re_match_object = Qnil; > while (n < 0) > { > EMACS_INT val; > val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, > pos_byte - BEGV_BYTE, lim_byte - pos_byte, > (NILP (Vinhibit_changing_match_data) > ? &search_regs : &search_regs_1), > /* Don't allow match past current point */ > pos_byte - BEGV_BYTE); > We pass s1 and s2, which are pointers to buffer text, so if buffer > text is relocated, we are screwed. > Does this explanation sound reasonable? If so, any ideas how to fix > this? I suggest you let-bind some witness variable is re_search_2 and then in the buffer-relocation code, you test this var and abort if it's non-nil. That should let us catch the offender red-handed, after which we will know better how to fix the problem. Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 20:39 ` Stefan Monnier @ 2012-05-22 19:00 ` Eli Zaretskii 2012-05-22 19:19 ` Stefan Monnier 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-22 19:00 UTC (permalink / raw) To: Stefan Monnier, Kenichi Handa; +Cc: lekktu, schwab, 11519 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Andreas Schwab <schwab@linux-m68k.org>, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Mon, 21 May 2012 16:39:56 -0400 > > I suggest you let-bind some witness variable is re_search_2 and then in > the buffer-relocation code, you test this var and abort if it's non-nil. > That should let us catch the offender red-handed, after which we will > know better how to fix the problem. I did the equivalent of the above, but without changing the source (to minimize the chances that the bug will disappear). Is the evidence below conclusive enough? Breakpoint 3, search_buffer (string=272417249, pos=1, pos_byte=1, lim=48448, lim_byte=51025, n=1, RE=1, trt=61843973, inverse_trt=61841925, posix=0) at search.c:1206 1206 val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, $1771 = 272417249 $1772 = (struct Lisp_String *) 0x103cc1e0 "(provide[ \n]+\\('\\|(quote[ \n]\\)[ \n]*ethio-util[ \n)]" (gdb) p current_buffer->text->beg $1773 = ( unsigned char *) 0x10757948 ";;; ethio-util.el --- utilities for Ethiopic -*- coding: utf-8-emacs; -*-\n\n;; Copyright (C) 1997-1998, 2002-2012 Free Software Foundation, Inc.\n;; Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 20"... (gdb) p *p1@20 $1774 = ";;; ethio-util.el --" (gdb) p p1 $1775 = ( unsigned char *) 0x10757948 ";;; ethio-util.el --- utilities for Ethiopic -*- coding: utf-8-emacs; -*-\n\n;; Copyright (C) 1997-1998, 2002-2012 Free Software Foundation, Inc.\n;; Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 20"... So at this point, before we call re_search_2, p1 and current_buffer->text->beg point to the same memory. Now: (gdb) watch current_buffer->text->beg Hardware watchpoint 4: current_buffer->text->beg (gdb) c Continuing. Hardware watchpoint 4: current_buffer->text->beg Old value = (unsigned char *) 0x10757948 ";;; ethio-util.el --- utilities for Ethiopic -*- coding: utf-8-emacs; -*-\n\n;; Copyright (C) 1997-1998, 2002-2012 Free Soft ware Foundation, Inc.\n;; Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 20". .. New value = (unsigned char *) 0x10826948 ";;; ethio-util.el --- utilities for Ethiopic -*- coding: utf-8-emacs; -*-\n\n;; Copyright (C) 1997-1998, 2002-2012 Free Soft ware Foundation, Inc.\n;; Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 20"... r_alloc_sbrk (size=847872) at ralloc.c:808 808 for (b = last_bloc; b != NIL_BLOC; b = b->prev) Note that the address of buffer text has changed from 0x10757948 to 0x10826948. And the culprit is ... (gdb) bt #0 r_alloc_sbrk (size=847872) at ralloc.c:808 #1 0x012e9dc0 in get_contiguous_space (size=847872, position=0x10748000) at gmalloc.c:447 #2 0x012ea662 in _malloc_internal_nolock (size=786436) at gmalloc.c:821 #3 0x012eaa4c in _malloc_internal (size=786436) at gmalloc.c:904 #4 0x012eaa99 in e_malloc (size=786436) at gmalloc.c:927 #5 0x0103a3b2 in emacs_blocked_malloc (size=786436, ptr=0x0) at alloc.c:1308 #6 0x012eaa99 in e_malloc (size=786436) at gmalloc.c:927 #7 0x010397fb in xmalloc (size=786436) at alloc.c:727 #8 0x0120da2d in load_charset_map_from_file (charset=0x1944970, mapfile=57455953, control_flag=1) at charset.c:501 #9 0x0120e480 in load_charset (charset=0x1944970, control_flag=1) at charset.c:646 #10 0x01214cc9 in maybe_unify_char (c=1704385, val=57027682) at charset.c:1644 #11 0x0128b4a0 in string_char ( p=0x1075d630 " 2.÷áחג 3.÷áחד 4.÷áחה 5.÷áח\200\")\n (cond\n ((= arg ?1)\n (insert \"÷áח\201\"))\n ((= arg ?2)\n (insert \"÷áחג\"))\n ((= arg ?3)\n (insert \"÷áחד\"))\n ((= arg ?4)\n (insert \"÷áחה\"))\n ((= arg ?5"..., advanced=0x0, len=0x82dcec) at character.c:200 #12 0x01142f65 in re_search_2 (bufp=0x1933c08, str1=0x10757948 ";;; ethio-util.el --- utilities for Ethiopic -*- coding: utf-8-emacs; -*-\n\n;; Copyright (C) 1997-1998, 2002-2012 Free Software Foundation, Inc.\n;; Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 20"..., size1=51024, str2=0x1077fb17 "", size2=0, startpos=23749, range=27244, regs=0x19351f8, stop=51024) at regex.c:4421 #13 0x010fbd78 in search_buffer (string=272417249, pos=1, pos_byte=1, lim=48448, lim_byte=51025, n=1, RE=1, trt=61843973, inverse_trt=61841925, posix=0) at search.c:1207 #14 0x010fb578 in search_command (string=272417249, bound=56838170, noerror=56838194, count=56838170, direction=1, RE=1, posix=0) at search.c:997 #15 0x010fefa4 in Fre_search_forward (regexp=272417249, bound=56838170, noerror=56838194, count=56838170) at search.c:2165 The fragment of regex.c that triggers this is as follows: if (RE_TRANSLATE_P (translate)) { if (multibyte) while (range > lim) { int buf_charlen; >>>>>>>>>>>>>>>>> buf_ch = STRING_CHAR_AND_LENGTH (d, buf_charlen); buf_ch = RE_TRANSLATE (translate, buf_ch); if (fastmap[CHAR_LEADING_CODE (buf_ch)]) break; range -= buf_charlen; d += buf_charlen; } The marked line calls string_char, which calls maybe_unify_char, which calls load_charset, which causes memory allocation and relocation of buffer text. The very next call to RE_TRANSLATE, which calls char_table_translate, throws an error because buf_ch is garbage. If you agree with the diagnosis, then how about the change below? It fixes the problem for me. (Or is there a better way?) If accepted, I will add the necessary commentary to this code and a prototype for the new function. In any case, I suggest to install the fix on the emacs-24 branch, because this issue is a disaster waiting to happen. === modified file 'src/ralloc.c' --- src/ralloc.c 2012-04-16 01:18:13 +0000 +++ src/ralloc.c 2012-05-22 18:39:25 +0000 @@ -1143,6 +1143,12 @@ r_alloc_reset_variable (POINTER *old, PO bloc->variable = new; } +void +r_alloc_inhibit_buffer_relocation (int inhibit) +{ + use_relocatable_buffers = (inhibit ? 0 : 1); +} + \f /*********************************************************************** Initialization === modified file 'src/search.c' --- src/search.c 2012-05-17 00:03:49 +0000 +++ src/search.c 2012-05-22 18:41:23 +0000 @@ -1158,12 +1158,19 @@ search_buffer (Lisp_Object string, EMACS while (n < 0) { EMACS_INT val; + +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (1); +#endif val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, pos_byte - BEGV_BYTE, lim_byte - pos_byte, (NILP (Vinhibit_changing_match_data) ? &search_regs : &search_regs_1), /* Don't allow match past current point */ pos_byte - BEGV_BYTE); +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif if (val == -2) { matcher_overflow (); @@ -1202,11 +1209,18 @@ search_buffer (Lisp_Object string, EMACS while (n > 0) { EMACS_INT val; + +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (1); +#endif val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, pos_byte - BEGV_BYTE, lim_byte - pos_byte, (NILP (Vinhibit_changing_match_data) ? &search_regs : &search_regs_1), lim_byte - BEGV_BYTE); +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif if (val == -2) { matcher_overflow (); ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-22 19:00 ` Eli Zaretskii @ 2012-05-22 19:19 ` Stefan Monnier 2012-05-22 19:47 ` Eli Zaretskii 2012-05-23 14:10 ` Kenichi Handa 0 siblings, 2 replies; 39+ messages in thread From: Stefan Monnier @ 2012-05-22 19:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, 11519, lekktu > Note that the address of buffer text has changed from 0x10757948 to > 0x10826948. And the culprit is ... > #7 0x010397fb in xmalloc (size=786436) at alloc.c:727 > #8 0x0120da2d in load_charset_map_from_file (charset=0x1944970, > mapfile=57455953, control_flag=1) at charset.c:501 Huh! Indeed, I always assumed that relocation would be something that can only happen during GC, not in a mere xmalloc. > The marked line calls string_char, which calls maybe_unify_char, which > calls load_charset, which causes memory allocation and relocation of > buffer text. This brings up back to the issue of calling maybe_unify_char in STRING_CHAR_AND_LENGTH. One more strike against it. Handa, could you prepare a patch that removes this? > If you agree with the diagnosis, then how about the change below? Might be an acceptable workaround for the emacs-24 branch, yes (tho I'd replace "inhibit ? 0 : 1" with "!inhibit"). But is it really new in Emacs-24? It seems the same problem is already present in Emacs-23, so it's probably not so urgent to fix it for 24.1. > It fixes the problem for me. (Or is there a better way?) If accepted, I > will add the necessary commentary to this code and a prototype for the > new function. In any case, I suggest to install the fix on the > emacs-24 branch, because this issue is a disaster waiting to happen. I wonder: why do we use REL_ALLOC? Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-22 19:19 ` Stefan Monnier @ 2012-05-22 19:47 ` Eli Zaretskii 2012-05-23 0:47 ` Stefan Monnier 2012-05-23 14:10 ` Kenichi Handa 1 sibling, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-22 19:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, 11519, lekktu > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: Kenichi Handa <handa@gnu.org>, schwab@linux-m68k.org, lekktu@gmail.com, > 11519@debbugs.gnu.org > Date: Tue, 22 May 2012 15:19:12 -0400 > > > Note that the address of buffer text has changed from 0x10757948 to > > 0x10826948. And the culprit is ... > > > #7 0x010397fb in xmalloc (size=786436) at alloc.c:727 > > #8 0x0120da2d in load_charset_map_from_file (charset=0x1944970, > > mapfile=57455953, control_flag=1) at charset.c:501 > > Huh! Indeed, I always assumed that relocation would be something that > can only happen during GC, not in a mere xmalloc. It happens in xmalloc/xrealloc when ralloc.c is used. > > If you agree with the diagnosis, then how about the change below? > > Might be an acceptable workaround for the emacs-24 branch, yes (tho I'd > replace "inhibit ? 0 : 1" with "!inhibit"). But is it really new in > Emacs-24? It seems the same problem is already present in Emacs-23, so > it's probably not so urgent to fix it for 24.1. The problem is indeed not new, but so what? It is real, and it just happened to us in real life, albeit on the trunk. Who knows how many other problems which we dismiss as not reproducible could have been caused by this (especially when exotic character sets were involved)? > I wonder: why do we use REL_ALLOC? AFAIK, we do that only on platforms that don't support mmap for allocating buffer text. The relocatable allocator makes a more efficient use of memory, especially when it is almost full: it can potentially produce a large block from several small ones by moving buffer text around to "compact" their storage. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-22 19:47 ` Eli Zaretskii @ 2012-05-23 0:47 ` Stefan Monnier 2012-05-23 2:59 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-23 0:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, 11519, lekktu >> > If you agree with the diagnosis, then how about the change below? >> Might be an acceptable workaround for the emacs-24 branch, yes (tho I'd >> replace "inhibit ? 0 : 1" with "!inhibit"). But is it really new in >> Emacs-24? It seems the same problem is already present in Emacs-23, so >> it's probably not so urgent to fix it for 24.1. > The problem is indeed not new, but so what? It is real, and it just > happened to us in real life, albeit on the trunk. Who knows how many > other problems which we dismiss as not reproducible could have been > caused by this (especially when exotic character sets were involved)? I assume that the problem can show up in many other places than re_search, so yes, it's a real problem that we need to fix for real, but your workaround will only fix the problem we happened to bump into, so adding this fix to the emacs-24 branch may be completely useless if this bug never shows up at that place there. >> I wonder: why do we use REL_ALLOC? > AFAIK, we do that only on platforms that don't support mmap for > allocating buffer text. So, IIUC the only reason to use it is so that we can more often return memory to the OS even for the non-mmap case? Is that because returning memory can only be done via sbrk style memory management? I wonder how effective it is in practice. Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 0:47 ` Stefan Monnier @ 2012-05-23 2:59 ` Eli Zaretskii 2012-05-23 14:16 ` Stefan Monnier 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-23 2:59 UTC (permalink / raw) To: Stefan Monnier, Richard Stallman; +Cc: schwab, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Tue, 22 May 2012 20:47:54 -0400 > > I assume that the problem can show up in many other places than > re_search, so yes, it's a real problem that we need to fix for real, but > your workaround will only fix the problem we happened to bump into Which other places use C pointers to buffer text and call functions that can allocate memory? If you know about such places, please point them out. I don't think we have them, but that's me. > adding this fix to the emacs-24 branch may be completely useless if this > bug never shows up at that place there. That "if" is profoundly false, as I can now easily craft a use case where it _will_ show up. Anyway, are you against committing this to the release branch? I'd be very sad if you were, having invested so much time in hunting this bug, but I guess I'll survive. > >> I wonder: why do we use REL_ALLOC? > > AFAIK, we do that only on platforms that don't support mmap for > > allocating buffer text. > > So, IIUC the only reason to use it is so that we can more often return > memory to the OS even for the non-mmap case? Is that because returning > memory can only be done via sbrk style memory management? I don't think this is only about _returning_ memory. It is first and foremost about not _asking_ for more memory when we can come up with it by reshuffling buffer text. > I wonder how effective it is in practice. I have no idea. ralloc.c was in place (and used by all platforms) long before I became involved with Emacs. Perhaps Richard can tell. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 2:59 ` Eli Zaretskii @ 2012-05-23 14:16 ` Stefan Monnier 2012-05-23 15:23 ` Ken Brown ` (2 more replies) 0 siblings, 3 replies; 39+ messages in thread From: Stefan Monnier @ 2012-05-23 14:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, Richard Stallman, 11519, lekktu > Which other places use C pointers to buffer text and call functions > that can allocate memory? IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is vulnerable to the problem. > Anyway, are you against committing this to the release branch? I'd be > very sad if you were, having invested so much time in hunting this > bug, but I guess I'll survive. I'm not dead set against it, and I'm glad we found the culprit so we can fix it: fixing it on the release branch is not that important, since this bug has been with us since Emacs-23.1, AFAICT. If you really want to install your workaround on the emacs-24 branch, go for it but let's try to find a real fix for the trunk. >> >> I wonder: why do we use REL_ALLOC? >> > AFAIK, we do that only on platforms that don't support mmap for >> > allocating buffer text. >> So, IIUC the only reason to use it is so that we can more often return >> memory to the OS even for the non-mmap case? Is that because returning >> memory can only be done via sbrk style memory management? > I don't think this is only about _returning_ memory. It is first and > foremost about not _asking_ for more memory when we can come up with > it by reshuffling buffer text. So you're saying it's use for fragmentation reasons? But on other platforms where we use mmap, we do suffer from this fragmentation, and yet it doesn't seem to be a real source of problem. That's why I think the only real reason is because memory can only be returned via sbrk-style memory management (i.e. only free memory at the end of the heap can be returned). Is that right? I guess my question turns into "why do we use gmalloc.c instead of a malloc library that uses mmap (or some other mechanism that lets it return large free chunks to the OS)"? AFAIK, Windows is pretty much the only system where we use gmalloc.c and ralloc.c nowadays. Does anyone remember why we don't use the system malloc under Windows (and Cygwin)? Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 14:16 ` Stefan Monnier @ 2012-05-23 15:23 ` Ken Brown 2012-05-23 16:52 ` Eli Zaretskii 2012-05-23 17:34 ` Eli Zaretskii 2 siblings, 0 replies; 39+ messages in thread From: Ken Brown @ 2012-05-23 15:23 UTC (permalink / raw) To: Stefan Monnier; +Cc: lekktu, schwab, Richard Stallman, 11519 On 5/23/2012 10:16 AM, Stefan Monnier wrote: >> Which other places use C pointers to buffer text and call functions >> that can allocate memory? > > IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is > vulnerable to the problem. > >> Anyway, are you against committing this to the release branch? I'd be >> very sad if you were, having invested so much time in hunting this >> bug, but I guess I'll survive. > > I'm not dead set against it, and I'm glad we found the culprit so we can > fix it: fixing it on the release branch is not that important, since this > bug has been with us since Emacs-23.1, AFAICT. > > If you really want to install your workaround on the emacs-24 branch, go > for it but let's try to find a real fix for the trunk. > >>>>> I wonder: why do we use REL_ALLOC? >>>> AFAIK, we do that only on platforms that don't support mmap for >>>> allocating buffer text. >>> So, IIUC the only reason to use it is so that we can more often return >>> memory to the OS even for the non-mmap case? Is that because returning >>> memory can only be done via sbrk style memory management? >> I don't think this is only about _returning_ memory. It is first and >> foremost about not _asking_ for more memory when we can come up with >> it by reshuffling buffer text. > > So you're saying it's use for fragmentation reasons? > But on other platforms where we use mmap, we do suffer from this > fragmentation, and yet it doesn't seem to be a real source of problem. > That's why I think the only real reason is because memory can only be > returned via sbrk-style memory management (i.e. only free memory at the > end of the heap can be returned). Is that right? > > I guess my question turns into "why do we use gmalloc.c instead of > a malloc library that uses mmap (or some other mechanism that lets it > return large free chunks to the OS)"? > > AFAIK, Windows is pretty much the only system where we use gmalloc.c and > ralloc.c nowadays. Does anyone remember why we don't use the system > malloc under Windows (and Cygwin)? Cygwin uses gmalloc.c but not ralloc.c; it uses mmap for buffers. There are two reasons for using gmalloc.c on Cygwin. The first, which may or may not be important, is that Cygwin's malloc doesn't support __after_morecore_hook, malloc_get_state, or malloc_set_state. The second has to do with the way emacs is dumped on Cygwin. See the comment starting at line 302 in gmalloc.c. I would love to find a better way to deal with this and be able to use the system malloc on Cygwin. Ken ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 14:16 ` Stefan Monnier 2012-05-23 15:23 ` Ken Brown @ 2012-05-23 16:52 ` Eli Zaretskii 2012-05-23 20:07 ` Stefan Monnier 2012-05-23 17:34 ` Eli Zaretskii 2 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-23 16:52 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, rms, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Richard Stallman <rms@gnu.org>, handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Wed, 23 May 2012 10:16:17 -0400 > > > Which other places use C pointers to buffer text and call functions > > that can allocate memory? > > IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is > vulnerable to the problem. That's not true. As long as you access buffer text through character position, you are safe. The only situation where you are vulnerable is if you store a C pointer to buffer text, e.g., like this: char *text = BEGV_ADDR; or char *text = BYTE_POS_ADDR (current_buffer->pt); then invoke some function that can allocate or reallocate memory, and _then_ access buffer text through that pointer. If you find such a code anywhere, then that's a bug similar to this one. > If you really want to install your workaround on the emacs-24 branch, go > for it but let's try to find a real fix for the trunk. What kind of real fix are you looking for? I agree with Handa-san: being able to suppress relocation in select places is a good feature. Why shouldn't it be the fix in this case, and what better fix can we invent when we use an essentially externally maintained code (AFAIR, regex will at some point be re-sync'ed with gnulib) that cannot be expected to change its code radically so as not to access buffer text through C pointers? > >> >> I wonder: why do we use REL_ALLOC? > >> > AFAIK, we do that only on platforms that don't support mmap for > >> > allocating buffer text. > >> So, IIUC the only reason to use it is so that we can more often return > >> memory to the OS even for the non-mmap case? Is that because returning > >> memory can only be done via sbrk style memory management? > > I don't think this is only about _returning_ memory. It is first and > > foremost about not _asking_ for more memory when we can come up with > > it by reshuffling buffer text. > > So you're saying it's use for fragmentation reasons? Yes. > But on other platforms where we use mmap, we do suffer from this > fragmentation, and yet it doesn't seem to be a real source of problem. I don't know enough about mmap to answer that. I vaguely recollect that mmap avoids such fragmentation as an inherent feature, but I may be wrong. > That's why I think the only real reason is because memory can only be > returned via sbrk-style memory management (i.e. only free memory at the > end of the heap can be returned). Is that right? Yes, AFAIK. > I guess my question turns into "why do we use gmalloc.c instead of > a malloc library that uses mmap (or some other mechanism that lets it > return large free chunks to the OS)"? Use of gmalloc is a different issue. We were talking about ralloc.c. You could use one, but not the other. > AFAIK, Windows is pretty much the only system where we use gmalloc.c and > ralloc.c nowadays. My reading of configure is that we use it on more than just Windows (and MS-DOS). Basically, any platform that uses gmalloc.c (which is the default, turned off only for GNU/Linux and Darwin) also uses ralloc.c. > Does anyone remember why we don't use the system malloc under > Windows (and Cygwin)? I find it hard to believe that going through system malloc on MS-Windows will let us use buffers as large as 1.5 GB (on a 32-bit machine). To achieve this today, we reserve a 2GB contiguous chunk of address space at startup, and then commit and uncommit parts of it as needed (see w32heap.c). ralloc.c has an important part in this arrangement. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 16:52 ` Eli Zaretskii @ 2012-05-23 20:07 ` Stefan Monnier 2012-05-24 16:22 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-23 20:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, rms, 11519, lekktu >> > Which other places use C pointers to buffer text and call functions >> > that can allocate memory? >> IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is >> vulnerable to the problem. > That's not true. As long as you access buffer text through character > position, you are safe. Right, some of those uses might be safe, indeed. Of course it's not only STRING_CHAR_AND_LENGTH but STRING_CHAR_ADVANCE as well, together with FETCH_* macros which use those, etc... Grepping for those macros shows they're used at *many* places, and I'd be amazed if re_search is the only place where we don't go through the BYTE_POS_ADDR rigmarole. Let's see ...hmmm... yup, set-buffer-multibyte is another example, find_charsets_in_text yet another, and I'm not even trying hard. Just grep for "STRING_CHAR_" and see for yourself. >> If you really want to install your workaround on the emacs-24 branch, go >> for it but let's try to find a real fix for the trunk. > What kind of real fix are you looking for? One that lets us write code without having to worry about such corner cases. E.g. changing STRING_CHAR_ADVANCE so it can't cause relocation. Not using ralloc.c any more would be another good option. Otherwise, changing our macros so they do the BYTE_POS_ADDR internally, discouraging the use of direct pointers into the buffer's content. > Why shouldn't it be the fix in this case, and what better fix can we > invent when we use an essentially externally maintained code (AFAIR, > regex will at some point be re-sync'ed with gnulib) that cannot be > expected to change its code radically so as not to access buffer text > through C pointers? To me, it's clearly a workaround. It's an OK workaround if/when we use a "blackbox" (i.e. externally maintained) regexp engine and keep using ralloc.c. But better would be to eliminate the problem altogether. >> But on other platforms where we use mmap, we do suffer from this >> fragmentation, and yet it doesn't seem to be a real source of problem. > I don't know enough about mmap to answer that. I vaguely recollect > that mmap avoids such fragmentation as an inherent feature, but I may > be wrong. No, fragmentation is a property of the address space, so without relocation you can't avoid it. >> I guess my question turns into "why do we use gmalloc.c instead of >> a malloc library that uses mmap (or some other mechanism that lets it >> return large free chunks to the OS)"? > Use of gmalloc is a different issue. We were talking about ralloc.c. > You could use one, but not the other. Well, still we use ralloc because we don't use mmap, so the question to me is: why don't we use mmap (either via a malloc that does, or directly via USE_MMAP_FOR_BUFFERS) and get rid of ralloc.c? >> AFAIK, Windows is pretty much the only system where we use gmalloc.c and >> ralloc.c nowadays. > My reading of configure is that we use it on more than just Windows > (and MS-DOS). Basically, any platform that uses gmalloc.c (which is > the default, turned off only for GNU/Linux and Darwin) also uses > ralloc.c. To me "all minus GNU/Linux, Mac OS X, and Cygwin (which apparently uses gmalloc but not ralloc)" is pretty close to "just Windows" nowadays. >> Does anyone remember why we don't use the system malloc under >> Windows (and Cygwin)? > I find it hard to believe that going through system malloc on > MS-Windows will let us use buffers as large as 1.5 GB (on a 32-bit > machine). To achieve this today, we reserve a 2GB contiguous chunk of > address space at startup, and then commit and uncommit parts of it as > needed (see w32heap.c). ralloc.c has an important part in this > arrangement. You mean that Windows's system malloc library has a memory that's too fragmented to be able to allocate a single 1.5G chunk? Why? [ I know next to nothing about the w32 API and plead guilty of POSIX-only thinking, so please bear with me. ] Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 20:07 ` Stefan Monnier @ 2012-05-24 16:22 ` Eli Zaretskii 2012-05-28 2:15 ` Stefan Monnier 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-24 16:22 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, rms, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: rms@gnu.org, handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Wed, 23 May 2012 16:07:05 -0400 > > >> > Which other places use C pointers to buffer text and call functions > >> > that can allocate memory? > >> IIUC any place that uses STRING_CHAR_AND_LENGTH on buffer text is > >> vulnerable to the problem. > > That's not true. As long as you access buffer text through character > > position, you are safe. > > Right, some of those uses might be safe, indeed. Of course it's not > only STRING_CHAR_AND_LENGTH but STRING_CHAR_ADVANCE as well, together > with FETCH_* macros which use those, etc... No, FETCH_* macros are safe, because they accept buffer positions, fetch only one character at a time, and call STRING_CHAR_* _after_ they access the buffer. > Grepping for those macros shows they're used at *many* places, and I'd > be amazed if re_search is the only place where we don't go through the > BYTE_POS_ADDR rigmarole. > > Let's see ...hmmm... yup, set-buffer-multibyte is another example, > find_charsets_in_text yet another, and I'm not even trying hard. > Just grep for "STRING_CHAR_" and see for yourself. I didn't mean STRING_CHAR_*. I agree that they should be fixed not to have such unexpected side effect. They should be read-only operations. As a temporary band-aid for Emacs 24.1, I suggest the change below. What I meant is the code besides STRING_CHAR_* macros. I don't think you will find code that manipulates C pointers to buffer text and calls functions that can allocate memory. > >> But on other platforms where we use mmap, we do suffer from this > >> fragmentation, and yet it doesn't seem to be a real source of problem. > > I don't know enough about mmap to answer that. I vaguely recollect > > that mmap avoids such fragmentation as an inherent feature, but I may > > be wrong. > > No, fragmentation is a property of the address space, so without > relocation you can't avoid it. I asked Gerd Möllmann, who wrote the mmap-based buffer allocation code, about this. That code originally resided on ralloc.c and was meant to replace the sbrk-based implementation. So I would expect that the issue of relocation was considered back then, and I hope Gerd will remember why the mmap-based code was considered good enough to replace ralloc.c. > > I find it hard to believe that going through system malloc on > > MS-Windows will let us use buffers as large as 1.5 GB (on a 32-bit > > machine). To achieve this today, we reserve a 2GB contiguous chunk of > > address space at startup, and then commit and uncommit parts of it as > > needed (see w32heap.c). ralloc.c has an important part in this > > arrangement. > > You mean that Windows's system malloc library has a memory that's too > fragmented to be able to allocate a single 1.5G chunk? Why? This has nothing to do with Windows APIs, so you are well equipped to reason about this ;-) You said "malloc", so I took an issue with the MS C runtime implementation of malloc. Since all the other implementations suffer from fragmentation, there's no reason to believe that the MS implementation avoids that danger. A general-purpose function that cannot move buffers behind the scenes cannot possibly avoid that. Doing better was the original reason for writing ralloc.c. If you meant to bypass malloc and use the Windows memory-allocation APIs, such as VirtualAlloc, directly, then that's what we already do in w32heap.c, which implements an emulation of sbrk that is good enough for Emacs. The fact that gmalloc and ralloc are used on top of that are simply to avoid reinventing the wheel. We could easily turn off buffer relocation in ralloc.c for good, by fixing the value of use_relocatable_buffers at zero. But I'm worried that this would cause Emacs on Windows run out of memory (or act as if it were) faster. For example, in an Emacs session that runs for 2 weeks and has a 200MB working set, I just visited a 1.3GB file, went to its middle and typed "C-u 30000 d" to insert 30K characters. Emacs had no problems enlarging the buffer, although it has only 1.9GB of reserved memory space on that machine, so if it needed to allocate another 1.3GB+30KB buffer (due to fragmentation, which is a certainty after such a long time), it would have failed, I presume. Yet another alternative is to emulate mmap on Windows using the equivalent Windows API. But that requires a research comparing mmap features we need and use on Posix platforms with the features offered by Windows, to make sure this is at all feasible. Such a research would need to be done by someone who knows enough about mmap and is willing to do the job. Do we have such a person on board? And then there's the implementation and testing. Doesn't sound like an efficient use of our time and energy. Are there other alternatives? Here's the band-aid I propose for emacs-24, to make the STRING_CHAR_* macros safe: === modified file 'src/charset.c' --- src/charset.c 2012-01-19 07:21:25 +0000 +++ src/charset.c 2012-05-24 16:14:05 +0000 @@ -1641,6 +1641,12 @@ maybe_unify_char (int c, Lisp_Object val return c; CHECK_CHARSET_GET_CHARSET (val, charset); +#ifdef REL_ALLOC + /* The call to load_charset below can allocate memory, whcih screws + callers of this function through STRING_CHAR_* macros that hold C + pointers to buffer text, if REL_ALLOC is used. */ + r_alloc_inhibit_buffer_relocation (1); +#endif load_charset (charset, 1); if (! inhibit_load_charset_map) { @@ -1656,6 +1662,9 @@ maybe_unify_char (int c, Lisp_Object val if (unified > 0) c = unified; } +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif return c; } === modified file 'src/ralloc.c' --- src/ralloc.c 2012-05-23 17:32:28 +0000 +++ src/ralloc.c 2012-05-24 16:16:14 +0000 @@ -1204,7 +1204,12 @@ r_alloc_reset_variable (POINTER *old, PO void r_alloc_inhibit_buffer_relocation (int inhibit) { - use_relocatable_buffers = !inhibit; + if (use_relocatable_buffers < 0) + use_relocatable_buffers = 0; + if (inhibit) + use_relocatable_buffers++; + else if (use_relocatable_buffers > 0) + use_relocatable_buffers--; } \f ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-24 16:22 ` Eli Zaretskii @ 2012-05-28 2:15 ` Stefan Monnier 2012-05-28 16:53 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-28 2:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, rms, 11519, lekktu > I didn't mean STRING_CHAR_*. I agree that they should be fixed not to > have such unexpected side effect. They should be read-only operations. > As a temporary band-aid for Emacs 24.1, I suggest the change below. Looks fine (should make the regex.c patch unnecessary). > You said "malloc", so I took an issue with the MS C runtime > implementation of malloc. Since all the other implementations suffer > from fragmentation, there's no reason to believe that the MS > implementation avoids that danger. The problem is inherent to the malloc API, pretty much, yes. > We could easily turn off buffer relocation in ralloc.c for good, by > fixing the value of use_relocatable_buffers at zero. But I'm worried > that this would cause Emacs on Windows run out of memory (or act as if > it were) faster. AFAIK, Emacs under GNU/Linux and Mac OS X uses non-relocatable buffers, and they don't seem to suffer more from fragmentation problems than Emacs under Windows. But yes, unless we use an mmap-style allocation, it would use more actual memory. > For example, in an Emacs session that runs for 2 > weeks and has a 200MB working set, I just visited a 1.3GB file, went > to its middle and typed "C-u 30000 d" to insert 30K characters. Emacs For sure editing such large file in a 32bit address space might prove problematic without relocation, (and even with buffer relocation, some non-buffer allocation might end up fragmenting the address space too much) but luckily few people do that (you need to compile with --wide-int to be able to do that). > Yet another alternative is to emulate mmap on Windows using the > equivalent Windows API. But that requires a research comparing mmap > features we need and use on Posix platforms with the features offered > by Windows, to make sure this is at all feasible. That would be nice. > Such a research > would need to be done by someone who knows enough about mmap and is > willing to do the job. Do we have such a person on board? I don't volunteer. Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-28 2:15 ` Stefan Monnier @ 2012-05-28 16:53 ` Eli Zaretskii 2012-05-28 19:44 ` Stefan Monnier 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-28 16:53 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, rms, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: rms@gnu.org, handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Sun, 27 May 2012 22:15:00 -0400 > > > I didn't mean STRING_CHAR_*. I agree that they should be fixed not to > > have such unexpected side effect. They should be read-only operations. > > As a temporary band-aid for Emacs 24.1, I suggest the change below. > > Looks fine Installed as revision 108020 on the emacs-24 branch. > (should make the regex.c patch unnecessary). I wouldn't recommend removing the calls to r_alloc_inhibit_buffer_relocation around calls to regex, as that is an external function we don't control. But if you feel strongly about that, I will remove them. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-28 16:53 ` Eli Zaretskii @ 2012-05-28 19:44 ` Stefan Monnier 2012-05-28 20:47 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-28 19:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, rms, 11519, lekktu >> (should make the regex.c patch unnecessary). > I wouldn't recommend removing the calls to > r_alloc_inhibit_buffer_relocation around calls to regex, as that is an > external function we don't control. But if you feel strongly about > that, I will remove them. I don't feel strongly about that. But this code is all ours, it's not an external function (it used to be half-external, and then I hacked on it half-extensively, after which the new version was sync'd back from Emacs to gnulib, after which the gnulib side dropped it, and AFAIK Emacs is the last remaining development line of this code, except maybe for XEmacs's version). Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-28 19:44 ` Stefan Monnier @ 2012-05-28 20:47 ` Eli Zaretskii 2012-05-29 1:23 ` Stefan Monnier 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-28 20:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, rms, 11519, lekktu > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: rms@gnu.org, handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, > 11519@debbugs.gnu.org > Date: Mon, 28 May 2012 15:44:18 -0400 > > > I wouldn't recommend removing the calls to > > r_alloc_inhibit_buffer_relocation around calls to regex, as that is an > > external function we don't control. But if you feel strongly about > > that, I will remove them. > > I don't feel strongly about that. But this code is all ours, it's not > an external function (it used to be half-external, and then I hacked on > it half-extensively, after which the new version was sync'd back from > Emacs to gnulib, after which the gnulib side dropped it, and AFAIK Emacs > is the last remaining development line of this code, except maybe for > XEmacs's version). If you are confident that regex.c doesn't call malloc in any other place, and that chances for it to do so in the future are low enough, then I'm okay with removing the calls to r_alloc_inhibit_buffer_relocation. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-28 20:47 ` Eli Zaretskii @ 2012-05-29 1:23 ` Stefan Monnier 2012-05-29 16:02 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-29 1:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, rms, 11519, lekktu > If you are confident that regex.c doesn't call malloc in any other > place, and that chances for it to do so in the future are low enough, I'm pretty confident, yes. Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-29 1:23 ` Stefan Monnier @ 2012-05-29 16:02 ` Eli Zaretskii 2012-06-02 20:44 ` Juanma Barranquero 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-29 16:02 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, rms, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: rms@gnu.org, handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Mon, 28 May 2012 21:23:13 -0400 > > > If you are confident that regex.c doesn't call malloc in any other > > place, and that chances for it to do so in the future are low enough, > > I'm pretty confident, yes. Removed in revision 108021 on the emacs-24 branch. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-29 16:02 ` Eli Zaretskii @ 2012-06-02 20:44 ` Juanma Barranquero 2012-06-03 4:18 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Juanma Barranquero @ 2012-06-02 20:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 11519 > Removed in revision 108021 on the emacs-24 branch. This bug fix hasn't yet been ported back to the trunk, has it? Juanma ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-06-02 20:44 ` Juanma Barranquero @ 2012-06-03 4:18 ` Eli Zaretskii 2013-12-28 8:41 ` Glenn Morris 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-06-03 4:18 UTC (permalink / raw) To: Juanma Barranquero; +Cc: 11519 > From: Juanma Barranquero <lekktu@gmail.com> > Date: Sat, 2 Jun 2012 22:44:39 +0200 > Cc: 11519@debbugs.gnu.org > > > Removed in revision 108021 on the emacs-24 branch. > > This bug fix hasn't yet been ported back to the trunk, has it? ?? Of course, it was. The function maybe_unify_char inhibits relocation of buffer text. Why, did you have the problem again? ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-06-03 4:18 ` Eli Zaretskii @ 2013-12-28 8:41 ` Glenn Morris 2013-12-28 9:48 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Glenn Morris @ 2013-12-28 8:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Juanma Barranquero, 11519 Is this report still relevant? ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2013-12-28 8:41 ` Glenn Morris @ 2013-12-28 9:48 ` Eli Zaretskii 0 siblings, 0 replies; 39+ messages in thread From: Eli Zaretskii @ 2013-12-28 9:48 UTC (permalink / raw) To: Glenn Morris; +Cc: lekktu, 11519-done > From: Glenn Morris <rgm@gnu.org> > Cc: Juanma Barranquero <lekktu@gmail.com>, 11519@debbugs.gnu.org > Date: Sat, 28 Dec 2013 03:41:32 -0500 > > > Is this report still relevant? No. Closing. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 14:16 ` Stefan Monnier 2012-05-23 15:23 ` Ken Brown 2012-05-23 16:52 ` Eli Zaretskii @ 2012-05-23 17:34 ` Eli Zaretskii 2 siblings, 0 replies; 39+ messages in thread From: Eli Zaretskii @ 2012-05-23 17:34 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, rms, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Richard Stallman <rms@gnu.org>, handa@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Wed, 23 May 2012 10:16:17 -0400 > > If you really want to install your workaround on the emacs-24 branch, go > for it Done (revision 108012 on the emacs-24 branch). ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-22 19:19 ` Stefan Monnier 2012-05-22 19:47 ` Eli Zaretskii @ 2012-05-23 14:10 ` Kenichi Handa 2012-05-23 15:27 ` Stefan Monnier 1 sibling, 1 reply; 39+ messages in thread From: Kenichi Handa @ 2012-05-23 14:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: lekktu, schwab, 11519 In article <jwv8vgkt4lc.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@IRO.UMontreal.CA> writes: > This brings up back to the issue of calling maybe_unify_char in > STRING_CHAR_AND_LENGTH. One more strike against it. Handa, could you > prepare a patch that removes this? Ok, I'll work on it. But, it's not for 24.1, right? > > If you agree with the diagnosis, then how about the change below? > Might be an acceptable workaround for the emacs-24 branch, yes (tho I'd > replace "inhibit ? 0 : 1" with "!inhibit"). I think it is not just a workaround. If we can suppress buffer/string relocation temporarily, we can utilize that functionality in several other places. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 14:10 ` Kenichi Handa @ 2012-05-23 15:27 ` Stefan Monnier 2012-05-23 17:02 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-23 15:27 UTC (permalink / raw) To: Kenichi Handa; +Cc: lekktu, schwab, 11519 >> This brings up back to the issue of calling maybe_unify_char in >> STRING_CHAR_AND_LENGTH. One more strike against it. Handa, could you >> prepare a patch that removes this? > Ok, I'll work on it. Thanks. > But, it's not for 24.1, right? Right, it's for 24.2, so it's not super urgent. >> > If you agree with the diagnosis, then how about the change below? >> Might be an acceptable workaround for the emacs-24 branch, yes (tho I'd >> replace "inhibit ? 0 : 1" with "!inhibit"). > I think it is not just a workaround. If we can suppress buffer/string > relocation temporarily, we can utilize that functionality in several > other places. We could suppress relocation all the time. After all, we don't use relalloc on GNU systems and under Mac OS X. Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-23 15:27 ` Stefan Monnier @ 2012-05-23 17:02 ` Eli Zaretskii 0 siblings, 0 replies; 39+ messages in thread From: Eli Zaretskii @ 2012-05-23 17:02 UTC (permalink / raw) To: Stefan Monnier; +Cc: schwab, 11519, lekktu > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: eliz@gnu.org, schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Wed, 23 May 2012 11:27:16 -0400 > > We could suppress relocation all the time. I suspect that will increase the probability of out-of-memory conditions due to fragmentation, and system-wide memory pressure in general, on those platforms that do use ralloc.c today. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 17:51 ` Eli Zaretskii 2012-05-21 20:39 ` Stefan Monnier @ 2012-05-22 14:38 ` Kenichi Handa 2012-05-22 19:02 ` Eli Zaretskii 1 sibling, 1 reply; 39+ messages in thread From: Kenichi Handa @ 2012-05-22 14:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, schwab, 11519 In article <83vcjpxw18.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > So I think that what happened is that something, probably the > translation through a char-table, caused allocation of a large chunk > of memory, which in turn relocated the text of the current buffer > behind regex.c's back, which still uses C pointers to the old location > of the buffer text. The function char_table_translate may allocate some amount of memory. But that's only when the char-table is one of a "unicode character property table" loaded via the function uniprop_table. When char_table_translate is called from re_search_2 (via the macro RE_TRANSLATE), the char-table to lookup is one of a case-related table (equiv? canon?) which is not a unicode property table. The function that acutally allocates memory via. char_table_translate is uniprop_table_uncompress. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-22 14:38 ` Kenichi Handa @ 2012-05-22 19:02 ` Eli Zaretskii 0 siblings, 0 replies; 39+ messages in thread From: Eli Zaretskii @ 2012-05-22 19:02 UTC (permalink / raw) To: Kenichi Handa; +Cc: lekktu, schwab, 11519 > From: Kenichi Handa <handa@gnu.org> > Cc: schwab@linux-m68k.org, lekktu@gmail.com, 11519@debbugs.gnu.org > Date: Tue, 22 May 2012 23:38:34 +0900 > > In article <83vcjpxw18.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > So I think that what happened is that something, probably the > > translation through a char-table, caused allocation of a large chunk > > of memory, which in turn relocated the text of the current buffer > > behind regex.c's back, which still uses C pointers to the old location > > of the buffer text. > > The function char_table_translate may allocate some amount > of memory. But that's only when the char-table is one of a > "unicode character property table" loaded via the function > uniprop_table. When char_table_translate is called from > re_search_2 (via the macro RE_TRANSLATE), the char-table to > lookup is one of a case-related table (equiv? canon?) which > is not a unicode property table. You are right: it was not char_table_translate, it was load_charset_map_from_file, invoked through STRING_CHAR_AND_LENGTH. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-20 17:27 ` Eli Zaretskii 2012-05-20 19:00 ` Juanma Barranquero @ 2012-05-21 1:49 ` Stefan Monnier 2012-05-21 2:50 ` Eli Zaretskii 1 sibling, 1 reply; 39+ messages in thread From: Stefan Monnier @ 2012-05-21 1:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Juanma Barranquero, 11519 > If you started looking, I think you can stop. It's a %$#@! > heisenbug. If I add a single (message %s file) to > custom-make-dependencies, it displays ethio-util.el as the last file > name before the error. But adding another message makes the problem > disappear, and the bootstrap runs flawlessly to completion. Likewise, > running just "make custom-deps" with the emacs.exe built by the first > stage of the bootstrap doesn't produce any errors. I guess the best options is to change the makefile so it sets debug-on-error (and/or starts Emacs under a debugger) in the area of interest and hope that the problem will still show up. Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 1:49 ` Stefan Monnier @ 2012-05-21 2:50 ` Eli Zaretskii 2012-05-21 3:21 ` Stefan Monnier 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2012-05-21 2:50 UTC (permalink / raw) To: Stefan Monnier; +Cc: lekktu, 11519 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Juanma Barranquero <lekktu@gmail.com>, 11519@debbugs.gnu.org > Date: Sun, 20 May 2012 21:49:02 -0400 > > > If you started looking, I think you can stop. It's a %$#@! > > heisenbug. If I add a single (message %s file) to > > custom-make-dependencies, it displays ethio-util.el as the last file > > name before the error. But adding another message makes the problem > > disappear, and the bootstrap runs flawlessly to completion. Likewise, > > running just "make custom-deps" with the emacs.exe built by the first > > stage of the bootstrap doesn't produce any errors. > > I guess the best options is to change the makefile so it sets > debug-on-error (and/or starts Emacs under a debugger) in the area of > interest and hope that the problem will still show up. I already tried the former (the problem went away). I'm guessing the latter will do the same. I will try to think about some trick, but frankly it's very frustrating, because any small change to cus-dep.el or the setup makes the problem go away. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping 2012-05-21 2:50 ` Eli Zaretskii @ 2012-05-21 3:21 ` Stefan Monnier 0 siblings, 0 replies; 39+ messages in thread From: Stefan Monnier @ 2012-05-21 3:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, 11519 > I already tried the former (the problem went away). I'm guessing the > latter will do the same. I will try to think about some trick, but > frankly it's very frustrating, because any small change to cus-dep.el > or the setup makes the problem go away. How 'bout changing the C code so a characterp error aborts? Stefan ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2013-12-28 9:48 UTC | newest] Thread overview: 39+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-19 16:10 bug#11519: "Wrong type argument: characterp" building custom-deps while boostrapping Juanma Barranquero 2012-05-19 16:27 ` Eli Zaretskii 2012-05-19 21:40 ` Juanma Barranquero 2012-05-20 17:27 ` Eli Zaretskii 2012-05-20 19:00 ` Juanma Barranquero 2012-05-21 1:50 ` Stefan Monnier 2012-05-21 2:51 ` Eli Zaretskii 2012-05-21 7:59 ` Andreas Schwab 2012-05-21 17:51 ` Eli Zaretskii 2012-05-21 20:39 ` Stefan Monnier 2012-05-22 19:00 ` Eli Zaretskii 2012-05-22 19:19 ` Stefan Monnier 2012-05-22 19:47 ` Eli Zaretskii 2012-05-23 0:47 ` Stefan Monnier 2012-05-23 2:59 ` Eli Zaretskii 2012-05-23 14:16 ` Stefan Monnier 2012-05-23 15:23 ` Ken Brown 2012-05-23 16:52 ` Eli Zaretskii 2012-05-23 20:07 ` Stefan Monnier 2012-05-24 16:22 ` Eli Zaretskii 2012-05-28 2:15 ` Stefan Monnier 2012-05-28 16:53 ` Eli Zaretskii 2012-05-28 19:44 ` Stefan Monnier 2012-05-28 20:47 ` Eli Zaretskii 2012-05-29 1:23 ` Stefan Monnier 2012-05-29 16:02 ` Eli Zaretskii 2012-06-02 20:44 ` Juanma Barranquero 2012-06-03 4:18 ` Eli Zaretskii 2013-12-28 8:41 ` Glenn Morris 2013-12-28 9:48 ` Eli Zaretskii 2012-05-23 17:34 ` Eli Zaretskii 2012-05-23 14:10 ` Kenichi Handa 2012-05-23 15:27 ` Stefan Monnier 2012-05-23 17:02 ` Eli Zaretskii 2012-05-22 14:38 ` Kenichi Handa 2012-05-22 19:02 ` Eli Zaretskii 2012-05-21 1:49 ` Stefan Monnier 2012-05-21 2:50 ` Eli Zaretskii 2012-05-21 3:21 ` Stefan Monnier
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).