* Emacs crashes @ 2006-03-13 20:23 Nick Roberts 2006-03-13 20:47 ` Chong Yidong ` (2 more replies) 0 siblings, 3 replies; 43+ messages in thread From: Nick Roberts @ 2006-03-13 20:23 UTC (permalink / raw) I've had Emacs crash/hang in three different ways in recent days. It would appear to be less stable than it was two years ago: 1) It hangs with some kind of mutex lock which I don't understand with a brief backtrace of three functions in libc, I think. The only thing I can do, after attaching with GDB, is kill it. 2) A garbage collection related crash where mark_object is called recursively literally thousands of times, 3) A crash that is caused by recent changes to the tool bar (I think). I attach the bactrace to this one below (xbacktrace produces no output). It appears to go wrong in produce_image_glyph where img=0x0 because it->f->output_data.x->display_info->used = 73, is less than it->image_id = 87. -- Nick http://www.inet.net.nz/~nickrob 0 0x005657a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x005a5df6 in kill () from /lib/tls/libc.so.6 #2 0x0810f587 in fatal_error_signal (sig=11) at emacs.c:430 #3 <signal handler called> #4 0x08106d26 in prepare_image_for_display (f=0xa1312e0, img=0x0) at image.c:1203 #5 0x0808cbad in produce_image_glyph (it=0xfee4d570) at xdisp.c:19505 #6 0x0809030d in x_produce_glyphs (it=0xfee4d570) at xdisp.c:20585 #7 0x080755c4 in display_tool_bar_line (it=0xfee4d570, height=38) at xdisp.c:9470 #8 0x08075cf6 in redisplay_tool_bar (f=0xa1312e0) at xdisp.c:9678 #9 0x0807d992 in redisplay_window (window=169022564, just_this_one_p=0) at xdisp.c:13160 #10 0x080792f0 in redisplay_window_0 (window=169022564) at xdisp.c:11523 #11 0x0818c065 in internal_condition_case_1 (bfun=0x80792c4 <redisplay_window_0>, arg=169022564, handlers=137856245, hfun=0x80792a3 <redisplay_window_error>) at eval.c:1521 #12 0x08079290 in redisplay_windows (window=169022564) at xdisp.c:11502 #13 0x08078731 in redisplay_internal (preserve_echo_area=0) at xdisp.c:11062 #14 0x08076d29 in redisplay () at xdisp.c:10292 #15 0x08115328 in read_char (commandflag=1, nmaps=3, maps=0xfee4e4f0, prev_event=137869513, used_mouse_menu=0xfee4e5ec) at keyboard.c:2549 #16 0x0811e7fc in read_key_sequence (keybuf=0xfee4e750, bufsize=30, prompt=137869513, dont_downcase_last=0, can_return_switch_frame=1, fix_current_buffer=1) at keyboard.c:8874 #17 0x08112d25 in command_loop_1 () at keyboard.c:1536 #18 0x0818bf37 in internal_condition_case (bfun=0x8112a27 <command_loop_1>, handlers=137914153, hfun=0x811256f <cmd_error>) at eval.c:1473 #19 0x081128a0 in command_loop_2 () at keyboard.c:1328 #20 0x0818b9b6 in internal_catch (tag=137910385, func=0x8112882 <command_loop_2>, arg=137869513) at eval.c:1211 #21 0x08112854 in command_loop () at keyboard.c:1307 #22 0x081122ee in recursive_edit_1 () at keyboard.c:1000 #23 0x0811242f in Frecursive_edit () at keyboard.c:1061 #24 0x08110d0c in main (argc=1, argv=0xfee4ed94) at emacs.c:1789 ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 20:23 Emacs crashes Nick Roberts @ 2006-03-13 20:47 ` Chong Yidong 2006-03-13 22:06 ` Kim F. Storm 2006-03-14 4:33 ` Eli Zaretskii 2 siblings, 0 replies; 43+ messages in thread From: Chong Yidong @ 2006-03-13 20:47 UTC (permalink / raw) Cc: emacs-devel Nick Roberts <nickrob@snap.net.nz> writes: > I've had Emacs crash/hang in three different ways in recent days. It would > appear to be less stable than it was two years ago: > > 1) It hangs with some kind of mutex lock which I don't understand with a brief > backtrace of three functions in libc, I think. The only thing I can do, > after attaching with GDB, is kill it. > > 2) A garbage collection related crash where mark_object is called recursively > literally thousands of times, Kim Storm reported some similar crashes around the beginning of March. Unfortunately, the only big change landed to the src/ directory around that time was my x_catch_errors change to avoid using record_unwind_protect. I've gone over those changes several times, but no luck: I don't see how they can possbily lead to garbage collection bugs. The only possibility I can think of is the change to struct specbinding and specpdl_ptr to make them non-volatile, which is supposedly OK since record_unwind_protect can no longer be called in a signal handler. Could that lead to problems elsewhere in Emacs? (The only other big checkin during that period was Luc's load-file-rep-suffixes change, but that's even less likely to be the caused.) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 20:23 Emacs crashes Nick Roberts 2006-03-13 20:47 ` Chong Yidong @ 2006-03-13 22:06 ` Kim F. Storm 2006-03-14 0:39 ` Kenichi Handa ` (4 more replies) 2006-03-14 4:33 ` Eli Zaretskii 2 siblings, 5 replies; 43+ messages in thread From: Kim F. Storm @ 2006-03-13 22:06 UTC (permalink / raw) Cc: Nick Roberts Nick Roberts <nickrob@snap.net.nz> writes: > I've had Emacs crash/hang in three different ways in recent days. I can second that. I had another crash today, so it has crashed on me four times in the last week. I suspect the recent changes to the handling (unwind etc) of x errors, but I have no proof, as there is no similarity to the crashes (except that it has now crashed twice in malloc_consolidate (libc internal) called from emacs_blocked_malloc called from XtVaGetValues in x_set_toolkit_scroll_bar_thumb. I didn't have time to dig further into the crash - and I have no way to determine what kind of corruption was causing the crash in malloc_consolidate. > It would > appear to be less stable than it was two years ago: It *is* *much* less stable that it was a week ago! > > 1) It hangs with some kind of mutex lock which I don't understand with a brief > backtrace of three functions in libc, I think. The only thing I can do, > after attaching with GDB, is kill it. > > 2) A garbage collection related crash where mark_object is called recursively > literally thousands of times, > > 3) A crash that is caused by recent changes to the tool bar (I think). I > attach the bactrace to this one below (xbacktrace produces no output). > It appears to go wrong in produce_image_glyph where img=0x0 because > it->f->output_data.x->display_info->used = 73, is less than > it->image_id = 87. I haven't seen any of these -- so there are now 6 different crashes. Looks like a completely random memory corruption. -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 22:06 ` Kim F. Storm @ 2006-03-14 0:39 ` Kenichi Handa 2006-03-14 16:09 ` Richard Stallman 2006-03-14 1:02 ` Juanma Barranquero ` (3 subsequent siblings) 4 siblings, 1 reply; 43+ messages in thread From: Kenichi Handa @ 2006-03-14 0:39 UTC (permalink / raw) Cc: nickrob, emacs-devel In article <m3pskqrod2.fsf@kfs-l.imdomain.dk>, storm@cua.dk (Kim F. Storm) writes: > I haven't seen any of these -- so there are now 6 different crashes. > Looks like a completely random memory corruption. Yesterday, I met Emacs crash this way: Program received signal SIGSEGV, Segmentation fault. 0x40359560 in mallopt () from /lib/libc.so.6 (gdb) bt 10 #0 0x40359560 in mallopt () from /lib/libc.so.6 #1 0x4035a339 in mallopt () from /lib/libc.so.6 #2 0x08138341 in emacs_blocked_malloc (size=1078031196, ptr=0x8137cad) at alloc.c:1217 #3 0x40357f55 in malloc () from /lib/libc.so.6 #4 0x08137cad in xmalloc (size=140991128) at alloc.c:740 #5 0x0809c8a8 in coding_allocate_composition_data (coding=0x89c41c8, char_offset=1078031196) at coding.c:1708 #6 0x080a5d17 in decode_coding_string (str=151149291, coding=0x10, nocopy=0) at coding.c:6294 #7 0x08182865 in read_process_output (proc=144453228, channel=46) at process.c:5040 (gdb) up 5 #5 0x0809c8a8 in coding_allocate_composition_data (coding=0x89c41c8, char_offset=1078031196) at coding.c:1708 1708 = (struct composition_data *) xmalloc (sizeof *cmp_data); But, this Emacs was compiled before these changes: 2006-03-10 Kim F. Storm <storm@cua.dk> * alloc.c (USE_POSIX_MEMALIGN): Fix last change. 2006-03-09 Stefan Monnier <monnier@iro.umontreal.ca> * alloc.c (USE_POSIX_MEMALIGN): New macro. (ABLOCKS_BASE, lisp_align_malloc, lisp_align_free): Use it. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 0:39 ` Kenichi Handa @ 2006-03-14 16:09 ` Richard Stallman 2006-03-15 3:24 ` Giorgos Keramidas 0 siblings, 1 reply; 43+ messages in thread From: Richard Stallman @ 2006-03-14 16:09 UTC (permalink / raw) Cc: nickrob, emacs-devel, storm But, this Emacs was compiled before these changes: 2006-03-10 Kim F. Storm <storm@cua.dk> * alloc.c (USE_POSIX_MEMALIGN): Fix last change. 2006-03-09 Stefan Monnier <monnier@iro.umontreal.ca> * alloc.c (USE_POSIX_MEMALIGN): New macro. (ABLOCKS_BASE, lisp_align_malloc, lisp_align_free): Use it. Could you put a note in src/ChangeLog that you get crashes with a version compiled before that point? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 16:09 ` Richard Stallman @ 2006-03-15 3:24 ` Giorgos Keramidas 2006-03-15 20:23 ` Richard Stallman 0 siblings, 1 reply; 43+ messages in thread From: Giorgos Keramidas @ 2006-03-15 3:24 UTC (permalink / raw) Cc: nickrob, emacs-devel, storm, Kenichi Handa On 2006-03-14 11:09, Richard Stallman <rms@gnu.org> wrote: > But, this Emacs was compiled before these changes: > > 2006-03-10 Kim F. Storm <storm@cua.dk> > > * alloc.c (USE_POSIX_MEMALIGN): Fix last change. > > 2006-03-09 Stefan Monnier <monnier@iro.umontreal.ca> > > * alloc.c (USE_POSIX_MEMALIGN): New macro. > (ABLOCKS_BASE, lisp_align_malloc, lisp_align_free): Use it. > > > Could you put a note in src/ChangeLog that you get crashes > with a version compiled before that point? I think these changes are very likely to be 100% correct, as not having them fails to bootstrap too many times. Their intent was to only use posix_memalign() when the system's malloc() is used. Otherwise we risk having a memory area allocated by posix_memalign() and then freed with Emacs' internal gmalloc or vice versa. With these changes, on the other hand, failures during bootstrapping on FreeBSD/amd64 have stopped immediately here :-) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 3:24 ` Giorgos Keramidas @ 2006-03-15 20:23 ` Richard Stallman 0 siblings, 0 replies; 43+ messages in thread From: Richard Stallman @ 2006-03-15 20:23 UTC (permalink / raw) Cc: nickrob, storm, handa, emacs-devel I think these changes are very likely to be 100% correct, as not having them fails to bootstrap too many times. That does not prove they are correct--it only proves they fixed a real problem. It remains possible that they also cause a bug in some less-frequent case. The real reason we know these changes are not responsible for these crashes is that the crashes happened before these changes were made. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 22:06 ` Kim F. Storm 2006-03-14 0:39 ` Kenichi Handa @ 2006-03-14 1:02 ` Juanma Barranquero 2006-03-14 9:36 ` David Kastrup 2006-03-14 1:37 ` Nick Roberts ` (2 subsequent siblings) 4 siblings, 1 reply; 43+ messages in thread From: Juanma Barranquero @ 2006-03-14 1:02 UTC (permalink / raw) On 3/13/06, Kim F. Storm <storm@cua.dk> wrote: > Nick Roberts <nickrob@snap.net.nz> writes: > > It would appear to be less stable than it was two years ago: > It *is* *much* less stable that it was a week ago! Impossible! Absurd! We're in a feature freeze, after all... -- /L/e/k/t/u ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 1:02 ` Juanma Barranquero @ 2006-03-14 9:36 ` David Kastrup 2006-03-14 11:59 ` Juanma Barranquero 0 siblings, 1 reply; 43+ messages in thread From: David Kastrup @ 2006-03-14 9:36 UTC (permalink / raw) Cc: emacs-devel "Juanma Barranquero" <lekktu@gmail.com> writes: > On 3/13/06, Kim F. Storm <storm@cua.dk> wrote: >> Nick Roberts <nickrob@snap.net.nz> writes: > >> > It would appear to be less stable than it was two years ago: > >> It *is* *much* less stable that it was a week ago! > > Impossible! Absurd! We're in a feature freeze, after all... New bugs can be introduced by fixing old ones. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 9:36 ` David Kastrup @ 2006-03-14 11:59 ` Juanma Barranquero 2006-03-14 17:45 ` Richard Stallman 0 siblings, 1 reply; 43+ messages in thread From: Juanma Barranquero @ 2006-03-14 11:59 UTC (permalink / raw) Cc: emacs-devel On 3/14/06, David Kastrup <dak@gnu.org> wrote: > "Juanma Barranquero" <lekktu@gmail.com> writes: > New bugs can be introduced by fixing old ones. And that's probably the case right now. I don't doubt it. Still I find difficult to take the freeze seriously. There must be something wrong with me, or perhaps in the air. -- /L/e/k/t/u ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 11:59 ` Juanma Barranquero @ 2006-03-14 17:45 ` Richard Stallman 2006-03-15 8:58 ` Juanma Barranquero 0 siblings, 1 reply; 43+ messages in thread From: Richard Stallman @ 2006-03-14 17:45 UTC (permalink / raw) Cc: emacs-devel Still I find difficult to take the freeze seriously. There must be something wrong with me, or perhaps in the air. If that is what you think, would you please not say it here? My life is already very frustrating, and you are making it worse. If you want to blow off steam, please do it privately, not on this list. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 17:45 ` Richard Stallman @ 2006-03-15 8:58 ` Juanma Barranquero 2006-03-17 16:32 ` Richard Stallman 0 siblings, 1 reply; 43+ messages in thread From: Juanma Barranquero @ 2006-03-15 8:58 UTC (permalink / raw) On 3/14/06, Richard Stallman <rms@gnu.org> wrote: > If that is what you think, would you please not say it here? No, that is not what I think really. I don't think there's anything wrong with me regarding the freeze. I honestly think there's something wrong with "the freeze". > My life is already very frustrating, and you are making it worse. I'm really sorry. That was not my intention. > If you want to blow off steam, please do it privately, not on > this list. No, I don't have any steam to blow off. -- /L/e/k/t/u ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 8:58 ` Juanma Barranquero @ 2006-03-17 16:32 ` Richard Stallman 2006-03-17 16:41 ` Juanma Barranquero 0 siblings, 1 reply; 43+ messages in thread From: Richard Stallman @ 2006-03-17 16:32 UTC (permalink / raw) Cc: emacs-devel I honestly think there's something wrong with "the freeze". You've stated your opinion. Would you please not say that again here? Saying it again just adds more stress and anger to my life. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-17 16:32 ` Richard Stallman @ 2006-03-17 16:41 ` Juanma Barranquero 0 siblings, 0 replies; 43+ messages in thread From: Juanma Barranquero @ 2006-03-17 16:41 UTC (permalink / raw) On 3/17/06, Richard Stallman <rms@gnu.org> wrote: > You've stated your opinion. Would you please not say that again here? I'll try to remember it. -- /L/e/k/t/u ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 22:06 ` Kim F. Storm 2006-03-14 0:39 ` Kenichi Handa 2006-03-14 1:02 ` Juanma Barranquero @ 2006-03-14 1:37 ` Nick Roberts 2006-03-14 16:07 ` Chong Yidong 2006-03-14 16:09 ` Richard Stallman 4 siblings, 0 replies; 43+ messages in thread From: Nick Roberts @ 2006-03-14 1:37 UTC (permalink / raw) Cc: emacs-devel > > I've had Emacs crash/hang in three different ways in recent days. > > I can second that. I had another crash today, so it has crashed on me > four times in the last week. > > I suspect the recent changes to the handling (unwind etc) of x errors, > but I have no proof, as there is no similarity to the crashes (except > that it has now crashed twice in malloc_consolidate (libc internal) > called from emacs_blocked_malloc called from XtVaGetValues in > x_set_toolkit_scroll_bar_thumb. Well mine are different to the ones I reported in February which always included: ... #4 0x080e89d1 in x_catch_errors_unwind (dummy=137858041) at xterm.c:7543 #5 0x0818dc6e in unbind_to (count=44, value=137858041) at eval.c:3233 I attach the bottom part of the backtraces for the garbage collection related crashes below. -- Nick http://www.inet.net.nz/~nickrob ... #1283 0x0817417e in mark_object (arg=146162697) at alloc.c:5575 #1284 0x0817417e in mark_object (arg=146839721) at alloc.c:5575 #1285 0x08174136 in mark_object (arg=137870564) at alloc.c:5562 #1286 0x08173422 in Fgarbage_collect () at alloc.c:5022 #1287 0x0818de36 in Ffuncall (nargs=2, args=0xfefe2990) at eval.c:2839 #1288 0x0818dcc4 in call1 (fn=137906777, arg1=144985555) at eval.c:2690 #1289 0x08114e0e in show_help_echo (help=144985555, window=137869513, object=137869513, pos=-8, ok_to_overwrite_keystroke_echo=0) at keyboard.c:2309 #1290 0x08116524 in read_char (commandflag=1, nmaps=4, maps=0xfefe2c30, prev_event=137869513, used_mouse_menu=0xfefe2d2c) at keyboard.c:3155 #1291 0x0811e7fc in read_key_sequence (keybuf=0xfefe2e90, bufsize=30, prompt=137869513, dont_downcase_last=0, can_return_switch_frame=1, fix_current_buffer=1) at keyboard.c:8874 #1292 0x08112d25 in command_loop_1 () at keyboard.c:1536 #1293 0x0818bf37 in internal_condition_case (bfun=0x8112a27 <command_loop_1>, handlers=137914153, hfun=0x811256f <cmd_error>) at eval.c:1473 #1294 0x081128a0 in command_loop_2 () at keyboard.c:1328 #1295 0x0818b9b6 in internal_catch (tag=137910385, func=0x8112882 <command_loop_2>, arg=137869513) at eval.c:1211 #1296 0x08112854 in command_loop () at keyboard.c:1307 #1297 0x081122ee in recursive_edit_1 () at keyboard.c:1000 #1298 0x0811242f in Frecursive_edit () at keyboard.c:1061 #1299 0x08110d0c in main (argc=1, argv=0xfefe34d4) at emacs.c:1789 and ... #19 0x0817417e in mark_object (arg=172548329) at alloc.c:5575 #20 0x08174136 in mark_object (arg=137870564) at alloc.c:5562 #21 0x08173422 in Fgarbage_collect () at alloc.c:5022 #22 0x0818cfc9 in Feval (form=177331709) at eval.c:2138 #23 0x0818c065 in internal_condition_case_1 (bfun=0x818ced9 <Feval>, arg=177331709, handlers=137914153, hfun=0x811bd6f <menu_item_eval_property_1>) at eval.c:1521 #24 0x0811bdf6 in menu_item_eval_property (sexpr=177331709) at keyboard.c:7198 #25 0x08125106 in get_keyelt (object=138087793, autoload=1) at keymap.c:821 #26 0x08124c2d in access_keymap (map=137857181, idx=137902585, t_ok=2, noinherit=0, autoload=1) at keymap.c:651 #27 0x0811cab6 in tool_bar_items (reuse=169488892, nitems=0xfef6fae4) at keyboard.c:7660 #28 0x0807501b in update_tool_bar (f=0x9125da8, save_match_data=0) at xdisp.c:9247 #29 0x08074a38 in prepare_menu_bars () at xdisp.c:8952 #30 0x08077bdc in redisplay_internal (preserve_echo_area=0) at xdisp.c:10703 #31 0x08076d29 in redisplay () at xdisp.c:10292 #32 0x08115328 in read_char (commandflag=1, nmaps=4, maps=0xfef70370, prev_event=137869513, used_mouse_menu=0xfef7046c) at keyboard.c:2549 #33 0x0811e7fc in read_key_sequence (keybuf=0xfef705d0, bufsize=30, prompt=137869513, dont_downcase_last=0, can_return_switch_frame=1, fix_current_buffer=1) at keyboard.c:8874 #34 0x08112d25 in command_loop_1 () at keyboard.c:1536 #35 0x0818bf37 in internal_condition_case (bfun=0x8112a27 <command_loop_1>, handlers=137914153, hfun=0x811256f <cmd_error>) at eval.c:1473 #36 0x081128a0 in command_loop_2 () at keyboard.c:1328 #37 0x0818b9b6 in internal_catch (tag=137910385, func=0x8112882 <command_loop_2>, arg=137869513) at eval.c:1211 #38 0x08112854 in command_loop () at keyboard.c:1307 #39 0x081122ee in recursive_edit_1 () at keyboard.c:1000 #40 0x0811242f in Frecursive_edit () at keyboard.c:1061 #41 0x08110d0c in main (argc=1, argv=0xfef70c14) at emacs.c:1789 ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 22:06 ` Kim F. Storm ` (2 preceding siblings ...) 2006-03-14 1:37 ` Nick Roberts @ 2006-03-14 16:07 ` Chong Yidong 2006-03-14 16:15 ` Kim F. Storm 2006-03-14 16:09 ` Richard Stallman 4 siblings, 1 reply; 43+ messages in thread From: Chong Yidong @ 2006-03-14 16:07 UTC (permalink / raw) Cc: Nick Roberts, emacs-devel storm@cua.dk (Kim F. Storm) writes: > Nick Roberts <nickrob@snap.net.nz> writes: > >> I've had Emacs crash/hang in three different ways in recent days. > > I can second that. I had another crash today, so it has crashed on me > four times in the last week. > > I suspect the recent changes to the handling (unwind etc) of x errors, > but I have no proof, as there is no similarity to the crashes (except > that it has now crashed twice in malloc_consolidate (libc internal) > called from emacs_blocked_malloc called from XtVaGetValues in > x_set_toolkit_scroll_bar_thumb. You could revert those changes (if you like, I can send you a single patch reverting just my changes), run Emacs for a while, and see if the crashes still happen. If it is really the cause, I could change the x_catch_errors so that it doesn't use malloc; maybe that will help. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 16:07 ` Chong Yidong @ 2006-03-14 16:15 ` Kim F. Storm 0 siblings, 0 replies; 43+ messages in thread From: Kim F. Storm @ 2006-03-14 16:15 UTC (permalink / raw) Cc: Nick Roberts, emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > storm@cua.dk (Kim F. Storm) writes: > >> Nick Roberts <nickrob@snap.net.nz> writes: >> >>> I've had Emacs crash/hang in three different ways in recent days. >> >> I can second that. I had another crash today, so it has crashed on me >> four times in the last week. >> >> I suspect the recent changes to the handling (unwind etc) of x errors, >> but I have no proof, as there is no similarity to the crashes (except >> that it has now crashed twice in malloc_consolidate (libc internal) >> called from emacs_blocked_malloc called from XtVaGetValues in >> x_set_toolkit_scroll_bar_thumb. > > You could revert those changes (if you like, I can send you a single > patch reverting just my changes), run Emacs for a while, and see if > the crashes still happen. > > If it is really the cause, I could change the x_catch_errors so that > it doesn't use malloc; maybe that will help. Except for those changes (and there have been quite a lot on top of each other!), I only see the following (related) change which may be at play here: 2006-02-26 Stefan Monnier <monnier@iro.umontreal.ca> * lisp.h (struct specbinding, specpdl_ptr): Remove the volatile qualifier which was trying to avoid the bug that was fixed by yesterday's changes to xterm.c. -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 22:06 ` Kim F. Storm ` (3 preceding siblings ...) 2006-03-14 16:07 ` Chong Yidong @ 2006-03-14 16:09 ` Richard Stallman 2006-03-14 20:47 ` Kim F. Storm 2006-03-15 15:41 ` Kim F. Storm 4 siblings, 2 replies; 43+ messages in thread From: Richard Stallman @ 2006-03-14 16:09 UTC (permalink / raw) Cc: nickrob, emacs-devel Below are all the C-level changes in the past 14 days that are not specific to Windows or MacOS, and that are before the point at which Handa reports his Emacs was compiled. There are not very many of them. So that means people could try reverting one of these changes and see if the crashes stop. If we try each of them, and record which ones have been tried, we should find which one it is. If you try reverting one of these changes and still get a crash, please put a note into src/ChangeLog saying "Checked DATE YOURNAME" on a line just after the header line. It would also be useful if people make a checkout of the March 1 sources and edit with them for a while, to verify that they indeed do not crash. 2006-03-09 Kenichi Handa <handa@m17n.org> * coding.c (DECODE_EMACS_MULE_COMPOSITION_CHAR): Fix decoding ASCII component of a composition. 2006-03-08 Luc Teirlinck <teirllm@auburn.edu> * window.c: Declare preserve_y as a static global variable. (window_scroll_pixel_based): No longer declare preserve_y; it is global now. (syms_of_window): Set preserve_y to -1. 2006-03-06 Chong Yidong <cyd@stupidchicken.com> * xdisp.c (handle_invisible_prop): Don't update it->position with a buffer position if we're in a display string. 2006-03-05 Andreas Schwab <schwab@suse.de> * xselect.c (x_catch_errors_unwind): Fix missing return value. 2006-03-02 Kim F. Storm <storm@cua.dk> * frame.h (struct frame): New member n_tool_bar_rows. * xdisp.c: Minimize the unpleasent visual impact of the requirement that non-toolkit tool-bars must occupy an integral number of screen lines, by distributing the rows evenly over the tool-bar screen area. (Vtool_bar_border): New variable. (syms_of_xdisp): DEFVAR_LISP it. (display_tool_bar_line): Add HEIGHT arg for desired row height. Make tool-bar row the desired height. Use default face for border below tool-bar. (tool_bar_lines_needed): Add N_ROWS arg. Use it to return number of actual tool-bar rows. (redisplay_tool_bar): Calculate f->n_tool_bar_rows initially. Adjust the height of the tool-bar rows to fill tool-bar screen area. (redisplay_tool_bar): Calculate f->n_tool_bar_rows when tool-bar area is resized. 2006-03-01 Luc Teirlinck <teirllm@auburn.edu> * search.c (Fregexp_quote): Do not precede a literal `]' with two backslashes to try to make clear that it has a literal meaning; it does not do that. (It could close a character alternative containing a backslash.) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 16:09 ` Richard Stallman @ 2006-03-14 20:47 ` Kim F. Storm 2006-03-14 21:35 ` Chong Yidong ` (3 more replies) 2006-03-15 15:41 ` Kim F. Storm 1 sibling, 4 replies; 43+ messages in thread From: Kim F. Storm @ 2006-03-14 20:47 UTC (permalink / raw) Cc: nickrob, emacs-devel Richard Stallman <rms@gnu.org> writes: > Below are all the C-level changes in the past 14 days that are not > specific to Windows or MacOS, and that are before the point at which > Handa reports his Emacs was compiled. I had a crash on Mar 6, two on Mar 8, and another on Mar 12. I don't know if it incidental, but just before the first crash, I had put (server-start) into my .emacs and used it via emacsclient in connection with some of those crashes. The other change I made is that before Mar 6, I hadn't used an up-to-date CVS emacs very intensively for some weeks, but on that day, and the days after I updated and used it quite intensively... So I don't know exactly when the crashes started ... but it may be something done before Mar 1. > There are not very many of > them. True, but if you go back another week, there are some quite fundamental changes in the way X errors are handled -- which I think is more likely to be the cause of these problems... > > So that means people could try reverting one of these changes and see > if the crashes stop. If we try each of them, and record which ones > have been tried, we should find which one it is. > > If you try reverting one of these changes and still get a crash, > please put a note into src/ChangeLog saying "Checked DATE YOURNAME" > on a line just after the header line. > > It would also be useful if people make a checkout of the March 1 sources > and edit with them for a while, to verify that they indeed do not crash. > I will do that. -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 20:47 ` Kim F. Storm @ 2006-03-14 21:35 ` Chong Yidong 2006-03-15 20:21 ` Richard Stallman 2006-03-14 22:38 ` Kim F. Storm ` (2 subsequent siblings) 3 siblings, 1 reply; 43+ messages in thread From: Chong Yidong @ 2006-03-14 21:35 UTC (permalink / raw) Cc: nickrob, rms, emacs-devel storm@cua.dk (Kim F. Storm) writes: > True, but if you go back another week, there are some quite fundamental > changes in the way X errors are handled -- which I think is more likely > to be the cause of these problems... >> >> It would also be useful if people make a checkout of the March 1 sources >> and edit with them for a while, to verify that they indeed do not crash. >> > > I will do that. If it seems likely that the X error handler changes are at fault, I can revert them. I have a pretty good idea how to fix the crashes they were originally meant to address in a different, less intrusive way (the idea is to make those functions that call x_catch_errors in a signal handler instead call XSetErrorHandler to install a temporary "ignore all errors" handler.) Unfortunately, I have not been able to experience these crashes myself. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 21:35 ` Chong Yidong @ 2006-03-15 20:21 ` Richard Stallman 0 siblings, 0 replies; 43+ messages in thread From: Richard Stallman @ 2006-03-15 20:21 UTC (permalink / raw) Cc: nickrob, emacs-devel, storm If it seems likely that the X error handler changes are at fault, I can revert them. I have a pretty good idea how to fix the crashes they were originally meant to address in a different, less intrusive way (the idea is to make those functions that call x_catch_errors in a signal handler instead call XSetErrorHandler to install a temporary "ignore all errors" handler.) It is worth a try, to see if this solves the problem. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 20:47 ` Kim F. Storm 2006-03-14 21:35 ` Chong Yidong @ 2006-03-14 22:38 ` Kim F. Storm 2006-03-15 9:22 ` Nick Roberts 2006-03-15 3:21 ` Giorgos Keramidas 2006-03-15 20:21 ` Richard Stallman 3 siblings, 1 reply; 43+ messages in thread From: Kim F. Storm @ 2006-03-14 22:38 UTC (permalink / raw) Cc: emacs-devel storm@cua.dk (Kim F. Storm) writes: >> It would also be useful if people make a checkout of the March 1 sources >> and edit with them for a while, to verify that they indeed do not crash. >> > > I will do that. I have checked out versions from Mar 1, Mar 7 and today, and tried various tasks with each of them for a while without any of them crashing. Since the crashes only happens occasionally, it is a bit hard to say anything based on this short period of time. I will try to use the Mar 1 version for a few days. If it is a problem with the X error handler, can someone tell me what I can possibly try to increase the chance of triggering an x error that may trigger a crash? -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 22:38 ` Kim F. Storm @ 2006-03-15 9:22 ` Nick Roberts 2006-03-15 9:28 ` David Kastrup 2006-03-15 11:35 ` Jan D. 0 siblings, 2 replies; 43+ messages in thread From: Nick Roberts @ 2006-03-15 9:22 UTC (permalink / raw) Cc: emacs-devel > If it is a problem with the X error handler, can someone tell me what > I can possibly try to increase the chance of triggering an x error > that may trigger a crash? Its probably not x error related but I can get Emacs (built Mar 13) to crash every time by: M-x tool-bar-mode C-x d <RET> I guess if others can't there might be something wrong with my build (it doesn't crash without the tool bar). -- Nick http://www.inet.net.nz/~nickrob ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 9:22 ` Nick Roberts @ 2006-03-15 9:28 ` David Kastrup 2006-03-15 11:35 ` Jan D. 1 sibling, 0 replies; 43+ messages in thread From: David Kastrup @ 2006-03-15 9:28 UTC (permalink / raw) Cc: emacs-devel, Kim F. Storm Nick Roberts <nickrob@snap.net.nz> writes: > > If it is a problem with the X error handler, can someone tell me what > > I can possibly try to increase the chance of triggering an x error > > that may trigger a crash? > > Its probably not x error related but I can get Emacs (built Mar 13) to crash > every time by: > > M-x tool-bar-mode > C-x d <RET> > > I guess if others can't there might be something wrong with my build > (it doesn't crash without the tool bar). Emacs starts with the tool bar on by default. Do you indeed turn it _off_ and then Emacs crashes? Or have you omitted relevant details from your setup? -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 9:22 ` Nick Roberts 2006-03-15 9:28 ` David Kastrup @ 2006-03-15 11:35 ` Jan D. 1 sibling, 0 replies; 43+ messages in thread From: Jan D. @ 2006-03-15 11:35 UTC (permalink / raw) Cc: emacs-devel, Kim F. Storm > > If it is a problem with the X error handler, can someone tell me what > > I can possibly try to increase the chance of triggering an x error > > that may trigger a crash? > > Its probably not x error related but I can get Emacs (built Mar 13) to crash > every time by: > > M-x tool-bar-mode > C-x d <RET> > > I guess if others can't there might be something wrong with my build > (it doesn't crash without the tool bar). I can't. Tested with and without tool bar (not clear what you have) on x86 and AMD-64. Which toolkit did you build with? Jan D. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 20:47 ` Kim F. Storm 2006-03-14 21:35 ` Chong Yidong 2006-03-14 22:38 ` Kim F. Storm @ 2006-03-15 3:21 ` Giorgos Keramidas 2006-03-15 20:21 ` Richard Stallman 3 siblings, 0 replies; 43+ messages in thread From: Giorgos Keramidas @ 2006-03-15 3:21 UTC (permalink / raw) Cc: nickrob, rms, emacs-devel On 2006-03-14 21:47, "Kim F. Storm" <storm@cua.dk> wrote: >Richard Stallman <rms@gnu.org> writes: >> Below are all the C-level changes in the past 14 days that are not >> specific to Windows or MacOS, and that are before the point at which >> Handa reports his Emacs was compiled. > > I had a crash on Mar 6, two on Mar 8, and another on Mar 12. > > I don't know if it incidental, but just before the first crash, I had > put (server-start) into my .emacs and used it via emacsclient in > connection with some of those crashes. FWIW, I've never had a crash after the posix_memalign() fix committed by Stefan, but I'm using only part of the full feature set of current Emacs builds. My current snapshot is compiled with: --prefix=/opt/emacs --without-x and I haven't used (server-start) for a week or so. Of course this doesn't prove anything. Just that the crash hasn't happened with this particular setup... yet. - Giorgos ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 20:47 ` Kim F. Storm ` (2 preceding siblings ...) 2006-03-15 3:21 ` Giorgos Keramidas @ 2006-03-15 20:21 ` Richard Stallman 3 siblings, 0 replies; 43+ messages in thread From: Richard Stallman @ 2006-03-15 20:21 UTC (permalink / raw) Cc: nickrob, emacs-devel I don't know if it incidental, but just before the first crash, I had put (server-start) into my .emacs and used it via emacsclient in connection with some of those crashes. Could you (or someone) try building an Emacs from mid-Feb and put in (server-start) and see if crashes happen? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 16:09 ` Richard Stallman 2006-03-14 20:47 ` Kim F. Storm @ 2006-03-15 15:41 ` Kim F. Storm 2006-03-15 17:05 ` Luc Teirlinck ` (2 more replies) 1 sibling, 3 replies; 43+ messages in thread From: Kim F. Storm @ 2006-03-15 15:41 UTC (permalink / raw) Cc: nickrob, emacs-devel Richard Stallman <rms@gnu.org> writes: > Below are all the C-level changes in the past 14 days that are not > specific to Windows or MacOS, and that are before the point at which > Handa reports his Emacs was compiled. There are not very many of > them. Well, I think I have located a probably cause of the crashes in function extend_face_to_end_of_line, and the way it is used in redisplay_tool_bar. My recent changes to the tool-bar display has revealed this error. I don't know if it can explain other crashes we have seen. I will install a fix later today. My humble apologies to Chong Yidong for pointing at his X error patches! They are probably working just fine. Sorry!!! -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 15:41 ` Kim F. Storm @ 2006-03-15 17:05 ` Luc Teirlinck 2006-03-15 17:21 ` Chong Yidong 2006-03-15 19:03 ` Kim F. Storm 2 siblings, 0 replies; 43+ messages in thread From: Luc Teirlinck @ 2006-03-15 17:05 UTC (permalink / raw) Cc: nickrob, rms, emacs-devel Kim Storm wrote: Well, I think I have located a probably cause of the crashes in function extend_face_to_end_of_line, and the way it is used in redisplay_tool_bar. My recent changes to the tool-bar display has revealed this error. I don't know if it can explain other crashes we have seen. It might explain why I (and other people) never saw _any_ of these crashes: I do not use the toolbar. Sincerely, Luc. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 15:41 ` Kim F. Storm 2006-03-15 17:05 ` Luc Teirlinck @ 2006-03-15 17:21 ` Chong Yidong 2006-03-15 19:03 ` Kim F. Storm 2 siblings, 0 replies; 43+ messages in thread From: Chong Yidong @ 2006-03-15 17:21 UTC (permalink / raw) Cc: nickrob, rms, emacs-devel storm@cua.dk (Kim F. Storm) writes: > My recent changes to the tool-bar display has revealed this error. > I don't know if it can explain other crashes we have seen. > > I will install a fix later today. > > > My humble apologies to Chong Yidong for pointing at his X error patches! > They are probably working just fine. Sorry!!! No need to --- according to the changelog, they were the only major changes in the time period. I'm guessing you already had your tool-bar changes in your tree; that was why you started getting crashes in the beginning of March, earlier than everyone else. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 15:41 ` Kim F. Storm 2006-03-15 17:05 ` Luc Teirlinck 2006-03-15 17:21 ` Chong Yidong @ 2006-03-15 19:03 ` Kim F. Storm 2006-03-15 21:40 ` Nick Roberts 2 siblings, 1 reply; 43+ messages in thread From: Kim F. Storm @ 2006-03-15 19:03 UTC (permalink / raw) Cc: emacs-devel storm@cua.dk (Kim F. Storm) writes: > I will install a fix later today. Done. I also fixed an old bug where the tool-bar window was twice the necessary height, but no icons were shown if the tool-bar row is exactly the same width as the window. -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 19:03 ` Kim F. Storm @ 2006-03-15 21:40 ` Nick Roberts 0 siblings, 0 replies; 43+ messages in thread From: Nick Roberts @ 2006-03-15 21:40 UTC (permalink / raw) > > I will install a fix later today. > > Done. > > > I also fixed an old bug where the tool-bar window was twice the > necessary height, but no icons were shown if the tool-bar row is > exactly the same width as the window. [Kim, I've not mailed you directly because everything I send to you gets rejected as spam] I can't reproduce the crash now. However, I couldn't reproduce it after an intermediate build without your changes. And the backtrace for one crash was in PRODUCE_GLYPHS in display_tool_bar_line which is _before_ extend_face_to_end_of_line. I also found out that the crash didn't occur with -Q (which presumably why no-one else saw it). I had presumed it was a display problem but just commenting out one line: (setq gud-pdb-command-name "/usr/lib/python2.3/pdb.py") or, even shortening the string length, stopped the crash, so I guess it was memory related. Nick ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-13 20:23 Emacs crashes Nick Roberts 2006-03-13 20:47 ` Chong Yidong 2006-03-13 22:06 ` Kim F. Storm @ 2006-03-14 4:33 ` Eli Zaretskii 2006-03-14 20:45 ` Nick Roberts 2 siblings, 1 reply; 43+ messages in thread From: Eli Zaretskii @ 2006-03-14 4:33 UTC (permalink / raw) Cc: emacs-devel > From: Nick Roberts <nickrob@snap.net.nz> > Date: Tue, 14 Mar 2006 09:23:23 +1300 > > 1) It hangs with some kind of mutex lock which I don't understand with a brief > backtrace of three functions in libc, I think. The only thing I can do, > after attaching with GDB, is kill it. Please show at least that short backtrace. > 2) A garbage collection related crash where mark_object is called recursively > literally thousands of times, The fact that there are thousands of recursive calls to mark_object is not in itself a sign of a problem. It is normal for the mark phase to be deeply recursive. etc/DEBUG has some text on how to debug crashes during GC. Could you try to use those techniques and see what data structure is corrupted? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 4:33 ` Eli Zaretskii @ 2006-03-14 20:45 ` Nick Roberts 2006-03-15 4:43 ` Eli Zaretskii 2006-03-15 20:21 ` Richard Stallman 0 siblings, 2 replies; 43+ messages in thread From: Nick Roberts @ 2006-03-14 20:45 UTC (permalink / raw) Cc: emacs-devel > > 1) It hangs with some kind of mutex lock which I don't understand with a > > brief backtrace of three functions in libc, I think. The only thing I > > can do, after attaching with GDB, is kill it. > > Please show at least that short backtrace. It happened twice but I've lost the details, sorry. > > 2) A garbage collection related crash where mark_object is called > > recursively literally thousands of times, > > The fact that there are thousands of recursive calls to mark_object is > not in itself a sign of a problem. It is normal for the mark phase to > be deeply recursive. OK, I didn't know that. Perhaps I should look at the bottom of the backtrace (i.e low frame nos) instead of the top. > etc/DEBUG has some text on how to debug crashes during GC. Could you > try to use those techniques and see what data structure is corrupted? I don't have a live process to debug but I think I can get it to crash again. Anyway this is what I have found using the corefile: (gdb) p last_marked_index $1 = 482 (gdb) p last_marked[482] $2 = 173755437 (gdb) xtype Lisp_Cons (gdb) xcons $3 = (struct Lisp_Cons *) 0xa5b4c28 { car = 0x83bc641, u = { cdr = 0x837b8c9, chain = 0x837b8c9 } } (gdb) p last_marked[481] $4 = 167781611 (gdb) xtype Lisp_String (gdb) xcons $5 = (struct Lisp_Cons *) 0xa0024e8 { car = 0x4, u = { cdr = 0xffffffff, chain = 0xffffffff } } These last addresses looks suspect I don't know what to do next. Am I right to assume that 481 is the index of the very last marked object, 480 the one before etc. And that 482 is the index of the oldest marked object in the array held in a circular fashion? Incidentally with gdb-ui, if you display a watch expression in the speedbar and press 'p' on a component (with a live process), Emacs will print the s-expression in the GUD buffer. I've just extended it to work for arrays so you can quickly look at the s-expression of any element of last_marked, although I don't know if the others are of interest. -- Nick http://www.inet.net.nz/~nickrob ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 20:45 ` Nick Roberts @ 2006-03-15 4:43 ` Eli Zaretskii 2006-03-15 7:49 ` Nick Roberts 2006-03-15 20:21 ` Richard Stallman 1 sibling, 1 reply; 43+ messages in thread From: Eli Zaretskii @ 2006-03-15 4:43 UTC (permalink / raw) Cc: emacs-devel > From: Nick Roberts <nickrob@snap.net.nz> > Date: Wed, 15 Mar 2006 09:45:22 +1300 > Cc: emacs-devel@gnu.org > > > The fact that there are thousands of recursive calls to mark_object is > > not in itself a sign of a problem. It is normal for the mark phase to > > be deeply recursive. > > OK, I didn't know that. Perhaps I should look at the bottom of the backtrace > (i.e low frame nos) instead of the top. Actually, it's the other way around: you need to look at the frames that call mark_object and its subroutines, and try to correlate those frames with the contents of last_marked[] array. Through these two pieces of evidence, you should reconstruct the Lisp data structure that is being marked (recursively) at the point of crash. Once the offending data structure is identified, i.e. you know the name of the Lisp variable/function/whatever that was corrupted, the next step is to try to figure out how it gets corrupted. > (gdb) p last_marked_index > $1 = 482 > (gdb) p last_marked[482] > $2 = 173755437 > (gdb) xtype > Lisp_Cons > (gdb) xcons > $3 = (struct Lisp_Cons *) 0xa5b4c28 > { > car = 0x83bc641, > u = { > cdr = 0x837b8c9, > chain = 0x837b8c9 > } > } > (gdb) p last_marked[481] > $4 = 167781611 > (gdb) xtype > Lisp_String > (gdb) xcons > $5 = (struct Lisp_Cons *) 0xa0024e8 > { > car = 0x4, > u = { > cdr = 0xffffffff, > chain = 0xffffffff > } > } > > These last addresses looks suspect Yes. > I don't know what to do next. You need to go back in time ;-). Print previous values in last_marked[] and correlate them with the backtrace. In each frame of the backtrace, you will see what kind of Lisp primitive data type is being marked, but since some subroutines of mark_object have loops, you won't see all the components being marked in the backtrace, so last_marked[] will fill in the blanks. For each Lisp type you find in last_marked[], try to establish its type and name, and, if it's a string, the value. The name and the string value are the most important parts, since you can then grep the sources to find out what data structure it could belong to. Continue doing this until you find a symbol that is a global or buffer-local variable you can identify in the sources. > Am I right to assume that 481 is the index of the very last marked > object, 480 the one before etc. And that 482 is the index of the > oldest marked object in the array held in a circular fashion? Yes. You need to go from 481 backwards and examine the objects one by one. > Incidentally with gdb-ui, if you display a watch expression in the speedbar > and press 'p' on a component (with a live process), Emacs will print the > s-expression in the GUD buffer. Beware: these features invoke code inside the crashed Emacs version. Even if you have a live process, if it crashed, it is unsafe to invoke `pr' and its ilk in that session, because it will most probably get a SIGSEGV a second time. You _must_ use only the simple commands xtype, xcons, xsymbol, xstring, etc. One other thing: since you are in the middle of the mark stage of GC, some objects, notably the strings in last_marked[] array, have their mark bit set and are relocated. I think xstring, doesn't know how to cope with that, so you might need to look at lisp.h and reconstruct the C pointers to the relevant C data structure manually, instead of using xstring. (This particular piece of experience is from long ago, so perhaps this problem is no longer with us with the current sources. Just don't be intimidated if some xstring says it cannot show the value, even though xtype said it's a string; try walking the C data structures manually.) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 4:43 ` Eli Zaretskii @ 2006-03-15 7:49 ` Nick Roberts 2006-03-15 19:49 ` Eli Zaretskii 0 siblings, 1 reply; 43+ messages in thread From: Nick Roberts @ 2006-03-15 7:49 UTC (permalink / raw) Cc: emacs-devel > > (gdb) p last_marked[481] > > $4 = 167781611 > > (gdb) xtype > > Lisp_String > > (gdb) xcons > > $5 = (struct Lisp_Cons *) 0xa0024e8 > > { > > car = 0x4, > > u = { > > cdr = 0xffffffff, > > chain = 0xffffffff > > } > > } > > > > These last addresses looks suspect > > Yes. Sorry, that was a mistake, I should have type xstring instead of xcons. (gdb) p* (struct Lisp_String *) 0xa0024e8 $15 = { size = 4, size_byte = -1, intervals = 0x10, data = 0xa66c79c "\301\b!\207" } which is what the variable ptr points to and it crashes out on the line: MARK_INTERVAL_TREE (ptr->intervals); > > I don't know what to do next. > > You need to go back in time ;-). Print previous values in > last_marked[] and correlate them with the backtrace. In each frame of > the backtrace, you will see what kind of Lisp primitive data type is > being marked, but since some subroutines of mark_object have loops, > you won't see all the components being marked in the backtrace, so > last_marked[] will fill in the blanks. > > For each Lisp type you find in last_marked[], try to establish its > type and name, and, if it's a string, the value. The name and the > string value are the most important parts, since you can then grep the > sources to find out what data structure it could belong to. Continue > doing this until you find a symbol that is a global or buffer-local > variable you can identify in the sources. Here are some values below but I can't see a connection between them. I guess I should try to work out what created (struct Lisp_String *) 0xa0024e8. Nick (gdb) p last_marked[482] $24 = 173755437 (gdb) xtyp Lisp_Cons (gdb) p last_marked[481] $1 = 167781611 (gdb) xtyp Lisp_String (gdb) xstring $2 = (struct Lisp_String *) 0xa0024e8 "\301\b!\207" (gdb) p last_marked[480] $3 = 138964225 (gdb) xtyp Lisp_Symbol (gdb) xsym $4 = (struct Lisp_Symbol *) 0x8486d00 "rev" (gdb) p last_marked[479] $5 = 174656941 (gdb) xtyp Lisp_Cons (gdb) xcons $6 = (struct Lisp_Cons *) 0xa690da8 { car = 0x8486d01, u = { cdr = 0x837b8c9, chain = 0x837b8c9 } } (gdb) p last_marked[478] $11 = 140320329 (gdb) xtyp Lisp_Symbol (gdb) xsym $12 = (struct Lisp_Symbol *) 0x85d1e48 "backend" (gdb) p last_marked[477] $13 = 174656909 (gdb) xtyp Lisp_Cons (gdb) xcons $14 = (struct Lisp_Cons *) 0xa690d88 { car = 0x85d1e49, u = { cdr = 0xa690dad, chain = 0xa690dad } } (gdb) p last_marked[476] $21 = 175717180 (gdb) xtyp Lisp_Vectorlike PVEC_COMPILED (gdb) p last_marked[475] $22 = 137869537 (gdb) xtyp Lisp_Symbol (gdb) xsym $23 = (struct Lisp_Symbol *) 0x837b8e0 "unbound" (gdb) p last_marked[482] $24 = 173755437 (gdb) xtyp Lisp_Cons (gdb) p last_marked[474] $25 = 172548329 (gdb) xtyp Lisp_Symbol (gdb) xsym $26 = (struct Lisp_Symbol *) 0xa48e0e8 "vc-default-show-log-entry" (gdb) p last_marked[473] $1 = 160558849 (gdb) xtyp Lisp_Symbol (gdb) xsym $2 = (struct Lisp_Symbol *) 0x991ef00 "ediff-skip-merge-regions-that-differ-from-default" (gdb) p last_marked[472] $3 = 137869513 (gdb) xtyp Lisp_Symbol (gdb) xsym $4 = (struct Lisp_Symbol *) 0x837b8c8 "nil" (gdb) p last_marked[471] $5 = 137869513 (gdb) xtyp Lisp_Symbol (gdb) xsym $6 = (struct Lisp_Symbol *) 0x837b8c8 "nil" (gdb) p last_marked[470] $7 = 137869513 (gdb) xtyp Lisp_Symbol (gdb) xsym $8 = (struct Lisp_Symbol *) 0x837b8c8 "nil" (gdb) p last_marked[469] $9 = 376392 (gdb) xtyp Lisp_Int (gdb) xint $10 = 47049 (gdb) p last_marked[468] $13 = 148534611 (gdb) xtyp Lisp_String (gdb) xstring $14 = (struct Lisp_String *) 0x8da7550 "/home/nickrob/emacs/lisp/mail/sendmail.elc" ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 7:49 ` Nick Roberts @ 2006-03-15 19:49 ` Eli Zaretskii 2006-03-15 21:40 ` Nick Roberts 0 siblings, 1 reply; 43+ messages in thread From: Eli Zaretskii @ 2006-03-15 19:49 UTC (permalink / raw) Cc: emacs-devel > From: Nick Roberts <nickrob@snap.net.nz> > Date: Wed, 15 Mar 2006 20:49:55 +1300 > Cc: emacs-devel@gnu.org > > (gdb) p* (struct Lisp_String *) 0xa0024e8 > $15 = { > size = 4, > size_byte = -1, > intervals = 0x10, > data = 0xa66c79c "\301\b!\207" > } > > which is what the variable ptr points to and it crashes out on the line: > > MARK_INTERVAL_TREE (ptr->intervals); I think ptr->intervals is the reason for the crash, because MARK_INTERVAL_TREE dereferences it, and 0x10 is too small to be a valid address. > Here are some values below but I can't see a connection between them. A simple list of values recorded in last_marked[] won't do. You need to correlate it with the innermost frames you see in the backtrace, and from that correlation figure out the name of the Lisp data structure that is being marked. The connection between the values recorded in last_marked[] will be revealed if you look at the code, because, e.g., when GC finds a cons, it recursively marks its car and its cdr. By looking at the code, you should be able to find this and other similar connections between the values, like A being a property of B etc. > I guess I should try to work out what created (struct Lisp_String *) > 0xa0024e8. Not who created it, but what higher-level Lisp data is it part of. Btw, looking at the value of that string, namely > (gdb) p* (struct Lisp_String *) 0xa0024e8 > $15 = { > size = 4, > size_byte = -1, > intervals = 0x10, > data = 0xa66c79c "\301\b!\207" > } It sounds like its data is some bytecode. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 19:49 ` Eli Zaretskii @ 2006-03-15 21:40 ` Nick Roberts 2006-03-16 20:18 ` Richard Stallman 0 siblings, 1 reply; 43+ messages in thread From: Nick Roberts @ 2006-03-15 21:40 UTC (permalink / raw) Cc: emacs-devel > > Here are some values below but I can't see a connection between them. > > A simple list of values recorded in last_marked[] won't do. You need > to correlate it with the innermost frames you see in the backtrace, > and from that correlation figure out the name of the Lisp data > structure that is being marked. The connection between the values > recorded in last_marked[] will be revealed if you look at the code, > because, e.g., when GC finds a cons, it recursively marks its car and > its cdr. By looking at the code, you should be able to find this and > other similar connections between the values, like A being a property > of B etc. OK, this tells me more than I could find in DEBUG. > > I guess I should try to work out what created (struct Lisp_String *) > > 0xa0024e8. > > Not who created it, but what higher-level Lisp data is it part of. > Btw, looking at the value of that string, namely > > > (gdb) p* (struct Lisp_String *) 0xa0024e8 > > $15 = { > > size = 4, > > size_byte = -1, > > intervals = 0x10, > > data = 0xa66c79c "\301\b!\207" > > } > > It sounds like its data is some bytecode. I've rebuilt Emacs now but this information will be useful for later. -- Nick http://www.inet.net.nz/~nickrob ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 21:40 ` Nick Roberts @ 2006-03-16 20:18 ` Richard Stallman 2006-03-16 21:25 ` Nick Roberts 0 siblings, 1 reply; 43+ messages in thread From: Richard Stallman @ 2006-03-16 20:18 UTC (permalink / raw) Cc: eliz, emacs-devel > A simple list of values recorded in last_marked[] won't do. You need > to correlate it with the innermost frames you see in the backtrace, > and from that correlation figure out the name of the Lisp data > structure that is being marked. The connection between the values > recorded in last_marked[] will be revealed if you look at the code, > because, e.g., when GC finds a cons, it recursively marks its car and > its cdr. By looking at the code, you should be able to find this and > other similar connections between the values, like A being a property > of B etc. OK, this tells me more than I could find in DEBUG. Could you add some of that to DEBUG? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-16 20:18 ` Richard Stallman @ 2006-03-16 21:25 ` Nick Roberts 2006-03-18 14:31 ` Eli Zaretskii 0 siblings, 1 reply; 43+ messages in thread From: Nick Roberts @ 2006-03-16 21:25 UTC (permalink / raw) Cc: eliz, emacs-devel > > A simple list of values recorded in last_marked[] won't do. You need > > to correlate it with the innermost frames you see in the backtrace, > > and from that correlation figure out the name of the Lisp data > > structure that is being marked. The connection between the values > > recorded in last_marked[] will be revealed if you look at the code, > > because, e.g., when GC finds a cons, it recursively marks its car and > > its cdr. By looking at the code, you should be able to find this and > > other similar connections between the values, like A being a property > > of B etc. > > OK, this tells me more than I could find in DEBUG. > > Could you add some of that to DEBUG? It would be better coming from Eli, but I will do this if he doesn't have the time. -- Nick http://www.inet.net.nz/~nickrob ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-16 21:25 ` Nick Roberts @ 2006-03-18 14:31 ` Eli Zaretskii 0 siblings, 0 replies; 43+ messages in thread From: Eli Zaretskii @ 2006-03-18 14:31 UTC (permalink / raw) Cc: emacs-devel > From: Nick Roberts <nickrob@snap.net.nz> > Date: Fri, 17 Mar 2006 10:25:44 +1300 > Cc: eliz@gnu.org, emacs-devel@gnu.org > > > > A simple list of values recorded in last_marked[] won't do. You need > > > to correlate it with the innermost frames you see in the backtrace, > > > and from that correlation figure out the name of the Lisp data > > > structure that is being marked. The connection between the values > > > recorded in last_marked[] will be revealed if you look at the code, > > > because, e.g., when GC finds a cons, it recursively marks its car and > > > its cdr. By looking at the code, you should be able to find this and > > > other similar connections between the values, like A being a property > > > of B etc. > > > > OK, this tells me more than I could find in DEBUG. > > > > Could you add some of that to DEBUG? > > It would be better coming from Eli, but I will do this if he doesn't have the > time. Done. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-14 20:45 ` Nick Roberts 2006-03-15 4:43 ` Eli Zaretskii @ 2006-03-15 20:21 ` Richard Stallman 2006-03-16 20:18 ` Richard Stallman 1 sibling, 1 reply; 43+ messages in thread From: Richard Stallman @ 2006-03-15 20:21 UTC (permalink / raw) Cc: eliz, emacs-devel > The fact that there are thousands of recursive calls to mark_object is > not in itself a sign of a problem. It is normal for the mark phase to > be deeply recursive. OK, I didn't know that. Perhaps I should look at the bottom of the backtrace (i.e low frame nos) instead of the top. That would be useful if you want to see what Emacs was doing when it garbage collected, so as to see what recent previous activity might have been responsible for the clobberage. However, for finding out what data was clobbered, you need to look at the innermost frames. Finding out what data was clobbered is often useful because often the clobberage is not entirely random. It may, for instance, be an overrun problem affecting the data immediately before in memory. $5 = (struct Lisp_Cons *) 0xa0024e8 { car = 0x4, u = { cdr = 0xffffffff, chain = 0xffffffff } } These last addresses looks suspect I don't know what to do next. It seems definitely invalid. So we know that the code that clobbers can store -1. That may be useful. Is it always -1? However, it seems clear that all the other data near this one are cons cells too. And cons cell slots are only used as cons cells. An overrun on a nearby cons cell seems rather implausible as a source of error. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: Emacs crashes 2006-03-15 20:21 ` Richard Stallman @ 2006-03-16 20:18 ` Richard Stallman 0 siblings, 0 replies; 43+ messages in thread From: Richard Stallman @ 2006-03-16 20:18 UTC (permalink / raw) So we know that the code that clobbers can store -1. That may be useful. Is it always -1? I see I was mistaken in reaching that conclusion, since it wasn't a real cons cell. ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2006-03-18 14:31 UTC | newest] Thread overview: 43+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-13 20:23 Emacs crashes Nick Roberts 2006-03-13 20:47 ` Chong Yidong 2006-03-13 22:06 ` Kim F. Storm 2006-03-14 0:39 ` Kenichi Handa 2006-03-14 16:09 ` Richard Stallman 2006-03-15 3:24 ` Giorgos Keramidas 2006-03-15 20:23 ` Richard Stallman 2006-03-14 1:02 ` Juanma Barranquero 2006-03-14 9:36 ` David Kastrup 2006-03-14 11:59 ` Juanma Barranquero 2006-03-14 17:45 ` Richard Stallman 2006-03-15 8:58 ` Juanma Barranquero 2006-03-17 16:32 ` Richard Stallman 2006-03-17 16:41 ` Juanma Barranquero 2006-03-14 1:37 ` Nick Roberts 2006-03-14 16:07 ` Chong Yidong 2006-03-14 16:15 ` Kim F. Storm 2006-03-14 16:09 ` Richard Stallman 2006-03-14 20:47 ` Kim F. Storm 2006-03-14 21:35 ` Chong Yidong 2006-03-15 20:21 ` Richard Stallman 2006-03-14 22:38 ` Kim F. Storm 2006-03-15 9:22 ` Nick Roberts 2006-03-15 9:28 ` David Kastrup 2006-03-15 11:35 ` Jan D. 2006-03-15 3:21 ` Giorgos Keramidas 2006-03-15 20:21 ` Richard Stallman 2006-03-15 15:41 ` Kim F. Storm 2006-03-15 17:05 ` Luc Teirlinck 2006-03-15 17:21 ` Chong Yidong 2006-03-15 19:03 ` Kim F. Storm 2006-03-15 21:40 ` Nick Roberts 2006-03-14 4:33 ` Eli Zaretskii 2006-03-14 20:45 ` Nick Roberts 2006-03-15 4:43 ` Eli Zaretskii 2006-03-15 7:49 ` Nick Roberts 2006-03-15 19:49 ` Eli Zaretskii 2006-03-15 21:40 ` Nick Roberts 2006-03-16 20:18 ` Richard Stallman 2006-03-16 21:25 ` Nick Roberts 2006-03-18 14:31 ` Eli Zaretskii 2006-03-15 20:21 ` Richard Stallman 2006-03-16 20:18 ` Richard Stallman
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).