unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* crash: x_error_quitter
@ 2007-05-10 18:05 sds
  2007-05-11 18:48 ` Richard Stallman
  2007-05-11 20:02 ` Chong Yidong
  0 siblings, 2 replies; 23+ messages in thread
From: sds @ 2007-05-10 18:05 UTC (permalink / raw)
  To: emacs-devel

GNU Emacs 22.1.50.5 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2007-05-08 on nyc-qws-005

despite all my attempts emacs just died....
this happened when emacsclient tried to create a new frame.


Breakpoint 3, x_error_quitter (display=0x8593da8, error=0xbfa4d328)
    at xterm.c:7859
7859      if (error->error_code == BadName)
(gdb)
(gdb)
(gdb)
(gdb)
(gdb) where
#0  x_error_quitter (display=0x8593da8, error=0xbfa4d328) at xterm.c:7859
#1  0x080c9a9d in x_error_handler (display=0x8593da8, error=0xbfa4d328)
    at xterm.c:7824
#2  0xb7c599ea in _XError () from /usr/lib/libX11.so.6
#3  0xb7c5c141 in _XEventsQueued () from /usr/lib/libX11.so.6
#4  0xb7c47c02 in XPending () from /usr/lib/libX11.so.6
#5  0x080c97ef in XTread_socket (sd=0, expected=1, hold_quit=0xbfa4ef70)
    at xterm.c:7068
#6  0x080f3ead in read_avail_input (expected=1) at keyboard.c:6843
#7  0x080f409a in handle_async_input () at keyboard.c:6989
#8  0x080597da in change_frame_size_1 (f=0x8678ca8, newheight=90, newwidth=80,
    pretend=1, delay=0, safe=0) at dispnew.c:6372
#9  0x080d706c in Fx_create_frame (parms=151565093) at xfns.c:3368
#10 0x08154131 in Ffuncall (nargs=2, args=0xbfa4f2f8) at eval.c:2997
#11 0x0817ebdd in Fbyte_code (bytestr=136233899, vector=136233916, maxdepth=40)
    at bytecode.c:679
#12 0x08153bda in funcall_lambda (fun=136233852, nargs=1,
    arg_vector=0xbfa4f424) at eval.c:3184
#13 0x08153fe1 in Ffuncall (nargs=2, args=0xbfa4f420) at eval.c:3054
#14 0x0817ebdd in Fbyte_code (bytestr=136476683, vector=136476700, maxdepth=24)
    at bytecode.c:679
#15 0x08153bda in funcall_lambda (fun=136476636, nargs=1,
    arg_vector=0xbfa4f4e0) at eval.c:3184
#16 0x08153ddb in apply_lambda (fun=136476636, args=138166029, eval_flag=1)
    at eval.c:3108
#17 0x081534b2 in Feval (form=138166037) at eval.c:2370
#18 0x08153a3f in Fprogn (args=138166005) at eval.c:447
#19 0x08153cb4 in funcall_lambda (fun=138166048, nargs=0,
    arg_vector=0xbfa4f664) at eval.c:3177
#20 0x08153fe1 in Ffuncall (nargs=1, args=0xbfa4f660) at eval.c:3054
#21 0x08155499 in call0 (fn=138166053) at eval.c:2761
#22 0x0809472a in Fdisplay_buffer (buffer=140525452,
    not_this_window=137472201, frame=137472201) at window.c:3694
#23 0x0810e335 in Fpop_to_buffer (buffer=140525452, other_window=137472201,
    norecord=137472201) at buffer.c:1775
#24 0x0815416a in Ffuncall (nargs=2, args=0xbfa4f7d4) at eval.c:3003
#25 0x08155969 in Fapply (nargs=2, args=0xbfa4f7d4) at eval.c:2430
#26 0x08154305 in Ffuncall (nargs=3, args=0xbfa4f7d0) at eval.c:2978
#27 0x0817ebdd in Fbyte_code (bytestr=140985083, vector=140985940, maxdepth=24)
    at bytecode.c:679
#28 0x08153bda in funcall_lambda (fun=140986068, nargs=1,
    arg_vector=0xbfa4f8f4) at eval.c:3184
#29 0x08153fe1 in Ffuncall (nargs=2, args=0xbfa4f8f0) at eval.c:3054
#30 0x0817ebdd in Fbyte_code (bytestr=142561251, vector=142565676, maxdepth=56)
    at bytecode.c:679
#31 0x08153bda in funcall_lambda (fun=142565932, nargs=1,
    arg_vector=0xbfa4fa24) at eval.c:3184
#32 0x08153fe1 in Ffuncall (nargs=2, args=0xbfa4fa20) at eval.c:3054
#33 0x0817ebdd in Fbyte_code (bytestr=142557851, vector=142562796, maxdepth=80)
    at bytecode.c:679
#34 0x08153695 in Feval (form=142536141) at eval.c:2334
#35 0x08152cda in internal_catch (tag=142547657, func=0x81532d0 <Feval>,
    arg=142536141) at eval.c:1222
#36 0x0817de5b in Fbyte_code (bytestr=142557819, vector=142563228, maxdepth=16)
    at bytecode.c:854
#37 0x08153bda in funcall_lambda (fun=142563348, nargs=2,
---Type <return> to continue, or q <return> to quit---
    arg_vector=0xbfa4fd44) at eval.c:3184
#38 0x08153fe1 in Ffuncall (nargs=3, args=0xbfa4fd40) at eval.c:3054
#39 0x08155874 in Fapply (nargs=2, args=0xbfa4fd90) at eval.c:2485
#40 0x081559b4 in apply1 (fn=142547369, arg=150348069) at eval.c:2749
#41 0x0818136d in read_process_output_call (fun_and_args=150348005)
    at process.c:4961
#42 0x081529f8 in internal_condition_case_1 (
    bfun=0x8181350 <read_process_output_call>, arg=150348005,
    handlers=137472201, hfun=0x8181300 <read_process_output_error_handler>)
    at eval.c:1529
#43 0x08180c13 in read_process_output (proc=156762508, channel=Variable "channel" is not available.
)
    at process.c:5191
#44 0x08184dbb in wait_reading_process_output (time_limit=30, microsecs=0,
    read_kbd=-1, do_display=1, wait_for_cell=137472201, wait_proc=0x0,
    just_wait_proc=0) at process.c:4795
#45 0x08053f60 in sit_for (timeout=240, reading=1, do_display=1)
    at dispnew.c:6579
#46 0x080f919b in read_char (commandflag=1, nmaps=2, maps=0xbfa51520,
    prev_event=137472201, used_mouse_menu=0xbfa515c8, end_time=0x0)
    at keyboard.c:2904
#47 0x080faf46 in read_key_sequence (keybuf=0xbfa51674, bufsize=30,
    prompt=137472201, dont_downcase_last=0, can_return_switch_frame=1,
    fix_current_buffer=1) at keyboard.c:9135
#48 0x080fca33 in command_loop_1 () at keyboard.c:1618
#49 0x08152c22 in internal_condition_case (bfun=0x80fc8a0 <command_loop_1>,
    handlers=137517657, hfun=0x80f7390 <cmd_error>) at eval.c:1481
#50 0x080f67d3 in command_loop_2 () at keyboard.c:1329
#51 0x08152cda in internal_catch (tag=137510841,
    func=0x80f67b0 <command_loop_2>, arg=137472201) at eval.c:1222
#52 0x080f71cc in command_loop () at keyboard.c:1308
#53 0x080f756b in recursive_edit_1 () at keyboard.c:1006
#54 0x080f7656 in Frecursive_edit () at keyboard.c:1067
#55 0x080ed4f5 in main (argc=2, argv=0xbfa51d74) at emacs.c:1762

Lisp Backtrace:
"x-create-frame" (0x908b325)
"x-create-frame-with-faces" (0x831a8c9)
"make-frame" (0x831a8c9)
0x83c3f25 Lisp type 5
"pop-to-buffer" (0x8603f8c)
"apply" (0x8331969)
0x86746d4 PVEC_COMPILED
"server-switch-buffer" (0x8603f8c)
"byte-code" (0x87f429b)
"server-process-filter" (0x958018c)
(gdb) p display
$1 = (Display *) 0x8593da8
(gdb) p *display
$2 = {
  ext_data = 0x85a37e8,
  private1 = 0x85983f8,
  fd = 7,
  private2 = 0,
  proto_major_version = 11,
  proto_minor_version = 0,
  vendor = 0x8594318 "The X.Org Foundation",
  private3 = 46137344,
  private4 = 2097151,
  private5 = 1142347,
  private6 = 0,
  resource_alloc = 0xb7c58fc0 <_XAllocID>,
  byte_order = 0,
  bitmap_unit = 32,
  bitmap_pad = 32,
  bitmap_bit_order = 0,
  nformats = 7,
  pixmap_format = 0x8598520,
  private8 = 11,
  release = 70000000,
  private9 = 0x0,
  private10 = 0x0,
  qlen = 0,
  last_request_read = 21403417,
  request = 21403429,
  private11 = 0xb7d17b2c "",
  private12 = 0x85943f0 "\022",
  private13 = 0x85943f0 "\022",
  private14 = 0x85983f0 "",
  max_request_size = 65535,
  db = 0x859fec0,
  private15 = 0,
  display_name = 0x8594398 ":0.0",
  default_screen = 0,
  nscreens = 1,
  screens = 0x8598598,
  motion_buffer = 256,
  private16 = 0,
  min_keycode = 8,
  max_keycode = 255,
  private17 = 0x0,
  private18 = 0x0,
  private19 = 0,
  xdefaults = 0x8598690 "*Box.background:\t#ede9e3\n*Box.foreground:\t#000000\n*Button.activeBackground:\t#ffffff\n*Button.activeForeground:\t#000000\n*Button.background:\t#ede9e3\n*Button.foreground:\t#000000\n*Button.highlightBackgroun"...
}
(gdb) p error
$3 = (XErrorEvent *) 0xbfa4d328
(gdb) p *error
$4 = {
  type = 0,
  display = 0x8593da8,
  resourceid = 47279684,
  serial = 21403417,
  error_code = 7 '\a',
  request_code = 55 '7',
  minor_code = 0 '\0'
}
(gdb) list
1762      Frecursive_edit ();
1763      /* NOTREACHED */
1764      return 0;
1765    }
1766    ^L
1767    /* Sort the args so we can find the most important ones
1768       at the beginning of argv.  */
1769
1770    /* First, here's a table of all the standard options.  */
1771
(gdb) up
#1  0x080c9a9d in x_error_handler (display=0x8593da8, error=0xbfa4d328)
    at xterm.c:7824
7824        x_error_quitter (display, error);
(gdb) list
7819         XErrorEvent *error;
7820    {
7821      if (x_error_message)
7822        x_error_catcher (display, error);
7823      else
7824        x_error_quitter (display, error);
7825      return 0;
7826    }
7827
7828    /* This is the usual handler for X protocol errors.
(gdb) p x_error_message
$5 = (struct x_error_message_stack *) 0x0
(gdb) return 0
Make x_error_handler return now? (y or n) y
#0  0xb7c599ea in _XError ()
   from /usr/lib/libX11.so.6
(gdb) c
Continuing.

Breakpoint 3, x_error_quitter (display=0x8593da8, error=0xbfa4d328)
    at xterm.c:7859
7859      if (error->error_code == BadName)
(gdb) n
7865      XGetErrorText (display, error->error_code, buf, sizeof (buf));
(gdb) n
7866      sprintf (buf1, "X protocol error: %s on protocol request %d",
(gdb) n
7868      x_connection_closed (display, buf1);
(gdb) list
7863         original error handler.  */
7864
7865      XGetErrorText (display, error->error_code, buf, sizeof (buf));
7866      sprintf (buf1, "X protocol error: %s on protocol request %d",
7867               buf, error->request_code);
7868      x_connection_closed (display, buf1);
7869    }
7870
7871
7872    /* This is the handler for X IO errors, always.
(gdb) p buf1
$6 = "X protocol error: BadFont (invalid Font parameter) on protocol request 55\000\000\200\000\300\000\000\000\000\000\000\000\200\377?\236\330\211\235\330\211\235\330\376?\000\000\000\000\000\212\235\330\376?", '\0' <repeats 20 times>, " @\000\000\177\003 @\000\000\311\0013\206\r\bs\000\000\000\364\215g\b{\000\000\000\200\037\000\000\377\377", '\0' <repeats 41 times>, "\200\000\300", '\0' <repeats 13 times>, "\200\377?\000\000\000\000\000\000\236\330\211\235\330\211\235\330\376?", '\0' <repeats 11 times>....
(gdb) return
Make x_error_quitter return now? (y or n) y
#0  x_error_handler (
    display=0x8593da8, error=0xbfa4d328) at xterm.c:7826
7826    }
(gdb) n
0x080c9a9f      7826    }
(gdb) n
0xb7c599ea in _XError () from /usr/lib/libX11.so.6
(gdb)
Single stepping until exit from function _XError,
which has no line number information.
0xb7c5c141 in _XEventsQueued () from /usr/lib/libX11.so.6
(gdb) n
Single stepping until exit from function _XEventsQueued,
which has no line number information.

Breakpoint 3, x_error_quitter (display=0x8593da8, error=0xbfa4d328)
    at xterm.c:7859
7859      if (error->error_code == BadName)
(gdb) n
7865      XGetErrorText (display, error->error_code, buf, sizeof (buf));
(gdb)
7866      sprintf (buf1, "X protocol error: %s on protocol request %d",
(gdb)
7868      x_connection_closed (display, buf1);
(gdb) p buf1
$7 = "X protocol error: BadFont (invalid Font parameter) on protocol request 55\000\000\200\000\300\000\000\000\000\000\000\000\200\377?\236\330\211\235\330\211\235\330\376?\000\000\000\000\000\212\235\330\376?", '\0' <repeats 20 times>, " @\000\000\177\003 @\000\000\311\0013\206\r\bs\000\000\000\364\215g\b{\000\000\000\200\037\000\000\377\377", '\0' <repeats 41 times>, "\200\000\300", '\0' <repeats 13 times>, "\200\377?\000\000\000\000\000\000\236\330\211\235\330\211\235\330\376?", '\0' <repeats 11 times>....
(gdb) return
Make x_error_quitter return now? (y or n) y
#0  x_error_handler (
    display=0x8593da8, error=0xbfa4d328) at xterm.c:7826
7826    }
(gdb) list
7821      if (x_error_message)
7822        x_error_catcher (display, error);
7823      else
7824        x_error_quitter (display, error);
7825      return 0;
7826    }
7827
7828    /* This is the usual handler for X protocol errors.
7829       It kills all frames on the display that we got the error for.
7830       If that was the only one, it prints an error message and kills Emacs.  */
(gdb) n
0x080c9a9f      7826    }
(gdb)
0xb7c599ea in _XError () from /usr/lib/libX11.so.6
(gdb) c
Continuing.

Breakpoint 3, x_error_quitter (display=0x8593da8, error=0xbfa4e528)
    at xterm.c:7859
7859      if (error->error_code == BadName)
(gdb) return 0
Make x_error_quitter return now? (y or n) y
#0  x_error_handler (
    display=0x8593da8, error=0xbfa4e528) at xterm.c:7826
7826    }
(gdb) c
Continuing.

Breakpoint 3, x_error_quitter (display=0x8593da8, error=0xbfa4e528)
    at xterm.c:7859
7859      if (error->error_code == BadName)
(gdb) p error->error_code
$8 = 13 '\r'
(gdb) list
7854      char buf[256], buf1[356];
7855
7856      /* Ignore BadName errors.  They can happen because of fonts
7857         or colors that are not defined.  */
7858
7859      if (error->error_code == BadName)
7860        return;
7861
7862      /* Note that there is no real way portable across R3/R4 to get the
7863         original error handler.  */
(gdb) return
Make x_error_quitter return now? (y or n) y
#0  x_error_handler (
    display=0x8593da8, error=0xbfa4e528) at xterm.c:7826
7826    }
(gdb) c
Continuing.

Breakpoint 3, x_error_quitter (display=0x8593da8, error=0xbfa4e528)
    at xterm.c:7859
7859      if (error->error_code == BadName)
(gdb) c
Continuing.
X protocol error: BadGC (invalid GC parameter) on protocol request 56

Program exited with code 0106.

-- 
Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux)
http://jihadwatch.org http://ffii.org http://israelunderattack.slide.com
http://openvotingconsortium.org http://iris.org.il
Lisp: Serious empowerment.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-10 18:05 crash: x_error_quitter sds
@ 2007-05-11 18:48 ` Richard Stallman
  2007-05-11 18:56   ` Sam Steingold
  2007-05-11 20:02 ` Chong Yidong
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Stallman @ 2007-05-11 18:48 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

Can you reproduce it?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-11 18:48 ` Richard Stallman
@ 2007-05-11 18:56   ` Sam Steingold
  2007-05-12 16:47     ` Richard Stallman
  0 siblings, 1 reply; 23+ messages in thread
From: Sam Steingold @ 2007-05-11 18:56 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Richard Stallman wrote:
> Can you reproduce it?

no, not reliably.
this just happens every now and then, a couple of times a month.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGRLxVPp1Qsf2qnMcRAtxzAJ4418pHpfeEgpPcgA9Ll37BrlZ8tgCglWgG
zmeQhnICd25qcsz7NGlNFZo=
=2oso
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-10 18:05 crash: x_error_quitter sds
  2007-05-11 18:48 ` Richard Stallman
@ 2007-05-11 20:02 ` Chong Yidong
  2007-05-12  6:51   ` Jan Djärv
  2007-05-12 16:47   ` Richard Stallman
  1 sibling, 2 replies; 23+ messages in thread
From: Chong Yidong @ 2007-05-11 20:02 UTC (permalink / raw)
  To: emacs-devel

sds@janestcapital.com writes:

> GNU Emacs 22.1.50.5 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
>  of 2007-05-08 on nyc-qws-005
>
> despite all my attempts emacs just died....
> this happened when emacsclient tried to create a new frame.

This is Yet Another Unnecessary X Protocol Error Crash, probably the
5th or 6th one to date.  As I've said many times, it's not necessary
for Emacs to crash on X protocol errors, since they are not fatal
errors, and can occur on misconfigured X servers.  The only reason to
crash is to annoy the user.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-11 20:02 ` Chong Yidong
@ 2007-05-12  6:51   ` Jan Djärv
  2007-05-12 16:47   ` Richard Stallman
  1 sibling, 0 replies; 23+ messages in thread
From: Jan Djärv @ 2007-05-12  6:51 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel



Chong Yidong skrev:
> sds@janestcapital.com writes:
> 
>> GNU Emacs 22.1.50.5 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
>>  of 2007-05-08 on nyc-qws-005
>>
>> despite all my attempts emacs just died....
>> this happened when emacsclient tried to create a new frame.
> 
> This is Yet Another Unnecessary X Protocol Error Crash, probably the
> 5th or 6th one to date.  As I've said many times, it's not necessary
> for Emacs to crash on X protocol errors, since they are not fatal
> errors, and can occur on misconfigured X servers.  The only reason to
> crash is to annoy the user.

We generally try to not crash.  If it still crashes it means that some code 
does not handle errors, and we need to find that code.  It can also be in the 
lucid part of the code, and then it may be harder to find.

That said, there certainly are X protocol errors Emacs should crash on, such 
as BadAlloc.

	Jan D.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-11 18:56   ` Sam Steingold
@ 2007-05-12 16:47     ` Richard Stallman
  0 siblings, 0 replies; 23+ messages in thread
From: Richard Stallman @ 2007-05-12 16:47 UTC (permalink / raw)
  To: Sam Steingold; +Cc: emacs-devel

    this just happens every now and then, a couple of times a month.

The only way I know of to debug this sort of thing is to do
(x-synchronize t).  Then when it crashes, it will crash inside the
primitive that is actually responsable for the problem, rather than
in whatever unrelated operation happens to come later.

If you do this all the time, then the next crash will give useful
debugging info.

Unfortunately, it will also run somewhat slower.  I don't know how
much slower.  You can try it and see if it is painful.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-11 20:02 ` Chong Yidong
  2007-05-12  6:51   ` Jan Djärv
@ 2007-05-12 16:47   ` Richard Stallman
  2007-05-12 23:31     ` Chong Yidong
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Stallman @ 2007-05-12 16:47 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel

    This is Yet Another Unnecessary X Protocol Error Crash, probably the
    5th or 6th one to date.  As I've said many times, it's not necessary
    for Emacs to crash on X protocol errors, since they are not fatal
    errors, and can occur on misconfigured X servers.  The only reason to
    crash is to annoy the user.

The reason we make this crash is so we can find the causes and fix them.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-12 16:47   ` Richard Stallman
@ 2007-05-12 23:31     ` Chong Yidong
  2007-05-12 23:36       ` Chong Yidong
  2007-05-13 13:00       ` Jan Djärv
  0 siblings, 2 replies; 23+ messages in thread
From: Chong Yidong @ 2007-05-12 23:31 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     This is Yet Another Unnecessary X Protocol Error Crash, probably the
>     5th or 6th one to date.  As I've said many times, it's not necessary
>     for Emacs to crash on X protocol errors, since they are not fatal
>     errors, and can occur on misconfigured X servers.  The only reason to
>     crash is to annoy the user.
>
> The reason we make this crash is so we can find the causes and fix them.

Except I don't think we've found a single legitimate bug this way.
Every time this comes up, we just end up adding yet another call to
x_catch_error.  Since X "protocol errors" are really more like warning
messages (as opposed to X "fatal errors"), I don't envision any other
situation cropping up.

But hey, CPU cycles are cheap these days, so what does it matter if we
plaster unnecessary x_catch_error calls all over the code.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-12 23:31     ` Chong Yidong
@ 2007-05-12 23:36       ` Chong Yidong
  2007-05-13 13:29         ` Jan Djärv
  2007-05-13 13:41         ` Chong Yidong
  2007-05-13 13:00       ` Jan Djärv
  1 sibling, 2 replies; 23+ messages in thread
From: Chong Yidong @ 2007-05-12 23:36 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> Richard Stallman <rms@gnu.org> writes:
>
>>     This is Yet Another Unnecessary X Protocol Error Crash, probably the
>>     5th or 6th one to date.  As I've said many times, it's not necessary
>>     for Emacs to crash on X protocol errors, since they are not fatal
>>     errors, and can occur on misconfigured X servers.  The only reason to
>>     crash is to annoy the user.
>>
>> The reason we make this crash is so we can find the causes and fix them.
>
> Except I don't think we've found a single legitimate bug this way.
> Every time this comes up, we just end up adding yet another call to
> x_catch_error.  Since X "protocol errors" are really more like warning
> messages (as opposed to X "fatal errors"), I don't envision any other
> situation cropping up.

Anyway, I have indeed added yet another call to x_catch_error to the
branch and the trunk.

I also hereby dub Yet Another Uncaught X Error Crash === YAUXEC
(apologies to Stefan).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-12 23:31     ` Chong Yidong
  2007-05-12 23:36       ` Chong Yidong
@ 2007-05-13 13:00       ` Jan Djärv
  2007-05-13 13:52         ` Chong Yidong
  1 sibling, 1 reply; 23+ messages in thread
From: Jan Djärv @ 2007-05-13 13:00 UTC (permalink / raw)
  To: Chong Yidong; +Cc: rms, emacs-devel



Chong Yidong skrev:
> Richard Stallman <rms@gnu.org> writes:
> 
>>     This is Yet Another Unnecessary X Protocol Error Crash, probably the
>>     5th or 6th one to date.  As I've said many times, it's not necessary
>>     for Emacs to crash on X protocol errors, since they are not fatal
>>     errors, and can occur on misconfigured X servers.  The only reason to
>>     crash is to annoy the user.
>>
>> The reason we make this crash is so we can find the causes and fix them.
> 
> Except I don't think we've found a single legitimate bug this way.

Yes we have.

> Every time this comes up, we just end up adding yet another call to
> x_catch_error.  Since X "protocol errors" are really more like warning
> messages (as opposed to X "fatal errors"), I don't envision any other
> situation cropping up.
> 

Depends on the error.  Some are really fatal.

> But hey, CPU cycles are cheap these days, so what does it matter if we
> plaster unnecessary x_catch_error calls all over the code.

It can hide a real error that causes the protocol error in the first place.

	Jan D.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-12 23:36       ` Chong Yidong
@ 2007-05-13 13:29         ` Jan Djärv
  2007-05-13 13:41         ` Chong Yidong
  1 sibling, 0 replies; 23+ messages in thread
From: Jan Djärv @ 2007-05-13 13:29 UTC (permalink / raw)
  To: Chong Yidong; +Cc: rms, emacs-devel

Chong Yidong skrev:

> Anyway, I have indeed added yet another call to x_catch_error to the
> branch and the trunk.
> 
> I also hereby dub Yet Another Uncaught X Error Crash === YAUXEC
> (apologies to Stefan).

Well, it makes all versions with X except Gtk+ crash immediately on startup.
Please revert that change.  It is not safe to do what it does in the signal
handler.

	Jan D.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-12 23:36       ` Chong Yidong
  2007-05-13 13:29         ` Jan Djärv
@ 2007-05-13 13:41         ` Chong Yidong
  1 sibling, 0 replies; 23+ messages in thread
From: Chong Yidong @ 2007-05-13 13:41 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

> Anyway, I have indeed added yet another call to x_catch_error to the
> branch and the trunk.

I reverted this change; it wasn't well thought out.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-13 13:00       ` Jan Djärv
@ 2007-05-13 13:52         ` Chong Yidong
  2007-05-13 15:33           ` Jan Djärv
  2007-05-14  8:09           ` Richard Stallman
  0 siblings, 2 replies; 23+ messages in thread
From: Chong Yidong @ 2007-05-13 13:52 UTC (permalink / raw)
  To: Jan Djärv; +Cc: rms, emacs-devel

Jan Djärv <jan.h.d@swipnet.se> writes:

>>> The reason we make this crash is so we can find the causes and fix them.
>>
>> Except I don't think we've found a single legitimate bug this way.
>
> Yes we have.
>
>> Every time this comes up, we just end up adding yet another call to
>> x_catch_error.  Since X "protocol errors" are really more like warning
>> messages (as opposed to X "fatal errors"), I don't envision any other
>> situation cropping up.
>
> Depends on the error.  Some are really fatal.
>
>> But hey, CPU cycles are cheap these days, so what does it matter if we
>> plaster unnecessary x_catch_error calls all over the code.
>
> It can hide a real error that causes the protocol error in the first place.

What's wrong with spitting out an error message to the terminal and
trying to continue, as I believe other X programs do?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-13 13:52         ` Chong Yidong
@ 2007-05-13 15:33           ` Jan Djärv
  2007-05-14  8:09           ` Richard Stallman
  1 sibling, 0 replies; 23+ messages in thread
From: Jan Djärv @ 2007-05-13 15:33 UTC (permalink / raw)
  To: Chong Yidong; +Cc: rms, emacs-devel

Chong Yidong skrev:
> Jan Djärv <jan.h.d@swipnet.se> writes:
>> It can hide a real error that causes the protocol error in the first place.
> 
> What's wrong with spitting out an error message to the terminal and
> trying to continue, as I believe other X programs do?
> 

Most programs don't install erro handlers so they print something and then die.

As I said, it can hide a real error.  If a font is invalid it is probably some
user error, i.e. he specified a non-existant font.  But if for example a
window or GC is invalid it is much more likely to be a serious bug in Emacs
internal data structures, and continuing after such an error could lead to all
sorts of nastiness.

	Jan D.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-13 13:52         ` Chong Yidong
  2007-05-13 15:33           ` Jan Djärv
@ 2007-05-14  8:09           ` Richard Stallman
  2007-05-14  8:40             ` David Kastrup
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Stallman @ 2007-05-14  8:09 UTC (permalink / raw)
  To: Chong Yidong; +Cc: jan.h.d, emacs-devel

    What's wrong with spitting out an error message to the terminal and
    trying to continue, as I believe other X programs do?

Users are not likely to report it or debug it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-14  8:09           ` Richard Stallman
@ 2007-05-14  8:40             ` David Kastrup
  2007-05-15  9:47               ` Richard Stallman
  0 siblings, 1 reply; 23+ messages in thread
From: David Kastrup @ 2007-05-14  8:40 UTC (permalink / raw)
  To: rms; +Cc: Chong Yidong, jan.h.d, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     What's wrong with spitting out an error message to the terminal and
>     trying to continue, as I believe other X programs do?
>
> Users are not likely to report it or debug it.

Frankly, users are not likely to report or debug a crash, anyway.  And
then there is the problem that post-mortem debugging of abort calls or
failed assertions is typically completely unreliable (at least without
-fno-crossjumping and likely some other options) because gcc knows
that abort can't return and will, for that reason, just jump to any
old abort call, and without bothering to keep the stack frame in a
useful state.  Which means that the traceback is likely completely
wrong, and local variables, in particular those that have been stored
in registers, are also completely unusable.

So we are not likely to get useful bug reports, anyway, unless the
person compiled Emacs herself and made judicious use of options as
suggested in etc/DEBUG.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-14  8:40             ` David Kastrup
@ 2007-05-15  9:47               ` Richard Stallman
  2007-05-15  9:59                 ` David Kastrup
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Stallman @ 2007-05-15  9:47 UTC (permalink / raw)
  To: David Kastrup; +Cc: cyd, jan.h.d, emacs-devel

      Which means that the traceback is likely completely
    wrong, and local variables, in particular those that have been stored
    in registers, are also completely unusable.

I think that is incorrect.  The frames on the stack give all the
necessary information, once (x-synchronize t) has been used.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-15  9:47               ` Richard Stallman
@ 2007-05-15  9:59                 ` David Kastrup
  2007-05-16  1:39                   ` Richard Stallman
  0 siblings, 1 reply; 23+ messages in thread
From: David Kastrup @ 2007-05-15  9:59 UTC (permalink / raw)
  To: rms; +Cc: cyd, jan.h.d, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>       Which means that the traceback is likely completely
>     wrong, and local variables, in particular those that have been stored
>     in registers, are also completely unusable.
>
> I think that is incorrect.  The frames on the stack give all the
> necessary information, once (x-synchronize t) has been used.

It would be helpful if you actually quoted the _relevant_ parts of
what you are commenting on.

I have spent several days debugging a failed assertion with such a
misleading stack frame due to not using -fno-crossjumping.  The
respective advice in etc/DEBUG is a result of that.

x_connection_closed is declared NO_RETURN, so gcc will not bother
about leaving the stack in a recognizable state when calling it.
Whether or not that affects the calls of x_error_quitter may depend on
whether gcc consequently figures out it being NO_RETURN, too.  With
gcc's current optimization strategies, that is quite plausible.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-15  9:59                 ` David Kastrup
@ 2007-05-16  1:39                   ` Richard Stallman
  2007-05-16  6:58                     ` David Kastrup
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Stallman @ 2007-05-16  1:39 UTC (permalink / raw)
  To: David Kastrup; +Cc: cyd, jan.h.d, emacs-devel

    >       Which means that the traceback is likely completely
    >     wrong, and local variables, in particular those that have been stored
    >     in registers, are also completely unusable.
    >
    > I think that is incorrect.  The frames on the stack give all the
    > necessary information, once (x-synchronize t) has been used.

    It would be helpful if you actually quoted the _relevant_ parts of
    what you are commenting on.

I think did so.

    x_connection_closed is declared NO_RETURN, so gcc will not bother
    about leaving the stack in a recognizable state when calling it.
    Whether or not that affects the calls of x_error_quitter may depend on
    whether gcc consequently figures out it being NO_RETURN, too.  With
    gcc's current optimization strategies, that is quite plausible.

The crucial information for debugging this problem is which X library
functions were called, and their arguments, and where the call came
from.  I don't think that calling a NO_RETURN subroutine will spoil
outer-level stack frames like this.  I think they will be visible.

But if there is a problem, then we should get rid of NO_RETURN for
this function, because there is no point in optimizing it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-16  1:39                   ` Richard Stallman
@ 2007-05-16  6:58                     ` David Kastrup
  2007-05-16 14:32                       ` Richard Stallman
  2007-05-16 18:04                       ` Eli Zaretskii
  0 siblings, 2 replies; 23+ messages in thread
From: David Kastrup @ 2007-05-16  6:58 UTC (permalink / raw)
  To: rms; +Cc: cyd, jan.h.d, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     >       Which means that the traceback is likely completely
>     >     wrong, and local variables, in particular those that have been stored
>     >     in registers, are also completely unusable.
>     >
>     > I think that is incorrect.  The frames on the stack give all the
>     > necessary information, once (x-synchronize t) has been used.
>
>     It would be helpful if you actually quoted the _relevant_ parts of
>     what you are commenting on.
>
> I think did so.
>
>     x_connection_closed is declared NO_RETURN, so gcc will not bother
>     about leaving the stack in a recognizable state when calling it.
>     Whether or not that affects the calls of x_error_quitter may depend on
>     whether gcc consequently figures out it being NO_RETURN, too.  With
>     gcc's current optimization strategies, that is quite plausible.
>
> The crucial information for debugging this problem is which X
> library functions were called, and their arguments, and where the
> call came from.  I don't think that calling a NO_RETURN subroutine
> will spoil outer-level stack frames like this.  I think they will be
> visible.

The stack frames will usually not get affected.  But the return
address might be completely wrong, and that will affect gdb's idea of
what function the stack frame belongs to, and what variables are
stored in what registers at that point of time.

That's because gcc will, instead of calling a NO_RETURN function, just
jump to an arbitrary point elsewhere where it is called.  So at least
the information about the immediate caller of a NO_RETURN function
(one that gcc knows as being NO_RETURN, not necessarily one explictly
declared as such) is not fit for debugging.

Personally, I think the tradeoff of saving a few bytes for being able
to use a (potentially conditional) short jump instead of a call is not
worth the cost in savings unless one specified an option
--fsavememory=lethallyaggressive or so, but one would have to bring
this issue up with the gcc developers.

> But if there is a problem, then we should get rid of NO_RETURN for
> this function, because there is no point in optimizing it.

I think that if we are compiling with -g, we should also enable the
-fno-crossjumping option.  Drawback is that this will cause slightly
different code to be produced.  Advantage is that it makes actually
usable stack frames for post-mortem analysis much more likely.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-16  6:58                     ` David Kastrup
@ 2007-05-16 14:32                       ` Richard Stallman
  2007-05-16 18:04                       ` Eli Zaretskii
  1 sibling, 0 replies; 23+ messages in thread
From: Richard Stallman @ 2007-05-16 14:32 UTC (permalink / raw)
  To: David Kastrup; +Cc: cyd, jan.h.d, emacs-devel

    The stack frames will usually not get affected.  But the return
    address might be completely wrong, and that will affect gdb's idea of
    what function the stack frame belongs to, and what variables are
    stored in what registers at that point of time.

It probably won't matter, because we aren't going to try to debug the
Xlib functions themselves.  We want to know which Xlib function Emacs
called, and how.

However, we should turn off the NO_RETURN for this.
(It seems to be on x_connection_closed.)

    I think that if we are compiling with -g, we should also enable the
    -fno-crossjumping option.

I always use -O0 -fnoinline, for maximum debuggability.
But that shouldn't matter for this bug.  The backtrace
should give the info we need, presuming (x-synchronize t)
was done.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-16  6:58                     ` David Kastrup
  2007-05-16 14:32                       ` Richard Stallman
@ 2007-05-16 18:04                       ` Eli Zaretskii
  2007-05-16 21:42                         ` David Kastrup
  1 sibling, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2007-05-16 18:04 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

> From: David Kastrup <dak@gnu.org>
> Date: Wed, 16 May 2007 08:58:42 +0200
> Cc: cyd@stupidchicken.com, jan.h.d@swipnet.se, emacs-devel@gnu.org
> 
> I think that if we are compiling with -g, we should also enable the
> -fno-crossjumping option.  Drawback is that this will cause slightly
> different code to be produced.

Why not use -fno-crossjumping even if we compile without -g?  I can
hardly believe this will cause any visible slowdown, as Emacs does not
have tight inner loops anywhere.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: crash: x_error_quitter
  2007-05-16 18:04                       ` Eli Zaretskii
@ 2007-05-16 21:42                         ` David Kastrup
  0 siblings, 0 replies; 23+ messages in thread
From: David Kastrup @ 2007-05-16 21:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: David Kastrup <dak@gnu.org>
>> Date: Wed, 16 May 2007 08:58:42 +0200
>> Cc: cyd@stupidchicken.com, jan.h.d@swipnet.se, emacs-devel@gnu.org
>> 
>> I think that if we are compiling with -g, we should also enable the
>> -fno-crossjumping option.  Drawback is that this will cause slightly
>> different code to be produced.
>
> Why not use -fno-crossjumping even if we compile without -g?  I can
> hardly believe this will cause any visible slowdown, as Emacs does not
> have tight inner loops anywhere.

We are not even talking tight inner loops here.  We are talking calls
of functions that never return.  Those can't happen too often (unless
there is some exception mechanism involved, or garbage compaction of
the stack, a tactique that some Scheme runtimes use for implementing
continuations), or the stack will overflow.

This is just a mechanism for squeezing out a few bytes.

Using -fno-crossjumping always (when supported by the compiler, of
course) would probably save us some dozen man hours of erroneous
post-mortem debugging.  The main disadvantage is that it makes the
compilation look ugly, as if Emacs were some special application
needing special compiler options.  In my opinion, crossjumping should
likely only be enabled for -O5 or greater.  Something like that.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2007-05-16 21:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-10 18:05 crash: x_error_quitter sds
2007-05-11 18:48 ` Richard Stallman
2007-05-11 18:56   ` Sam Steingold
2007-05-12 16:47     ` Richard Stallman
2007-05-11 20:02 ` Chong Yidong
2007-05-12  6:51   ` Jan Djärv
2007-05-12 16:47   ` Richard Stallman
2007-05-12 23:31     ` Chong Yidong
2007-05-12 23:36       ` Chong Yidong
2007-05-13 13:29         ` Jan Djärv
2007-05-13 13:41         ` Chong Yidong
2007-05-13 13:00       ` Jan Djärv
2007-05-13 13:52         ` Chong Yidong
2007-05-13 15:33           ` Jan Djärv
2007-05-14  8:09           ` Richard Stallman
2007-05-14  8:40             ` David Kastrup
2007-05-15  9:47               ` Richard Stallman
2007-05-15  9:59                 ` David Kastrup
2007-05-16  1:39                   ` Richard Stallman
2007-05-16  6:58                     ` David Kastrup
2007-05-16 14:32                       ` Richard Stallman
2007-05-16 18:04                       ` Eli Zaretskii
2007-05-16 21:42                         ` David Kastrup

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).