Re: SEGV in x_catch_errors_unwind (x86

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
       [not found] <v9oe1hg44k.fsf@marauder.physik.uni-ulm.de>
@ 2006-02-13  4:40 ` Richard M. Stallman
  2006-02-13 14:04   ` Reiner Steib
  0 siblings, 1 reply; 25+ messages in thread
From: Richard M. Stallman @ 2006-02-13  4:40 UTC (permalink / raw)
  Cc: emacs-devel

    > Please describe exactly what actions triggered the bug
    > and the precise symptoms of the bug:

    `<f1> k [Ctrl <tool-bar> <help>]'  (But it's not reproducible.)

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 182940272320 (LWP 4018)]
    x_catch_errors_unwind (dummy=9829361)
	at [...]/emacs/src/xterm.c:7530

The first question is, what was the immediate cause of the crash?
Could you please see what instruction crashed, and what data it was
looking at?

Maybe it was examining x_error_message->dpy.
What is the value of x_error_message?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-13  4:40 ` Richard M. Stallman
@ 2006-02-13 14:04   ` Reiner Steib
  2006-02-13 17:05     ` Stefan Monnier
  0 siblings, 1 reply; 25+ messages in thread
From: Reiner Steib @ 2006-02-13 14:04 UTC (permalink / raw)
  Cc: Stefan Monnier, emacs-devel

On Mon, Feb 13 2006, Richard M. Stallman wrote:

>     > Please describe exactly what actions triggered the bug
>     > and the precise symptoms of the bug:
>
>     `<f1> k [Ctrl <tool-bar> <help>]'  (But it's not reproducible.)

Today I got another crash at the same point (using the same built as
before):

Program received signal SIGSEGV, Segmentation fault.
x_catch_errors_unwind (dummy=9829361)
    at [...]/emacs/src/xterm.c:7530
7530      Display *dpy = x_error_message->dpy;
(gdb) bt full
#0  x_catch_errors_unwind (dummy=9829361)
    at [...]/emacs/src/xterm.c:7530
        dpy = Variable "dpy" is not available.
(gdb) xbacktrace 
"face-attr-match-p"
"face-spec-match-p"
"frame-set-background-mode"
"x-create-frame-with-faces"
"make-frame"
"make-frame-command"
"call-interactively"
(gdb) 

>     Program received signal SIGSEGV, Segmentation fault.
>     [Switching to Thread 182940272320 (LWP 4018)]
>     x_catch_errors_unwind (dummy=9829361)
> 	at [...]/emacs/src/xterm.c:7530
>
> The first question is, what was the immediate cause of the crash?
> Could you please see what instruction crashed, and what data it was
> looking at?
>
> Maybe it was examining x_error_message->dpy.
> What is the value of x_error_message?

In both gdb sessions I get:

(gdb) p x_error_message
$1 = (struct x_error_message_stack *) 0x0
(gdb) pr
0
(gdb) p x_error_message->dpy
Cannot access memory at address 0xc8

(If this is not what you asked for, could you please tell me which gdb
command I should use?)

As I didn't have any crash since November [1] and now two crashes at
the same spot with the 2006-02-08 built, I suspect that one of these
changes (from Stefan) might trigger it:

,----
| revision 1.893
| date: 2006-01-23 22:08:13 +0000;  author: monnier;  state: Exp;  lines: +2 -1
| (x_catch_errors_unwind): Yet another int/Lisp_Object mixup.
| ----------------------------
| revision 1.892
| date: 2006-01-23 02:44:02 +0000;  author: monnier;  state: Exp;  lines: +36 -20
| Avoid allocating Lisp data from code that can be run from a signal handler.
| (x_error_message): New var to replace x_error_message_string.
| (x_error_catcher, x_catch_errors, x_catch_errors_unwind)
| (x_check_errors, x_had_errors_p, x_clear_errors, x_error_handler)
| (syms_of_xterm): Use it instead of x_error_message_string.
`----

Should I build again?  Should I enable additional checks
(ENABLE_CHECKING?)?

Bye, Reiner.

[1] http://thread.gmane.org/gmane.emacs.pretest.bugs/10235
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-13 14:04   ` Reiner Steib
@ 2006-02-13 17:05     ` Stefan Monnier
  2006-02-14  0:40       ` Richard M. Stallman
  2006-02-17 14:27       ` Reiner Steib
  0 siblings, 2 replies; 25+ messages in thread
From: Stefan Monnier @ 2006-02-13 17:05 UTC (permalink / raw)
  Cc: emacs-devel

> As I didn't have any crash since November [1] and now two crashes at
> the same spot with the 2006-02-08 built, I suspect that one of these
> changes (from Stefan) might trigger it:

> ,----
> | revision 1.893
> | date: 2006-01-23 22:08:13 +0000;  author: monnier;  state: Exp;  lines: +2 -1
> | (x_catch_errors_unwind): Yet another int/Lisp_Object mixup.
> | ----------------------------
> | revision 1.892
> | date: 2006-01-23 02:44:02 +0000;  author: monnier;  state: Exp;  lines: +36 -20
> | Avoid allocating Lisp data from code that can be run from a signal handler.
> | (x_error_message): New var to replace x_error_message_string.
> | (x_error_catcher, x_catch_errors, x_catch_errors_unwind)
> | (x_check_errors, x_had_errors_p, x_clear_errors, x_error_handler)
> | (syms_of_xterm): Use it instead of x_error_message_string.
> `----

Indeed, it looks like a likely culprit :-(

> Should I build again?  Should I enable additional checks
> (ENABLE_CHECKING?)?

Yes, please try ENABLE_CHECKING.


        Stefan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-13 17:05     ` Stefan Monnier
@ 2006-02-14  0:40       ` Richard M. Stallman
  2006-02-17 14:27       ` Reiner Steib
  1 sibling, 0 replies; 25+ messages in thread
From: Richard M. Stallman @ 2006-02-14  0:40 UTC (permalink / raw)
  Cc: Reiner.Steib, emacs-devel

He gave this crucial piece of information:

    In both gdb sessions I get:

    (gdb) p x_error_message
    $1 = (struct x_error_message_stack *) 0x0

>From that, can you see a bug in your code?

    (gdb) xbacktrace 
    "face-attr-match-p"

That ought to localize it pretty well too.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
@ 2006-02-17  8:04 John W. Eaton
  0 siblings, 0 replies; 25+ messages in thread
From: John W. Eaton @ 2006-02-17  8:04 UTC (permalink / raw)
  Cc: John W. Eaton

On Mon, 13 Feb 2006, Richard M. Stallman wrote:

| He gave this crucial piece of information:
| 
|     In both gdb sessions I get:
| 
|     (gdb) p x_error_message
|     $1 = (struct x_error_message_stack *) 0x0
| 
| >From that, can you see a bug in your code?
| 
|     (gdb) xbacktrace 
|     "face-attr-match-p"
| 
| That ought to localize it pretty well too.

I'm also encountering a crash due to x_error_message == 0, here:

  (gdb) p x_error_message
  $1 = (struct x_error_message_stack *) 0x0
  (gdb) list
  7538
  7539    static Lisp_Object
  7540    x_catch_errors_unwind (dummy)
  7541         Lisp_Object dummy;
  7542    {
  7543      Display *dpy = x_error_message->dpy;
  7544      struct x_error_message_stack *tmp;
  7545
  7546      /* The display may have been closed before this function is called.
  7547         Check if it is still open before calling XSync.  */
  (gdb) xbacktrace
  "assoc-default"
  "set-auto-mode"
  "normal-mode"
  "after-find-file"
  "find-file-noselect-1"
  "find-file-noselect"
  "find-file-other-window"
  "command-line-1"
  "command-line"
  "normal-top-level"
  (gdb) where
  #0  x_catch_errors_unwind (dummy=13215333)
      at /scratch/jwe/src/emacs/src/xterm.c:7543
  #1  0x000000000050db2d in unbind_to (count=<value optimized out>,
      value=9546737) at /scratch/jwe/src/emacs/src/eval.c:3233
  #2  0x000000000053c778 in Fbyte_code (bytestr=70, vector=7317884, maxdepth=40)
      at /scratch/jwe/src/emacs/src/bytecode.c:716
  #3  0x000000000050eeab in funcall_lambda (fun=7317572, nargs=1,
      arg_vector=0x7ffffff35988) at /scratch/jwe/src/emacs/src/eval.c:3066
  #4  0x000000000050f456 in Ffuncall (nargs=<value optimized out>, args=0x6fa840)
      at /scratch/jwe/src/emacs/src/eval.c:2934
  #5  0x000000000053c904 in Fbyte_code (bytestr=10163713, vector=7319660,
      maxdepth=64) at /scratch/jwe/src/emacs/src/bytecode.c:694
  #6  0x000000000050eeab in funcall_lambda (fun=7318844, nargs=4,
      arg_vector=0x7ffffff35b48) at /scratch/jwe/src/emacs/src/eval.c:3066
  #7  0x000000000050f456 in Ffuncall (nargs=<value optimized out>, args=0x6fad38)
      at /scratch/jwe/src/emacs/src/eval.c:2934
  #8  0x000000000053c904 in Fbyte_code (bytestr=10412017, vector=7308220,
      maxdepth=48) at /scratch/jwe/src/emacs/src/bytecode.c:694
  #9  0x000000000050eeab in funcall_lambda (fun=7308036, nargs=1,
      arg_vector=0x7ffffff35cf8) at /scratch/jwe/src/emacs/src/eval.c:3066
  #10 0x000000000050f456 in Ffuncall (nargs=<value optimized out>, args=0x6f8300)
      at /scratch/jwe/src/emacs/src/eval.c:2934
  #11 0x000000000053c904 in Fbyte_code (bytestr=10081201, vector=787, maxdepth=0)
      at /scratch/jwe/src/emacs/src/bytecode.c:694
  #12 0x000000000050eeab in funcall_lambda (fun=8449212, nargs=1,
      arg_vector=0x7ffffff35ed8) at /scratch/jwe/src/emacs/src/eval.c:3066
  #13 0x000000000050f456 in Ffuncall (nargs=<value optimized out>, args=0x80ecb8)
      at /scratch/jwe/src/emacs/src/eval.c:2934
  #14 0x000000000053c904 in Fbyte_code (bytestr=9752577, vector=8425668,
      maxdepth=56) at /scratch/jwe/src/emacs/src/bytecode.c:694
  #15 0x000000000050eeab in funcall_lambda (fun=8424020, nargs=0,
      arg_vector=0x7ffffff36098) at /scratch/jwe/src/emacs/src/eval.c:3066
  #16 0x000000000050f456 in Ffuncall (nargs=<value optimized out>, args=0x808a50)
      at /scratch/jwe/src/emacs/src/eval.c:2934
  #17 0x000000000053c904 in Fbyte_code (bytestr=9699873, vector=8415588,
      maxdepth=48) at /scratch/jwe/src/emacs/src/bytecode.c:694
  #18 0x000000000050eeab in funcall_lambda (fun=8415364, nargs=0,
      arg_vector=0x7ffffff361d0) at /scratch/jwe/src/emacs/src/eval.c:3066
  #19 0x000000000050f14c in apply_lambda (fun=8415364, args=9546737, eval_flag=1)
      at /scratch/jwe/src/emacs/src/eval.c:2988
  #20 0x000000000050e810 in Feval (form=<value optimized out>)
      at /scratch/jwe/src/emacs/src/eval.c:2277
  #21 0x000000000050d587 in internal_condition_case (
      bfun=0x4a4700 <top_level_2>, handlers=9640161, hfun=0x4aa810 <cmd_error>)
      at /scratch/jwe/src/emacs/src/eval.c:1465
  #22 0x00000000004a473a in top_level_1 ()
      at /scratch/jwe/src/emacs/src/keyboard.c:1345
  #23 0x000000000050d437 in internal_catch (tag=<value optimized out>,
      func=0x4a4710 <top_level_1>, arg=9546737)
      at /scratch/jwe/src/emacs/src/eval.c:1211
  #24 0x00000000004a44db in command_loop ()
      at /scratch/jwe/src/emacs/src/keyboard.c:1302
  #25 0x00000000004a4591 in recursive_edit_1 ()
      at /scratch/jwe/src/emacs/src/keyboard.c:1000
  ---Type <return> to continue, or q <return> to quit---
  #26 0x00000000004a4693 in Frecursive_edit ()
      at /scratch/jwe/src/emacs/src/keyboard.c:1061
  #27 0x00000000004a372f in main (argc=2011, argv=0x7ffffff36a68)
      at /scratch/jwe/src/emacs/src/emacs.c:1789

I'm using the CVS Emacs sources, checked out Feb 16.  I'm generating
this error by running

  emacs -q long list of files to open ...

followed by grabbing the window with the mouse and moving it around
the screen rapidly while Emacs is loading the files.  I can usually
but not always cause Emacs to crash by doing this.

I was unable to reproduce the problem if I set a breakpoint in
x_catch_errors_unwind, so I made the following changes to xterm.c, to
trace the sequence of calls to x_catch_errors and
x_catch_errors_unwind, which it seems should always be paired.

Index: xterm.c
===================================================================
RCS file: /sources/emacs/emacs/src/xterm.c,v
retrieving revision 1.897
diff -u -r1.897 xterm.c
--- xterm.c	14 Feb 2006 10:01:23 -0000	1.897
+++ xterm.c	17 Feb 2006 07:48:50 -0000
@@ -7508,6 +7508,9 @@
 void x_check_errors ();
 static Lisp_Object x_catch_errors_unwind ();
 
+static char xxxbuf[1000];
+static int xxxidx = 0;
+
 int
 x_catch_errors (dpy)
      Display *dpy;
@@ -7531,6 +7534,9 @@
 
   record_unwind_protect (x_catch_errors_unwind, dummy);
 
+  xxxbuf[xxxidx++] = 's';
+  xxxbuf[xxxidx] = 0;
+
   return count;
 }
 
@@ -7540,6 +7546,12 @@
 x_catch_errors_unwind (dummy)
      Lisp_Object dummy;
 {
+  xxxbuf[xxxidx++] = 'C';
+  xxxbuf[xxxidx] = 0;
+
+  if (! x_error_message)
+    fprintf (stderr, "%s\n", xxxbuf);
+
   Display *dpy = x_error_message->dpy;
   struct x_error_message_stack *tmp;


With these changes, I get a message like this:

  sCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCsCC

just before the crash (the message is only printed if x_error_message
is 0).  The length varies, but it is always a sequence of sC
(indicating a call to x_catch_errors followed by a call to
x_catch_errors_unwind) ending with CC, indicating that
x_catch_errors_unwind is being called twice, without an intervening
call to x_catch_errors.  Looking at the way these functions are used
in xterm.c, it is not at all obvious to me how that can happen.

I compiled Emacs with gcc 4.0.3 on an amd64 Debian system.  Here is
the output from gcc -v:

  Using built-in specs.
  Target: x86_64-linux-gnu
  Configured with: ../src/configure -v --enable-languages=c,c++,java,f95,objc,ada,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.0 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.0-1.4.2.0/jre --enable-mpfr --disable-werror --enable-checking=release x86_64-linux-gnu
  Thread model: posix
  gcc version 4.0.3 20060128 (prerelease) (Debian 4.0.2-8)

Here is the information from report-emacs-bug:

  In GNU Emacs 22.0.50.9 (x86_64-unknown-linux-gnu)
   of 2006-02-17 on segfault
  X server distributor `The X.Org Foundation', version 11.0.60900000
  configured using `configure '--prefix=/usr/local/cvs-emacs''

  Important settings:
    value of $LC_ALL: nil
    value of $LC_COLLATE: nil
    value of $LC_CTYPE: nil
    value of $LC_MESSAGES: nil
    value of $LC_MONETARY: nil
    value of $LC_NUMERIC: nil
    value of $LC_TIME: nil
    value of $LANG: nil
    locale-coding-system: nil
    default-enable-multibyte-characters: t

  Major mode: Lisp Interaction

  Minor modes in effect:
    tooltip-mode: t
    auto-compression-mode: t
    tool-bar-mode: t
    mouse-wheel-mode: t
    menu-bar-mode: t
    file-name-shadow-mode: t
    global-font-lock-mode: t
    font-lock-mode: t
    blink-cursor-mode: t
    unify-8859-on-encoding-mode: t
    utf-translate-cjk-mode: t
    line-number-mode: t

  Recent input:
  <help-echo> <help-echo> <help-echo> <help-echo> <escape> 
  x r e p o r t <tab> <return>

  Recent messages:
  (/usr/local/cvs-emacs/bin/emacs -q)
  For information about the GNU Project and its goals, type C-h C-p.
  Loading emacsbug...
  Loading regexp-opt...done
  Loading emacsbug...done

I'd be happy to try any other debugging.  I'm not subscribed to the
list.

Thanks,

jwe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-13 17:05     ` Stefan Monnier
  2006-02-14  0:40       ` Richard M. Stallman
@ 2006-02-17 14:27       ` Reiner Steib
  2006-02-17 15:20         ` Reproducible crashes: dropping an URL (was: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)) Reiner Steib
  1 sibling, 1 reply; 25+ messages in thread
From: Reiner Steib @ 2006-02-17 14:27 UTC (permalink / raw)
  Cc: Nick Roberts, emacs-devel

On Mon, Feb 13 2006, Stefan Monnier wrote:

>> Should I build again?  Should I enable additional checks
>> (ENABLE_CHECKING?)?
>
> Yes, please try ENABLE_CHECKING.

I updated and recompiled with ENABLE_CHECKING some days ago.  Now I
got another crash:

--8<---------------cut here---------------start------------->8---
Emacs fatal error: [...]/emacs/src/alloc.c:3212: assertion failed: !handling_signal

Breakpoint 1, abort ()
    at [...]/emacs/src/emacs.c:463
463     {
(gdb) bt full
#0  abort () at [...]/emacs/src/emacs.c:463
No locals.
#1  0x00000000005a2304 in die (msg=Variable "msg" is not available.
)   
    at [...]/emacs/src/alloc.c:6193
No locals.
#2  0x00000000005a6334 in Fmake_symbol (name=62320131)
    at [...]/emacs/src/alloc.c:3236
        val = Variable "val" is not available.
(gdb) xbacktrace
"read-event"
"byte-code"
"mouse-show-mark"
"mouse-drag-track"
"mouse-drag-region"
"call-interactively"
"widget-button-click"
"call-interactively"
(gdb) p x_error_message
$1 = (struct x_error_message_stack *) 0x0
(gdb) 
--8<---------------cut here---------------end--------------->8---

If I can provide more information, please tell me which gdb commands I
should use.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-20 14:59             ` SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu) (was: Reproducible crashes: dropping an URL) Reiner Steib
@ 2006-02-20 15:04               ` Stefan Monnier
  2006-02-20 20:05                 ` Reiner Steib
  2006-02-21  5:30                 ` Richard M. Stallman
  0 siblings, 2 replies; 25+ messages in thread
From: Stefan Monnier @ 2006-02-20 15:04 UTC (permalink / raw)


> I recompiled with this change on 2006-02-17.  I can't reproduce the
> Drag/drop crash anymore.  

Good.

> Today I got another "x_error_message == 0x0" crash:

> --8<---------------cut here---------------start------------->8---
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000004fe8b0 in x_catch_errors_unwind (dummy=20426868)
>     at [...]/emacs/src/xterm.c:7543
> warning: Source file is more recent than executable.

> 7543      Display *dpy = x_error_message->dpy;
> (gdb) bt full
> #0  0x00000000004fe8b0 in x_catch_errors_unwind (dummy=20426868)
>     at [...]/emacs/src/xterm.c:7543
>         dpy = Variable "dpy" is not available.
> (gdb) xbacktrace 
> (gdb) p x_error_message
> $1 = (struct x_error_message_stack *) 0x0
> (gdb) 
> --8<---------------cut here---------------end--------------->8---

What's the C backtrace?


        Stefan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-20 15:04               ` SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu) Stefan Monnier
@ 2006-02-20 20:05                 ` Reiner Steib
  2006-02-21  4:39                   ` Chong Yidong
  2006-02-21  5:30                 ` Richard M. Stallman
  1 sibling, 1 reply; 25+ messages in thread
From: Reiner Steib @ 2006-02-20 20:05 UTC (permalink / raw)
  Cc: emacs-devel

On Mon, Feb 20 2006, Stefan Monnier wrote:

>> --8<---------------cut here---------------start------------->8---
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00000000004fe8b0 in x_catch_errors_unwind (dummy=20426868)
>>     at [...]/emacs/src/xterm.c:7543
>> warning: Source file is more recent than executable.
>
>> 7543      Display *dpy = x_error_message->dpy;
>> (gdb) bt full
>> #0  0x00000000004fe8b0 in x_catch_errors_unwind (dummy=20426868)
>>     at [...]/emacs/src/xterm.c:7543
>>         dpy = Variable "dpy" is not available.
>> (gdb) xbacktrace 
>> (gdb) p x_error_message
>> $1 = (struct x_error_message_stack *) 0x0
>> (gdb) 
>> --8<---------------cut here---------------end--------------->8---
>
> What's the C backtrace?

Sorry, I thought `bt full' (as suggested by `report-emacs-bug') would
print the full C backtrace.

--8<---------------cut here---------------start------------->8---
(gdb) backtrace 
#0  0x00000000004fe8b0 in x_catch_errors_unwind (dummy=20426868)
    at [...]/emacs/src/xterm.c:7543
#1  0x00000000005c24f7 in unbind_to (count=5, value=10808305)
    at [...]/emacs/src/eval.c:3233
#2  0x000000000048a34d in display_mode_line (w=Variable "w" is not available.
)
    at [...]/emacs/src/xdisp.c:16231
#3  0x000000000048a818 in display_mode_lines (w=0x246f610)
    at [...]/emacs/src/xdisp.c:16177
#4  0x00000000004993d1 in redisplay_window (window=38204948, just_this_one_p=0)
    at [...]/emacs/src/xdisp.c:13019
#5  0x000000000049b6bd in redisplay_window_0 (window=Variable "window" is not available.
)
    at [...]/emacs/src/xdisp.c:11450
#6  0x00000000005c3268 in internal_condition_case_1 (
    bfun=0x49b690 <redisplay_window_0>, arg=38204948, handlers=10790613, 
    hfun=0x463270 <redisplay_window_error>)
    at [...]/emacs/src/eval.c:1506
#7  0x000000000047ce6f in redisplay_windows (window=38204948)
    at [...]/emacs/src/xdisp.c:11429
#8  0x000000000047cddb in redisplay_windows (window=50459380)
    at [...]/emacs/src/xdisp.c:11423
#9  0x00000000004901d5 in redisplay_internal (preserve_echo_area=Variable "preserve_echo_area" is not available.
)
    at [...]/emacs/src/xdisp.c:10989
#10 0x0000000000542436 in read_char (commandflag=1, nmaps=7, 
    maps=0x7fbfffd490, prev_event=10808305, used_mouse_menu=0x7fbfffd534)
    at [...]/emacs/src/keyboard.c:2549
#11 0x0000000000546e7b in read_key_sequence (keybuf=0x7fbfffd6d0, bufsize=30, 
    prompt=10808305, dont_downcase_last=0, can_return_switch_frame=1, 
    fix_current_buffer=1)
    at [...]/emacs/src/keyboard.c:8874
#12 0x000000000054a6b2 in command_loop_1 ()
    at [...]/emacs/src/keyboard.c:1536
#13 0x00000000005c3581 in internal_condition_case (
    bfun=0x54a3f0 <command_loop_1>, handlers=10901729, 
    hfun=0x540f10 <cmd_error>)
    at [...]/emacs/src/eval.c:1465
#14 0x000000000053fd3a in command_loop_2 ()
    at [...]/emacs/src/keyboard.c:1328
#15 0x00000000005c36d0 in internal_catch (tag=Variable "tag" is not available.
)
    at [...]/emacs/src/eval.c:1211
#16 0x0000000000540938 in command_loop ()
    at [...]/emacs/src/keyboard.c:1307
#17 0x00000000005409d1 in recursive_edit_1 ()
    at [...]/emacs/src/keyboard.c:1000
#18 0x0000000000540b70 in Frecursive_edit ()
    at [...]/emacs/src/keyboard.c:1061
#19 0x0000000000530796 in main (argc=5, argv=0x7fbfffdf58)
    at [...]/emacs/src/emacs.c:1789
--8<---------------cut here---------------end--------------->8---

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-20 20:05                 ` Reiner Steib
@ 2006-02-21  4:39                   ` Chong Yidong
  2006-02-22  5:23                     ` Richard M. Stallman
  0 siblings, 1 reply; 25+ messages in thread
From: Chong Yidong @ 2006-02-21  4:39 UTC (permalink / raw)
  Cc: emacs-devel

If these crashes are caused by the changes made to avoid allocating
lisp strings in the x error handler, maybe we can just go back to the
old system, since we now block input in the allocation functions.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-20 15:04               ` SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu) Stefan Monnier
  2006-02-20 20:05                 ` Reiner Steib
@ 2006-02-21  5:30                 ` Richard M. Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Richard M. Stallman @ 2006-02-21  5:30 UTC (permalink / raw)
  Cc: emacs-devel

I think I fixed the x_error_message problem yesterday
with a change in xterm.c.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
@ 2006-02-21 21:33 John W. Eaton
  2006-02-23  2:39 ` Richard Stallman
  2006-02-24  9:55 ` Richard Stallman
  0 siblings, 2 replies; 25+ messages in thread
From: John W. Eaton @ 2006-02-21 21:33 UTC (permalink / raw)
  Cc: rms

On Tue, 21 Feb 2006, Richard M. Stallman wrote:

| I think I fixed the x_error_message problem yesterday
| with a change in xterm.c.

Do you mean the following change?

  2006-02-19  Richard M. Stallman  <rms@gnu.org>

	  * xterm.c (x_catch_errors): Use xmalloc.

  Index: xterm.c
  ===================================================================
  RCS file: /sources/emacs/emacs/src/xterm.c,v
  retrieving revision 1.897
  retrieving revision 1.898
  diff -u -r1.897 -r1.898
  --- xterm.c     14 Feb 2006 10:01:23 -0000      1.897
  +++ xterm.c     20 Feb 2006 01:18:43 -0000      1.898
  @@ -7513,7 +7513,7 @@
	Display *dpy;
   {
     int count = SPECPDL_INDEX ();
  -  struct x_error_message_stack *data = malloc (sizeof (*data));
  +  struct x_error_message_stack *data = xmalloc (sizeof (*data));
     Lisp_Object dummy;
   #ifdef ENABLE_CHECKING
     dummy = make_number ((EMACS_INT)dpy + (EMACS_INT)x_error_message);


Unfortunately, this doesn't seem to solve the problem for me.  I still
have crashes in unbind_to in xterm.c due to x_error_message == 0.
I'm appending the backtrace from gdb.  This is with the current CVS
sources updated around noon EST 2/21/2006.  

I supplied some additional info here:

  http://lists.gnu.org/archive/html/emacs-devel/2006-02/msg00794.html

I don't think malloc is failing to allocate memory here, but something
is causing x_catch_errors_unwind to be called without a matching call
to x_catch_errors.

I'd be happy to do any additional debugging.  Just let me know what
additional info would help.

Thanks,

jwe


Here is the backtrace from gdb:

Program received signal SIGSEGV, Segmentation fault.
x_catch_errors_unwind (dummy=9571313) at /scratch/jwe/src/emacs/src/xterm.c:7543
7543      Display *dpy = x_error_message->dpy;
(gdb) where
#0  x_catch_errors_unwind (dummy=9571313) at /scratch/jwe/src/emacs/src/xterm.c:7543
#1  0x000000000050e60d in unbind_to (count=<value optimized out>, value=17530837)
    at /scratch/jwe/src/emacs/src/eval.c:3233
#2  0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0xfb4020)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#3  0x000000000051115a in call1 (fn=<value optimized out>, arg1=<value optimized out>)
    at /scratch/jwe/src/emacs/src/eval.c:2664
#4  0x000000000051adb4 in mapcar1 (leni=44, vals=0x7fffff8f3fd0, fn=16465956, seq=<value optimized out>)
    at /scratch/jwe/src/emacs/src/fns.c:3138
#5  0x000000000051b165 in Fmapcar (function=16465956, sequence=16318565)
    at /scratch/jwe/src/emacs/src/fns.c:3206
#6  0x000000000051002b in Ffuncall (nargs=<value optimized out>, args=<value optimized out>)
    at /scratch/jwe/src/emacs/src/eval.c:2882
#7  0x000000000053d3e4 in Fbyte_code (bytestr=16581345, vector=16515636, maxdepth=72)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#8  0x000000000050f46a in Feval (form=<value optimized out>) at /scratch/jwe/src/emacs/src/eval.c:2225
#9  0x0000000000511a81 in internal_lisp_condition_case (var=10166625, bodyform=13169253,
    handlers=<value optimized out>) at /scratch/jwe/src/emacs/src/eval.c:1412
#10 0x000000000053c426 in Fbyte_code (bytestr=32, vector=16659748, maxdepth=24)
    at /scratch/jwe/src/emacs/src/bytecode.c:884
#11 0x000000000050f98b in funcall_lambda (fun=16672196, nargs=1, arg_vector=0x7fffff8f46e8)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#12 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0xfe65c0)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#13 0x000000000053d3e4 in Fbyte_code (bytestr=16815377, vector=16825188, maxdepth=24)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#14 0x000000000050f98b in funcall_lambda (fun=16825620, nargs=0, arg_vector=0x7fffff8f4888)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#15 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x100bd10)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#16 0x000000000053d3e4 in Fbyte_code (bytestr=10159937, vector=7350844, maxdepth=24)
---Type <return> to continue, or q <return> to quit---
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#17 0x000000000050f98b in funcall_lambda (fun=7350668, nargs=2, arg_vector=0x7fffff8f4a28)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#18 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x702988)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#19 0x000000000053d3e4 in Fbyte_code (bytestr=10460385, vector=7349476, maxdepth=40)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#20 0x000000000050f98b in funcall_lambda (fun=7349060, nargs=0, arg_vector=0x7fffff8f4b60)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#21 0x000000000050fc2c in apply_lambda (fun=7349060, args=9571313, eval_flag=1)
    at /scratch/jwe/src/emacs/src/eval.c:2988
#22 0x000000000050f2f0 in Feval (form=<value optimized out>) at /scratch/jwe/src/emacs/src/eval.c:2277
#23 0x0000000000511a81 in internal_lisp_condition_case (var=10166625, bodyform=7336397,
    handlers=<value optimized out>) at /scratch/jwe/src/emacs/src/eval.c:1412
#24 0x000000000053c426 in Fbyte_code (bytestr=24, vector=7336220, maxdepth=32)
    at /scratch/jwe/src/emacs/src/bytecode.c:884
#25 0x000000000050f98b in funcall_lambda (fun=7336004, nargs=1, arg_vector=0x7fffff8f4fa8)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#26 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x6ff040)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#27 0x000000000053d3e4 in Fbyte_code (bytestr=208, vector=7334468, maxdepth=40)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#28 0x000000000050f98b in funcall_lambda (fun=7333996, nargs=2, arg_vector=0x7fffff8f5158)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#29 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x6fe868)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#30 0x000000000053d3e4 in Fbyte_code (bytestr=10446913, vector=7329828, maxdepth=32)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#31 0x000000000050f98b in funcall_lambda (fun=7329460, nargs=6, arg_vector=0x7fffff8f52f8)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#32 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x6fd6b0)
---Type <return> to continue, or q <return> to quit---
    at /scratch/jwe/src/emacs/src/eval.c:2934
#33 0x000000000053d3e4 in Fbyte_code (bytestr=10287953, vector=12733232, maxdepth=64)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#34 0x000000000050f98b in funcall_lambda (fun=7325932, nargs=4, arg_vector=0x7fffff8f54b8)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#35 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x6fc8e8)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#36 0x000000000053d3e4 in Fbyte_code (bytestr=10437041, vector=7315308, maxdepth=48)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#37 0x000000000050f98b in funcall_lambda (fun=7315124, nargs=1, arg_vector=0x7fffff8f5668)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#38 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x6f9eb0)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#39 0x000000000053d3e4 in Fbyte_code (bytestr=10107153, vector=67, maxdepth=0)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#40 0x000000000050f98b in funcall_lambda (fun=8456516, nargs=1, arg_vector=0x7fffff8f5848)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#41 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x810940)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#42 0x000000000053d3e4 in Fbyte_code (bytestr=9778481, vector=8432972, maxdepth=56)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#43 0x000000000050f98b in funcall_lambda (fun=8431324, nargs=0, arg_vector=0x7fffff8f5a08)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#44 0x000000000050ff36 in Ffuncall (nargs=<value optimized out>, args=0x80a6d8)
    at /scratch/jwe/src/emacs/src/eval.c:2934
#45 0x000000000053d3e4 in Fbyte_code (bytestr=9724593, vector=8422892, maxdepth=48)
    at /scratch/jwe/src/emacs/src/bytecode.c:694
#46 0x000000000050f98b in funcall_lambda (fun=8422668, nargs=0, arg_vector=0x7fffff8f5b40)
    at /scratch/jwe/src/emacs/src/eval.c:3066
#47 0x000000000050fc2c in apply_lambda (fun=8422668, args=9571313, eval_flag=1)
    at /scratch/jwe/src/emacs/src/eval.c:2988
---Type <return> to continue, or q <return> to quit---
#48 0x000000000050f2f0 in Feval (form=<value optimized out>) at /scratch/jwe/src/emacs/src/eval.c:2277
#49 0x000000000050e067 in internal_condition_case (bfun=0x4a5160 <top_level_2>, handlers=9664785,
    hfun=0x4ab270 <cmd_error>) at /scratch/jwe/src/emacs/src/eval.c:1465
#50 0x00000000004a519a in top_level_1 () at /scratch/jwe/src/emacs/src/keyboard.c:1345
#51 0x000000000050df17 in internal_catch (tag=<value optimized out>, func=0x4a5170 <top_level_1>, arg=9571313)
    at /scratch/jwe/src/emacs/src/eval.c:1211
#52 0x00000000004a4f3b in command_loop () at /scratch/jwe/src/emacs/src/keyboard.c:1302
#53 0x00000000004a4ff1 in recursive_edit_1 () at /scratch/jwe/src/emacs/src/keyboard.c:1000
#54 0x00000000004a50f3 in Frecursive_edit () at /scratch/jwe/src/emacs/src/keyboard.c:1061
#55 0x00000000004a418f in main (argc=388, argv=0x7fffff8f63d8) at /scratch/jwe/src/emacs/src/emacs.c:1789

Lisp Backtrace:
0xfb4024 PVEC_COMPILED
"mapcar"
"byte-code"
"c-init-language-vars-for"
"c++-mode"
"set-auto-mode-0"
"set-auto-mode"
"normal-mode"
"after-find-file"
"find-file-noselect-1"
"find-file-noselect"
"find-file-other-window"
"command-line-1"
"command-line"
"normal-top-level"
(gdb) list
7538
7539    static Lisp_Object
7540    x_catch_errors_unwind (dummy)
7541         Lisp_Object dummy;
7542    {
7543      Display *dpy = x_error_message->dpy;
7544      struct x_error_message_stack *tmp;
7545
7546      /* The display may have been closed before this function is called.
7547         Check if it is still open before calling XSync.  */
(gdb) p x_error_message
$1 = (struct x_error_message_stack *) 0x0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-21  4:39                   ` Chong Yidong
@ 2006-02-22  5:23                     ` Richard M. Stallman
  0 siblings, 0 replies; 25+ messages in thread
From: Richard M. Stallman @ 2006-02-22  5:23 UTC (permalink / raw)
  Cc: monnier, emacs-devel

    If these crashes are caused by the changes made to avoid allocating
    lisp strings in the x error handler, maybe we can just go back to the
    old system, since we now block input in the allocation functions.

Perhaps so, but first, let's see if my small fix fixed it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-21 21:33 John W. Eaton
@ 2006-02-23  2:39 ` Richard Stallman
  2006-02-24  9:55 ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2006-02-23  2:39 UTC (permalink / raw)
  Cc: emacs-devel

    Unfortunately, this doesn't seem to solve the problem for me.  I still
    have crashes in unbind_to in xterm.c due to x_error_message == 0.
    I'm appending the backtrace from gdb.

In that case, let's do go back to the previous code and see if things
work better--as Yidong recently suggested.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-21 21:33 John W. Eaton
  2006-02-23  2:39 ` Richard Stallman
@ 2006-02-24  9:55 ` Richard Stallman
  2006-02-25  7:45   ` John W. Eaton
  1 sibling, 1 reply; 25+ messages in thread
From: Richard Stallman @ 2006-02-24  9:55 UTC (permalink / raw)
  Cc: emacs-devel

    I don't think malloc is failing to allocate memory here, but something
    is causing x_catch_errors_unwind to be called without a matching call
    to x_catch_errors.

It does look that way.  But x_catch_errors_unwind was called from the
specpdl, and nothing ever puts it on the specpdl except x_catch_errors.
So something very very strange is happening.

I just looked at every call to x_catch_errors, and none of them seems
to be able to exit without a subsequent call to x_uncatch_errors which
should unwind it.

Can you examine the innermost specpdl bindings and see what
variables they bind?  Also, please examine a few slots
just beyond the specpdl_ptr, slots which were unwound recently.
What variables or unwind functions do they use?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-24  9:55 ` Richard Stallman
@ 2006-02-25  7:45   ` John W. Eaton
  2006-02-25 15:20     ` Chong Yidong
  0 siblings, 1 reply; 25+ messages in thread
From: John W. Eaton @ 2006-02-25  7:45 UTC (permalink / raw)
  Cc: emacs-devel, John W. Eaton

On 24-Feb-2006, Richard Stallman wrote:

|     I don't think malloc is failing to allocate memory here, but something
|     is causing x_catch_errors_unwind to be called without a matching call
|     to x_catch_errors.
| 
| It does look that way.  But x_catch_errors_unwind was called from the
| specpdl, and nothing ever puts it on the specpdl except x_catch_errors.
| So something very very strange is happening.
| 
| I just looked at every call to x_catch_errors, and none of them seems
| to be able to exit without a subsequent call to x_uncatch_errors which
| should unwind it.
| 
| Can you examine the innermost specpdl bindings and see what
| variables they bind?  Also, please examine a few slots
| just beyond the specpdl_ptr, slots which were unwound recently.
| What variables or unwind functions do they use?

Do you mean something like the following?

  Program received signal SIGSEGV, Segmentation fault.
  x_catch_errors_unwind (dummy=9571361) at /scratch/jwe/src/emacs/src/xterm.c:7543
  7543      Display *dpy = x_error_message->dpy;
  (gdb) p *(specpdl_ptr-5)
  $1 = {
    symbol = 25233173,
    old_value = 9571409,
    func = 0,
    unused = 14128919392575862
  }
  (gdb) p *(specpdl_ptr-4)
  $2 = {
    symbol = 10047041,
    old_value = 9571361,
    func = 0,
    unused = 127979076609647
  }
  (gdb) p *(specpdl_ptr-3)
  $3 = {
    symbol = 15310353,
    old_value = 9571361,
    func = 0,
    unused = 7089075026933016948
  }
  (gdb) p *(specpdl_ptr-2)
  $4 = {
    symbol = 15310401,
    old_value = 9571361,
    func = 0,
    unused = 7521905712077829995
  }
  (gdb) p *(specpdl_ptr-1)
  $5 = {
    symbol = 9682081,
    old_value = 9572340,
    func = 0,
    unused = 7018969065866813815
  }
  (gdb) p *(specpdl_ptr-0)
  $6 = {
    symbol = 9571313,
    old_value = 9571313,
    func = 0x4807c0 <x_catch_errors_unwind>,
    unused = 7074422071709478245
  }
  (gdb) p *(specpdl_ptr+1)
  $7 = {
    symbol = 10046849,
    old_value = 9571313,
    func = 0,
    unused = 7305790112002241125
  }
  (gdb) p *(specpdl_ptr+2)
  $8 = {
    symbol = 10046897,
    old_value = 25232549,
    func = 0,
    unused = 7305790112002241125
  }
  (gdb) p *(specpdl_ptr+3)
  $9 = {
    symbol = 10436065,
    old_value = 9571313,
    func = 0,
    unused = 20
  }
  (gdb) p *(specpdl_ptr+4)
  $10 = {
    symbol = 9990977,
    old_value = 24405347,
    func = 0,
    unused = 13
  }
  (gdb) p *(specpdl_ptr+5)
  $11 = {
    symbol = 9714577,
    old_value = 9571361,
    func = 0,
    unused = 1701734764
  }

If not, then will you please tell me precisely how you would like for
me to do this?  I'm not very familiar with Emacs internals.

In any case, I don't see anything useful there.  Maybe you will.

However, as I was looking at the following loop unbind_to in eval.c,
it occurred to me that one way the x_catch_errors_unwind function
could be called twice in succession would be if specpdl_ptr is
incremented by the addition of additional bindings while the loop is
running (by some other code that is misbehaving while manipulating the
specpdl array).  In that case, it seems that the the entry for
x_catch_errors_unwind would remain on the stack, to be executed
again.  I'm not sure how to determine whether that is what is
happening, or if it is, how to determine where specpdl_ptr is being
changed without being reset correctly.

  while (specpdl_ptr != specpdl + count)
    {
      /* Copy the binding, and decrement specpdl_ptr, before we do
	 the work to unbind it.  We decrement first
	 so that an error in unbinding won't try to unbind
	 the same entry again, and we copy the binding first
	 in case more bindings are made during some of the code we run.  */

      struct specbinding this_binding;
      this_binding = *--specpdl_ptr;

      if (this_binding.func != 0)
	(*this_binding.func) (this_binding.old_value);
      /* If the symbol is a list, it is really (SYMBOL WHERE
	 . CURRENT-BUFFER) where WHERE is either nil, a buffer, or a
	 frame.  If WHERE is a buffer or frame, this indicates we
	 bound a variable that had a buffer-local or frame-local
	 binding.  WHERE nil means that the variable had the default
	 value when it was bound.  CURRENT-BUFFER is the buffer that
         was current when the variable was bound.  */
      else if (CONSP (this_binding.symbol))
	{
	  Lisp_Object symbol, where;

	  symbol = XCAR (this_binding.symbol);
	  where = XCAR (XCDR (this_binding.symbol));

	  if (NILP (where))
	    Fset_default (symbol, this_binding.old_value);
	  else if (BUFFERP (where))
	    set_internal (symbol, this_binding.old_value, XBUFFER (where), 1);
	  else
	    set_internal (symbol, this_binding.old_value, NULL, 1);
	}
      else
	{
	  /* If variable has a trivial value (no forwarding), we can
	     just set it.  No need to check for constant symbols here,
	     since that was already done by specbind.  */
	  if (!MISCP (SYMBOL_VALUE (this_binding.symbol)))
	    SET_SYMBOL_VALUE (this_binding.symbol, this_binding.old_value);
	  else
	    set_internal (this_binding.symbol, this_binding.old_value, 0, 1);
	}
    }

jwe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25  7:45   ` John W. Eaton
@ 2006-02-25 15:20     ` Chong Yidong
  2006-02-25 15:36       ` Chong Yidong
  0 siblings, 1 reply; 25+ messages in thread
From: Chong Yidong @ 2006-02-25 15:20 UTC (permalink / raw)
  Cc: rms, emacs-devel

> However, as I was looking at the following loop unbind_to in eval.c,
> it occurred to me that one way the x_catch_errors_unwind function
> could be called twice in succession would be if specpdl_ptr is
> incremented by the addition of additional bindings while the loop is
> running (by some other code that is misbehaving while manipulating the
> specpdl array).

Yes, that's the problem.  Putting that loop inside a BLOCK_INPUT
prevents the crash.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 15:20     ` Chong Yidong
@ 2006-02-25 15:36       ` Chong Yidong
  2006-02-25 17:47         ` John W. Eaton
  2006-02-26 12:11         ` Richard Stallman
  0 siblings, 2 replies; 25+ messages in thread
From: Chong Yidong @ 2006-02-25 15:36 UTC (permalink / raw)
  Cc: rms, emacs-devel

Chong Yidong <cyd@stupidchicken.com> writes:

>> However, as I was looking at the following loop unbind_to in eval.c,
>> it occurred to me that one way the x_catch_errors_unwind function
>> could be called twice in succession would be if specpdl_ptr is
>> incremented by the addition of additional bindings while the loop is
>> running (by some other code that is misbehaving while manipulating the
>> specpdl array).

There isn't any misbehaving code, btw.  Since unbind_to is currently
run without BLOCK_INPUT, it can be interrupted by a signal handler.
The signal handler can call the x error handler, which calls
record_unwind_protect, which screws things up.

One solution is to somehow re-engineer the x error handler not to use
record_unwind_protect.  The other is to block inputs at the point in
unbind_to where specpdl_ptr is being modified, like this:

*** emacs/src/eval.c.~1.261.~	2006-02-09 23:33:56.000000000 -0500
--- emacs/src/eval.c	2006-02-25 10:26:26.000000000 -0500
***************
*** 3214,3233 ****
  {
    Lisp_Object quitf = Vquit_flag;
    struct gcpro gcpro1, gcpro2;
  
    GCPRO2 (value, quitf);
    Vquit_flag = Qnil;
  
!   while (specpdl_ptr != specpdl + count)
      {
        /* Copy the binding, and decrement specpdl_ptr, before we do
  	 the work to unbind it.  We decrement first
  	 so that an error in unbinding won't try to unbind
  	 the same entry again, and we copy the binding first
  	 in case more bindings are made during some of the code we run.  */
  
-       struct specbinding this_binding;
        this_binding = *--specpdl_ptr;
  
        if (this_binding.func != 0)
  	(*this_binding.func) (this_binding.old_value);
--- 3214,3238 ----
  {
    Lisp_Object quitf = Vquit_flag;
    struct gcpro gcpro1, gcpro2;
+   struct specbinding this_binding;
  
    GCPRO2 (value, quitf);
    Vquit_flag = Qnil;
  
!   while (1)
      {
+       BLOCK_INPUT;
+       if (specpdl_ptr == specpdl + count)
+ 	break;
+ 
        /* Copy the binding, and decrement specpdl_ptr, before we do
  	 the work to unbind it.  We decrement first
  	 so that an error in unbinding won't try to unbind
  	 the same entry again, and we copy the binding first
  	 in case more bindings are made during some of the code we run.  */
  
        this_binding = *--specpdl_ptr;
+       UNBLOCK_INPUT;
  
        if (this_binding.func != 0)
  	(*this_binding.func) (this_binding.old_value);
***************
*** 3263,3268 ****
--- 3268,3274 ----
  	    set_internal (this_binding.symbol, this_binding.old_value, 0, 1);
  	}
      }
+   UNBLOCK_INPUT;
  
    if (NILP (Vquit_flag) && !NILP (quitf))
      Vquit_flag = quitf;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 15:36       ` Chong Yidong
@ 2006-02-25 17:47         ` John W. Eaton
  2006-02-25 23:29           ` Chong Yidong
  2006-02-26 16:00           ` Richard Stallman
  2006-02-26 12:11         ` Richard Stallman
  1 sibling, 2 replies; 25+ messages in thread
From: John W. Eaton @ 2006-02-25 17:47 UTC (permalink / raw)
  Cc: emacs-devel, rms, John W. Eaton

On 25-Feb-2006, Chong Yidong wrote:

| Chong Yidong <cyd@stupidchicken.com> writes:
| 
| >> However, as I was looking at the following loop unbind_to in eval.c,
| >> it occurred to me that one way the x_catch_errors_unwind function
| >> could be called twice in succession would be if specpdl_ptr is
| >> incremented by the addition of additional bindings while the loop is
| >> running (by some other code that is misbehaving while manipulating the
| >> specpdl array).
| 
| There isn't any misbehaving code, btw.  Since unbind_to is currently
| run without BLOCK_INPUT, it can be interrupted by a signal handler.
| The signal handler can call the x error handler, which calls
| record_unwind_protect, which screws things up.

| One solution is to somehow re-engineer the x error handler not to use
| record_unwind_protect.

I think this might be the best bet.

| The other is to block inputs at the point in
| unbind_to where specpdl_ptr is being modified, like this:

This does not solve the problem for me.  It seems to be harder to
generate the crash, but I am still hitting the x_error_message == 0
segfault.

Again, the way I'm triggering the bug is to run Emacs under gdb.  The
last line of my .gdbinit file is

  set args -q file1 file2 ...

where the list of files was generated with

  find ~/src/octave -name '*.cc'

There are 386 files in the list.  The rest of the .gdbinit file is
extracted from the .gdbinit file in the Emacs src directory that I
checked out from savannah.I had to throw out a few things that did not
apply because I'm installing emacs (--prefix=/usr/local/cvs-emacs)
before running it.  While Emacs is processing the list of files, I
grab the title bar of the window with the mouse pointer and rapidly
move the Emacs window around the screen.

Before applying your latest patch, I could generate a crash in maybe 7
or 8 out of 10 tries.  With the patch, it is down to around 1 or 2 out
of 10 tries, but it is still crashing.

Just to be sure I haven't screwed something up, I updated from the
public CVS archive for Emacs and made sure that I had no local
modifications, then applied your patch and ran configure and make
bootstrap, then generated a crash with the method explained above.

Even surrounding the entire body of unbind_to with a
BLOCK_INPUT/UNBLOCK_INPUT pair did not avoid the crash, though it took
nearly 20 attempts to trigger it.  Of course, even if this had worked,
I don't think it could be a solution becuase it would prevent any user
input from happening inside the cleanup portion of an unwind-protect
form.

BTW, as a wishlist item, it would be nice if either the manual or the
source included an explanation of the origin of names like specpdl,
staticpro, gcpro, etc.  It took some time before I understood what
these mean, even after looking at the declarations for these
variables/macros/functions (and I'm still not sure of how the term
specpdl was derived).  OTOH, perhaps I just missed them, or am
unusually slow to catch on.

Thanks,

jwe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 17:47         ` John W. Eaton
@ 2006-02-25 23:29           ` Chong Yidong
  2006-02-26  3:37             ` John W. Eaton
  2006-02-27  8:58             ` Richard Stallman
  2006-02-26 16:00           ` Richard Stallman
  1 sibling, 2 replies; 25+ messages in thread
From: Chong Yidong @ 2006-02-25 23:29 UTC (permalink / raw)
  Cc: rms, emacs-devel

"John W. Eaton" <jwe@bevo.che.wisc.edu> writes:

> | One solution is to somehow re-engineer the x error handler not to use
> | record_unwind_protect.
>
> I think this might be the best bet.
>
> | The other is to block inputs at the point in
> | unbind_to where specpdl_ptr is being modified, like this:
>
> This does not solve the problem for me.  It seems to be harder to
> generate the crash, but I am still hitting the x_error_message == 0
> segfault.

I just checked in some changes to make the x error handler avoid using
record_unwind_protect.  I am unable to make Emacs crash now -- can you
confirm this?

(Maybe this issue is related to the other problem, reported on
emacs-devel recently, on nil becoming bound to "(#<save_value...".)  I
added an "eassert (!handling_signal)" to record_unwind_protect.)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 23:29           ` Chong Yidong
@ 2006-02-26  3:37             ` John W. Eaton
  2006-02-27  8:58             ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: John W. Eaton @ 2006-02-26  3:37 UTC (permalink / raw)
  Cc: emacs-devel, rms, John W. Eaton

On 25-Feb-2006, Chong Yidong wrote:

| "John W. Eaton" <jwe@bevo.che.wisc.edu> writes:
| 
| > | One solution is to somehow re-engineer the x error handler not to use
| > | record_unwind_protect.
| >
| > I think this might be the best bet.
| >
| > | The other is to block inputs at the point in
| > | unbind_to where specpdl_ptr is being modified, like this:
| >
| > This does not solve the problem for me.  It seems to be harder to
| > generate the crash, but I am still hitting the x_error_message == 0
| > segfault.
| 
| I just checked in some changes to make the x error handler avoid using
| record_unwind_protect.  I am unable to make Emacs crash now -- can you
| confirm this?

I updated and was unable to make Emacs crash in 20 attempts so it seems
likely that the problem is fixed.

Thanks!

jwe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 15:36       ` Chong Yidong
  2006-02-25 17:47         ` John W. Eaton
@ 2006-02-26 12:11         ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2006-02-26 12:11 UTC (permalink / raw)
  Cc: emacs-devel, jwe

Thanks for figuring this out.
Please install your patch.

We may as well not revert to the old code, now that the
new code is working.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 17:47         ` John W. Eaton
  2006-02-25 23:29           ` Chong Yidong
@ 2006-02-26 16:00           ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2006-02-26 16:00 UTC (permalink / raw)
  Cc: cyd, emacs-devel, jwe

    Even surrounding the entire body of unbind_to with a
    BLOCK_INPUT/UNBLOCK_INPUT pair did not avoid the crash, though it took
    nearly 20 attempts to trigger it.

There must be another place that is sensitive to a similar bug.
Here's one:

record_unwind_protect (function, arg)
     Lisp_Object (*function) P_ ((Lisp_Object));
     Lisp_Object arg;
{
  if (specpdl_ptr == specpdl + specpdl_size)
    grow_specpdl ();
  specpdl_ptr->func = function;
  specpdl_ptr->symbol = Qnil;
  specpdl_ptr->old_value = arg;
  specpdl_ptr++;

I think specbind is another.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-25 23:29           ` Chong Yidong
  2006-02-26  3:37             ` John W. Eaton
@ 2006-02-27  8:58             ` Richard Stallman
  2006-02-28  0:51               ` Chong Yidong
  1 sibling, 1 reply; 25+ messages in thread
From: Richard Stallman @ 2006-02-27  8:58 UTC (permalink / raw)
  Cc: emacs-devel, jwe

    I just checked in some changes to make the x error handler avoid using
    record_unwind_protect.  I am unable to make Emacs crash now -- can you
    confirm this?

But if it does not use record_unwind_protect, how does it ensure
that the x error handler gets turned off if an error happens?
Have you checked every use of x_catch_errors to make sure that no
Lisp errors can occur before the matching call to x_uncatch_errors?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-27  8:58             ` Richard Stallman
@ 2006-02-28  0:51               ` Chong Yidong
  2006-03-05  0:59                 ` Richard Stallman
  0 siblings, 1 reply; 25+ messages in thread
From: Chong Yidong @ 2006-02-28  0:51 UTC (permalink / raw)
  Cc: emacs-devel, jwe

Richard Stallman <rms@gnu.org> writes:

>     I just checked in some changes to make the x error handler avoid using
>     record_unwind_protect.  I am unable to make Emacs crash now -- can you
>     confirm this?
>
> But if it does not use record_unwind_protect, how does it ensure
> that the x error handler gets turned off if an error happens?
> Have you checked every use of x_catch_errors to make sure that no
> Lisp errors can occur before the matching call to x_uncatch_errors?

I missed two places in xselect.c where code protected in
x_catch_errors can signal Lisp errors (x_reply_selection_request and
x_get_foreign_selection), in corner cases.  Since these two functions
cannot be called from a signal handler, I'll put their
x_uncatch_errors call into a record_unwind_protect.

I have checked several more times, and there is no other such
occurence---most uses of x_catch_errors just wrap one or two Xlib
calls.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)
  2006-02-28  0:51               ` Chong Yidong
@ 2006-03-05  0:59                 ` Richard Stallman
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2006-03-05  0:59 UTC (permalink / raw)
  Cc: jwe, emacs-devel

    I have checked several more times, and there is no other such
    occurence---most uses of x_catch_errors just wrap one or two Xlib
    calls.

Just to avoid hidden bugs, I will arrange for Fsignal to close off
all pending calls to x_catch_errors.

Aside from that, I think this is issue is finished.  Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2006-03-05  0:59 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-17  8:04 SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu) John W. Eaton
  -- strict thread matches above, loose matches on Subject: below --
2006-02-21 21:33 John W. Eaton
2006-02-23  2:39 ` Richard Stallman
2006-02-24  9:55 ` Richard Stallman
2006-02-25  7:45   ` John W. Eaton
2006-02-25 15:20     ` Chong Yidong
2006-02-25 15:36       ` Chong Yidong
2006-02-25 17:47         ` John W. Eaton
2006-02-25 23:29           ` Chong Yidong
2006-02-26  3:37             ` John W. Eaton
2006-02-27  8:58             ` Richard Stallman
2006-02-28  0:51               ` Chong Yidong
2006-03-05  0:59                 ` Richard Stallman
2006-02-26 16:00           ` Richard Stallman
2006-02-26 12:11         ` Richard Stallman
     [not found] <v9oe1hg44k.fsf@marauder.physik.uni-ulm.de>
2006-02-13  4:40 ` Richard M. Stallman
2006-02-13 14:04   ` Reiner Steib
2006-02-13 17:05     ` Stefan Monnier
2006-02-14  0:40       ` Richard M. Stallman
2006-02-17 14:27       ` Reiner Steib
2006-02-17 15:20         ` Reproducible crashes: dropping an URL (was: SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu)) Reiner Steib
2006-02-17 16:01           ` Reproducible crashes: dropping an URL Stefan Monnier
2006-02-20 14:59             ` SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu) (was: Reproducible crashes: dropping an URL) Reiner Steib
2006-02-20 15:04               ` SEGV in x_catch_errors_unwind (x86_64-unknown-linux-gnu) Stefan Monnier
2006-02-20 20:05                 ` Reiner Steib
2006-02-21  4:39                   ` Chong Yidong
2006-02-22  5:23                     ` Richard M. Stallman
2006-02-21  5:30                 ` Richard M. Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).