unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
@ 2016-02-24 13:08 Andreas Gustafsson
  2016-02-24 17:51 ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Andreas Gustafsson @ 2016-02-24 13:08 UTC (permalink / raw)
  To: 22790

I'm using emacs and the VM package to read mail under NetBSD 6.1.5 on a
daily basis, often working with mailboxes many tens of megabytes in
size.  Once every few days of doing this, emacs becomes unresponsive,
consuming 100% CPU.  I rebuilt emacs with debug symbols, and the last
time this happened, I caught the following backtrace:

(gdb) where
#0  0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
#1  0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
#2  0x00000000005c5486 in _malloc_internal (size=65536) at gmalloc.c:929
#3  0x00000000005c54fc in malloc (size=65536) at gmalloc.c:953
#4  0x00007f7ff60ed28c in __smakebuf () from /usr/lib/libc.so.12
#5  0x00007f7ff60ed125 in __swsetup () from /usr/lib/libc.so.12
#6  0x00007f7ff60cde92 in __vfprintf_unlocked () from /usr/lib/libc.so.12
#7  0x00007f7ff60d1258 in vfprintf () from /usr/lib/libc.so.12
#8  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
#9  0x00000000004db715 in handle_interrupt (in_signal_handler=true) at keyboard.c:10364
#10 0x00000000004db63e in handle_interrupt_signal (sig=2) at keyboard.c:10288
#11 0x00000000004e8b63 in deliver_process_signal (sig=2, handler=0x4db5f1 <handle_interrupt_signal>) at sysdep.c:1570
#12 0x00000000004db65a in deliver_interrupt_signal (sig=2) at keyboard.c:10295
#13 <signal handler called>
#14 0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
#15 0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
#16 0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
#17 0x00000000005c626f in _free_internal (ptr=0x33e0000) at gmalloc.c:1268
#18 0x00000000005c62ca in free (ptr=0x33e0000) at gmalloc.c:1283
#19 0x0000000000535046 in xfree (block=0x33e0000) at alloc.c:735
#20 0x000000000055ba6d in unbind_to (count=2, value=11946034) at eval.c:3304
#21 0x000000000055715d in unwind_to_catch (catch=0xfbf600, value=45248038) at eval.c:1161
#22 0x000000000055806a in Fsignal (error_symbol=12016098, data=11946034) at eval.c:1557
#23 0x00000000004db9c7 in handle_interrupt (in_signal_handler=true) at keyboard.c:10421
#24 0x00000000004db63e in handle_interrupt_signal (sig=2) at keyboard.c:10288
#25 0x00000000004e8b63 in deliver_process_signal (sig=2, handler=0x4db5f1 <handle_interrupt_signal>) at sysdep.c:1570
#26 0x00000000004db65a in deliver_interrupt_signal (sig=2) at keyboard.c:10295
#27 <signal handler called>
#28 0x00007f7ff60ec172 in memmove () from /usr/lib/libc.so.12
#29 0x00000000005c7c7a in r_alloc_sbrk (size=8527872) at ralloc.c:821
#30 0x00000000005c42b7 in align (size=8527872) at gmalloc.c:423
#31 0x00000000005c46c9 in morecore_nolock (size=8527872) at gmalloc.c:624
#32 0x00000000005c4fc9 in _malloc_internal_nolock (size=8394752) at gmalloc.c:863
#33 0x00000000005c6571 in _realloc_internal_nolock (ptr=0x2be0000, size=8394752) at gmalloc.c:1424
#34 0x00000000005c66fe in _realloc_internal (ptr=0x2be0000, size=8394752) at gmalloc.c:1480
#35 0x00000000005c6773 in realloc (ptr=0x2be0000, size=8394752) at gmalloc.c:1495
#36 0x0000000000534fec in xrealloc (block=0x2be0000, size=8394752) at alloc.c:717
#37 0x0000000000473021 in coding_alloc_by_realloc (coding=0x7f7fffff9a70, bytes=6144) at coding.c:1061
#38 0x000000000047337c in alloc_destination (coding=0x7f7fffff9a70, nbytes=6144, dst=0x33df400 "") at coding.c:1102
#39 0x00000000004881f2 in encode_coding_raw_text (coding=0x7f7fffff9a70) at coding.c:5429
#40 0x0000000000490639 in encode_coding (coding=0x7f7fffff9a70) at coding.c:7802
#41 0x0000000000492467 in encode_coding_object (coding=0x7f7fffff9a70, src_object=40695813, from=8388609, from_byte=8407959, to=16777217, to_byte=16796567, dst_object=11946082)
    at coding.c:8371
#42 0x000000000050dcef in e_write (desc=6, string=11946034, start=8388609, end=52984520, coding=0x7f7fffff9a70) at fileio.c:5256
#43 0x000000000050d8e9 in a_write (desc=6, string=11946034, pos=1, nchars=52984519, annot=0x7f7fffff9da8, coding=0x7f7fffff9a70) at fileio.c:5172
#44 0x000000000050ce5f in write_region (start=4, end=211938080, filename=33678465, append=11946034, visit=11946082, lockname=33724897, mustbenew=11946034, desc=6) at fileio.c:4870
#45 0x000000000050c758 in Fwrite_region (start=11946034, end=11946034, filename=33678465, append=11946034, visit=11946082, lockname=33724897, mustbenew=11946034) at fileio.c:4679
#46 0x000000000055ab18 in Ffuncall (nargs=7, args=0x7f7fffffa070) at eval.c:2837
#47 0x0000000000599506 in exec_byte_code (bytestr=8987609, vector=8987645, maxdepth=76, args_template=0, nargs=0, args=0x7f7fffffa5b0) at bytecode.c:916
#48 0x000000000055b0c7 in funcall_lambda (fun=8987565, nargs=0, arg_vector=0x7f7fffffa5b0) at eval.c:2978
#49 0x000000000055abb1 in Ffuncall (nargs=1, args=0x7f7fffffa5a8) at eval.c:2860
#50 0x0000000000599506 in exec_byte_code (bytestr=8987417, vector=8987453, maxdepth=12, args_template=0, nargs=0, args=0x7f7fffffaab0) at bytecode.c:916
#51 0x000000000055b0c7 in funcall_lambda (fun=8987373, nargs=0, arg_vector=0x7f7fffffaab0) at eval.c:2978
#52 0x000000000055abb1 in Ffuncall (nargs=1, args=0x7f7fffffaaa8) at eval.c:2860
#53 0x0000000000599506 in exec_byte_code (bytestr=8986145, vector=8986181, maxdepth=40, args_template=0, nargs=0, args=0x7f7fffffafc0) at bytecode.c:916
#54 0x000000000055b0c7 in funcall_lambda (fun=8986093, nargs=0, arg_vector=0x7f7fffffafc0) at eval.c:2978
#55 0x000000000055abb1 in Ffuncall (nargs=1, args=0x7f7fffffafb8) at eval.c:2860
#56 0x0000000000599506 in exec_byte_code (bytestr=8984849, vector=8984885, maxdepth=20, args_template=1024, nargs=0, args=0x7f7fffffb4a8) at bytecode.c:916
#57 0x000000000055b0c7 in funcall_lambda (fun=8984797, nargs=0, arg_vector=0x7f7fffffb4a8) at eval.c:2978
#58 0x000000000055abb1 in Ffuncall (nargs=1, args=0x7f7fffffb4a0) at eval.c:2860
#59 0x0000000000599506 in exec_byte_code (bytestr=8989777, vector=8989813, maxdepth=8, args_template=1028, nargs=1, args=0x7f7fffffba20) at bytecode.c:916
#60 0x000000000055b0c7 in funcall_lambda (fun=8989733, nargs=1, arg_vector=0x7f7fffffba18) at eval.c:2978
#61 0x000000000055abb1 in Ffuncall (nargs=2, args=0x7f7fffffba10) at eval.c:2860
#62 0x0000000000599506 in exec_byte_code (bytestr=8723753, vector=8723789, maxdepth=140, args_template=6156, nargs=5, args=0x7f7fffffbfc0) at bytecode.c:916
#63 0x000000000055b0c7 in funcall_lambda (fun=8723709, nargs=5, arg_vector=0x7f7fffffbf98) at eval.c:2978
#64 0x000000000055abb1 in Ffuncall (nargs=6, args=0x7f7fffffbf90) at eval.c:2860
#65 0x0000000000599506 in exec_byte_code (bytestr=8989113, vector=8989149, maxdepth=64, args_template=2048, nargs=2, args=0x7f7fffffc4c8) at bytecode.c:916
#66 0x000000000055b0c7 in funcall_lambda (fun=8989061, nargs=2, arg_vector=0x7f7fffffc4b8) at eval.c:2978
#67 0x000000000055abb1 in Ffuncall (nargs=3, args=0x7f7fffffc4b0) at eval.c:2860
#68 0x0000000000599506 in exec_byte_code (bytestr=9007353, vector=9007389, maxdepth=64, args_template=1024, nargs=1, args=0x7f7fffffca10) at bytecode.c:916
#69 0x000000000055b0c7 in funcall_lambda (fun=9007301, nargs=1, arg_vector=0x7f7fffffca08) at eval.c:2978
#70 0x000000000055abb1 in Ffuncall (nargs=2, args=0x7f7fffffca00) at eval.c:2860
#71 0x0000000000599506 in exec_byte_code (bytestr=9008337, vector=9008373, maxdepth=16, args_template=1024, nargs=1, args=0x7f7fffffcf40) at bytecode.c:916
#72 0x000000000055b0c7 in funcall_lambda (fun=9008285, nargs=1, arg_vector=0x7f7fffffcf38) at eval.c:2978
#73 0x000000000055abb1 in Ffuncall (nargs=2, args=0x7f7fffffcf30) at eval.c:2860
#74 0x00000000005552cf in Fcall_interactively (function=16250050, record_flag=11946034, keys=11998845) at callint.c:836
#75 0x000000000055aa05 in Ffuncall (nargs=4, args=0x7f7fffffd238) at eval.c:2818
#76 0x0000000000599506 in exec_byte_code (bytestr=9460233, vector=9460269, maxdepth=52, args_template=4100, nargs=1, args=0x7f7fffffd750) at bytecode.c:916
#77 0x000000000055b0c7 in funcall_lambda (fun=9460189, nargs=1, arg_vector=0x7f7fffffd748) at eval.c:2978
#78 0x000000000055abb1 in Ffuncall (nargs=2, args=0x7f7fffffd740) at eval.c:2860
#79 0x000000000055a35f in call1 (fn=12009122, arg1=16250050) at eval.c:2610
#80 0x00000000004cb8a9 in command_loop_1 () at keyboard.c:1560
#81 0x0000000000557882 in internal_condition_case (bfun=0x4cb1f1 <command_loop_1>, handlers=12016002, hfun=0x4cab3b <cmd_error>) at eval.c:1348
#82 0x00000000004caf5d in command_loop_2 (ignore=11946034) at keyboard.c:1178
#83 0x00000000005570b5 in internal_catch (tag=12008098, func=0x4caf37 <command_loop_2>, arg=11946034) at eval.c:1112
#84 0x00000000004caf0f in command_loop () at keyboard.c:1157
#85 0x00000000004ca737 in recursive_edit_1 () at keyboard.c:778
#86 0x00000000004ca8a4 in Frecursive_edit () at keyboard.c:849
#87 0x00000000004c8aa4 in main (argc=4, argv=0x7f7fffffdb80) at emacs.c:1642
(gdb) 

For obvious reasons, the information below is not from the same emacs
process, but it is from the same binary.

In GNU Emacs 24.5.1 (x86_64--netbsd)
 of 2016-02-14 on guava.gson.org
Configured using:
 `configure --srcdir=/usr/pkgsrc/editors/emacs24/work/emacs-24.5
 --localstatedir=/var --without-dbus --without-gnutls --without-rsvg
 --without-x --without-xpm --without-jpeg --without-tiff --without-gif
 --without-png --prefix=/usr/pkg --build=x86_64--netbsd
 --host=x86_64--netbsd --infodir=/usr/pkg/info --mandir=/usr/pkg/man
 'CFLAGS=-g -I/usr/include -I/usr/pkg/include' 'CPPFLAGS=-DTERMINFO
 -I/usr/include -I/usr/pkg/include' 'LDFLAGS=-L/usr/lib -Wl,-R/usr/lib
 -L/usr/pkg/lib -Wl,-R/usr/pkg/lib''

Important settings:
  locale-coding-system: nil

Major mode: Debugger

Minor modes in effect:
  tooltip-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
Note: c-basic-offset adjusted to 2 for buffer keyboard.c.
Mark set [2 times]
Note: c-basic-offset adjusted to 2 for buffer sysdep.c.
Mark set
Mark saved where search started [2 times]
Mark set
scroll-up-command: End of buffer
Command: next 1
Undo!
Mark saved where search started

Load-path shadows:
/u/gson/lisp/tempo hides /usr/pkg/share/emacs/24.5/lisp/tempo
/usr/pkg/share/emacs/site-lisp/ispell/ispell hides /usr/pkg/share/emacs/24.5/lisp/textmodes/ispell

Features:
(shadow sort gnus-util mail-extr warnings emacsbug message format-spec
rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util help-fns mail-prsvr mail-utils cc-langs cc-mode cc-fonts
easymenu cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine misearch
multi-isearch gdb-mi cl-loaddefs cl-lib bindat json gud tool-bar
easy-mmode comint ansi-color ring xterm time-date guess-offset cc-vars
cc-defs regexp-opt tooltip electric uniquify ediff-hook vc-hooks
lisp-float-type tabulated-list newcomment lisp-mode prog-mode register
page menu-bar rfn-eshadow timer select mouse jit-lock font-lock syntax
facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak
czech european ethiopic indian cyrillic chinese case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer nadvice loaddefs button
faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote make-network-process multi-tty emacs)

Memory information:
((conses 16 138264 9026)
 (symbols 48 20788 0)
 (miscs 40 86 618)
 (strings 32 21808 4914)
 (string-bytes 1 693311)
 (vectors 16 11410)
 (vector-slots 8 384549 3962)
 (floats 8 69 395)
 (intervals 56 2366 101)
 (buffers 960 19))





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-02-24 13:08 bug#22790: 24.5; Infinite loop involving malloc called from signal handler Andreas Gustafsson
@ 2016-02-24 17:51 ` Eli Zaretskii
  2016-02-24 18:17   ` Andreas Gustafsson
  2016-02-29 14:44   ` Andreas Gustafsson
  0 siblings, 2 replies; 14+ messages in thread
From: Eli Zaretskii @ 2016-02-24 17:51 UTC (permalink / raw)
  To: Andreas Gustafsson; +Cc: 22790

> From: Andreas Gustafsson <gson@gson.org>
> Date: Wed, 24 Feb 2016 15:08:21 +0200
> 
> I'm using emacs and the VM package to read mail under NetBSD 6.1.5 on a
> daily basis, often working with mailboxes many tens of megabytes in
> size.  Once every few days of doing this, emacs becomes unresponsive,
> consuming 100% CPU.  I rebuilt emacs with debug symbols, and the last
> time this happened, I caught the following backtrace:

Thanks.

Is the SIGINT interrupt that shows in the backtrace the result of your
typing C-c, or did it come from something else?

In any case, when this happens next, please use the procedure
described in etc/DEBUG for locating the place where Emacs loops, and
post that information.  Backtraces generated from an infloop
interrupted in a random place tend to be random and don't provide
enough information for finding out the reasons for the loop.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-02-24 17:51 ` Eli Zaretskii
@ 2016-02-24 18:17   ` Andreas Gustafsson
  2016-02-29 14:44   ` Andreas Gustafsson
  1 sibling, 0 replies; 14+ messages in thread
From: Andreas Gustafsson @ 2016-02-24 18:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 22790

Eli Zaretskii wrote:
> Is the SIGINT interrupt that shows in the backtrace the result of your
> typing C-c, or did it come from something else?

I'm not sure.  I don't recall typing C-c, but the process was sitting
there for a few hours waiting to be debugged while I attended to some
other work, and I may have done it accidentally during that time.

Also, when an emacs operation takes too long, typing C-g (sic) has
become something of a subconcious reflex after 30 years of use, so
I can't say for sure whether I have done that or not, and will likely
not be able to say the next time it happens, either.

> In any case, when this happens next, please use the procedure
> described in etc/DEBUG for locating the place where Emacs loops, and
> post that information.  Backtraces generated from an infloop
> interrupted in a random place tend to be random and don't provide
> enough information for finding out the reasons for the loop.

To be clear, whether or not emacs was in an infinite loop when it
received the SIGINT, it is in an infinite loop within libpthread now:

(gdb) define s
Type commands for definition of "s".
End with a line saying just "end".
>stepi
>x/i $pc
>end
(gdb) s
0x00007f7ff6c08440 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08440:      callq  0x7f7ff6c083e0
(gdb) s
0x00007f7ff6c083e0 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c083e0:      pause
(gdb) s
0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c083e2:      retq
(gdb) s
0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08445:      sub    $0x1,%ebp
(gdb) s
0x00007f7ff6c08448 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08448:      jne    0x7f7ff6c08440
(gdb) s
0x00007f7ff6c08440 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08440:      callq  0x7f7ff6c083e0
(gdb) 

-- 
Andreas Gustafsson, gson@gson.org





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-02-24 17:51 ` Eli Zaretskii
  2016-02-24 18:17   ` Andreas Gustafsson
@ 2016-02-29 14:44   ` Andreas Gustafsson
  2016-03-04  9:42     ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Gustafsson @ 2016-02-29 14:44 UTC (permalink / raw)
  To: 22790

The lockup happened again.  There's still a SIGINT handler involved,
but at least there is only one of this time and not two recursive
ones.

The full backtrace and some additional gdb output are included below,
but I would think this two-line excerpt should be sufficient to
identify the bug (or at least _a_ bug, if there is more than one):

  #9  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
  #10 0x00000000004db715 in handle_interrupt (in_signal_handler=true) at keyboard.c:10364

That is, printf() is not a signal safe function, so emacs is invoking
undefined behavior by calling it from a signal handler.

> In any case, when this happens next, please use the procedure
> described in etc/DEBUG for locating the place where Emacs loops, and
> post that information.

As you can see from the gdb transcript below, the "step" function
didn't work, but "stepi" shows it looping within libpthread.

> Backtraces generated from an infloop
> interrupted in a random place tend to be random and don't provide
> enough information for finding out the reasons for the loop.

Even if you consider the backtrace to be suspect, code inspection
should suffice to show that the line

          printf ("Auto-save? (y or n) ");

in src/keyboard.c can be executed from a signal handler.
-- 
Andreas Gustafsson, gson@gson.org

(gdb) where
#0  0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
#1  0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
#2  0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
#3  0x00000000005c5486 in _malloc_internal (size=65536) at gmalloc.c:929
#4  0x00000000005c54fc in malloc (size=65536) at gmalloc.c:953
#5  0x00007f7ff60ed28c in __smakebuf () from /usr/lib/libc.so.12
#6  0x00007f7ff60ed125 in __swsetup () from /usr/lib/libc.so.12
#7  0x00007f7ff60cde92 in __vfprintf_unlocked () from /usr/lib/libc.so.12
#8  0x00007f7ff60d1258 in vfprintf () from /usr/lib/libc.so.12
#9  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
#10 0x00000000004db715 in handle_interrupt (in_signal_handler=true) at keyboard.c:10364
#11 0x00000000004db63e in handle_interrupt_signal (sig=2) at keyboard.c:10288
#12 0x00000000004e8b63 in deliver_process_signal (sig=2, handler=0x4db5f1 <handle_interrupt_signal>) at sysdep.c:1570
#13 0x00000000004db65a in deliver_interrupt_signal (sig=2) at keyboard.c:10295
#14 <signal handler called>
#15 0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
#16 0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
#17 0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
#18 0x00000000005c5486 in _malloc_internal (size=1000) at gmalloc.c:929
#19 0x00000000005c54fc in malloc (size=1000) at gmalloc.c:953
#20 0x0000000000534f0d in xmalloc (size=1000) at alloc.c:677
#21 0x000000000057968f in Fprinc (object=8564569, printcharfun=11946034) at print.c:656
#22 0x000000000057a544 in print_error_message (data=41076294, stream=11944965, context=0x0, caller=11946034) at print.c:919
#23 0x000000000057a238 in Ferror_message_string (obj=41076294) at print.c:844
#24 0x000000000050e40e in auto_save_error (error_val=41076294) at fileio.c:5425
#25 0x000000000055787a in internal_condition_case (bfun=0x50e477 <auto_save_1>, handlers=11946082, hfun=0x50e3bf <auto_save_error>) at eval.c:1345
#26 0x000000000050eb76 in Fdo_auto_save (no_message=11946082, current_only=11946034) at fileio.c:5672
#27 0x00000000004cde3c in read_char (commandflag=1, map=41075894, prev_event=11946034, used_mouse_menu=0x7f7fffff9c0f, end_time=0x0) at keyboard.c:2751
#28 0x00000000004d932a in read_key_sequence (keybuf=0x7f7fffff9df0, bufsize=30, prompt=11946034, dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:9089
#29 0x00000000004cb5b0 in command_loop_1 () at keyboard.c:1453
#30 0x0000000000557882 in internal_condition_case (bfun=0x4cb1f1 <command_loop_1>, handlers=12016002, hfun=0x4cab3b <cmd_error>) at eval.c:1348
#31 0x00000000004caf5d in command_loop_2 (ignore=11946034) at keyboard.c:1178
#32 0x00000000005570b5 in internal_catch (tag=12108690, func=0x4caf37 <command_loop_2>, arg=11946034) at eval.c:1112
#33 0x00000000004caec0 in command_loop () at keyboard.c:1149
#34 0x00000000004ca737 in recursive_edit_1 () at keyboard.c:778
#35 0x00000000005017dd in read_minibuf (map=40555366, initial=37407873, prompt=18302785, expflag=false, histvar=12034962, histpos=0, defalt=11946034, allow_props=false, inherit_input_method=false) at minibuf.c:674
#36 0x0000000000501ffd in Fread_from_minibuffer (prompt=18302785, initial_contents=37407873, keymap=40555366, read=11946034, hist=12034962, default_value=11946034, inherit_input_method=11946034) at minibuf.c:957
#37 0x000000000055ab18 in Ffuncall (nargs=8, args=0x7f7fffffa398) at eval.c:2837
#38 0x0000000000599506 in exec_byte_code (bytestr=9425233, vector=9425269, maxdepth=72, args_template=8200, nargs=8, args=0x7f7fffffa918) at bytecode.c:916
#39 0x000000000055b0c7 in funcall_lambda (fun=9425189, nargs=8, arg_vector=0x7f7fffffa8d8) at eval.c:2978
#40 0x000000000055abb1 in Ffuncall (nargs=9, args=0x7f7fffffa8d0) at eval.c:2860
#41 0x0000000000503624 in Fcompleting_read (prompt=18302785, collection=12147074, predicate=12031842, require_match=11946034, initial_input=37407873, hist=12034962, def=11946034, inherit_input_method=11946034) at minibuf.c:1674
#42 0x000000000055ab77 in Ffuncall (nargs=8, args=0x7f7fffffaa70) at eval.c:2844
#43 0x0000000000599506 in exec_byte_code (bytestr=9416857, vector=9416893, maxdepth=92, args_template=6148, nargs=6, args=0x7f7fffffaff0) at bytecode.c:916
#44 0x000000000055b0c7 in funcall_lambda (fun=9416813, nargs=6, arg_vector=0x7f7fffffafc0) at eval.c:2978
#45 0x000000000055abb1 in Ffuncall (nargs=7, args=0x7f7fffffafb8) at eval.c:2860
#46 0x0000000000599506 in exec_byte_code (bytestr=9416657, vector=9416693, maxdepth=52, args_template=6148, nargs=6, args=0x7f7fffffb4f0) at bytecode.c:916
#47 0x000000000055b0c7 in funcall_lambda (fun=9416613, nargs=6, arg_vector=0x7f7fffffb4c0) at eval.c:2978
#48 0x000000000055abb1 in Ffuncall (nargs=7, args=0x7f7fffffb4b8) at eval.c:2860
#49 0x0000000000599506 in exec_byte_code (bytestr=13771137, vector=15498901, maxdepth=28, args_template=11946034, nargs=0, args=0x0) at bytecode.c:916
#50 0x000000000059899d in Fbyte_code (bytestr=13771137, vector=15498901, maxdepth=28) at bytecode.c:482
#51 0x00000000005595a6 in eval_sub (form=13294870) at eval.c:2187
#52 0x000000000055771f in internal_lisp_condition_case (var=11946034, bodyform=13294870, handlers=13294294) at eval.c:1317
#53 0x000000000059a671 in exec_byte_code (bytestr=13770785, vector=15499053, maxdepth=12, args_template=11946034, nargs=0, args=0x0) at bytecode.c:1162
#54 0x000000000055b3c0 in funcall_lambda (fun=15499117, nargs=6, arg_vector=0xec7f2d) at eval.c:3044
#55 0x000000000055abb1 in Ffuncall (nargs=7, args=0x7f7fffffbff8) at eval.c:2860
#56 0x0000000000599506 in exec_byte_code (bytestr=13774209, vector=15563853, maxdepth=28, args_template=11946034, nargs=0, args=0x0) at bytecode.c:916
#57 0x000000000055b3c0 in funcall_lambda (fun=15499165, nargs=3, arg_vector=0xed7c4d) at eval.c:3044
#58 0x000000000055aea5 in apply_lambda (fun=15499165, args=19767910, count=13) at eval.c:2919
#59 0x0000000000559777 in eval_sub (form=19767894) at eval.c:2226
#60 0x0000000000555e28 in Fprogn (body=19767958) at eval.c:462
#61 0x0000000000555dcd in Fcond (args=19767974) at eval.c:440
#62 0x0000000000559273 in eval_sub (form=19767382) at eval.c:2131
#63 0x0000000000555e28 in Fprogn (body=19767990) at eval.c:462
#64 0x0000000000556d09 in Flet (args=19767366) at eval.c:970
#65 0x0000000000559273 in eval_sub (form=19770054) at eval.c:2131
#66 0x0000000000555e28 in Fprogn (body=19768006) at eval.c:462
#67 0x0000000000556d09 in Flet (args=19770006) at eval.c:970
#68 0x0000000000559273 in eval_sub (form=19769894) at eval.c:2131
#69 0x000000000055939f in eval_sub (form=19769878) at eval.c:2147
#70 0x0000000000558dac in Feval (form=19769878, lexical=11946034) at eval.c:1996
#71 0x0000000000553732 in Fcall_interactively (function=18304242, record_flag=11946034, keys=11998845) at callint.c:345
#72 0x000000000055aa05 in Ffuncall (nargs=4, args=0x7f7fffffd248) at eval.c:2818
#73 0x0000000000599506 in exec_byte_code (bytestr=9460233, vector=9460269, maxdepth=52, args_template=4100, nargs=1, args=0x7f7fffffd760) at bytecode.c:916
#74 0x000000000055b0c7 in funcall_lambda (fun=9460189, nargs=1, arg_vector=0x7f7fffffd758) at eval.c:2978
#75 0x000000000055abb1 in Ffuncall (nargs=2, args=0x7f7fffffd750) at eval.c:2860
#76 0x000000000055a35f in call1 (fn=12009122, arg1=18304242) at eval.c:2610
#77 0x00000000004cb8a9 in command_loop_1 () at keyboard.c:1560
#78 0x0000000000557882 in internal_condition_case (bfun=0x4cb1f1 <command_loop_1>, handlers=12016002, hfun=0x4cab3b <cmd_error>) at eval.c:1348
#79 0x00000000004caf5d in command_loop_2 (ignore=11946034) at keyboard.c:1178
#80 0x00000000005570b5 in internal_catch (tag=12008098, func=0x4caf37 <command_loop_2>, arg=11946034) at eval.c:1112
#81 0x00000000004caf0f in command_loop () at keyboard.c:1157
#82 0x00000000004ca737 in recursive_edit_1 () at keyboard.c:778
#83 0x00000000004ca8a4 in Frecursive_edit () at keyboard.c:849
#84 0x00000000004c8aa4 in main (argc=4, argv=0x7f7fffffdb90) at emacs.c:1642
(gdb) step
Cannot find bounds of current function
(gdb) define s
Type commands for definition of "s".
End with a line saying just "end".
>stepi
>x/i $pc
>end
(gdb) s
0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08445:      sub    $0x1,%ebp
(gdb) 
0x00007f7ff6c08448 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08448:      jne    0x7f7ff6c08440
(gdb) 
0x00007f7ff6c08440 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08440:      callq  0x7f7ff6c083e0
(gdb) 
0x00007f7ff6c083e0 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c083e0:      pause  
(gdb) 
0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c083e2:      retq   
(gdb) 
0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08445:      sub    $0x1,%ebp
(gdb) 
0x00007f7ff6c08448 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08448:      jne    0x7f7ff6c08440
(gdb) 
0x00007f7ff6c08440 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08440:      callq  0x7f7ff6c083e0
(gdb) 
0x00007f7ff6c083e0 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c083e0:      pause  
(gdb) 
0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c083e2:      retq   
(gdb) 
0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
=> 0x7f7ff6c08445:      sub    $0x1,%ebp
(gdb) info threads
  Id   Target Id         Frame 
* 1    LWP 1             0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-02-29 14:44   ` Andreas Gustafsson
@ 2016-03-04  9:42     ` Eli Zaretskii
  2016-03-04 14:23       ` Andreas Gustafsson
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-04  9:42 UTC (permalink / raw)
  To: Andreas Gustafsson; +Cc: 22790

> Date: Mon, 29 Feb 2016 16:44:30 +0200
> CC: Eli Zaretskii <eliz@gnu.org>
> From: Andreas Gustafsson <gson@gson.org>
> 
> The lockup happened again.  There's still a SIGINT handler involved,
> but at least there is only one of this time and not two recursive
> ones.
> 
> The full backtrace and some additional gdb output are included below,
> but I would think this two-line excerpt should be sufficient to
> identify the bug (or at least _a_ bug, if there is more than one):
> 
>   #9  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
>   #10 0x00000000004db715 in handle_interrupt (in_signal_handler=true) at keyboard.c:10364
> 
> That is, printf() is not a signal safe function, so emacs is invoking
> undefined behavior by calling it from a signal handler.

Is this a GUI session or a text-mode terminal (a.k.a. "TTY") session?
If the former, handle_interrupt is not called from a SIGINT handler.

In any case, this code is run as part of the so-called "emergency
escape", when you type C-g more than once while Emacs is busy doing
something that cannot be interrupted.  In that situation, we are way
past the point where invoking undefined behavior is of any concern,
because the only thing we can do then is auto-save and commit
suicide.  The printf call you see on the stack is asking the user
whether to auto-save, and the next question is whether to abort.

> > In any case, when this happens next, please use the procedure
> > described in etc/DEBUG for locating the place where Emacs loops, and
> > post that information.
> 
> As you can see from the gdb transcript below, the "step" function
> didn't work, but "stepi" shows it looping within libpthread.

You need to use "finish", not "step" or "stepi".  I don't think
the loop can reasonably be inside libpthread, so you should try
getting back to the Emacs application code and out of calls to library
functions.  Typing "finish" repeatedly until you are in some Emacs
code is the way to achieve that.  But this should be done without
typing C-g first, because otherwise you might be forcibly taken out of
the loop, and there's no easy way to return there.

And I still don't understand why the SIGINT handler is in the
picture.  Did you type C-g when this lockup happened?

> Even if you consider the backtrace to be suspect, code inspection
> should suffice to show that the line
> 
>           printf ("Auto-save? (y or n) ");
> 
> in src/keyboard.c can be executed from a signal handler.

Indeed, it can.  But I don't think this is the reason for the problem
you are describing.  That code cannot be entered unless you type C-g
twice or more in a TTY session while Emacs is already in some
un-interruptible loop or system call.  It is that loop or system call
that we need to identify in order to fix this problem.

> (gdb) where
> #0  0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
> #1  0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
> #2  0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
> #3  0x00000000005c5486 in _malloc_internal (size=65536) at gmalloc.c:929
> #4  0x00000000005c54fc in malloc (size=65536) at gmalloc.c:953
> #5  0x00007f7ff60ed28c in __smakebuf () from /usr/lib/libc.so.12
> #6  0x00007f7ff60ed125 in __swsetup () from /usr/lib/libc.so.12
> #7  0x00007f7ff60cde92 in __vfprintf_unlocked () from /usr/lib/libc.so.12
> #8  0x00007f7ff60d1258 in vfprintf () from /usr/lib/libc.so.12
> #9  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
> #10 0x00000000004db715 in handle_interrupt (in_signal_handler=true) at keyboard.c:10364
> #11 0x00000000004db63e in handle_interrupt_signal (sig=2) at keyboard.c:10288
> #12 0x00000000004e8b63 in deliver_process_signal (sig=2, handler=0x4db5f1 <handle_interrupt_signal>) at sysdep.c:1570
> #13 0x00000000004db65a in deliver_interrupt_signal (sig=2) at keyboard.c:10295
> #14 <signal handler called>
> #15 0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
> #16 0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
> #17 0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
> #18 0x00000000005c5486 in _malloc_internal (size=1000) at gmalloc.c:929
> #19 0x00000000005c54fc in malloc (size=1000) at gmalloc.c:953
> #20 0x0000000000534f0d in xmalloc (size=1000) at alloc.c:677
> #21 0x000000000057968f in Fprinc (object=8564569, printcharfun=11946034) at print.c:656
> #22 0x000000000057a544 in print_error_message (data=41076294, stream=11944965, context=0x0, caller=11946034) at print.c:919
> #23 0x000000000057a238 in Ferror_message_string (obj=41076294) at print.c:844
> #24 0x000000000050e40e in auto_save_error (error_val=41076294) at fileio.c:5425
> #25 0x000000000055787a in internal_condition_case (bfun=0x50e477 <auto_save_1>, handlers=11946082, hfun=0x50e3bf <auto_save_error>) at eval.c:1345
> #26 0x000000000050eb76 in Fdo_auto_save (no_message=11946082, current_only=11946034) at fileio.c:5672
> #27 0x00000000004cde3c in read_char (commandflag=1, map=41075894, prev_event=11946034, used_mouse_menu=0x7f7fffff9c0f, end_time=0x0) at keyboard.c:2751
> #28 0x00000000004d932a in read_key_sequence (keybuf=0x7f7fffff9df0, bufsize=30, prompt=11946034, dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:9089
> #29 0x00000000004cb5b0 in command_loop_1 () at keyboard.c:1453
> #30 0x0000000000557882 in internal_condition_case (bfun=0x4cb1f1 <command_loop_1>, handlers=12016002, hfun=0x4cab3b <cmd_error>) at eval.c:1348
> #31 0x00000000004caf5d in command_loop_2 (ignore=11946034) at keyboard.c:1178
> #32 0x00000000005570b5 in internal_catch (tag=12108690, func=0x4caf37 <command_loop_2>, arg=11946034) at eval.c:1112
> #33 0x00000000004caec0 in command_loop () at keyboard.c:1149
> #34 0x00000000004ca737 in recursive_edit_1 () at keyboard.c:778
> #35 0x00000000005017dd in read_minibuf (map=40555366, initial=37407873, prompt=18302785, expflag=false, histvar=12034962, histpos=0, defalt=11946034, allow_props=false, inherit_input_method=false) at minibuf.c:674
> #36 0x0000000000501ffd in Fread_from_minibuffer (prompt=18302785, initial_contents=37407873, keymap=40555366, read=11946034, hist=12034962, default_value=11946034, inherit_input_method=11946034) at minibuf.c:957
> #37 0x000000000055ab18 in Ffuncall (nargs=8, args=0x7f7fffffa398) at eval.c:2837
> #38 0x0000000000599506 in exec_byte_code (bytestr=9425233, vector=9425269, maxdepth=72, args_template=8200, nargs=8, args=0x7f7fffffa918) at bytecode.c:916

This tells the following story:

 . Emacs was running some byte code
 . that byte code tried to read from the minibuffer, probably after
   asking some question or prompting for some input
 . as part of that prompt, Emacs attempted to auto-save modified
   buffers
 . the auto-save attempt signaled an error
 . Emacs wanted to display the error message, and called malloc
 . then somehow SIGINT was delivered

Does this match what you were doing?  Any reason why auto-saving could
fail (some filesystem that could be off-line, for example)?  And where
did that SIGINT come from?

Thanks.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-04  9:42     ` Eli Zaretskii
@ 2016-03-04 14:23       ` Andreas Gustafsson
  2016-03-04 15:16         ` Eli Zaretskii
  2016-03-13  9:21         ` Daniel Colascione
  0 siblings, 2 replies; 14+ messages in thread
From: Andreas Gustafsson @ 2016-03-04 14:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 22790

Eli Zaretskii wrote:
> Is this a GUI session or a text-mode terminal (a.k.a. "TTY") session?

This is a TTY session.

> In any case, this code is run as part of the so-called "emergency
> escape", when you type C-g more than once while Emacs is busy doing
> something that cannot be interrupted.  In that situation, we are way
> past the point where invoking undefined behavior is of any concern,
> because the only thing we can do then is auto-save and commit
> suicide.

Not necessarily - there is also the option of continuing what emacs
was doing, which is what I would have done, by answering both the
"Auto-save?" and "Abort (and dump core)?" prompts with "no", if those
prompts had actually appeared.  But they didn't, because emacs entered
an infinite loop trying to print them.

> You need to use "finish", not "step" or "stepi".

I will try that the next time the lockup happens, but I'm quite sure
it won't do anything useful.

> I don't think the loop can reasonably be inside libpthread,

Why not?  We are talking undefined behavior here, after all.  If you
find looping in libpthread surprising, just wait until the nasal
demons appear :)  It could be something as simple as trying to acquire
a spinlock that was already held when the signal occurred.

> so you should try
> getting back to the Emacs application code and out of calls to library
> functions.  Typing "finish" repeatedly until you are in some Emacs
> code is the way to achieve that.

It won't be able do that since it's in an infinite loop inside
libpthread.

> But this should be done without
> typing C-g first, because otherwise you might be forcibly taken out of
> the loop, and there's no easy way to return there.
> 
> And I still don't understand why the SIGINT handler is in the
> picture.  Did you type C-g when this lockup happened?

As I said, I probably did type C-g, and may well have done so more
than once, because I have been conditioned to do that reflexively by
decades of emacs use.  I'm not realistically going to be able to
unlearn that reflex.  The only way I'd be able to stop sending emacs
SIGINTs would be to hack the code to disable C-g as the interrupt
character in the tty settings.

> > Even if you consider the backtrace to be suspect, code inspection
> > should suffice to show that the line
> > 
> >           printf ("Auto-save? (y or n) ");
> > 
> > in src/keyboard.c can be executed from a signal handler.
> 
> Indeed, it can.  But I don't think this is the reason for the problem
> you are describing.  That code cannot be entered unless you type C-g
> twice or more in a TTY session while Emacs is already in some
> un-interruptible loop or system call.  It is that loop or system call
> that we need to identify in order to fix this problem.

As far as I'm concerned, this bug report is specifically about what
happens after the SIGINT, not what happened before it.  For some
reason, emacs was unresponsive, which probably caused me to
reflexively hit control-g.  The lack of responsiveness is not
necessarily a bug - emacs may just be slow due to the size of the
files being edited, or paging, or whatever.  But entering an infinite
loop when I hit control-g, causing the loss of unsaved data and giving
me no option to continue, is definitely a bug.

>  . Emacs was running some byte code
>  . that byte code tried to read from the minibuffer, probably after
>    asking some question or prompting for some input
>  . as part of that prompt, Emacs attempted to auto-save modified
>    buffers
>  . the auto-save attempt signaled an error
>  . Emacs wanted to display the error message, and called malloc
>  . then somehow SIGINT was delivered
> 
> Does this match what you were doing?

Presumably; I don't recall the specifics.

> Any reason why auto-saving could
> fail (some filesystem that could be off-line, for example)?

The only mounted filesystem was a local hard disk, and there were no
disk errors in the system log.  Could the auto-save have failed due to
interruption by yet another (earlier) control-g?

> And where did that SIGINT come from?

Presumably from deep inside the subconscious parts of my brain :)
-- 
Andreas Gustafsson, gson@gson.org





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-04 14:23       ` Andreas Gustafsson
@ 2016-03-04 15:16         ` Eli Zaretskii
  2016-03-05 10:17           ` Andreas Gustafsson
  2016-07-07 16:24           ` Andreas Gustafsson
  2016-03-13  9:21         ` Daniel Colascione
  1 sibling, 2 replies; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-04 15:16 UTC (permalink / raw)
  To: Andreas Gustafsson; +Cc: 22790

> Date: Fri, 4 Mar 2016 16:23:17 +0200
> Cc: 22790@debbugs.gnu.org
> From: Andreas Gustafsson <gson@gson.org>
> 
> > In any case, this code is run as part of the so-called "emergency
> > escape", when you type C-g more than once while Emacs is busy doing
> > something that cannot be interrupted.  In that situation, we are way
> > past the point where invoking undefined behavior is of any concern,
> > because the only thing we can do then is auto-save and commit
> > suicide.
> 
> Not necessarily - there is also the option of continuing what emacs
> was doing, which is what I would have done, by answering both the
> "Auto-save?" and "Abort (and dump core)?" prompts with "no", if those
> prompts had actually appeared.  But they didn't, because emacs entered
> an infinite loop trying to print them.

Is it really a fact that the loop _followed_ the C-g, not the other
way around?  If there was no loop before the C-g, then why would you
type C-g?

> > You need to use "finish", not "step" or "stepi".
> 
> I will try that the next time the lockup happens, but I'm quite sure
> it won't do anything useful.
> 
> > I don't think the loop can reasonably be inside libpthread,
> 
> Why not?

Because it's much more likely that Emacs has bugs that lead to
infloops than that libpthread has such bugs.

> We are talking undefined behavior here, after all.

If libpthread was so prone to undefined behavior, it would have been
either fixed long time ago or thrown away in favor of a more robust
implementation.

I'm not saying such a bug in libpthread is impossible, just that it's
much less likely than a bug in Emacs.

> If you find looping in libpthread surprising, just wait until the
> nasal demons appear :) It could be something as simple as trying to
> acquire a spinlock that was already held when the signal occurred.

It could be anything.  But I generally find that the probability of a
bug in an application is much higher than in a library everyone uses.

> > so you should try
> > getting back to the Emacs application code and out of calls to library
> > functions.  Typing "finish" repeatedly until you are in some Emacs
> > code is the way to achieve that.
> 
> It won't be able do that since it's in an infinite loop inside
> libpthread.

If you can show that, it might be an evidence that the loop is indeed
inside that library.  Did you actually try that?  If not, please do,
it's important to know where Emacs loops.

> > > Even if you consider the backtrace to be suspect, code inspection
> > > should suffice to show that the line
> > > 
> > >           printf ("Auto-save? (y or n) ");
> > > 
> > > in src/keyboard.c can be executed from a signal handler.
> > 
> > Indeed, it can.  But I don't think this is the reason for the problem
> > you are describing.  That code cannot be entered unless you type C-g
> > twice or more in a TTY session while Emacs is already in some
> > un-interruptible loop or system call.  It is that loop or system call
> > that we need to identify in order to fix this problem.
> 
> As far as I'm concerned, this bug report is specifically about what
> happens after the SIGINT, not what happened before it.

I was under the impression that the loop happens regardless; apologies
if I misunderstood.  But if you think it is caused by the emergency
exit procedure, how about commenting out those printf's and running
with the modified version for a while?  If the loops don't happen,
then it will be another evidence in favor of your hypothesis about the
reasons for this.

In any case, if those printf's are the culprit, they are no longer
there in the current sources of what will soon become Emacs 25.1.
They were replaced with direct calls to 'write'.  So if we are sure
there's no other problem that causes these loops, we can close this
bug.

> Could the auto-save have failed due to interruption by yet another
> (earlier) control-g?

Not according to my reading of the code, but I could be wrong, and
it's hard to test this in real usage.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-04 15:16         ` Eli Zaretskii
@ 2016-03-05 10:17           ` Andreas Gustafsson
  2016-03-05 11:26             ` Eli Zaretskii
  2016-07-07 16:24           ` Andreas Gustafsson
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Gustafsson @ 2016-03-05 10:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 22790

Eli Zaretskii wrote:
> > > I don't think the loop can reasonably be inside libpthread,
> > 
> > Why not?
> 
> Because it's much more likely that Emacs has bugs that lead to
> infloops than that libpthread has such bugs.
> 
> > We are talking undefined behavior here, after all.
> 
> If libpthread was so prone to undefined behavior, it would have been
> either fixed long time ago or thrown away in favor of a more robust
> implementation.
> 
> I'm not saying such a bug in libpthread is impossible, just that it's
> much less likely than a bug in Emacs.

I never said there was a bug in libpthread.  The bug is in emacs,
which is incorrectly calling printf() from a signal handler.  Looping
is correct behavior on libpthread's part when you do that, as is any
other behavior - that's what "undefined behavior" means.

> I was under the impression that the loop happens regardless; apologies
> if I misunderstood.  But if you think it is caused by the emergency
> exit procedure, how about commenting out those printf's and running
> with the modified version for a while?  If the loops don't happen,
> then it will be another evidence in favor of your hypothesis about the
> reasons for this.
>
> In any case, if those printf's are the culprit, they are no longer
> there in the current sources of what will soon become Emacs 25.1.
> They were replaced with direct calls to 'write'.  So if we are sure
> there's no other problem that causes these loops, we can close this
> bug.

OK, to me this means that he bug has already been fixed in emacs 25.
I might as well back-port that fix to emacs 24 then, instead of just
commenting out the calls, and this bug report can be closed if no
further emacs 24 releases are planned.  If I'm still experiencing
problems after back-porting the fix or upgrading to emacs 25, I will
file separate bug reports about those.
-- 
Andreas Gustafsson, gson@gson.org





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-05 10:17           ` Andreas Gustafsson
@ 2016-03-05 11:26             ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-05 11:26 UTC (permalink / raw)
  To: Andreas Gustafsson; +Cc: 22790

> Date: Sat, 5 Mar 2016 12:17:30 +0200
> Cc: 22790@debbugs.gnu.org
> From: Andreas Gustafsson <gson@gson.org>
> 
> > In any case, if those printf's are the culprit, they are no longer
> > there in the current sources of what will soon become Emacs 25.1.
> > They were replaced with direct calls to 'write'.  So if we are sure
> > there's no other problem that causes these loops, we can close this
> > bug.
> 
> OK, to me this means that he bug has already been fixed in emacs 25.
> I might as well back-port that fix to emacs 24 then, instead of just
> commenting out the calls, and this bug report can be closed if no
> further emacs 24 releases are planned.  If I'm still experiencing
> problems after back-porting the fix or upgrading to emacs 25, I will
> file separate bug reports about those.

Thanks.  No further Emacs 24 releases are planned at this time.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-04 14:23       ` Andreas Gustafsson
  2016-03-04 15:16         ` Eli Zaretskii
@ 2016-03-13  9:21         ` Daniel Colascione
  2016-03-13 20:41           ` Philipp Stephani
  1 sibling, 1 reply; 14+ messages in thread
From: Daniel Colascione @ 2016-03-13  9:21 UTC (permalink / raw)
  To: Andreas Gustafsson, Eli Zaretskii; +Cc: 22790


[-- Attachment #1.1: Type: text/plain, Size: 1732 bytes --]

On 03/04/2016 06:23 AM, Andreas Gustafsson wrote:
> Eli Zaretskii wrote:
>> Is this a GUI session or a text-mode terminal (a.k.a. "TTY") session?
> 
> This is a TTY session.
> 
>> In any case, this code is run as part of the so-called "emergency
>> escape", when you type C-g more than once while Emacs is busy doing
>> something that cannot be interrupted.  In that situation, we are way
>> past the point where invoking undefined behavior is of any concern,
>> because the only thing we can do then is auto-save and commit
>> suicide.
> 
> Not necessarily - there is also the option of continuing what emacs
> was doing, which is what I would have done, by answering both the
> "Auto-save?" and "Abort (and dump core)?" prompts with "no", if those
> prompts had actually appeared.  But they didn't, because emacs entered
> an infinite loop trying to print them.
> 
>> You need to use "finish", not "step" or "stepi".
> 
> I will try that the next time the lockup happens, but I'm quite sure
> it won't do anything useful.
> 
>> I don't think the loop can reasonably be inside libpthread,
> 
> Why not?  We are talking undefined behavior here, after all.  If you
> find looping in libpthread surprising, just wait until the nasal
> demons appear :)  It could be something as simple as trying to acquire
> a spinlock that was already held when the signal occurred.

The Emacs maintainership has decided that undefined behavior in signal
handlers is perfectly okay. I've patched this dangerous code out of my
Emacs, and I suggest you do too. It's not just the printf calls that are
unsafe --- it's everything that happens there.  But there is little
interest in making Emacs robust in this way.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-13  9:21         ` Daniel Colascione
@ 2016-03-13 20:41           ` Philipp Stephani
  0 siblings, 0 replies; 14+ messages in thread
From: Philipp Stephani @ 2016-03-13 20:41 UTC (permalink / raw)
  To: Daniel Colascione, Andreas Gustafsson, Eli Zaretskii; +Cc: 22790

[-- Attachment #1: Type: text/plain, Size: 1775 bytes --]

Daniel Colascione <dancol@dancol.org> schrieb am So., 13. März 2016 um
10:22 Uhr:

> On 03/04/2016 06:23 AM, Andreas Gustafsson wrote:
> > Eli Zaretskii wrote:
> >> Is this a GUI session or a text-mode terminal (a.k.a. "TTY") session?
> >
> > This is a TTY session.
> >
> >> In any case, this code is run as part of the so-called "emergency
> >> escape", when you type C-g more than once while Emacs is busy doing
> >> something that cannot be interrupted.  In that situation, we are way
> >> past the point where invoking undefined behavior is of any concern,
> >> because the only thing we can do then is auto-save and commit
> >> suicide.
> >
> > Not necessarily - there is also the option of continuing what emacs
> > was doing, which is what I would have done, by answering both the
> > "Auto-save?" and "Abort (and dump core)?" prompts with "no", if those
> > prompts had actually appeared.  But they didn't, because emacs entered
> > an infinite loop trying to print them.
> >
> >> You need to use "finish", not "step" or "stepi".
> >
> > I will try that the next time the lockup happens, but I'm quite sure
> > it won't do anything useful.
> >
> >> I don't think the loop can reasonably be inside libpthread,
> >
> > Why not?  We are talking undefined behavior here, after all.  If you
> > find looping in libpthread surprising, just wait until the nasal
> > demons appear :)  It could be something as simple as trying to acquire
> > a spinlock that was already held when the signal occurred.
>
> The Emacs maintainership has decided that undefined behavior in signal
> handlers is perfectly okay. I've patched this dangerous code out of my
> Emacs, and I suggest you do too.



Could you maybe share the patch here? Thanks.

[-- Attachment #2: Type: text/html, Size: 2336 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-03-04 15:16         ` Eli Zaretskii
  2016-03-05 10:17           ` Andreas Gustafsson
@ 2016-07-07 16:24           ` Andreas Gustafsson
  2016-07-07 16:53             ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Gustafsson @ 2016-07-07 16:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andreas Gustafsson, 22790

In March, Eli Zaretskii wrote:
> In any case, if those printf's are the culprit, they are no longer
> there in the current sources of what will soon become Emacs 25.1.
> They were replaced with direct calls to 'write'.  So if we are sure
> there's no other problem that causes these loops, we can close this
> bug.

Eliminating the printf calls has mostly fixed the problem for me, but
not completely.  I just had my emacs 24 (which I have patched to
eliminate the printf calls) go into a loop again, following a
control-G and the usual "Auto-save?" and "Abort (and dump core)?"
dialog, but now with the following backtrace:

  (gdb) where
  #0  pthread__mutex_spin (ptm=ptm@entry=0xaea860 <_malloc_mutex>, owner=<optimized out>) at /bracket/prod/7.0gson1/src/lib/libpthread/pthread_mutex.c:204
  #1  0x00007f7ff680b73b in pthread__mutex_lock_slow (ptm=0xaea860 <_malloc_mutex>) at /bracket/prod/7.0gson1/src/lib/libpthread/pthread_mutex.c:288
  #2  0x00000000005c5307 in _free_internal (ptr=0x2a71000) at gmalloc.c:1268
  #3  0x00000000005c5362 in free (ptr=0x2a71000) at gmalloc.c:1283
  #4  0x0000000000533e4e in xfree (block=0x2a71000) at alloc.c:735
  #5  0x000000000055abd2 in unbind_to (count=4, value=11933746) at eval.c:3304
  #6  0x00000000005562e4 in unwind_to_catch (catch=0xfc4500, value=33964454) at eval.c:1161
  #7  0x0000000000557203 in Fsignal (error_symbol=12003810, data=11933746) at eval.c:1557
  #8  0x00000000004daf23 in handle_interrupt (in_signal_handler=true) at keyboard.c:10440
  #9  0x00000000004dac4a in handle_interrupt_signal (sig=2) at keyboard.c:10288
  #10 0x00000000004e8092 in deliver_process_signal (sig=2, handler=0x4dabfd <handle_interrupt_signal>) at sysdep.c:1570
  #11 0x00000000004dac66 in deliver_interrupt_signal (sig=2) at keyboard.c:10295
  #12 0x00007f7ff5c9f3f0 in _opendir (name=<optimized out>) at /bracket/prod/7.0gson1/src/lib/libc/gen/opendir.c:72
  #13 0x00007fff00000002 in ?? ()
  #14 0x0000000000000000 in ?? ()

Calling free() from a signal handler is of course incorrect for the
same reasons calling printf() is.

I have not yet checked if emacs 25 has the same bug.
-- 
Andreas Gustafsson, gson@gson.org





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-07-07 16:24           ` Andreas Gustafsson
@ 2016-07-07 16:53             ` Eli Zaretskii
  2016-12-07 21:06               ` Glenn Morris
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2016-07-07 16:53 UTC (permalink / raw)
  To: Andreas Gustafsson; +Cc: gson, 22790

> Date: Thu, 7 Jul 2016 19:24:09 +0300
> Cc: Andreas Gustafsson <gson@gson.org>,
>     22790@debbugs.gnu.org
> From: Andreas Gustafsson <gson@gson.org>
> 
> Calling free() from a signal handler is of course incorrect for the
> same reasons calling printf() is.
> 
> I have not yet checked if emacs 25 has the same bug.

Please do.  Emacs 24 is no longer maintained.

Thanks.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#22790: 24.5; Infinite loop involving malloc called from signal handler
  2016-07-07 16:53             ` Eli Zaretskii
@ 2016-12-07 21:06               ` Glenn Morris
  0 siblings, 0 replies; 14+ messages in thread
From: Glenn Morris @ 2016-12-07 21:06 UTC (permalink / raw)
  To: 22790-done

Eli Zaretskii wrote:

>> I have not yet checked if emacs 25 has the same bug.
>
> Please do.  Emacs 24 is no longer maintained.

No further response in months.
I suggest opening a new report if the issue persists in Emacs 25.





^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-12-07 21:06 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-24 13:08 bug#22790: 24.5; Infinite loop involving malloc called from signal handler Andreas Gustafsson
2016-02-24 17:51 ` Eli Zaretskii
2016-02-24 18:17   ` Andreas Gustafsson
2016-02-29 14:44   ` Andreas Gustafsson
2016-03-04  9:42     ` Eli Zaretskii
2016-03-04 14:23       ` Andreas Gustafsson
2016-03-04 15:16         ` Eli Zaretskii
2016-03-05 10:17           ` Andreas Gustafsson
2016-03-05 11:26             ` Eli Zaretskii
2016-07-07 16:24           ` Andreas Gustafsson
2016-07-07 16:53             ` Eli Zaretskii
2016-12-07 21:06               ` Glenn Morris
2016-03-13  9:21         ` Daniel Colascione
2016-03-13 20:41           ` Philipp Stephani

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).