bug#21965: 24.5; Emacs freezes when canceling at open file

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#21965: 24.5; Emacs freezes when canceling at open file
@ 2015-11-20 19:18 Maneesh Yadav
  2015-11-20 21:37 ` John Wiegley
  2015-11-20 22:01 ` Eli Zaretskii
  0 siblings, 2 replies; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-20 19:18 UTC (permalink / raw)
  To: 21965

I was experiencing frequent 'freezing' after pressing 'Ctrl-g-g' (I
think I have defaulted to pressing twice to make sure I've cancelled out
of partial command, seeing "quit" makes me feel happier than "C-x C-g is
undefined").

This doesn't seem completely repeatable (but I did just do it on my
first try just now): I start 'emacs -Q' and Ctrl-x Ctrl-f to see a
prompt in the minibuffer for a file and then Ctrl-g Ctrl-g then emacs
freezes (does not respond to Ctrl-g or any keystrokes...I don't think
I'm naively invoking an process control key binding (checked with stty-a).




Ugh, I am behind on my debug skills, I couldn't find out how to
compile my macports version with debug symbols.  After I kill  with
"pkill -USR2 emacs", I get this output (is this useful?):

0   emacs                               0x000000010009ebe9
emacs_backtrace + 87
1   emacs                               0x0000000100084ffa
terminate_due_to_signal + 97
2   emacs                               0x000000010009d77b
init_baud_rate + 0
3   emacs                               0x000000010008e5d3
handle_interrupt + 590
4   emacs                               0x000000010009e808
deliver_process_signal + 53
5   libsystem_platform.dylib            0x00007fff8c44df1a _sigtramp +
26
6   ???                                 0x0646666406466664 0x0 +
452161392385353316
7   libglib-2.0.0.dylib                 0x0000000100888ba1 g_mutex_lock
+ 26
8   libglib-2.0.0.dylib                 0x000000010084c284
g_main_context_acquire + 42
9   emacs                               0x00000001001447cc xg_select +
135
10  emacs                               0x000000010012dbe2
wait_reading_process_output + 2074
11  emacs                               0x00000001000075b5 sit_for + 260
12  emacs                               0x000000010008be56 read_char +
5024
13  emacs                               0x000000010008918c
read_key_sequence + 1526
14  emacs                               0x0000000100088973
command_loop_1 + 3983
15  emacs                               0x00000001000f251e
internal_condition_case + 251
16  emacs                               0x00000001000966c9
command_loop_2 + 53
17  emacs                               0x00000001000f1f48
internal_catch + 243
18  emacs                               0x00000001000871ea
recursive_edit_1 + 206
19  emacs                               0x000000010008738d
Frecursive_edit + 236
20  emacs                               0x000000010008648e main + 4658
21  libdyld.dylib                       0x00007fff8707a5c9 start + 1
22  ???                                 0x0000000000000001 0x0 + 1



In GNU Emacs 24.5.1 (x86_64-apple-darwin14.4.0)
 of 2015-08-25 on Maneeshs-MacBook-Air.local
Configured using:
 `configure --prefix=/opt/local --without-x --without-dbus
 --without-gconf --without-libotf --without-m17n-flt --without-gpm
 --without-gnutls --with-xml2 --infodir /opt/local/share/info/emacs
 'CFLAGS=-pipe -Os -arch x86_64' CPPFLAGS=-I/opt/local/include
 'LDFLAGS=-L/opt/local/lib -Wl,-headerpad_max_install_names -Wl,-no_pie
 -arch x86_64''

Important settings:
  value of $LC_ALL: en_US.UTF-8
  value of $LC_CTYPE: en_US.UTF-8
  value of $LANG: us
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  electric-indent-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Making completion list... [2 times]

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail regexp-opt rfc2047 rfc2045 ietf-drums
mm-util help-fns mail-prsvr mail-utils help-mode easymenu xterm
time-date tooltip electric uniquify ediff-hook vc-hooks lisp-float-type
tabulated-list newcomment lisp-mode prog-mode register page menu-bar
rfn-eshadow timer select mouse jit-lock font-lock syntax facemenu
font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer nadvice loaddefs button
faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote make-network-process gfilenotify multi-tty emacs)

Memory information:
((conses 16 74606 6954)
 (symbols 48 16694 0)
 (miscs 40 32 141)
 (strings 32 9404 4608)
 (string-bytes 1 257827)
 (vectors 16 7133)
 (vector-slots 8 335922 32343)
 (floats 8 52 663)
 (intervals 56 176 13)
 (buffers 960 12))





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 19:18 bug#21965: 24.5; Emacs freezes when canceling at open file Maneesh Yadav
@ 2015-11-20 21:37 ` John Wiegley
  2015-11-20 21:47   ` Maneesh Yadav
  2015-11-20 22:01 ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-20 21:37 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> I was experiencing frequent 'freezing' after pressing 'Ctrl-g-g' (I think I
> have defaulted to pressing twice to make sure I've cancelled out of partial
> command, seeing "quit" makes me feel happier than "C-x C-g is undefined").

Only twice? I think my default right now is 10-20 times, just for that extra
satisfying feel.

> This doesn't seem completely repeatable (but I did just do it on my first
> try just now): I start 'emacs -Q' and Ctrl-x Ctrl-f to see a prompt in the
> minibuffer for a file and then Ctrl-g Ctrl-g then emacs freezes (does not
> respond to Ctrl-g or any keystrokes...I don't think I'm naively invoking an
> process control key binding (checked with stty-a).

I've experienced deadlocks in the past as well, although recently there have
been none. What else are you doing when this happens? Are you using tramp,
running any background processes, etc.?

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 21:37 ` John Wiegley
@ 2015-11-20 21:47   ` Maneesh Yadav
  2015-11-20 21:55     ` John Wiegley
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-20 21:47 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Hi John,
I'm running in a mac environment with nothing really going on in the
background (a web browser, mail, another shell or two open with
prompt).  No tramp or fancy filesystem stuff or anything (even without
-Q I think my startup is fairly modest).  I should correct my
replication path, I believe I am pressing Ctrl-g-g...but I've
replicated the bug with mashing Ctrl-gs when the minibuffer is waiting
for input, not clear that the minibuffer is the only context this
happens.

Thanks so much for your fast response, I hope this is real bug.  I
held off on reporting it for awhile since I wanted to be sure it
wasn't something idiotic I am doing.

> This doesn't seem completely repeatable (but I did just do it on my first
> try just now): I start 'emacs -Q' and Ctrl-x Ctrl-f to see a prompt in the
> minibuffer for a file and then Ctrl-g-g then emacs freezes (does not
> respond to Ctrl-g or any keystrokes...I don't think I'm naively invoking an
> process control key binding (checked with stty-a).

On Fri, Nov 20, 2015 at 1:37 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> I was experiencing frequent 'freezing' after pressing 'Ctrl-g-g' (I think I
>> have defaulted to pressing twice to make sure I've cancelled out of partial
>> command, seeing "quit" makes me feel happier than "C-x C-g is undefined").
>
> Only twice? I think my default right now is 10-20 times, just for that extra
> satisfying feel.
>
>> This doesn't seem completely repeatable (but I did just do it on my first
>> try just now): I start 'emacs -Q' and Ctrl-x Ctrl-f to see a prompt in the
>> minibuffer for a file and then Ctrl-g Ctrl-g then emacs freezes (does not
>> respond to Ctrl-g or any keystrokes...I don't think I'm naively invoking an
>> process control key binding (checked with stty-a).
>
> I've experienced deadlocks in the past as well, although recently there have
> been none. What else are you doing when this happens? Are you using tramp,
> running any background processes, etc.?
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 21:47   ` Maneesh Yadav
@ 2015-11-20 21:55     ` John Wiegley
  2015-11-20 22:07       ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-20 21:55 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Thanks so much for your fast response, I hope this is real bug. I held off
> on reporting it for awhile since I wanted to be sure it wasn't something
> idiotic I am doing.

I'm pretty sure it's a real bug, since I've encountered it before too. I was
forced to "kill -9 PID" to restart Emacs. It used to happen to me at least
once a day, and always after having left Emacs idle for a while.

I wonder, does it also happen if you try the 'Mac port' version of Carbon
Emacs, by Yamamoto Matsuhiro? See:

    ftp://ftp.math.s.chiba-u.ac.jp/emacs/

I think that may have been how I resolved it here, and I haven't switched back
to Cocoa yet.

John

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 19:18 bug#21965: 24.5; Emacs freezes when canceling at open file Maneesh Yadav
  2015-11-20 21:37 ` John Wiegley
@ 2015-11-20 22:01 ` Eli Zaretskii
  1 sibling, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2015-11-20 22:01 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

> Date: Fri, 20 Nov 2015 11:18:41 -0800
> From: Maneesh Yadav <maneeshkyadav@gmail.com>
> 
> This doesn't seem completely repeatable (but I did just do it on my
> first try just now): I start 'emacs -Q' and Ctrl-x Ctrl-f to see a
> prompt in the minibuffer for a file and then Ctrl-g Ctrl-g then emacs
> freezes (does not respond to Ctrl-g or any keystrokes...I don't think
> I'm naively invoking an process control key binding (checked with stty-a).

Not reproducible here (but I'm not on Darwin).





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 21:55     ` John Wiegley
@ 2015-11-20 22:07       ` Maneesh Yadav
  2015-11-20 22:45         ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-20 22:07 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Indeed, I run into it frequently enough it causes pain.

When I replicated the bug for this report, I did it right after emacs
started (and I was in scratch buffer) FWIW; so idling (in this case)
wasn't necessary.  I'll try one of the other mac emacs versions.


I've tried it a few times just now and haven't been able to make it
occur again (though it happens on its own often enough that I know
I'll see it again soon).

On Fri, Nov 20, 2015 at 1:55 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> Thanks so much for your fast response, I hope this is real bug. I held off
>> on reporting it for awhile since I wanted to be sure it wasn't something
>> idiotic I am doing.
>
> I'm pretty sure it's a real bug, since I've encountered it before too. I was
> forced to "kill -9 PID" to restart Emacs. It used to happen to me at least
> once a day, and always after having left Emacs idle for a while.
>
> I wonder, does it also happen if you try the 'Mac port' version of Carbon
> Emacs, by Yamamoto Matsuhiro? See:
>
>     ftp://ftp.math.s.chiba-u.ac.jp/emacs/
>
> I think that may have been how I resolved it here, and I haven't switched back
> to Cocoa yet.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 22:07       ` Maneesh Yadav
@ 2015-11-20 22:45         ` Maneesh Yadav
  2015-11-20 23:26           ` John Wiegley
  2015-11-21  7:29           ` Eli Zaretskii
  0 siblings, 2 replies; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-20 22:45 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Ok I just did it again had to mash Ctrl-g a little (still not sure if
it is Ctrl-g-g or Ctrl-g Ctrl-g that triggers it) but a similar
backtrace.  Just confirming I could replicate it 'on command' (sort
of).

emacs -Q

Auto-save? (y or n) n

Abort (and dump core)? (y or n) y

Fatal error 6: Abort trap

Backtrace:

0   emacs                               0x000000010009ebe9 emacs_backtrace + 87

1   emacs                               0x0000000100084ffa
terminate_due_to_signal + 97

2   emacs                               0x000000010009d77b init_baud_rate + 0

3   emacs                               0x000000010008e5d3
handle_interrupt + 590

4   emacs                               0x000000010009e808
deliver_process_signal + 53

5   libsystem_platform.dylib            0x00007fff8c44df1a _sigtramp + 26

6   ???                                 0x0000000000000000 0x0 + 0

7   libglib-2.0.0.dylib                 0x0000000100888ba1 g_mutex_lock + 26

8   libglib-2.0.0.dylib                 0x000000010084c284
g_main_context_acquire + 42

9   emacs                               0x00000001001447cc xg_select + 135

10  emacs                               0x000000010012dbe2
wait_reading_process_output + 2074

11  emacs                               0x00000001000075b5 sit_for + 260

12  emacs                               0x000000010008be56 read_char + 5024

13  emacs                               0x000000010008918c
read_key_sequence + 1526

14  emacs                               0x0000000100088973 command_loop_1 + 3983

15  emacs                               0x00000001000f251e
internal_condition_case + 251

16  emacs                               0x00000001000966c9 command_loop_2 + 53

17  emacs                               0x00000001000f1f48 internal_catch + 243

18  emacs                               0x00000001000871ea
recursive_edit_1 + 206

19  emacs                               0x000000010008738d Frecursive_edit + 236

20  emacs                               0x000000010008648e main + 4658

21  libdyld.dylib                       0x00007fff8707a5c9 start + 1

On Fri, Nov 20, 2015 at 2:07 PM, Maneesh Yadav <maneeshkyadav@gmail.com> wrote:
> Indeed, I run into it frequently enough it causes pain.
>
> When I replicated the bug for this report, I did it right after emacs
> started (and I was in scratch buffer) FWIW; so idling (in this case)
> wasn't necessary.  I'll try one of the other mac emacs versions.
>
>
> I've tried it a few times just now and haven't been able to make it
> occur again (though it happens on its own often enough that I know
> I'll see it again soon).
>
> On Fri, Nov 20, 2015 at 1:55 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>>
>>> Thanks so much for your fast response, I hope this is real bug. I held off
>>> on reporting it for awhile since I wanted to be sure it wasn't something
>>> idiotic I am doing.
>>
>> I'm pretty sure it's a real bug, since I've encountered it before too. I was
>> forced to "kill -9 PID" to restart Emacs. It used to happen to me at least
>> once a day, and always after having left Emacs idle for a while.
>>
>> I wonder, does it also happen if you try the 'Mac port' version of Carbon
>> Emacs, by Yamamoto Matsuhiro? See:
>>
>>     ftp://ftp.math.s.chiba-u.ac.jp/emacs/
>>
>> I think that may have been how I resolved it here, and I haven't switched back
>> to Cocoa yet.
>>
>> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 22:45         ` Maneesh Yadav
@ 2015-11-20 23:26           ` John Wiegley
  2015-11-20 23:32             ` Maneesh Yadav
  2015-11-21  7:29           ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-20 23:26 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Ok I just did it again had to mash Ctrl-g a little (still not sure if it is
> Ctrl-g-g or Ctrl-g Ctrl-g that triggers it) but a similar backtrace. Just
> confirming I could replicate it 'on command' (sort of).

Ok, the next step will be to build you Emacs with debugging, and see if that
adds more to the trace. We may need to start adding some print statements, to
find out if `wait_reading_process_output' is really the blocking call.

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 23:26           ` John Wiegley
@ 2015-11-20 23:32             ` Maneesh Yadav
  2015-11-20 23:54               ` John Wiegley
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-20 23:32 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Apologies for the debug newbieness...this shows that the debug symbols
are in there, correct?

maneeshyadav$ otool -Iv /opt/local/bin/emacs

/opt/local/bin/emacs:

Indirect symbols for (__TEXT,__stubs) 256 entries

address            index name

0x0000000100151d60  6480 _htmlReadMemory

0x0000000100151d66  6610 _xmlCheckVersion

0x0000000100151d6c  6611 _xmlCleanupParser

0x0000000100151d72  6612 _xmlDocGetRootElement

0x0000000100151d78  6613 _xmlFreeDoc

0x0000000100151d7e  6614 _xmlReadMemory

0x0000000100151d84  6588 _tgetent

0x0000000100151d8a  6589 _tgetflag

0x0000000100151d90  6590 _tgetnum

0x0000000100151d96  6591 _tgetstr

0x0000000100151d9c  6592 _tgoto

0x0000000100151da2  6596 _tparm

0x0000000100151da8  6597 _tputs

0x0000000100151dae  6352 __NSGetEnviron

0x0000000100151db4  6353 ___assert_rtn

On Fri, Nov 20, 2015 at 3:26 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> Ok I just did it again had to mash Ctrl-g a little (still not sure if it is
>> Ctrl-g-g or Ctrl-g Ctrl-g that triggers it) but a similar backtrace. Just
>> confirming I could replicate it 'on command' (sort of).
>
> Ok, the next step will be to build you Emacs with debugging, and see if that
> adds more to the trace. We may need to start adding some print statements, to
> find out if `wait_reading_process_output' is really the blocking call.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 23:32             ` Maneesh Yadav
@ 2015-11-20 23:54               ` John Wiegley
  2015-11-21  1:46                 ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-20 23:54 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Apologies for the debug newbieness...this shows that the debug symbols are
> in there, correct?

Only the symbol table is there (i.e., it hasn't been stripped), not the debug
info (i.e., -g) that correlates TEXT addresses with file and line numbers.

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 23:54               ` John Wiegley
@ 2015-11-21  1:46                 ` Maneesh Yadav
  0 siblings, 0 replies; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-21  1:46 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Ugh, this is not a part of emacs, but for the sake of anyone else who
has to to do the same in the future.  Does anyone have hints as to
building the debug version of macport emacs?

I did:
port dir emacs
(backed up original portfile)
added compiler flags -O0 and -g

*not quite sure where how to to run dsym...

Assuming I can edit the portfile properly then I should be able to
building the debug emacs binary.  If anyone already has a clever way
of doing this, please let me know otherwise I will trudge through
macports.

On Fri, Nov 20, 2015 at 3:54 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> Apologies for the debug newbieness...this shows that the debug symbols are
>> in there, correct?
>
> Only the symbol table is there (i.e., it hasn't been stripped), not the debug
> info (i.e., -g) that correlates TEXT addresses with file and line numbers.
>
> John

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-20 22:45         ` Maneesh Yadav
  2015-11-20 23:26           ` John Wiegley
@ 2015-11-21  7:29           ` Eli Zaretskii
  2015-11-22  5:11             ` John Wiegley
  1 sibling, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2015-11-21  7:29 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: jwiegley, 21965

> Date: Fri, 20 Nov 2015 14:45:23 -0800
> From: Maneesh Yadav <maneeshkyadav@gmail.com>
> Cc: 21965@debbugs.gnu.org
> 
> Ok I just did it again had to mash Ctrl-g a little (still not sure if
> it is Ctrl-g-g or Ctrl-g Ctrl-g that triggers it) but a similar
> backtrace.  Just confirming I could replicate it 'on command' (sort
> of).
> 
> emacs -Q
> Auto-save? (y or n) n
> Abort (and dump core)? (y or n) y
> Fatal error 6: Abort trap
> 
> Backtrace:
> 0   emacs                               0x000000010009ebe9 emacs_backtrace + 87
> 1   emacs                               0x0000000100084ffa
> terminate_due_to_signal + 97
> 2   emacs                               0x000000010009d77b init_baud_rate + 0
> 3   emacs                               0x000000010008e5d3
> handle_interrupt + 590
> 4   emacs                               0x000000010009e808
> deliver_process_signal + 53
> 5   libsystem_platform.dylib            0x00007fff8c44df1a _sigtramp + 26
> 6   ???                                 0x0000000000000000 0x0 + 0
> 7   libglib-2.0.0.dylib                 0x0000000100888ba1 g_mutex_lock + 26
> 8   libglib-2.0.0.dylib                 0x000000010084c284
> g_main_context_acquire + 42
> 9   emacs                               0x00000001001447cc xg_select + 135
> 10  emacs                               0x000000010012dbe2
> wait_reading_process_output + 2074
> 11  emacs                               0x00000001000075b5 sit_for + 260
> 12  emacs                               0x000000010008be56 read_char + 5024
> 13  emacs                               0x000000010008918c
> read_key_sequence + 1526

This backtrace simply says that Emacs called 'abort'.  And it did so
because the user told it so.

We need to know where it hangs or infloops prior to that.

Thanks.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-21  7:29           ` Eli Zaretskii
@ 2015-11-22  5:11             ` John Wiegley
  2015-11-22  5:15               ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-22  5:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Maneesh Yadav, 21965

>>>>> Eli Zaretskii <eliz@gnu.org> writes:

>> 7   libglib-2.0.0.dylib                 0x0000000100888ba1 g_mutex_lock + 26
>> 8   libglib-2.0.0.dylib                 0x000000010084c284 g_main_context_acquire + 42
>> 9   emacs                               0x00000001001447cc xg_select + 135
>> 10  emacs                               0x000000010012dbe2 wait_reading_process_output + 2074
>> 11  emacs                               0x00000001000075b5 sit_for + 260
>> 12  emacs                               0x000000010008be56 read_char + 5024
>> 13  emacs                               0x000000010008918c read_key_sequence + 1526

> This backtrace simply says that Emacs called 'abort'.  And it did so
> because the user told it so.

It might also be saying that Emacs deadlocked trying to obtain a mutex.

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-22  5:11             ` John Wiegley
@ 2015-11-22  5:15               ` Maneesh Yadav
  2015-11-23 21:29                 ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-22  5:15 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

The trace is what what I saw after I sent pkill -USR2 emacs, I don't
know quite know how to read it (other than the vague references to
mutexs...no idea if that is actually relevant).  Still trying to
figure out how to get macports to compile debug emacs...hopefully will
figure it out this w/e.

On Sat, Nov 21, 2015 at 9:11 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> 7   libglib-2.0.0.dylib                 0x0000000100888ba1 g_mutex_lock + 26
>>> 8   libglib-2.0.0.dylib                 0x000000010084c284 g_main_context_acquire + 42
>>> 9   emacs                               0x00000001001447cc xg_select + 135
>>> 10  emacs                               0x000000010012dbe2 wait_reading_process_output + 2074
>>> 11  emacs                               0x00000001000075b5 sit_for + 260
>>> 12  emacs                               0x000000010008be56 read_char + 5024
>>> 13  emacs                               0x000000010008918c read_key_sequence + 1526
>
>> This backtrace simply says that Emacs called 'abort'.  And it did so
>> because the user told it so.
>
> It might also be saying that Emacs deadlocked trying to obtain a mutex.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-22  5:15               ` Maneesh Yadav
@ 2015-11-23 21:29                 ` Maneesh Yadav
  2015-11-23 22:17                   ` John Wiegley
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-23 21:29 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Flog me if I am not doing this right.  Seems that +debug on macports
is the easy to make debug compiles (an old thread seemed to suggest
that macports rejected this idea...but I guess it was eventually
accepted).  So installed emacs +debug and reproduced the crash,
attached to emacs via lldb and got this backtrace (which looks a lot
like the previous, can I provide better info somehow?):

(lldb) process attach --name emacs

Process 23166 stopped

* thread #1: tid = 0x4d18b, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

    frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>

    0x7fff8a861168 <+12>: movq   %rax, %rdi

    0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel

    0x7fff8a861170 <+20>: retq


Executable module set to "/opt/local/bin/emacs".

Architecture set to: x86_64h-apple-macosx.

(lldb) thread backtrace

* thread #1: tid = 0x4d18b, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

    frame #1: 0x00007fff853b5696
libsystem_pthread.dylib`_pthread_mutex_lock + 480

    frame #2: 0x0000000100a17ba1 libglib-2.0.0.dylib`g_mutex_lock + 26

    frame #3: 0x00000001009db284 libglib-2.0.0.dylib`g_main_context_acquire + 42

    frame #4: 0x000000010024fc47 emacs`xg_select + 231

    frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757

    frame #6: 0x0000000100008cb6 emacs`sit_for + 582

    frame #7: 0x0000000100108f00 emacs`read_char + 4496

    frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757

    frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212

    frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382

    frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41

    frame #12: 0x00000001001be696 emacs`internal_catch + 342

    frame #13: 0x0000000100102ddb emacs`command_loop + 187

    frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127

    frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327

    frame #16: 0x0000000100100fd3 emacs`main + 4387

    frame #17: 0x00007fff8707a5c9 libdyld.dylib`start + 1

    frame #18: 0x00007fff8707a5c9 libdyld.dylib`start + 1

On Sat, Nov 21, 2015 at 9:15 PM, Maneesh Yadav <maneeshkyadav@gmail.com> wrote:
> The trace is what what I saw after I sent pkill -USR2 emacs, I don't
> know quite know how to read it (other than the vague references to
> mutexs...no idea if that is actually relevant).  Still trying to
> figure out how to get macports to compile debug emacs...hopefully will
> figure it out this w/e.
>
> On Sat, Nov 21, 2015 at 9:11 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>>> 7   libglib-2.0.0.dylib                 0x0000000100888ba1 g_mutex_lock + 26
>>>> 8   libglib-2.0.0.dylib                 0x000000010084c284 g_main_context_acquire + 42
>>>> 9   emacs                               0x00000001001447cc xg_select + 135
>>>> 10  emacs                               0x000000010012dbe2 wait_reading_process_output + 2074
>>>> 11  emacs                               0x00000001000075b5 sit_for + 260
>>>> 12  emacs                               0x000000010008be56 read_char + 5024
>>>> 13  emacs                               0x000000010008918c read_key_sequence + 1526
>>
>>> This backtrace simply says that Emacs called 'abort'.  And it did so
>>> because the user told it so.
>>
>> It might also be saying that Emacs deadlocked trying to obtain a mutex.
>>
>> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-23 21:29                 ` Maneesh Yadav
@ 2015-11-23 22:17                   ` John Wiegley
  2015-11-24  0:30                     ` Maneesh Yadav
  2015-11-24  3:34                     ` Eli Zaretskii
  0 siblings, 2 replies; 34+ messages in thread
From: John Wiegley @ 2015-11-23 22:17 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Flog me if I am not doing this right. Seems that +debug on macports is the
> easy to make debug compiles (an old thread seemed to suggest that macports
> rejected this idea...but I guess it was eventually accepted). So installed
> emacs +debug and reproduced the crash, attached to emacs via lldb and got
> this backtrace (which looks a lot like the previous, can I provide better
> info somehow?):

We're still missing file and line numbers for the Emacs code, which is odd.
But not terribly important, since the lockup is happening inside glib, it
appears.

>     frame #3: 0x00000001009db284
>     libglib-2.0.0.dylib`g_main_context_acquire + 42

So, here's that function, more or less:

    gboolean 
    g_main_context_acquire (GMainContext *context)
    {
      gboolean result = FALSE;
      GThread *self = G_THREAD_SELF;
    
      if (context == NULL)
        context = g_main_context_default ();
      
      LOCK_CONTEXT (context);
      /* ... */
    }

We're blocked waiting on the context. The question then being: who else has
that context? Is it another Emacs thread?

Eli, does this ring any bells?

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-23 22:17                   ` John Wiegley
@ 2015-11-24  0:30                     ` Maneesh Yadav
  2015-11-24  3:39                       ` Eli Zaretskii
  2015-11-24  3:34                     ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-24  0:30 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

I don't understand the apple framework/glib event handling structure
and I doubt this is terribly informative, but for the sake of
completeness the output of 'thread list' is pasted below:

(lldb) thread list

Process 23166 stopped

* thread #1: tid = 0x4d18b, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  thread #2: tid = 0x4d18c, 0x00007fff8a8613fa
libsystem_kernel.dylib`__select + 10, name = 'gmain'

(lldb)


On Mon, Nov 23, 2015 at 2:17 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> Flog me if I am not doing this right. Seems that +debug on macports is the
>> easy to make debug compiles (an old thread seemed to suggest that macports
>> rejected this idea...but I guess it was eventually accepted). So installed
>> emacs +debug and reproduced the crash, attached to emacs via lldb and got
>> this backtrace (which looks a lot like the previous, can I provide better
>> info somehow?):
>
> We're still missing file and line numbers for the Emacs code, which is odd.
> But not terribly important, since the lockup is happening inside glib, it
> appears.
>
>>     frame #3: 0x00000001009db284
>>     libglib-2.0.0.dylib`g_main_context_acquire + 42
>
> So, here's that function, more or less:
>
>     gboolean
>     g_main_context_acquire (GMainContext *context)
>     {
>       gboolean result = FALSE;
>       GThread *self = G_THREAD_SELF;
>
>       if (context == NULL)
>         context = g_main_context_default ();
>
>       LOCK_CONTEXT (context);
>       /* ... */
>     }
>
> We're blocked waiting on the context. The question then being: who else has
> that context? Is it another Emacs thread?
>
> Eli, does this ring any bells?
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-23 22:17                   ` John Wiegley
  2015-11-24  0:30                     ` Maneesh Yadav
@ 2015-11-24  3:34                     ` Eli Zaretskii
  2015-11-24  3:39                       ` John Wiegley
  1 sibling, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2015-11-24  3:34 UTC (permalink / raw)
  To: John Wiegley; +Cc: maneeshkyadav, 21965

> From: John Wiegley <jwiegley@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  21965@debbugs.gnu.org
> Date: Mon, 23 Nov 2015 14:17:17 -0800
> 
> We're still missing file and line numbers for the Emacs code, which is odd.
> But not terribly important, since the lockup is happening inside glib, it
> appears.
> 
> >     frame #3: 0x00000001009db284
> >     libglib-2.0.0.dylib`g_main_context_acquire + 42
> 
> So, here's that function, more or less:
> 
>     gboolean 
>     g_main_context_acquire (GMainContext *context)
>     {
>       gboolean result = FALSE;
>       GThread *self = G_THREAD_SELF;
>     
>       if (context == NULL)
>         context = g_main_context_default ();
>       
>       LOCK_CONTEXT (context);
>       /* ... */
>     }
> 
> We're blocked waiting on the context. The question then being: who else has
> that context? Is it another Emacs thread?
> 
> Eli, does this ring any bells?

No.  And I'm not even convinced that's where we are blocked.  It could
be that this is part of a loop that Emacs is waiting in.  To prove
that we are blocked there, one needs to attach the debugger many times
and see that the debugger finds Emacs at _exactly_ the same
instruction.  Or, after attaching, step Emacs and see that it cannot
move even a single instructions.

If this is really what happens, and Emacs cannot acquire a mutex, that
would mean someone is holding that mutex, and the question is who that
someone is.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-24  0:30                     ` Maneesh Yadav
@ 2015-11-24  3:39                       ` Eli Zaretskii
  0 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2015-11-24  3:39 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: jwiegley, 21965

> Date: Mon, 23 Nov 2015 16:30:59 -0800
> From: Maneesh Yadav <maneeshkyadav@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, 21965@debbugs.gnu.org
> 
> I don't understand the apple framework/glib event handling structure
> and I doubt this is terribly informative, but for the sake of
> completeness the output of 'thread list' is pasted below:
> 
> (lldb) thread list
> 
> Process 23166 stopped
> 
> * thread #1: tid = 0x4d18b, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
> 
>   thread #2: tid = 0x4d18c, 0x00007fff8a8613fa
> libsystem_kernel.dylib`__select + 10, name = 'gmain'
> 
> (lldb)

Where did that STOP signal come from?  Could that be the debugger
itself?





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-24  3:34                     ` Eli Zaretskii
@ 2015-11-24  3:39                       ` John Wiegley
  2015-11-24 22:51                         ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-24  3:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: maneeshkyadav, 21965

>>>>> Eli Zaretskii <eliz@gnu.org> writes:

> No. And I'm not even convinced that's where we are blocked. It could be that
> this is part of a loop that Emacs is waiting in. To prove that we are
> blocked there, one needs to attach the debugger many times and see that the
> debugger finds Emacs at _exactly_ the same instruction. Or, after attaching,
> step Emacs and see that it cannot move even a single instructions.

Fair enough.  The docs for g_main_context_acquire do say that it should return
immediately, if no other thread is holding the lock.

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-24  3:39                       ` John Wiegley
@ 2015-11-24 22:51                         ` Maneesh Yadav
  2015-11-24 22:58                           ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-24 22:51 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

FWIW I just triggered the condition 3 times in a row (I seem to be
getting better at triggering it), attached with lldb output
(backtraces looks the same as before as well).  Looks like the same
instruction?


#1



Process 25176 stopped

* thread #1: tid = 0x7369a, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

    frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>

    0x7fff8a861168 <+12>: movq   %rax, %rdi

    0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel

    0x7fff8a861170 <+20>: retq


(lldb) thread backtrace

* thread #1: tid = 0x7369a, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

    frame #1: 0x00007fff853b5696
libsystem_pthread.dylib`_pthread_mutex_lock + 480

    frame #2: 0x0000000100a17ba1 libglib-2.0.0.dylib`g_mutex_lock + 26

    frame #3: 0x00000001009db284 libglib-2.0.0.dylib`g_main_context_acquire + 42

    frame #4: 0x000000010024fc47 emacs`xg_select + 231
...



#2

Process 25238 stopped

* thread #1: tid = 0x742be, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

    frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>

    0x7fff8a861168 <+12>: movq   %rax, %rdi

    0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel

    0x7fff8a861170 <+20>: retq



#3

Process 25251 stopped

* thread #1: tid = 0x746f0, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

    frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>

    0x7fff8a861168 <+12>: movq   %rax, %rdi

    0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel
    0x7fff8a861170 <+20>: retq


On Mon, Nov 23, 2015 at 7:39 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Eli Zaretskii <eliz@gnu.org> writes:
>
>> No. And I'm not even convinced that's where we are blocked. It could be that
>> this is part of a loop that Emacs is waiting in. To prove that we are
>> blocked there, one needs to attach the debugger many times and see that the
>> debugger finds Emacs at _exactly_ the same instruction. Or, after attaching,
>> step Emacs and see that it cannot move even a single instructions.
>
> Fair enough.  The docs for g_main_context_acquire do say that it should return
> immediately, if no other thread is holding the lock.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-24 22:51                         ` Maneesh Yadav
@ 2015-11-24 22:58                           ` Maneesh Yadav
  2015-11-25  1:02                             ` John Wiegley
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-24 22:58 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

lldb "thread continue" runs after, but emacs remains unresponsive

'thread step-in" does increment the instruction counter (output
below)...but not really sure what that implies.


(lldb) thread step-in

Process 25251 stopped

* thread #1: tid = 0x746f0, 0x00007fff8a861168
libsystem_kernel.dylib`__psynch_mutexwait + 12, queue =
'com.apple.main-thread', stop reason = instruction step into

    frame #0: 0x00007fff8a861168 libsystem_kernel.dylib`__psynch_mutexwait + 12

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8a861168 <+12>: movq   %rax, %rdi

    0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel

    0x7fff8a861170 <+20>: retq

    0x7fff8a861171 <+21>: nop

(lldb) thread step-in

Process 25251 stopped

* thread #1: tid = 0x746f0, 0x00007fff8a86116b
libsystem_kernel.dylib`__psynch_mutexwait + 15, queue =
'com.apple.main-thread', stop reason = instruction step into

    frame #0: 0x00007fff8a86116b libsystem_kernel.dylib`__psynch_mutexwait + 15

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel

    0x7fff8a861170 <+20>: retq

    0x7fff8a861171 <+21>: nop

    0x7fff8a861172 <+22>: nop

(lldb) thread step-in

Process 25251 stopped

* thread #1: tid = 0x746f0, 0x00007fff8a85cc53
libsystem_kernel.dylib`cerror_nocancel, queue =
'com.apple.main-thread', stop reason = instruction step into

    frame #0: 0x00007fff8a85cc53 libsystem_kernel.dylib`cerror_nocancel

libsystem_kernel.dylib`cerror_nocancel:

->  0x7fff8a85cc53 <+0>:  movl   %edi, -0x14ad19d9(%rip)   ; errno

    0x7fff8a85cc59 <+6>:  movq   %gs:0x8, %rax

    0x7fff8a85cc62 <+15>: testq  %rax, %rax

    0x7fff8a85cc65 <+18>: je     0x7fff8a85cc69            ; <+22>

On Tue, Nov 24, 2015 at 2:51 PM, Maneesh Yadav <maneeshkyadav@gmail.com> wrote:
> FWIW I just triggered the condition 3 times in a row (I seem to be
> getting better at triggering it), attached with lldb output
> (backtraces looks the same as before as well).  Looks like the same
> instruction?
>
>
> #1
>
>
>
> Process 25176 stopped
>
> * thread #1: tid = 0x7369a, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
>
>     frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10
>
> libsystem_kernel.dylib`__psynch_mutexwait:
>
> ->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>
>
>     0x7fff8a861168 <+12>: movq   %rax, %rdi
>
>     0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel
>
>     0x7fff8a861170 <+20>: retq
>
>
> (lldb) thread backtrace
>
> * thread #1: tid = 0x7369a, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
>
>   * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10
>
>     frame #1: 0x00007fff853b5696
> libsystem_pthread.dylib`_pthread_mutex_lock + 480
>
>     frame #2: 0x0000000100a17ba1 libglib-2.0.0.dylib`g_mutex_lock + 26
>
>     frame #3: 0x00000001009db284 libglib-2.0.0.dylib`g_main_context_acquire + 42
>
>     frame #4: 0x000000010024fc47 emacs`xg_select + 231
> ...
>
>
>
> #2
>
> Process 25238 stopped
>
> * thread #1: tid = 0x742be, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
>
>     frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10
>
> libsystem_kernel.dylib`__psynch_mutexwait:
>
> ->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>
>
>     0x7fff8a861168 <+12>: movq   %rax, %rdi
>
>     0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel
>
>     0x7fff8a861170 <+20>: retq
>
>
>
> #3
>
> Process 25251 stopped
>
> * thread #1: tid = 0x746f0, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
>
>     frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10
>
> libsystem_kernel.dylib`__psynch_mutexwait:
>
> ->  0x7fff8a861166 <+10>: jae    0x7fff8a861170            ; <+20>
>
>     0x7fff8a861168 <+12>: movq   %rax, %rdi
>
>     0x7fff8a86116b <+15>: jmp    0x7fff8a85cc53            ; cerror_nocancel
>     0x7fff8a861170 <+20>: retq
>
>
> On Mon, Nov 23, 2015 at 7:39 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>> No. And I'm not even convinced that's where we are blocked. It could be that
>>> this is part of a loop that Emacs is waiting in. To prove that we are
>>> blocked there, one needs to attach the debugger many times and see that the
>>> debugger finds Emacs at _exactly_ the same instruction. Or, after attaching,
>>> step Emacs and see that it cannot move even a single instructions.
>>
>> Fair enough.  The docs for g_main_context_acquire do say that it should return
>> immediately, if no other thread is holding the lock.
>>
>> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-24 22:58                           ` Maneesh Yadav
@ 2015-11-25  1:02                             ` John Wiegley
  2015-11-25  1:15                               ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-25  1:02 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> lldb "thread continue" runs after, but emacs remains unresponsive 'thread
> step-in" does increment the instruction counter (output below)...but not
> really sure what that implies.

Maneesh,

Can you show me the full backtrace of all threads when it deadlocks? I just
realized that xg_select is called from wait_reading_process_output, which I
believe means it's callable from multiple threads at once.

The behavior of g_main_context_acquire is *documented* to never block, but
rather to return FALSE if another thread has the context; if the behavior has
been changed to block on OS X -- and the thread with the context is calling
pselect() and waiting to return -- this would match your experience.

John

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25  1:02                             ` John Wiegley
@ 2015-11-25  1:15                               ` Maneesh Yadav
  2015-11-25  1:38                                 ` John Wiegley
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-25  1:15 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

A disconcerting finding: I could not replicate the bug while briefly
commandeering a colleague's machine (a mac which I installed macports
emacs onto).

I still am uncomfortable with my comprehension of the lldb output but
here is 'backtrace all' after triggering the condition

(lldb) thread backtrace all

* thread #1: tid = 0x7d73b, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

    frame #1: 0x00007fff853b5696
libsystem_pthread.dylib`_pthread_mutex_lock + 480

    frame #2: 0x0000000100a17ba1 libglib-2.0.0.dylib`g_mutex_lock + 26

    frame #3: 0x00000001009db284 libglib-2.0.0.dylib`g_main_context_acquire + 42

    frame #4: 0x000000010024fc47 emacs`xg_select + 231

    frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757

    frame #6: 0x0000000100008cb6 emacs`sit_for + 582

    frame #7: 0x0000000100108f00 emacs`read_char + 4496

    frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757

    frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212

    frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382

    frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41

    frame #12: 0x00000001001be696 emacs`internal_catch + 342

    frame #13: 0x0000000100102ddb emacs`command_loop + 187

    frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127

    frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327

    frame #16: 0x0000000100100fd3 emacs`main + 4387

    frame #17: 0x00007fff8707a5c9 libdyld.dylib`start + 1

    frame #18: 0x00007fff8707a5c9 libdyld.dylib`start + 1


  thread #2: tid = 0x7d73c, 0x00007fff8a8613fa
libsystem_kernel.dylib`__select + 10, name = 'gmain'

    frame #0: 0x00007fff8a8613fa libsystem_kernel.dylib`__select + 10

    frame #1: 0x00000001009e8aef libglib-2.0.0.dylib`g_poll + 399

    frame #2: 0x00000001009dd667
libglib-2.0.0.dylib`g_main_context_iterate + 326

    frame #3: 0x00000001009dd716
libglib-2.0.0.dylib`g_main_context_iteration + 55

    frame #4: 0x00000001009de809 libglib-2.0.0.dylib`glib_worker_main + 53

    frame #5: 0x00000001009fdcdb libglib-2.0.0.dylib`g_thread_proxy + 90

    frame #6: 0x00007fff853b805a libsystem_pthread.dylib`_pthread_body + 131

    frame #7: 0x00007fff853b7fd7 libsystem_pthread.dylib`_pthread_start + 176

    frame #8: 0x00007fff853b53ed libsystem_pthread.dylib`thread_start + 13

On Tue, Nov 24, 2015 at 5:02 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> lldb "thread continue" runs after, but emacs remains unresponsive 'thread
>> step-in" does increment the instruction counter (output below)...but not
>> really sure what that implies.
>
> Maneesh,
>
> Can you show me the full backtrace of all threads when it deadlocks? I just
> realized that xg_select is called from wait_reading_process_output, which I
> believe means it's callable from multiple threads at once.
>
> The behavior of g_main_context_acquire is *documented* to never block, but
> rather to return FALSE if another thread has the context; if the behavior has
> been changed to block on OS X -- and the thread with the context is calling
> pselect() and waiting to return -- this would match your experience.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25  1:15                               ` Maneesh Yadav
@ 2015-11-25  1:38                                 ` John Wiegley
  2015-11-25  1:46                                   ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-25  1:38 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> I still am uncomfortable with my comprehension of the lldb output but here
> is 'backtrace all' after triggering the condition

Ok! Now we know what the deadlock situation is:

Thread #2:

>     frame #3: 0x00000001009dd716 libglib-2.0.0.dylib`g_main_context_iteration + 55
...
>     frame #0: 0x00007fff8a8613fa libsystem_kernel.dylib`__select + 10

Thread #1:

>     frame #3: 0x00000001009db284 libglib-2.0.0.dylib`g_main_context_acquire + 42

It turns out that both g_main_context_acquire and g_main_context_iteration
(when called with NULL) call LOCK_CONTEXT on the "default context".

Now, I *think* the context should be different between these two threads: one
should be the default context, and one should be the worker context. But it
_looks_ like Thread #1 is being locked out by Thread #2.

In fact, reading the glib code, if the call to g_once_init_enter returns FALSE
within g_get_worker_context, then the worker context will be NULL! Which seems
like a subtle bug waiting to happen, and might be what's biting us.

To go deeper, we may need to build a separate copy of glib and start putting
some print statements in to find out why there is lock contention. Would you
be up for that? I'd like to know if this is happening in g_get_worker_context.

John

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25  1:38                                 ` John Wiegley
@ 2015-11-25  1:46                                   ` Maneesh Yadav
  2015-11-25  1:50                                     ` John Wiegley
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-25  1:46 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

For sure...let me make sure I can insert print statements into glib in
the context of my macports install (and get a little better
understanding of the glib event loop). Will write back once that is
up.


On Tue, Nov 24, 2015 at 5:38 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>> I still am uncomfortable with my comprehension of the lldb output but here
>> is 'backtrace all' after triggering the condition
>
> Ok! Now we know what the deadlock situation is:
>
> Thread #2:
>
>>     frame #3: 0x00000001009dd716 libglib-2.0.0.dylib`g_main_context_iteration + 55
> ...
>>     frame #0: 0x00007fff8a8613fa libsystem_kernel.dylib`__select + 10
>
> Thread #1:
>
>>     frame #3: 0x00000001009db284 libglib-2.0.0.dylib`g_main_context_acquire + 42
>
> It turns out that both g_main_context_acquire and g_main_context_iteration
> (when called with NULL) call LOCK_CONTEXT on the "default context".
>
> Now, I *think* the context should be different between these two threads: one
> should be the default context, and one should be the worker context. But it
> _looks_ like Thread #1 is being locked out by Thread #2.
>
> In fact, reading the glib code, if the call to g_once_init_enter returns FALSE
> within g_get_worker_context, then the worker context will be NULL! Which seems
> like a subtle bug waiting to happen, and might be what's biting us.
>
> To go deeper, we may need to build a separate copy of glib and start putting
> some print statements in to find out why there is lock contention. Would you
> be up for that? I'd like to know if this is happening in g_get_worker_context.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25  1:46                                   ` Maneesh Yadav
@ 2015-11-25  1:50                                     ` John Wiegley
  2015-11-25 18:49                                       ` Maneesh Yadav
  0 siblings, 1 reply; 34+ messages in thread
From: John Wiegley @ 2015-11-25  1:50 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

>> To go deeper, we may need to build a separate copy of glib and start
>> putting some print statements in to find out why there is lock contention.
>> Would you be up for that? I'd like to know if this is happening in
>> g_get_worker_context.

I've read further, and since "static gsize initialised;" must initialize to
zero, it's for me to see how this code could be wrong just from reading it.

I'd like to find every line of code in glib that calls LOCK_CONTEXT or
UNLOCK_CONTEXT, and print out:

    Function, file, line, lock or unlock, pointer value of context

That should help us narrow it down.

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25  1:50                                     ` John Wiegley
@ 2015-11-25 18:49                                       ` Maneesh Yadav
  2015-11-25 18:59                                         ` John Wiegley
  2016-02-18 21:46                                         ` Maneesh Yadav
  0 siblings, 2 replies; 34+ messages in thread
From: Maneesh Yadav @ 2015-11-25 18:49 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Weird.

I patched glib2 this way (just overriding the macros (and removing
semicolons on macro invocations...seemed to be the best way to deal
with if statements that didn't wrap in curly braces...just realized my
strings don't reflect "UN"LOCK/LOCK....not a big deal since line
numbers are there...fixed for next time):

#define LOCK_CONTEXT(context) g_mutex_lock (&context->mutex)

#define LOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to
LOCK: %s, %d, %s\n", __FILE__, __LINE__, __FUNCTION__); g_mutex_lock
(&context->mutex);}

#define UNLOCK_CONTEXT(context) g_mutex_unlock (&context->mutex)

#define UNLOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to
LOCK: %s, %d, %s\n", __FILE__, __LINE__, __FUNCTION__); g_mutex_unlock
(&context->mutex);}


Grabbing the output (emacs -Q > test.out) shows the stall mid print:

MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3208, g_main_context_acquire

MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3222, g_main_context_acquire

MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3801, g_main_context_iterate

MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3812, g_main_context_iterate

MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3376, g_main_context_prepare

MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3501, g_main_context_prepare

MANEESH GLIB DEBUG



gmain.c 3501 region:

if (timeout)

    {

      *timeout = context->timeout;

      if (*timeout != 0)

        context->time_is_fresh = FALSE;

    }



  UNLOCK_CONTEXT (context)


  return n_poll;


Nothing terribly different from the lldb backtrace (for completeness):


(lldb) thread backtrace all

* thread #1: tid = 0x9bb3c, 0x00007fff8a861166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10

    frame #1: 0x00007fff853b5696
libsystem_pthread.dylib`_pthread_mutex_lock + 480

    frame #2: 0x0000000100a17b78 libglib-2.0.0.dylib`g_mutex_lock + 26

    frame #3: 0x00000001009da551 libglib-2.0.0.dylib`g_main_context_acquire + 78

    frame #4: 0x000000010024fc47 emacs`xg_select + 231

    frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757

    frame #6: 0x0000000100008cb6 emacs`sit_for + 582

    frame #7: 0x0000000100108f00 emacs`read_char + 4496

    frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757

    frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212

    frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382

    frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41

    frame #12: 0x00000001001be696 emacs`internal_catch + 342

    frame #13: 0x0000000100102ddb emacs`command_loop + 187

    frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127

    frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327

    frame #16: 0x0000000100100fd3 emacs`main + 4387

    frame #17: 0x00007fff8707a5c9 libdyld.dylib`start + 1

    frame #18: 0x00007fff8707a5c9 libdyld.dylib`start + 1


  thread #2: tid = 0x9bb48, 0x00007fff8a8613fa
libsystem_kernel.dylib`__select + 10, name = 'gmain'

    frame #0: 0x00007fff8a8613fa libsystem_kernel.dylib`__select + 10

    frame #1: 0x00000001009e8bed libglib-2.0.0.dylib`g_poll + 399

    frame #2: 0x00000001009dd303
libglib-2.0.0.dylib`g_main_context_iterate + 627

    frame #3: 0x00000001009dd40e
libglib-2.0.0.dylib`g_main_context_iteration + 104

    frame #4: 0x00000001009de7c6 libglib-2.0.0.dylib`glib_worker_main + 53

    frame #5: 0x00000001009fde09 libglib-2.0.0.dylib`g_thread_proxy + 90

    frame #6: 0x00007fff853b805a libsystem_pthread.dylib`_pthread_body + 131

    frame #7: 0x00007fff853b7fd7 libsystem_pthread.dylib`_pthread_start + 176

    frame #8: 0x00007fff853b53ed libsystem_pthread.dylib`thread_start + 13


Everything is terrible.

On Tue, Nov 24, 2015 at 5:50 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
>>> To go deeper, we may need to build a separate copy of glib and start
>>> putting some print statements in to find out why there is lock contention.
>>> Would you be up for that? I'd like to know if this is happening in
>>> g_get_worker_context.
>
> I've read further, and since "static gsize initialised;" must initialize to
> zero, it's for me to see how this code could be wrong just from reading it.
>
> I'd like to find every line of code in glib that calls LOCK_CONTEXT or
> UNLOCK_CONTEXT, and print out:
>
>     Function, file, line, lock or unlock, pointer value of context
>
> That should help us narrow it down.
>
> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25 18:49                                       ` Maneesh Yadav
@ 2015-11-25 18:59                                         ` John Wiegley
  2016-02-18 21:46                                         ` Maneesh Yadav
  1 sibling, 0 replies; 34+ messages in thread
From: John Wiegley @ 2015-11-25 18:59 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> I patched glib2 this way (just overriding the macros (and removing
> semicolons on macro invocations...seemed to be the best way to deal
> with if statements that didn't wrap in curly braces...just realized my
> strings don't reflect "UN"LOCK/LOCK....not a big deal since line
> numbers are there...fixed for next time):

Nice, this is much closer. I just need you to add a %p formatting string, and
then print the value of the "context":

 #define LOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to LOCK %p:
 %s, %d, %\n", context, __FILE__, __LINE__, __FUNCTION__); g_mutex_lock
 (&context->mutex);}

John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2015-11-25 18:49                                       ` Maneesh Yadav
  2015-11-25 18:59                                         ` John Wiegley
@ 2016-02-18 21:46                                         ` Maneesh Yadav
  2016-02-20  2:40                                           ` John Wiegley
  2020-08-31  2:11                                           ` Stefan Kangas
  1 sibling, 2 replies; 34+ messages in thread
From: Maneesh Yadav @ 2016-02-18 21:46 UTC (permalink / raw)
  To: John Wiegley; +Cc: 21965

Apologies to all I haven't ben able to follow up on this more
thoroughly, part of the problem was trying to get the crash to
replicate.  I could do it in a few minutes while I was originally
posting, then as I was getting all the right debug statements in it
got harder and harder.  I decided to just revert to normal use and
wait for it to happen.  Just happened again and I've put all the
debugging info I can here and will try to trace through glib and
figure out what is going on, just putting everything here for
reference.

overrided the following macros in gmain.c (and had to add some curly braces):

#define LOCK_CONTEXT(context) g_mutex_lock (&context->mutex)
#define LOCK_CONTEXT(context) {fprintf(stderr, "MANEESH GLIB DEBUG:
About to LOCK: %p, %s, %d, %s, %p\n", context, __FILE__, __LINE__,
__FUNCTION__,g_thread_self()); g_mutex_lock (&context->mutex);}

#define UNLOCK_CONTEXT(context) g_mutex_unlock (&context->mutex)
#define UNLOCK_CONTEXT(context) {fprintf(stderr, "MANEESH GLIB DEBUG:
About to UNLOCK: %p, %s, %d, %s, %p\n", context, __FILE__, __LINE__,
__FUNCTION__, g_thread_self()); g_mutex_unlock (&context->mutex);}


At the time of crash, an abbreviated summary of stderr:
...
MANEESH GLIB DEBUG: About to UNLOCK: 0x100f00e20, gmain.c, 4128,
g_main_context_poll, 0x102001c00

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3222,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3538,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3583,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3859,
g_main_context_pending, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3782,
g_main_context_iterate, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3222,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3801,
g_main_context_iterate, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3812,
g_main_context_iterate, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3376,
g_main_context_prepare, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3501,
g_main_context_prepare, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3538,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3583,
g_main_context_query, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 4124,
g_main_context_poll, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 4128,
g_main_context_poll, 0x100da5800

...

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3242,
g_main_context_release, 0x100da5800

MANEESH GLIB DEBUG: About to UNLOCK: 0x100d8a3a0, gmain.c, 3265,
g_main_context_release, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800

MANEESH GLIB DEBUG: About to LOCK: 0x100d8a3a0, gmain.c, 3208,
g_main_context_acquire, 0x100da5800



lldb traces:

(lldb) attach emacs

Process 79773 stopped

* thread #1: tid = 0x6c3b4f, 0x00007fff8deb9166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

    frame #0: 0x00007fff8deb9166 libsystem_kernel.dylib`__psynch_mutexwait + 10

libsystem_kernel.dylib`__psynch_mutexwait:

->  0x7fff8deb9166 <+10>: jae    0x7fff8deb9170            ; <+20>

    0x7fff8deb9168 <+12>: movq   %rax, %rdi

    0x7fff8deb916b <+15>: jmp    0x7fff8deb4c53            ; cerror_nocancel

    0x7fff8deb9170 <+20>: retq


Executable module set to "/opt/local/bin/emacs".

Architecture set to: x86_64h-apple-macosx.

(lldb) thread backtrace all

* thread #1: tid = 0x6c3b4f, 0x00007fff8deb9166
libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
'com.apple.main-thread', stop reason = signal SIGSTOP

  * frame #0: 0x00007fff8deb9166 libsystem_kernel.dylib`__psynch_mutexwait + 10

    frame #1: 0x00007fff88a0d696
libsystem_pthread.dylib`_pthread_mutex_lock + 480

    frame #2: 0x0000000100a17b48 libglib-2.0.0.dylib`g_mutex_lock + 26

    frame #3: 0x00000001009d9b53
libglib-2.0.0.dylib`g_main_context_acquire + 109

    frame #4: 0x000000010024fc47 emacs`xg_select + 231

    frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757

    frame #6: 0x0000000100008cb6 emacs`sit_for + 582

    frame #7: 0x0000000100108f00 emacs`read_char + 4496

    frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757

    frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212

    frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382

    frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41

    frame #12: 0x00000001001be696 emacs`internal_catch + 342

    frame #13: 0x0000000100102ddb emacs`command_loop + 187

    frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127

    frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327

    frame #16: 0x0000000100100fd3 emacs`main + 4387

    frame #17: 0x00007fff8a6d25c9 libdyld.dylib`start + 1

    frame #18: 0x00007fff8a6d25c9 libdyld.dylib`start + 1

 thread #2: tid = 0x6c3b6b, 0x00007fff8deb93fa
libsystem_kernel.dylib`__select + 10, name = 'gmain'

    frame #0: 0x00007fff8deb93fa libsystem_kernel.dylib`__select + 10

    frame #1: 0x00000001009e8bbd libglib-2.0.0.dylib`g_poll + 399

    frame #2: 0x00000001009dd07c
libglib-2.0.0.dylib`g_main_context_iterate + 845

    frame #3: 0x00000001009dd1b1
libglib-2.0.0.dylib`g_main_context_iteration + 127

    frame #4: 0x00000001009de796 libglib-2.0.0.dylib`glib_worker_main + 53

    frame #5: 0x00000001009fddd9 libglib-2.0.0.dylib`g_thread_proxy + 90

    frame #6: 0x00007fff88a1005a libsystem_pthread.dylib`_pthread_body + 131

    frame #7: 0x00007fff88a0ffd7 libsystem_pthread.dylib`_pthread_start + 176

    frame #8: 0x00007fff88a0d3ed libsystem_pthread.dylib`thread_start + 13


Inkscape is the only other binary linked to glib that is running, I think:

Maneeshs-MacBook-Air:~ maneeshyadav$ ps

  PID TTY           TIME CMD

49116 ttys000    0:00.14 -bash

79773 ttys000    2:16.74 emacs

49772 ttys001    0:01.80 -bash

63245 ttys002    0:00.56 -bash

65082 ttys002    0:00.01 /bin/bash
/Applications/ChemAxon/MarvinBeans/bin/msketch

65087 ttys002   18:31.79
/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/bin/java
chemaxon.marvin.Sketch

81099 ttys002    1:06.60 inkscape

On Wed, Nov 25, 2015 at 10:49 AM, Maneesh Yadav <maneeshkyadav@gmail.com> wrote:
> Weird.
>
> I patched glib2 this way (just overriding the macros (and removing
> semicolons on macro invocations...seemed to be the best way to deal
> with if statements that didn't wrap in curly braces...just realized my
> strings don't reflect "UN"LOCK/LOCK....not a big deal since line
> numbers are there...fixed for next time):
>
> #define LOCK_CONTEXT(context) g_mutex_lock (&context->mutex)
>
> #define LOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to
> LOCK: %s, %d, %s\n", __FILE__, __LINE__, __FUNCTION__); g_mutex_lock
> (&context->mutex);}
>
> #define UNLOCK_CONTEXT(context) g_mutex_unlock (&context->mutex)
>
> #define UNLOCK_CONTEXT(context) {printf("MANEESH GLIB DEBUG: About to
> LOCK: %s, %d, %s\n", __FILE__, __LINE__, __FUNCTION__); g_mutex_unlock
> (&context->mutex);}
>
>
> Grabbing the output (emacs -Q > test.out) shows the stall mid print:
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3208, g_main_context_acquire
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3222, g_main_context_acquire
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3801, g_main_context_iterate
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3812, g_main_context_iterate
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3376, g_main_context_prepare
>
> MANEESH GLIB DEBUG: About to LOCK: gmain.c, 3501, g_main_context_prepare
>
> MANEESH GLIB DEBUG
>
>
>
> gmain.c 3501 region:
>
> if (timeout)
>
>     {
>
>       *timeout = context->timeout;
>
>       if (*timeout != 0)
>
>         context->time_is_fresh = FALSE;
>
>     }
>
>
>
>   UNLOCK_CONTEXT (context)
>
>
>   return n_poll;
>
>
> Nothing terribly different from the lldb backtrace (for completeness):
>
>
> (lldb) thread backtrace all
>
> * thread #1: tid = 0x9bb3c, 0x00007fff8a861166
> libsystem_kernel.dylib`__psynch_mutexwait + 10, queue =
> 'com.apple.main-thread', stop reason = signal SIGSTOP
>
>   * frame #0: 0x00007fff8a861166 libsystem_kernel.dylib`__psynch_mutexwait + 10
>
>     frame #1: 0x00007fff853b5696
> libsystem_pthread.dylib`_pthread_mutex_lock + 480
>
>     frame #2: 0x0000000100a17b78 libglib-2.0.0.dylib`g_mutex_lock + 26
>
>     frame #3: 0x00000001009da551 libglib-2.0.0.dylib`g_main_context_acquire + 78
>
>     frame #4: 0x000000010024fc47 emacs`xg_select + 231
>
>     frame #5: 0x0000000100225c3d emacs`wait_reading_process_output + 3757
>
>     frame #6: 0x0000000100008cb6 emacs`sit_for + 582
>
>     frame #7: 0x0000000100108f00 emacs`read_char + 4496
>
>     frame #8: 0x0000000100104edd emacs`read_key_sequence + 1757
>
>     frame #9: 0x0000000100103cec emacs`command_loop_1 + 1212
>
>     frame #10: 0x00000001001bf04e emacs`internal_condition_case + 382
>
>     frame #11: 0x000000010011ce09 emacs`command_loop_2 + 41
>
>     frame #12: 0x00000001001be696 emacs`internal_catch + 342
>
>     frame #13: 0x0000000100102ddb emacs`command_loop + 187
>
>     frame #14: 0x0000000100102c9f emacs`recursive_edit_1 + 127
>
>     frame #15: 0x0000000100102f87 emacs`Frecursive_edit + 327
>
>     frame #16: 0x0000000100100fd3 emacs`main + 4387
>
>     frame #17: 0x00007fff8707a5c9 libdyld.dylib`start + 1
>
>     frame #18: 0x00007fff8707a5c9 libdyld.dylib`start + 1
>
>
>   thread #2: tid = 0x9bb48, 0x00007fff8a8613fa
> libsystem_kernel.dylib`__select + 10, name = 'gmain'
>
>     frame #0: 0x00007fff8a8613fa libsystem_kernel.dylib`__select + 10
>
>     frame #1: 0x00000001009e8bed libglib-2.0.0.dylib`g_poll + 399
>
>     frame #2: 0x00000001009dd303
> libglib-2.0.0.dylib`g_main_context_iterate + 627
>
>     frame #3: 0x00000001009dd40e
> libglib-2.0.0.dylib`g_main_context_iteration + 104
>
>     frame #4: 0x00000001009de7c6 libglib-2.0.0.dylib`glib_worker_main + 53
>
>     frame #5: 0x00000001009fde09 libglib-2.0.0.dylib`g_thread_proxy + 90
>
>     frame #6: 0x00007fff853b805a libsystem_pthread.dylib`_pthread_body + 131
>
>     frame #7: 0x00007fff853b7fd7 libsystem_pthread.dylib`_pthread_start + 176
>
>     frame #8: 0x00007fff853b53ed libsystem_pthread.dylib`thread_start + 13
>
>
> Everything is terrible.
>
> On Tue, Nov 24, 2015 at 5:50 PM, John Wiegley <jwiegley@gmail.com> wrote:
>>>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>>
>>>> To go deeper, we may need to build a separate copy of glib and start
>>>> putting some print statements in to find out why there is lock contention.
>>>> Would you be up for that? I'd like to know if this is happening in
>>>> g_get_worker_context.
>>
>> I've read further, and since "static gsize initialised;" must initialize to
>> zero, it's for me to see how this code could be wrong just from reading it.
>>
>> I'd like to find every line of code in glib that calls LOCK_CONTEXT or
>> UNLOCK_CONTEXT, and print out:
>>
>>     Function, file, line, lock or unlock, pointer value of context
>>
>> That should help us narrow it down.
>>
>> John





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2016-02-18 21:46                                         ` Maneesh Yadav
@ 2016-02-20  2:40                                           ` John Wiegley
  2020-08-31  2:11                                           ` Stefan Kangas
  1 sibling, 0 replies; 34+ messages in thread
From: John Wiegley @ 2016-02-20  2:40 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: 21965

>>>>> Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Apologies to all I haven't ben able to follow up on this more thoroughly,
> part of the problem was trying to get the crash to replicate. I could do it
> in a few minutes while I was originally posting, then as I was getting all
> the right debug statements in it got harder and harder. I decided to just
> revert to normal use and wait for it to happen. Just happened again and I've
> put all the debugging info I can here and will try to trace through glib and
> figure out what is going on, just putting everything here for reference.

That sure looks like a lot of locks for the same context. Now, I wonder how
could allow that code path to recur without intervening unlocks, as there were
before the hang?

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2016-02-18 21:46                                         ` Maneesh Yadav
  2016-02-20  2:40                                           ` John Wiegley
@ 2020-08-31  2:11                                           ` Stefan Kangas
  2020-08-31  2:25                                             ` Maneesh Yadav
  1 sibling, 1 reply; 34+ messages in thread
From: Stefan Kangas @ 2020-08-31  2:11 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: John Wiegley, 21965

Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Apologies to all I haven't ben able to follow up on this more
> thoroughly, part of the problem was trying to get the crash to
> replicate.  I could do it in a few minutes while I was originally
> posting, then as I was getting all the right debug statements in it
> got harder and harder.  I decided to just revert to normal use and
> wait for it to happen.  Just happened again and I've put all the
> debugging info I can here and will try to trace through glib and
> figure out what is going on, just putting everything here for
> reference.

(That was five years ago.)

Were you ever able to get any further in debugging this?  Are you still
seeing this on a recent version of Emacs?

Thanks in advance.

Best regards,
Stefan Kangas





^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2020-08-31  2:11                                           ` Stefan Kangas
@ 2020-08-31  2:25                                             ` Maneesh Yadav
  2020-08-31 13:54                                               ` Stefan Kangas
  0 siblings, 1 reply; 34+ messages in thread
From: Maneesh Yadav @ 2020-08-31  2:25 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: John Wiegley, 21965

[-- Attachment #1: Type: text/plain, Size: 949 bytes --]

Indeed. Fwiw haven't run into it since that era.

On Sun, Aug 30, 2020, 7:11 PM Stefan Kangas <stefan@marxist.se> wrote:

> Maneesh Yadav <maneeshkyadav@gmail.com> writes:
>
> > Apologies to all I haven't ben able to follow up on this more
> > thoroughly, part of the problem was trying to get the crash to
> > replicate.  I could do it in a few minutes while I was originally
> > posting, then as I was getting all the right debug statements in it
> > got harder and harder.  I decided to just revert to normal use and
> > wait for it to happen.  Just happened again and I've put all the
> > debugging info I can here and will try to trace through glib and
> > figure out what is going on, just putting everything here for
> > reference.
>
> (That was five years ago.)
>
> Were you ever able to get any further in debugging this?  Are you still
> seeing this on a recent version of Emacs?
>
> Thanks in advance.
>
> Best regards,
> Stefan Kangas
>

[-- Attachment #2: Type: text/html, Size: 1401 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* bug#21965: 24.5; Emacs freezes when canceling at open file
  2020-08-31  2:25                                             ` Maneesh Yadav
@ 2020-08-31 13:54                                               ` Stefan Kangas
  0 siblings, 0 replies; 34+ messages in thread
From: Stefan Kangas @ 2020-08-31 13:54 UTC (permalink / raw)
  To: Maneesh Yadav; +Cc: John Wiegley, 21965-done

Maneesh Yadav <maneeshkyadav@gmail.com> writes:

> Indeed. Fwiw haven't run into it since that era.

Thanks.  I'm therefore closing this bug report.

If you see something like this again, please open a new bug.





^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-08-31 13:54 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-20 19:18 bug#21965: 24.5; Emacs freezes when canceling at open file Maneesh Yadav
2015-11-20 21:37 ` John Wiegley
2015-11-20 21:47   ` Maneesh Yadav
2015-11-20 21:55     ` John Wiegley
2015-11-20 22:07       ` Maneesh Yadav
2015-11-20 22:45         ` Maneesh Yadav
2015-11-20 23:26           ` John Wiegley
2015-11-20 23:32             ` Maneesh Yadav
2015-11-20 23:54               ` John Wiegley
2015-11-21  1:46                 ` Maneesh Yadav
2015-11-21  7:29           ` Eli Zaretskii
2015-11-22  5:11             ` John Wiegley
2015-11-22  5:15               ` Maneesh Yadav
2015-11-23 21:29                 ` Maneesh Yadav
2015-11-23 22:17                   ` John Wiegley
2015-11-24  0:30                     ` Maneesh Yadav
2015-11-24  3:39                       ` Eli Zaretskii
2015-11-24  3:34                     ` Eli Zaretskii
2015-11-24  3:39                       ` John Wiegley
2015-11-24 22:51                         ` Maneesh Yadav
2015-11-24 22:58                           ` Maneesh Yadav
2015-11-25  1:02                             ` John Wiegley
2015-11-25  1:15                               ` Maneesh Yadav
2015-11-25  1:38                                 ` John Wiegley
2015-11-25  1:46                                   ` Maneesh Yadav
2015-11-25  1:50                                     ` John Wiegley
2015-11-25 18:49                                       ` Maneesh Yadav
2015-11-25 18:59                                         ` John Wiegley
2016-02-18 21:46                                         ` Maneesh Yadav
2016-02-20  2:40                                           ` John Wiegley
2020-08-31  2:11                                           ` Stefan Kangas
2020-08-31  2:25                                             ` Maneesh Yadav
2020-08-31 13:54                                               ` Stefan Kangas
2015-11-20 22:01 ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).