unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#60144: 30.0.50; PGTK Emacs crashes after signal
@ 2022-12-17  3:39 Karl Otness via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-12-18  2:08 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 11+ messages in thread
From: Karl Otness via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-17  3:39 UTC (permalink / raw)
  To: 60144

Hello, I have been having issues with unpredictable crashes running
Emacs master with PGTK on Wayland. This looks somewhat similar to
bug#59452.

Like that bug, it seems to be caused by an Emacs signal happening in a
GTK callback. It works its way to get_char_property_and_overlay
(textprop.c:644), signals, which longjmps out of the GLib/GObject
signal handling (g_signal_emit) leading to memory corruption and a
segfault.

Backtraces below. The segfault happens after continuing. Seems like
after continuing it reenters g_signal_emit and follows a corrupted
pointer in a linked list of signals to dispatch.

Unfortunately I don't have a good recipe for reliably reproducing it.
I've only seen it happen in buffers with eglot enabled (so far C++
buffers) when clicking around, typing, messing with the eglot menu,
etc.

This is for an Emacs from recent master.
Version: 30.0.50
Commit: 1568123196cd8b57ed64e284b7deb058026be713

Configured using:
 'configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib
 --localstatedir=/var --with-pgtk --with-native-compilation
 --without-sound --with-harfbuzz --without-m17n-flt --without-xft
 --with-libotf --with-cairo --with-modules --without-gconf
 --without-gsettings --with-gameuser=:games --without-imagemagick
 --with-dumping=pdumper --with-sqlite3 --with-json --with-tree-sitter
 '--program-transform-name=s/^ctags$/ctags.emacs/' 'CFLAGS=-g -ggdb -O3
 -pipe -fno-plt -fstack-protector-all -fstack-clash-protection
 -fcf-protection=full -fPIE -D_FORTIFY_SOURCE=3 -march=native
 -mtune=native' 'LDFLAGS=-pie
 -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now,-z,noexecstack''

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM HARFBUZZ JPEG JSON LCMS2
LIBOTF LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER
PGTK PNG RSVG SECCOMP SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
TREE_SITTER WEBP XIM GTK3 ZLIB

Let me know if there's anything else I can gather that might be
helpful.

Thanks,
Karl

Here's the backtrace for the signal out of the event handling. From
GDB with a breakpoint on Fsignal and a condition
'$_any_caller_is("g_signal_emit", 20)'

> #0  Fsignal (error_symbol=error_symbol@entry=0x2f40, data=0x55bed53eeac3) at eval.c:1681
> #1  0x000055bece1213bf in xsignal (data=<optimized out>, error_symbol=0x2f40) at emacs/src/lisp.h:4558
> #2  xsignal1 (error_symbol=error_symbol@entry=0x2f40, arg=arg@entry=0x82) at eval.c:1878
> #3  0x000055bece1253e3 in get_char_property_and_overlay (position=0x82, prop=0x5a90, object=0x7f105451a265, overlay=0x0) at textprop.c:644
> #4  0x000055bece156110 in string_buffer_position_lim (string=string@entry=0x55bed52e8b24, from=from@entry=32, to=to@entry=1032, back_p=back_p@entry=false) at xdisp.c:6246
> #5  0x000055bece1561fa in string_buffer_position (string=0x55bed52e8b24, around_charpos=32) at xdisp.c:6284
> #6  0x000055bece1aaddb in note_mouse_highlight (f=f@entry=0x55bed1a839e8, x=<optimized out>, y=<optimized out>) at xdisp.c:35339
> #7  0x000055bece4039ac in note_mouse_movement (event=0x55bed1ce6030, frame=0x55bed1a839e8) at pgtkterm.c:5821
> #8  motion_notify_event (widget=widget@entry=0x55bed2024130, event=0x55bed1ce6030, user_data=<optimized out>) at pgtkterm.c:5905
> #9  0x00007f105c684fd8 in _gtk_marshal_BOOLEAN__BOXED (closure=0x55bed1ef9f40, return_value=0x7ffebc5ae480, n_param_values=<optimized out>, param_values=0x7ffebc5ae4e0, invocation_hint=<optimized out>, marshal_data=<optimized out>)
>     at gtk/gtkmarshalers.c:84
> #10 0x00007f105c095210 in g_closure_invoke (closure=0x55bed1ef9f40, return_value=0x7ffebc5ae480, n_param_values=2, param_values=0x7ffebc5ae4e0, invocation_hint=0x7ffebc5ae460) at ../glib/gobject/gclosure.c:832
> #11 0x00007f105c0c2ea8 in signal_emit_unlocked_R.isra.0
>     (node=<optimized out>, detail=detail@entry=0, instance=instance@entry=0x55bed2024130, emission_return=emission_return@entry=0x7ffebc5ae5f0, instance_and_params=instance_and_params@entry=0x7ffebc5ae4e0)
>     at ../glib/gobject/gsignal.c:3796
> #12 0x00007f105c0b2980 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffebc5ae6a0) at ../glib/gobject/gsignal.c:3559
> #13 0x00007f105c0b3204 in g_signal_emit (instance=instance@entry=0x55bed2024130, signal_id=<optimized out>, detail=detail@entry=0) at ../glib/gobject/gsignal.c:3606
> #14 0x00007f105c9447f5 in gtk_widget_event_internal.part.0.lto_priv.0 (widget=0x55bed2024130, event=0x55bed1ce6030) at ../gtk/gtk/gtkwidget.c:7812
> #15 0x00007f105c7e20db in propagate_event_up (topmost=<optimized out>, event=<optimized out>, widget=0x55bed2024130) at ../gtk/gtk/gtkmain.c:2588
> #16 propagate_event (widget=widget@entry=0x55bed2024130, event=event@entry=0x55bed1ce6030, captured=captured@entry=0, topmost=topmost@entry=0x0) at ../gtk/gtk/gtkmain.c:2691
> #17 0x00007f105c7e2212 in gtk_propagate_event (widget=widget@entry=0x55bed2024130, event=event@entry=0x55bed1ce6030) at ../gtk/gtk/gtkmain.c:2725
> #18 0x00007f105c7e2fbb in gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1921
> #19 gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1691
> #20 0x00007f105c542cd3 in _gdk_event_emit (event=0x55bed1ce6030) at ../gtk/gdk/gdkevents.c:73
> #21 _gdk_event_emit (event=0x55bed1ce6030) at ../gtk/gdk/gdkevents.c:67
> #22 0x00007f105c576d48 in gdk_event_source_dispatch (base=<optimized out>, callback=<optimized out>, data=<optimized out>) at ../gtk/gdk/wayland/gdkeventsource.c:124
> #23 0x00007f105bf9787b in g_main_dispatch (context=0x55bed0cc5940) at ../glib/glib/gmain.c:3444
> #24 g_main_context_dispatch (context=0x55bed0cc5940) at ../glib/glib/gmain.c:4162
> #25 0x000055bece3feea9 in pgtk_read_socket (terminal=<optimized out>, hold_quit=0x7ffebc5ae9f0) at pgtkterm.c:3839
> #26 pgtk_read_socket (terminal=<optimized out>, hold_quit=0x7ffebc5ae9f0) at pgtkterm.c:3818
> #27 0x000055bece251ae1 in gobble_input () at keyboard.c:7417
> #28 0x000055bece254901 in handle_async_input () at keyboard.c:7648
> #29 process_pending_signals () at keyboard.c:7662
> #30 unblock_input_to (level=0) at keyboard.c:7677
> #31 unblock_input_to (level=<optimized out>) at keyboard.c:7671
> #32 unblock_input () at keyboard.c:7696
> #33 timer_check () at keyboard.c:4742
> #34 0x000055bece254bcd in readable_events (flags=1) at keyboard.c:3524
> #35 0x000055bece25a624 in get_input_pending (flags=1) at keyboard.c:7367
> #36 detect_input_pending_run_timers (do_display=do_display@entry=true) at keyboard.c:10897
> #37 0x000055bece38962f in wait_reading_process_output
>     (time_limit=time_limit@entry=0, nsecs=nsecs@entry=0, read_kbd=read_kbd@entry=-1, do_display=<optimized out>, wait_for_cell=wait_for_cell@entry=0x0, wait_proc=wait_proc@entry=0x0, just_wait_proc=<optimized out>) at process.c:5779
> #38 0x000055bece25271c in kbd_buffer_get_event (end_time=0x0, used_mouse_menu=0x7ffebc5af64b, kbp=<synthetic pointer>) at keyboard.c:4003
> #39 read_event_from_main_queue (used_mouse_menu=0x7ffebc5af64b, local_getcjmp=0x7ffebc5af3c0, end_time=0x0) at keyboard.c:2270
> #40 read_decoded_event_from_main_queue (end_time=0x0, local_getcjmp=0x7ffebc5af3c0, prev_event=0x0, used_mouse_menu=0x7ffebc5af64b) at keyboard.c:2334
> #41 0x000055bece25b904 in read_char (commandflag=1, map=0x55bed51362e3, prev_event=0x0, used_mouse_menu=0x7ffebc5af64b, end_time=0x0) at keyboard.c:2964
> #42 0x000055bece2600b7 in read_key_sequence (keybuf=<optimized out>, prevent_redisplay=false, fix_current_buffer=<optimized out>, can_return_switch_frame=<optimized out>, dont_downcase_last=<optimized out>, prompt=<optimized out>)
>     at keyboard.c:10074
> #43 0x000055bece262141 in command_loop_1 () at keyboard.c:1376
> #44 0x000055bece3055bf in internal_condition_case (bfun=bfun@entry=0x55bece261f70 <command_loop_1>, handlers=handlers@entry=0x90, hfun=hfun@entry=0x55bece248c70 <cmd_error>) at eval.c:1474
> #45 0x000055bece24682f in command_loop_2 (handlers=handlers@entry=0x90) at keyboard.c:1125
> #46 0x000055bece3054e5 in internal_catch (tag=tag@entry=0xfb10, func=func@entry=0x55bece2467f0 <command_loop_2>, arg=arg@entry=0x90) at eval.c:1197
> #47 0x000055bece2467bb in command_loop () at keyboard.c:1103
> #48 0x000055bece24ee1d in recursive_edit_1 () at keyboard.c:712
> #49 0x000055bece24f269 in Frecursive_edit () at keyboard.c:795
> #50 0x000055bece128b15 in main (argc=<optimized out>, argv=0x7ffebc5afc88) at emacs.c:2529

and the stack trace after the longjmp (unwinds all the way to
internal_condition_case):

> #0  0x000055bece305577 in internal_condition_case
>     (bfun=bfun@entry=0x55bece261f70 <command_loop_1>, handlers=handlers@entry=0x90, hfun=hfun@entry=0x55bece248c70 <cmd_error>) at eval.c:1465
> #1  0x000055bece24682f in command_loop_2 (handlers=handlers@entry=0x90) at keyboard.c:1125
> #2  0x000055bece3054e5 in internal_catch
>     (tag=tag@entry=0xfb10, func=func@entry=0x55bece2467f0 <command_loop_2>, arg=arg@entry=0x90) at eval.c:1197
> #3  0x000055bece2467bb in command_loop () at keyboard.c:1103
> #4  0x000055bece24ee1d in recursive_edit_1 () at keyboard.c:712
> #5  0x000055bece24f269 in Frecursive_edit () at keyboard.c:795
> #6  0x000055bece128b15 in main (argc=<optimized out>, argv=0x7ffebc5afc88) at emacs.c:2529





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-17  3:39 bug#60144: 30.0.50; PGTK Emacs crashes after signal Karl Otness via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-18  2:08 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-12-18  5:45   ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-18  2:08 UTC (permalink / raw)
  To: Karl Otness; +Cc: 60144

Karl Otness <karl@karlotness.com> writes:

>> #0  Fsignal (error_symbol=error_symbol@entry=0x2f40, data=0x55bed53eeac3) at eval.c:1681
>> #1  0x000055bece1213bf in xsignal (data=<optimized out>, error_symbol=0x2f40) at emacs/src/lisp.h:4558
>> #2  xsignal1 (error_symbol=error_symbol@entry=0x2f40, arg=arg@entry=0x82) at eval.c:1878
>> #3  0x000055bece1253e3 in get_char_property_and_overlay (position=0x82, prop=0x5a90, object=0x7f105451a265, overlay=0x0) at textprop.c:644
>> #4  0x000055bece156110 in string_buffer_position_lim (string=string@entry=0x55bed52e8b24, from=from@entry=32, to=to@entry=1032, back_p=back_p@entry=false) at xdisp.c:6246
>> #5  0x000055bece1561fa in string_buffer_position (string=0x55bed52e8b24, around_charpos=32) at xdisp.c:6284
>> #6  0x000055bece1aaddb in note_mouse_highlight (f=f@entry=0x55bed1a839e8, x=<optimized out>, y=<optimized out>) at xdisp.c:35339
>> #7  0x000055bece4039ac in note_mouse_movement (event=0x55bed1ce6030, frame=0x55bed1a839e8) at pgtkterm.c:5821
>> #8  motion_notify_event (widget=widget@entry=0x55bed2024130, event=0x55bed1ce6030, user_data=<optimized out>) at pgtkterm.c:5905
>> #9  0x00007f105c684fd8 in _gtk_marshal_BOOLEAN__BOXED (closure=0x55bed1ef9f40, return_value=0x7ffebc5ae480, n_param_values=<optimized out>, param_values=0x7ffebc5ae4e0, invocation_hint=<optimized out>, marshal_data=<optimized out>)
>>     at gtk/gtkmarshalers.c:84
>> #10 0x00007f105c095210 in g_closure_invoke (closure=0x55bed1ef9f40, return_value=0x7ffebc5ae480, n_param_values=2, param_values=0x7ffebc5ae4e0, invocation_hint=0x7ffebc5ae460) at ../glib/gobject/gclosure.c:832
>> #11 0x00007f105c0c2ea8 in signal_emit_unlocked_R.isra.0
>>     (node=<optimized out>, detail=detail@entry=0, instance=instance@entry=0x55bed2024130, emission_return=emission_return@entry=0x7ffebc5ae5f0, instance_and_params=instance_and_params@entry=0x7ffebc5ae4e0)
>>     at ../glib/gobject/gsignal.c:3796
>> #12 0x00007f105c0b2980 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffebc5ae6a0) at ../glib/gobject/gsignal.c:3559
>> #13 0x00007f105c0b3204 in g_signal_emit (instance=instance@entry=0x55bed2024130, signal_id=<optimized out>, detail=detail@entry=0) at ../glib/gobject/gsignal.c:3606
>> #14 0x00007f105c9447f5 in gtk_widget_event_internal.part.0.lto_priv.0 (widget=0x55bed2024130, event=0x55bed1ce6030) at ../gtk/gtk/gtkwidget.c:7812
>> #15 0x00007f105c7e20db in propagate_event_up (topmost=<optimized out>, event=<optimized out>, widget=0x55bed2024130) at ../gtk/gtk/gtkmain.c:2588
>> #16 propagate_event (widget=widget@entry=0x55bed2024130, event=event@entry=0x55bed1ce6030, captured=captured@entry=0, topmost=topmost@entry=0x0) at ../gtk/gtk/gtkmain.c:2691
>> #17 0x00007f105c7e2212 in gtk_propagate_event (widget=widget@entry=0x55bed2024130, event=event@entry=0x55bed1ce6030) at ../gtk/gtk/gtkmain.c:2725
>> #18 0x00007f105c7e2fbb in gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1921
>> #19 gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1691
>> #20 0x00007f105c542cd3 in _gdk_event_emit (event=0x55bed1ce6030) at ../gtk/gdk/gdkevents.c:73
>> #21 _gdk_event_emit (event=0x55bed1ce6030) at ../gtk/gdk/gdkevents.c:67

Thanks.  This sounds awfully like another bug that was fixed last month.
Would someone please take a look at this? note_mouse_highlight should
never signal.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18  2:08 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-18  5:45   ` Eli Zaretskii
  2022-12-18  6:22     ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2022-12-18  5:45 UTC (permalink / raw)
  To: Po Lu; +Cc: 60144, karl

> Cc: 60144@debbugs.gnu.org
> Date: Sun, 18 Dec 2022 10:08:59 +0800
> From:  Po Lu via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> Karl Otness <karl@karlotness.com> writes:
> 
> >> #0  Fsignal (error_symbol=error_symbol@entry=0x2f40, data=0x55bed53eeac3) at eval.c:1681
> >> #1  0x000055bece1213bf in xsignal (data=<optimized out>, error_symbol=0x2f40) at emacs/src/lisp.h:4558
> >> #2  xsignal1 (error_symbol=error_symbol@entry=0x2f40, arg=arg@entry=0x82) at eval.c:1878
> >> #3  0x000055bece1253e3 in get_char_property_and_overlay (position=0x82, prop=0x5a90, object=0x7f105451a265, overlay=0x0) at textprop.c:644
> >> #4  0x000055bece156110 in string_buffer_position_lim (string=string@entry=0x55bed52e8b24, from=from@entry=32, to=to@entry=1032, back_p=back_p@entry=false) at xdisp.c:6246
> >> #5  0x000055bece1561fa in string_buffer_position (string=0x55bed52e8b24, around_charpos=32) at xdisp.c:6284
> >> #6  0x000055bece1aaddb in note_mouse_highlight (f=f@entry=0x55bed1a839e8, x=<optimized out>, y=<optimized out>) at xdisp.c:35339
> >> #7  0x000055bece4039ac in note_mouse_movement (event=0x55bed1ce6030, frame=0x55bed1a839e8) at pgtkterm.c:5821
> >> #8  motion_notify_event (widget=widget@entry=0x55bed2024130, event=0x55bed1ce6030, user_data=<optimized out>) at pgtkterm.c:5905
> >> #9  0x00007f105c684fd8 in _gtk_marshal_BOOLEAN__BOXED (closure=0x55bed1ef9f40, return_value=0x7ffebc5ae480, n_param_values=<optimized out>, param_values=0x7ffebc5ae4e0, invocation_hint=<optimized out>, marshal_data=<optimized out>)
> >>     at gtk/gtkmarshalers.c:84
> >> #10 0x00007f105c095210 in g_closure_invoke (closure=0x55bed1ef9f40, return_value=0x7ffebc5ae480, n_param_values=2, param_values=0x7ffebc5ae4e0, invocation_hint=0x7ffebc5ae460) at ../glib/gobject/gclosure.c:832
> >> #11 0x00007f105c0c2ea8 in signal_emit_unlocked_R.isra.0
> >>     (node=<optimized out>, detail=detail@entry=0, instance=instance@entry=0x55bed2024130, emission_return=emission_return@entry=0x7ffebc5ae5f0, instance_and_params=instance_and_params@entry=0x7ffebc5ae4e0)
> >>     at ../glib/gobject/gsignal.c:3796
> >> #12 0x00007f105c0b2980 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffebc5ae6a0) at ../glib/gobject/gsignal.c:3559
> >> #13 0x00007f105c0b3204 in g_signal_emit (instance=instance@entry=0x55bed2024130, signal_id=<optimized out>, detail=detail@entry=0) at ../glib/gobject/gsignal.c:3606
> >> #14 0x00007f105c9447f5 in gtk_widget_event_internal.part.0.lto_priv.0 (widget=0x55bed2024130, event=0x55bed1ce6030) at ../gtk/gtk/gtkwidget.c:7812
> >> #15 0x00007f105c7e20db in propagate_event_up (topmost=<optimized out>, event=<optimized out>, widget=0x55bed2024130) at ../gtk/gtk/gtkmain.c:2588
> >> #16 propagate_event (widget=widget@entry=0x55bed2024130, event=event@entry=0x55bed1ce6030, captured=captured@entry=0, topmost=topmost@entry=0x0) at ../gtk/gtk/gtkmain.c:2691
> >> #17 0x00007f105c7e2212 in gtk_propagate_event (widget=widget@entry=0x55bed2024130, event=event@entry=0x55bed1ce6030) at ../gtk/gtk/gtkmain.c:2725
> >> #18 0x00007f105c7e2fbb in gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1921
> >> #19 gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1691
> >> #20 0x00007f105c542cd3 in _gdk_event_emit (event=0x55bed1ce6030) at ../gtk/gdk/gdkevents.c:73
> >> #21 _gdk_event_emit (event=0x55bed1ce6030) at ../gtk/gdk/gdkevents.c:67
> 
> Thanks.  This sounds awfully like another bug that was fixed last month.
> Would someone please take a look at this? note_mouse_highlight should
> never signal.

You cannot require that from note_mouse_highlight, since it looks up
text and overlay properties, and those can signal an error if the
position is outside the valid/reachable range of buffer positions.

Do you understand why note_mouse_highlight was called in this
scenario?  The backtrace seems strange: why should GTK care about our
mouse highlight?





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18  5:45   ` Eli Zaretskii
@ 2022-12-18  6:22     ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-12-18  8:39       ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-18  6:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60144, karl

Eli Zaretskii <eliz@gnu.org> writes:

> You cannot require that from note_mouse_highlight, since it looks up
> text and overlay properties, and those can signal an error if the
> position is outside the valid/reachable range of buffer positions.

How about simply wrapping those calls in
internal_catch_all/internal_condition_case?

> Do you understand why note_mouse_highlight was called in this
> scenario?  The backtrace seems strange: why should GTK care about our
> mouse highlight?

What happens here is that Emacs is reading input through GTK, either
inside xg_select or the read_socket_hook.  GTK then detects some mouse
motion and calls the motion event handler for the frame's widget, which
in turn calls note_mouse_movement.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18  6:22     ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-18  8:39       ` Eli Zaretskii
  2022-12-18  9:52         ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2022-12-18  8:39 UTC (permalink / raw)
  To: Po Lu; +Cc: 60144, karl

> From: Po Lu <luangruo@yahoo.com>
> Cc: karl@karlotness.com,  60144@debbugs.gnu.org
> Date: Sun, 18 Dec 2022 14:22:04 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > You cannot require that from note_mouse_highlight, since it looks up
> > text and overlay properties, and those can signal an error if the
> > position is outside the valid/reachable range of buffer positions.
> 
> How about simply wrapping those calls in
> internal_catch_all/internal_condition_case?

This is too drastic, IMO: it would deprive us of valuable diagnostics
when note_mouse_highlight is called.  Like in this case, for example:
having the code signal an error allowed me to find a real bug.  We
were asking string_buffer_position_lim to check properties and
overlays for positions outside the BEGV..ZV interval, which can never
do anything useful, even if it isn't called in the PGTK context.  I've
now fixed that on the release branch.

In general, note_mouse_highlight should never examine invalid buffer
or string positions.  If it does, it's a bug that needs to be fixed.

> > Do you understand why note_mouse_highlight was called in this
> > scenario?  The backtrace seems strange: why should GTK care about our
> > mouse highlight?
> 
> What happens here is that Emacs is reading input through GTK, either
> inside xg_select or the read_socket_hook.  GTK then detects some mouse
> motion and calls the motion event handler for the frame's widget, which
> in turn calls note_mouse_movement.

Why this fragile architecture of reading input events?  Calling
functions of our Lisp machine from context where those functions
cannot signal an error is very dangerous, and cannot work well in
Emacs.  Why cannot we have the reads through GTK only deliver events
to us, which we enqueue to our own event queue, and then we could
process that queue in the safe context of the Lisp machine, as (AFAIK)
we do on other platforms?





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18  8:39       ` Eli Zaretskii
@ 2022-12-18  9:52         ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-12-18 11:43           ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-18  9:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60144, karl

Eli Zaretskii <eliz@gnu.org> writes:

>> > Do you understand why note_mouse_highlight was called in this
>> > scenario?  The backtrace seems strange: why should GTK care about our
>> > mouse highlight?
>> 
>> What happens here is that Emacs is reading input through GTK, either
>> inside xg_select or the read_socket_hook.  GTK then detects some mouse
>> motion and calls the motion event handler for the frame's widget, which
>> in turn calls note_mouse_movement.
>
> Why this fragile architecture of reading input events?  Calling
> functions of our Lisp machine from context where those functions
> cannot signal an error is very dangerous, and cannot work well in
> Emacs.  Why cannot we have the reads through GTK only deliver events
> to us, which we enqueue to our own event queue, and then we could
> process that queue in the safe context of the Lisp machine, as (AFAIK)
> we do on other platforms?

No, signalling there is equally unsafe on the other platforms, where
note_mouse_highlight is called from the same place(s): read_socket_hook,
event_handler_gdk, et cetera.  Just look at the callers of
x_note_mouse_movement in xterm.c, or [EmacsView mouseMoved:] in
nsterm.m.

But I see you already fixed the bug.  Thanks.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18  9:52         ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-18 11:43           ` Eli Zaretskii
  2022-12-18 12:12             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2022-12-18 11:43 UTC (permalink / raw)
  To: Po Lu; +Cc: 60144, karl

> From: Po Lu <luangruo@yahoo.com>
> Cc: karl@karlotness.com,  60144@debbugs.gnu.org
> Date: Sun, 18 Dec 2022 17:52:42 +0800
> 
> > Why this fragile architecture of reading input events?  Calling
> > functions of our Lisp machine from context where those functions
> > cannot signal an error is very dangerous, and cannot work well in
> > Emacs.  Why cannot we have the reads through GTK only deliver events
> > to us, which we enqueue to our own event queue, and then we could
> > process that queue in the safe context of the Lisp machine, as (AFAIK)
> > we do on other platforms?
> 
> No, signalling there is equally unsafe on the other platforms, where
> note_mouse_highlight is called from the same place(s): read_socket_hook,
> event_handler_gdk, et cetera.  Just look at the callers of
> x_note_mouse_movement in xterm.c, or [EmacsView mouseMoved:] in
> nsterm.m.

Sorry, I'm afraid I don't see the danger on other platforms.  Please
explain.  AFAIK, read_socket_hook is called from keyboard.c code which
reads input, and that code has no problem signaling an error.  What am
I missing?





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18 11:43           ` Eli Zaretskii
@ 2022-12-18 12:12             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-12-18 12:33               ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-18 12:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60144, karl

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Po Lu <luangruo@yahoo.com>
>> Cc: karl@karlotness.com,  60144@debbugs.gnu.org
>> Date: Sun, 18 Dec 2022 17:52:42 +0800
>> 
>> > Why this fragile architecture of reading input events?  Calling
>> > functions of our Lisp machine from context where those functions
>> > cannot signal an error is very dangerous, and cannot work well in
>> > Emacs.  Why cannot we have the reads through GTK only deliver events
>> > to us, which we enqueue to our own event queue, and then we could
>> > process that queue in the safe context of the Lisp machine, as (AFAIK)
>> > we do on other platforms?
>> 
>> No, signalling there is equally unsafe on the other platforms, where
>> note_mouse_highlight is called from the same place(s): read_socket_hook,
>> event_handler_gdk, et cetera.  Just look at the callers of
>> x_note_mouse_movement in xterm.c, or [EmacsView mouseMoved:] in
>> nsterm.m.
>
> Sorry, I'm afraid I don't see the danger on other platforms.  Please
> explain.  AFAIK, read_socket_hook is called from keyboard.c code which
> reads input, and that code has no problem signaling an error.  What am
> I missing?

That code has problems signalling errors, unless it is okay for
unblock_input to signal.

On the regular X build with GTK, handle_one_xevent is called from
event_handler_gdk, which is called by GDK when it detects an event.
handle_one_xevent can also be called from x_dispatch_event inside a
popup menu, and during drag-and-drop.  On NS, [EmacsView mouseMoved:] is
called by the system from Objective-C.

Out of all of those places, the only place where it is safe to signal is
inside the drag-and-drop event loop.  Signalling out of the rest will
either lead to catastrophic blowups (if it happens inside
event_handler_gdk or [EmacsView mouseMoved:]), or to grabs never being
released and resource leaks inside a menu.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18 12:12             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-18 12:33               ` Eli Zaretskii
  2022-12-18 13:45                 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2022-12-18 12:33 UTC (permalink / raw)
  To: Po Lu; +Cc: 60144, karl

> From: Po Lu <luangruo@yahoo.com>
> Cc: karl@karlotness.com,  60144@debbugs.gnu.org
> Date: Sun, 18 Dec 2022 20:12:53 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> No, signalling there is equally unsafe on the other platforms, where
> >> note_mouse_highlight is called from the same place(s): read_socket_hook,
> >> event_handler_gdk, et cetera.  Just look at the callers of
> >> x_note_mouse_movement in xterm.c, or [EmacsView mouseMoved:] in
> >> nsterm.m.
> >
> > Sorry, I'm afraid I don't see the danger on other platforms.  Please
> > explain.  AFAIK, read_socket_hook is called from keyboard.c code which
> > reads input, and that code has no problem signaling an error.  What am
> > I missing?
> 
> That code has problems signalling errors, unless it is okay for
> unblock_input to signal.

I don't understand this part.  Why and how is unblock_input part of
the picture?

> On the regular X build with GTK, handle_one_xevent is called from
> event_handler_gdk, which is called by GDK when it detects an event.
> handle_one_xevent can also be called from x_dispatch_event inside a
> popup menu, and during drag-and-drop.  On NS, [EmacsView mouseMoved:] is
> called by the system from Objective-C.

So in the X/GTK build we have the same problem as with PGTK?  If so,
why not change that as well, to work as I described, i.e. enqueue
events to our own event queue, which we will then read and process in
safe context?

AFAIU, w32 already works like that.  Does it not?

(As for NS, I know it does some very dangerous stuff.)

> Out of all of those places, the only place where it is safe to signal is
> inside the drag-and-drop event loop.  Signalling out of the rest will
> either lead to catastrophic blowups (if it happens inside
> event_handler_gdk or [EmacsView mouseMoved:]), or to grabs never being
> released and resource leaks inside a menu.

Yes, understood.  But it just tells me that we need to change the
architecture so that the events delivered by the window-system are not
processed in callbacks we install to be called by the window-system,
they should be processed in our own safe context.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18 12:33               ` Eli Zaretskii
@ 2022-12-18 13:45                 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-12-18 17:34                   ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-18 13:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60144, karl

Eli Zaretskii <eliz@gnu.org> writes:

>> That code has problems signalling errors, unless it is okay for
>> unblock_input to signal.
>
> I don't understand this part.  Why and how is unblock_input part of
> the picture?

Because unblock_input can call process_pending_signals, and in doing so
handle_async_input, which calls gobble_input (and thus the
read_socket_hook.)  As a result, it is not safe for any read_socket_hook
to signal as long as it is not ok for unblock_input to signal as well.

> So in the X/GTK build we have the same problem as with PGTK?  If so,
> why not change that as well, to work as I described, i.e. enqueue
> events to our own event queue, which we will then read and process in
> safe context?
>
> AFAIU, w32 already works like that.  Does it not?

It doesn't, see how w32_note_mouse_movement is called from
w32_read_socket.

> Yes, understood.  But it just tells me that we need to change the
> architecture so that the events delivered by the window-system are not
> processed in callbacks we install to be called by the window-system,
> they should be processed in our own safe context.

The problem is note_mouse_highlight is simply not supposed to signal.
It is a function called directly while handling async input as far back
as Emacs 19, much like expose_frame.  (IIRC back then there was a
slightly different implementation in each of the *term.c files.)

Moving note_mouse_highlight out of handle_one_xevent would lead to other
bugs, since mouse movement must be processed in order wrt to other X
events.  For example, if an XI_Motion event arrives and is queued, and
then a subsequent XI_Leave event arrives before that event has a chance
to be processed ``in our own safe context'', note_mouse_highlight will
be called after the mouse has left the frame, leading to stuck mouse
highlight.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#60144: 30.0.50; PGTK Emacs crashes after signal
  2022-12-18 13:45                 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-18 17:34                   ` Eli Zaretskii
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2022-12-18 17:34 UTC (permalink / raw)
  To: Po Lu; +Cc: 60144, karl

> From: Po Lu <luangruo@yahoo.com>
> Cc: karl@karlotness.com,  60144@debbugs.gnu.org
> Date: Sun, 18 Dec 2022 21:45:38 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> That code has problems signalling errors, unless it is okay for
> >> unblock_input to signal.
> >
> > I don't understand this part.  Why and how is unblock_input part of
> > the picture?
> 
> Because unblock_input can call process_pending_signals, and in doing so
> handle_async_input, which calls gobble_input (and thus the
> read_socket_hook.)  As a result, it is not safe for any read_socket_hook
> to signal as long as it is not ok for unblock_input to signal as well.

But AFAIK it is _always_ safe for unblock_input to signal.  When do
you think it isn't?

> > So in the X/GTK build we have the same problem as with PGTK?  If so,
> > why not change that as well, to work as I described, i.e. enqueue
> > events to our own event queue, which we will then read and process in
> > safe context?
> >
> > AFAIU, w32 already works like that.  Does it not?
> 
> It doesn't, see how w32_note_mouse_movement is called from
> w32_read_socket.

??? w32_read_socket runs in the Lisp (a.k.a. "main") thread.  So it is
safe for any code it calls it to signal errors and do anything else
it's safe to do for the Lisp machine.

> > Yes, understood.  But it just tells me that we need to change the
> > architecture so that the events delivered by the window-system are not
> > processed in callbacks we install to be called by the window-system,
> > they should be processed in our own safe context.
> 
> The problem is note_mouse_highlight is simply not supposed to signal.

You cannot request that.  note_mouse_highlight examines properties,
and that can always signal, because properties are Lisp creatures and
can invoke Lisp.

> It is a function called directly while handling async input as far back
> as Emacs 19, much like expose_frame.  (IIRC back then there was a
> slightly different implementation in each of the *term.c files.)

We did a lot of dangerous stuff in Emacs 19, including (oh horror!)
reading input and doing redisplay inside signal handlers.  We
gradually removed all those unsafe parts, and nowadays we only do the
minimum in such contexts.  If unsafe processing of input is still with
us, we should move to safer techniques.  That this unsafe code exists
for such a long time is therefore not a justification for it to
continue existing.

Also, I think this unsafe processing of events only happens with
GTK/PGTK; other X configurations call XTread_socket and
handle_one_xevent from keyboard.c, in a "safe" context.

> Moving note_mouse_highlight out of handle_one_xevent would lead to other
> bugs, since mouse movement must be processed in order wrt to other X
> events.

I didn't say we shouldn't process mouse movements.  I said that this
processing should be limited to generating an Emacs input event and
queuing it, will all the pertinent information for further processing.
For example, note_mouse_highlight does just three things:

  . redisplays portion of the window in a special face
  . changes the way the cursor is drawn
  . shows help-echo

All of that can be done given an input event read by terminals
read_socket_hook inside keyboard.c, provided that the information
about the mouse move is stored in the input event.  There's
absolutely no reason to produce the above 3 effects right where we are
fed the raw X or GTK event from the window-system or the toolkit.

> For example, if an XI_Motion event arrives and is queued, and
> then a subsequent XI_Leave event arrives before that event has a chance
> to be processed ``in our own safe context'', note_mouse_highlight will
> be called after the mouse has left the frame, leading to stuck mouse
> highlight.

AFAIU, these two events will both be queued, and will both be
processed, so there will be not "suck mouse highlight".

So I still don't understand what is it that I'm missing that makes you
say this safe processing of window-system events is impossible.

Anyway, this bug report is not the proper place to discuss this.
Please start a discussion on emacs-devel, and let's pick this up
there.





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-12-18 17:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-17  3:39 bug#60144: 30.0.50; PGTK Emacs crashes after signal Karl Otness via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-18  2:08 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-18  5:45   ` Eli Zaretskii
2022-12-18  6:22     ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-18  8:39       ` Eli Zaretskii
2022-12-18  9:52         ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-18 11:43           ` Eli Zaretskii
2022-12-18 12:12             ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-18 12:33               ` Eli Zaretskii
2022-12-18 13:45                 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-18 17:34                   ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).