all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* igc, macOS avoiding signals
@ 2024-12-28  6:40 Gerd Möllmann
  2024-12-28 12:49 ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-28  6:40 UTC (permalink / raw)
  To: Pip Cet; +Cc: Emacs Devel

This is about commit

ceec5ace134081b64dbf46c4fb5702ef5209c5fd
Avoid MPS being interrupted by signals

I've been running with this for some days now, and must report that
Emacs feels a _bit_ different here in interactive use, maybe one could
say not as smooth. (macOS, --without-ns, in my fork of Emacs, which is
very recent master++).

After reverting the commit, it's feeling smoother again.

Just saying. FWIW.

Maybe it's a point for making things conditional on OS, don't know.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28  6:40 Gerd Möllmann
@ 2024-12-28 12:49 ` Pip Cet via Emacs development discussions.
  2024-12-28 12:55   ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 12:49 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Emacs Devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> This is about commit
>
> ceec5ace134081b64dbf46c4fb5702ef5209c5fd
> Avoid MPS being interrupted by signals
>
> I've been running with this for some days now, and must report that
> Emacs feels a _bit_ different here in interactive use, maybe one could
> say not as smooth. (macOS, --without-ns, in my fork of Emacs, which is
> very recent master++).

I think we should quantify that.  Set a watchpoint on
igc_global->signals_pending, check how often we even set that, and how
often we call igc_maybe_quit, particularly if we were previously idle.
Maybe it's sufficient to call it again from the idle handler.

> After reverting the commit, it's feeling smoother again.

Entirely possible.  Let's measure it.

> Maybe it's a point for making things conditional on OS, don't know.

If we establish it's not necessary on macOS, sure.  If it slows things
down that's kind of a hint that it might be necessary, though.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 12:49 ` Pip Cet via Emacs development discussions.
@ 2024-12-28 12:55   ` Gerd Möllmann
  2024-12-28 13:50     ` Óscar Fuentes
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-28 12:55 UTC (permalink / raw)
  To: Pip Cet; +Cc: Emacs Devel

Pip Cet <pipcet@protonmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> This is about commit
>>
>> ceec5ace134081b64dbf46c4fb5702ef5209c5fd
>> Avoid MPS being interrupted by signals
>>
>> I've been running with this for some days now, and must report that
>> Emacs feels a _bit_ different here in interactive use, maybe one could
>> say not as smooth. (macOS, --without-ns, in my fork of Emacs, which is
>> very recent master++).
>
> I think we should quantify that.  Set a watchpoint on
> igc_global->signals_pending, check how often we even set that, and how
> often we call igc_maybe_quit, particularly if we were previously idle.
> Maybe it's sufficient to call it again from the idle handler.
>
>> After reverting the commit, it's feeling smoother again.
>
> Entirely possible.  Let's measure it.

I'm sorry, but I pass. It's too time-consuming for me. Maybe someone
else using macOS can do that. I just wanted to make you aware of this.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
@ 2024-12-28 13:24 Sean Devlin
  2024-12-28 13:28 ` Gerd Möllmann
                   ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Sean Devlin @ 2024-12-28 13:24 UTC (permalink / raw)
  To: gerd.moellmann, pipcet; +Cc: emacs-devel

> Pip Cet <pipcet@protonmail.com> writes:
> 
> > Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> >
> >> This is about commit
> >>
> >> ceec5ace134081b64dbf46c4fb5702ef5209c5fd
> >> Avoid MPS being interrupted by signals
> >>
> >> I've been running with this for some days now, and must report that
> >> Emacs feels a _bit_ different here in interactive use, maybe one could
> >> say not as smooth. (macOS, --without-ns, in my fork of Emacs, which is
> >> very recent master++).
> >
> > I think we should quantify that.  Set a watchpoint on
> > igc_global->signals_pending, check how often we even set that, and how
> > often we call igc_maybe_quit, particularly if we were previously idle.
> > Maybe it's sufficient to call it again from the idle handler.
> >
> >> After reverting the commit, it's feeling smoother again.
> >
> > Entirely possible.  Let's measure it.
> 
> I'm sorry, but I pass. It's too time-consuming for me. Maybe someone
> else using macOS can do that. I just wanted to make you aware of this.

I can take a stab at this. I’ve been running the scratch/igc branch on macOS using the NS build.

To be clear, I should make sure to include the above commit in my build? Or should I not include it?

When I take my measurements, are there any kinds of tasks I should try to perform? What durations of time should I try to measure?

Cheers.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 13:24 igc, macOS avoiding signals Sean Devlin
@ 2024-12-28 13:28 ` Gerd Möllmann
  2024-12-28 14:31   ` Eli Zaretskii
  2024-12-28 15:12 ` Pip Cet via Emacs development discussions.
  2024-12-28 16:29 ` Pip Cet via Emacs development discussions.
  2 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-28 13:28 UTC (permalink / raw)
  To: Sean Devlin; +Cc: pipcet, emacs-devel

Sean Devlin <spd@toadstyle.org> writes:

>> Pip Cet <pipcet@protonmail.com> writes:
>> 
>> > Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>> >
>> >> This is about commit
>> >>
>> >> ceec5ace134081b64dbf46c4fb5702ef5209c5fd
>> >> Avoid MPS being interrupted by signals
>> >>
>> >> I've been running with this for some days now, and must report that
>> >> Emacs feels a _bit_ different here in interactive use, maybe one could
>> >> say not as smooth. (macOS, --without-ns, in my fork of Emacs, which is
>> >> very recent master++).
>> >
>> > I think we should quantify that.  Set a watchpoint on
>> > igc_global->signals_pending, check how often we even set that, and how
>> > often we call igc_maybe_quit, particularly if we were previously idle.
>> > Maybe it's sufficient to call it again from the idle handler.
>> >
>> >> After reverting the commit, it's feeling smoother again.
>> >
>> > Entirely possible.  Let's measure it.
>> 
>> I'm sorry, but I pass. It's too time-consuming for me. Maybe someone
>> else using macOS can do that. I just wanted to make you aware of this.
>
> I can take a stab at this. I’ve been running the scratch/igc branch on
> macOS using the NS build.
>
> To be clear, I should make sure to include the above commit in my
> build? Or should I not include it?

Just let in in. I reverted it only in my Emacs.

> When I take my measurements, are there any kinds of tasks I should try
> to perform? What durations of time should I try to measure?
>
> Cheers.

That's something for Pip to tell.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 12:55   ` Gerd Möllmann
@ 2024-12-28 13:50     ` Óscar Fuentes
  2024-12-29  8:02       ` Helmut Eller
  0 siblings, 1 reply; 119+ messages in thread
From: Óscar Fuentes @ 2024-12-28 13:50 UTC (permalink / raw)
  To: emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

>>> After reverting the commit, it's feeling smoother again.
>>
>> Entirely possible.  Let's measure it.
>
> I'm sorry, but I pass. It's too time-consuming for me. Maybe someone
> else using macOS can do that. I just wanted to make you aware of this.

Is it difficult to implement a way to measure some relevant metrics?

Something related to UI responsiveness would be great. For instance,
record the time from each interactive command start to command end (or,
better, until Emacs is idle again, to account for commands accumulating
on the queue). Then we can perform some statistical analysis on that
info.

I'm afraid that if we start discussing personal perceptions we will
devote a lot of time trying to fine-adjust parameters.




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 13:28 ` Gerd Möllmann
@ 2024-12-28 14:31   ` Eli Zaretskii
  2024-12-28 14:45     ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-28 14:31 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: spd, pipcet, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com,  emacs-devel@gnu.org
> Date: Sat, 28 Dec 2024 14:28:57 +0100
> 
> Sean Devlin <spd@toadstyle.org> writes:
> 
> > To be clear, I should make sure to include the above commit in my
> > build? Or should I not include it?
> 
> Just let in in. I reverted it only in my Emacs.

Can you show that MPS runs GC from a non-main thread?  Like, if you
set a breakpoint in ArenaEnter, does it ever break on a non-main
thread?

I've just reviewed all the backtraces we had in our MPS discussions,
and all I see there is MPS being triggered either from igc_on_idle or
from alloc_impl, but every single time it happens on the main thread,
the one where I see the main function that enters recursive-edit.

So maybe Pip is right, and MPS always runs in the main (Lisp) thread,
even on macOS?  Can you catch it on a non-main thread?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 14:31   ` Eli Zaretskii
@ 2024-12-28 14:45     ` Gerd Möllmann
  2024-12-30  7:13       ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-28 14:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, pipcet, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> So maybe Pip is right, and MPS always runs in the main (Lisp) thread,
> even on macOS?  Can you catch it on a non-main thread?

It's well possible that I misunderstand what the MPS guide says about it
being concurrent (see my reply to Pip), and that the thread I see here
is something else.

If you don't see an additional thread on Linux, just don't listen to me
and do what you think is TRT. I don't know anything about MPS internals.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 13:24 igc, macOS avoiding signals Sean Devlin
  2024-12-28 13:28 ` Gerd Möllmann
@ 2024-12-28 15:12 ` Pip Cet via Emacs development discussions.
  2024-12-28 17:30   ` Eli Zaretskii
  2024-12-28 16:29 ` Pip Cet via Emacs development discussions.
  2 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 15:12 UTC (permalink / raw)
  To: Sean Devlin; +Cc: gerd.moellmann, emacs-devel

"Sean Devlin" <spd@toadstyle.org> writes:

>> Pip Cet <pipcet@protonmail.com> writes:
>>
>> > Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>> >
>> >> This is about commit
>> >>
>> >> ceec5ace134081b64dbf46c4fb5702ef5209c5fd
>> >> Avoid MPS being interrupted by signals
>> >>
>> >> I've been running with this for some days now, and must report that
>> >> Emacs feels a _bit_ different here in interactive use, maybe one could
>> >> say not as smooth. (macOS, --without-ns, in my fork of Emacs, which is
>> >> very recent master++).
>> >
>> > I think we should quantify that.  Set a watchpoint on
>> > igc_global->signals_pending, check how often we even set that, and how
>> > often we call igc_maybe_quit, particularly if we were previously idle.
>> > Maybe it's sufficient to call it again from the idle handler.
>> >
>> >> After reverting the commit, it's feeling smoother again.
>> >
>> > Entirely possible.  Let's measure it.
>>
>> I'm sorry, but I pass. It's too time-consuming for me. Maybe someone
>> else using macOS can do that. I just wanted to make you aware of this.
>
> I can take a stab at this. I’ve been running the scratch/igc branch on macOS using the NS build.

Thanks!

> To be clear, I should make sure to include the above commit in my build? Or should I not include it?

I think we probably need to put instrumentation in the source code, so
we gain some idea of how long signals are delayed for when we mark them
pending.

(I'm also noticing that igc_maybe_quit isn't called as much as I thought
it would be.  Maybe we need to call it explicitly when Emacs becomes
idle?)

I'll try to come up with a patch.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 13:24 igc, macOS avoiding signals Sean Devlin
  2024-12-28 13:28 ` Gerd Möllmann
  2024-12-28 15:12 ` Pip Cet via Emacs development discussions.
@ 2024-12-28 16:29 ` Pip Cet via Emacs development discussions.
  2024-12-29  2:21   ` Sean Devlin
  2 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 16:29 UTC (permalink / raw)
  To: Sean Devlin; +Cc: gerd.moellmann, emacs-devel

> I'll try to come up with a patch.

This should provide some data (on stderr) about which signals we delay,
and for how long (the "delayed" messages).  It also includes some
information on additional points at which we can detect whether signals
are pending (the "delaying" messages); it's probably safe to run them at
that point, but the code might need some changes because other signals
(or even the signal in question) might be legitimately blocked when we
reach that point.

If the "delaying" messages indicate acceptable (initial) delays, we
might get away with simply calling gc_maybe_quit more often.  If they
don't, further fixes will be necessary, or we need to find more such
points.

On POSIX systems where we can spare an additional signal, we can run a
separate thread to ask us to retry running signal handlers when the
arena lock might be available again.

Or we could move to a separate thread for slow-path allocations.

diff --git a/src/igc.c b/src/igc.c
index 4d9aec90340..1402b9d10a5 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -811,6 +811,7 @@ IGC_DEFINE_LIST (igc_thread);
   /* Per-signal flags.  These are `int' to reduce the chance of
    * corruption when accessed non-atomically.  */
   int pending_signals[64];
+  struct timespec pending_since[64];
   /* The real signal mask we want to restore after handling pending
    * signals.  */
   sigset_t signal_mask;
@@ -3833,6 +3834,7 @@ alloc_impl (size_t size, enum igc_obj_type type, mps_ap_t ap)
     case IGC_STATE_INITIAL:
       emacs_abort ();
     }
+  gc_maybe_dont_quit ();
   return p;
 }
 
@@ -4924,8 +4926,9 @@ gc_signal_handler_can_run (int sig)
   if (igc_busy_p ())
     {
       sigset_t sigs;
-      global_igc->signals_pending = 1;
+      clock_gettime (CLOCK_REALTIME, &global_igc->pending_since[sig]);
       global_igc->pending_signals[sig] = 1;
+      global_igc->signals_pending = 1;
       sigemptyset (&sigs);
       sigaddset (&sigs, sig);
       pthread_sigmask (SIG_BLOCK, &sigs, NULL);
@@ -4946,6 +4949,12 @@ gc_maybe_quit (void)
       for (int i = 0; i < ARRAYELTS (global_igc->pending_signals); i++)
 	if (global_igc->pending_signals[i])
 	  {
+	    struct timespec ts;
+	    clock_gettime (CLOCK_REALTIME, &ts);
+	    long long nsec = ts.tv_nsec - global_igc->pending_since[i].tv_nsec;
+	    long long sec = ts.tv_sec - global_igc->pending_since[i].tv_sec;
+	    nsec += 1000000000 * sec;
+	    fprintf (stderr, "delayed %d for %f sec\n", i, nsec * 1.0e-9);
 	    global_igc->pending_signals[i] = 0;
 	    raise (i);
 	  }
@@ -4953,6 +4962,23 @@ gc_maybe_quit (void)
     }
 }
 
+void gc_maybe_dont_quit (void)
+{
+  if (global_igc->signals_pending)
+    {
+      for (int i = 0; i < ARRAYELTS (global_igc->pending_signals); i++)
+	if (global_igc->pending_signals[i])
+	  {
+	    struct timespec ts;
+	    clock_gettime (CLOCK_REALTIME, &ts);
+	    long long nsec = ts.tv_nsec - global_igc->pending_since[i].tv_nsec;
+	    long long sec = ts.tv_sec - global_igc->pending_since[i].tv_sec;
+	    nsec += 1000000000 * sec;
+	    fprintf (stderr, "delaying %d for %f sec\n", i, nsec * 1.0e-9);
+	  }
+    }
+}
+
 DEFUN ("igc--add-extra-dependency", Figc__add_extra_dependency,
        Sigc__add_extra_dependency, 3, 3, 0,
        doc: /* Add an extra DEPENDENCY to object OBJ, associate it with KEY.
diff --git a/src/keyboard.c b/src/keyboard.c
index e875e98fde6..906595f3be9 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -8203,6 +8203,7 @@ unblock_input_to (int level)
   interrupt_input_blocked = level;
   if (level == 0)
     {
+      gc_maybe_dont_quit ();
       if (pending_signals && !fatal_error_in_progress)
 	process_pending_signals ();
     }
diff --git a/src/lisp.h b/src/lisp.h
index 48585c2d8a1..fb7f3847a5d 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -47,12 +47,17 @@ #define EMACS_LISP_H
 #ifdef HAVE_MPS
 union gc_header;
 extern void gc_maybe_quit (void);
+extern void gc_maybe_dont_quit (void);
 extern bool gc_signal_handler_can_run (int);
 #else
 INLINE void gc_maybe_quit (void)
 {
 }
 
+INLINE void gc_maybe_dont_quit (void)
+{
+}
+
 INLINE bool gc_signal_handler_can_run (int sig)
 {
   return true;




^ permalink raw reply related	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 15:12 ` Pip Cet via Emacs development discussions.
@ 2024-12-28 17:30   ` Eli Zaretskii
  2024-12-28 18:40     ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-28 17:30 UTC (permalink / raw)
  To: Pip Cet; +Cc: spd, gerd.moellmann, emacs-devel

> Date: Sat, 28 Dec 2024 15:12:23 +0000
> Cc: gerd.moellmann@gmail.com, emacs-devel@gnu.org
> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> 
> I think we probably need to put instrumentation in the source code, so
> we gain some idea of how long signals are delayed for when we mark them
> pending.

What do we expect to learn from this, except the timing of the OS
scheduler?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 17:30   ` Eli Zaretskii
@ 2024-12-28 18:40     ` Pip Cet via Emacs development discussions.
  2024-12-28 18:50       ` Eli Zaretskii
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 18:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, gerd.moellmann, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Sat, 28 Dec 2024 15:12:23 +0000
>> Cc: gerd.moellmann@gmail.com, emacs-devel@gnu.org
>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>
>> I think we probably need to put instrumentation in the source code, so
>> we gain some idea of how long signals are delayed for when we mark them
>> pending.
>
> What do we expect to learn from this,

It tests the current code, which does this:

When a signal arrives, and we can't handle it because we might have
interrupted MPS, we mark the signal as pending in the igc structure.  At
some point later, we check the igc structure for pending signals,
reraise them, and unmask them.

Gerd's experience suggests that the "some point later" happens too late.
This patch gives us measurements.

It's unrelated to the OS scheduler, AFAICS.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 18:40     ` Pip Cet via Emacs development discussions.
@ 2024-12-28 18:50       ` Eli Zaretskii
  2024-12-28 19:07         ` Eli Zaretskii
  2024-12-28 19:15         ` Pip Cet via Emacs development discussions.
  0 siblings, 2 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-28 18:50 UTC (permalink / raw)
  To: Pip Cet; +Cc: spd, gerd.moellmann, emacs-devel

> Date: Sat, 28 Dec 2024 18:40:30 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> Date: Sat, 28 Dec 2024 15:12:23 +0000
> >> Cc: gerd.moellmann@gmail.com, emacs-devel@gnu.org
> >> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> >>
> >> I think we probably need to put instrumentation in the source code, so
> >> we gain some idea of how long signals are delayed for when we mark them
> >> pending.
> >
> > What do we expect to learn from this,
> 
> It tests the current code, which does this:
> 
> When a signal arrives, and we can't handle it because we might have
> interrupted MPS, we mark the signal as pending in the igc structure.  At
> some point later, we check the igc structure for pending signals,
> reraise them, and unmask them.
> 
> Gerd's experience suggests that the "some point later" happens too late.
> This patch gives us measurements.
> 
> It's unrelated to the OS scheduler, AFAICS.

Ah, okay.  I note that if we'd block signals when calling MPS and
unblock on exit, then these delays couldn't have happened, AFAIU.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 18:50       ` Eli Zaretskii
@ 2024-12-28 19:07         ` Eli Zaretskii
  2024-12-28 19:20           ` Pip Cet via Emacs development discussions.
  2024-12-28 19:15         ` Pip Cet via Emacs development discussions.
  1 sibling, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-28 19:07 UTC (permalink / raw)
  To: pipcet; +Cc: spd, gerd.moellmann, emacs-devel

> Date: Sat, 28 Dec 2024 20:50:22 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
> 
> > > What do we expect to learn from this,
> > 
> > It tests the current code, which does this:
> > 
> > When a signal arrives, and we can't handle it because we might have
> > interrupted MPS, we mark the signal as pending in the igc structure.  At
> > some point later, we check the igc structure for pending signals,
> > reraise them, and unmask them.
> > 
> > Gerd's experience suggests that the "some point later" happens too late.
> > This patch gives us measurements.
> > 
> > It's unrelated to the OS scheduler, AFAICS.
> 
> Ah, okay.  I note that if we'd block signals when calling MPS and
> unblock on exit, then these delays couldn't have happened, AFAIU.

But OTOH, if this delaying of a signal affects responsiveness, then
all we need to do is exempt SIGSEGV from being delayed, right?  This
signal-delay mechanism was invented for SIGPROF, SIGCHLD, and SIGALRM,
but there's no reason to delay SIGSEGV.

And AFAIU, on macOS there's no SIGSEGV anyway, is that right?  So why
does this delaying affect responsiveness?

Or what am I missing?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 18:50       ` Eli Zaretskii
  2024-12-28 19:07         ` Eli Zaretskii
@ 2024-12-28 19:15         ` Pip Cet via Emacs development discussions.
  2024-12-28 19:30           ` Eli Zaretskii
  1 sibling, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 19:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, gerd.moellmann, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

> Ah, okay.  I note that if we'd block signals when calling MPS and
> unblock on exit, then these delays couldn't have happened, AFAIU.

We can't do so for SIGSEGV calling into MPS, so this wouldn't fix all
cases.  Blocking signals around mps_alloc slows down (make-list 1000000
nil) by a factor of about 5 (on current GNU/Linux; possibly
significantly more on other operating systems).

But, yes, if we're willing to give up on unmodified MPS, blocking
signals in the slow path only might work.  We'd need to check
finalizable objects, though, because we need to call into MPS for every
finalizable object.  That's a problem this approach shares with the
allocation thread approach, by the way.  In both cases, registering
objects for finalization doesn't need to happen at allocation time: if
we save a reference to the object somewhere MPS sees it, the objects
won't be collected, so they won't be finalized.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 19:07         ` Eli Zaretskii
@ 2024-12-28 19:20           ` Pip Cet via Emacs development discussions.
  2024-12-28 19:36             ` Eli Zaretskii
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 19:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, gerd.moellmann, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Sat, 28 Dec 2024 20:50:22 +0200
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
>>
>> > > What do we expect to learn from this,
>> >
>> > It tests the current code, which does this:
>> >
>> > When a signal arrives, and we can't handle it because we might have
>> > interrupted MPS, we mark the signal as pending in the igc structure.  At
>> > some point later, we check the igc structure for pending signals,
>> > reraise them, and unmask them.
>> >
>> > Gerd's experience suggests that the "some point later" happens too late.
>> > This patch gives us measurements.
>> >
>> > It's unrelated to the OS scheduler, AFAICS.
>>
>> Ah, okay.  I note that if we'd block signals when calling MPS and
>> unblock on exit, then these delays couldn't have happened, AFAIU.
>
> But OTOH, if this delaying of a signal affects responsiveness, then
> all we need to do is exempt SIGSEGV from being delayed, right?  This
> signal-delay mechanism was invented for SIGPROF, SIGCHLD, and SIGALRM,
> but there's no reason to delay SIGSEGV.

SIGSEGV is never delayed in any proposal I'm aware of.  I don't see how
it could be, to be honest, but maybe I'm missing something there.

> And AFAIU, on macOS there's no SIGSEGV anyway, is that right?  So why
> does this delaying affect responsiveness?

Possibly SIGPOLL.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 19:15         ` Pip Cet via Emacs development discussions.
@ 2024-12-28 19:30           ` Eli Zaretskii
  0 siblings, 0 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-28 19:30 UTC (permalink / raw)
  To: Pip Cet; +Cc: spd, gerd.moellmann, emacs-devel

> Date: Sat, 28 Dec 2024 19:15:26 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> > Ah, okay.  I note that if we'd block signals when calling MPS and
> > unblock on exit, then these delays couldn't have happened, AFAIU.
> 
> We can't do so for SIGSEGV calling into MPS, so this wouldn't fix all
> cases.

We don't need to block SIGSEGV (or any other fatal signal).  We only
need to block SIGPROF, SIGALRM and SIGCHLD.

> Blocking signals around mps_alloc slows down (make-list 1000000
> nil) by a factor of about 5 (on current GNU/Linux; possibly
> significantly more on other operating systems).

No, I think only GNU/Linux is affected.

You are saying that sigblock is very expensive?  Can you measure it?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 19:20           ` Pip Cet via Emacs development discussions.
@ 2024-12-28 19:36             ` Eli Zaretskii
  2024-12-28 20:54               ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-28 19:36 UTC (permalink / raw)
  To: Pip Cet; +Cc: spd, gerd.moellmann, emacs-devel

> Date: Sat, 28 Dec 2024 19:20:40 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> > But OTOH, if this delaying of a signal affects responsiveness, then
> > all we need to do is exempt SIGSEGV from being delayed, right?  This
> > signal-delay mechanism was invented for SIGPROF, SIGCHLD, and SIGALRM,
> > but there's no reason to delay SIGSEGV.
> 
> SIGSEGV is never delayed in any proposal I'm aware of.

The call to gc_signal_handler_can_run is inside
deliver_process_signal.  Are you saying that deliver_process_signal is
not called for SIGSEGV?

> > And AFAIU, on macOS there's no SIGSEGV anyway, is that right?  So why
> > does this delaying affect responsiveness?
> 
> Possibly SIGPOLL.

We don't need to block SIGPOLL, either.  Its handler is safe, the same
as SIGIO.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 19:36             ` Eli Zaretskii
@ 2024-12-28 20:54               ` Pip Cet via Emacs development discussions.
  2024-12-29  5:51                 ` Eli Zaretskii
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-28 20:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, gerd.moellmann, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Sat, 28 Dec 2024 19:20:40 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
>>
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>>
>> > But OTOH, if this delaying of a signal affects responsiveness, then
>> > all we need to do is exempt SIGSEGV from being delayed, right?  This
>> > signal-delay mechanism was invented for SIGPROF, SIGCHLD, and SIGALRM,
>> > but there's no reason to delay SIGSEGV.
>>
>> SIGSEGV is never delayed in any proposal I'm aware of.
>
> The call to gc_signal_handler_can_run is inside
> deliver_process_signal.  Are you saying that deliver_process_signal is
> not called for SIGSEGV?

MPS installs its own SIGSEGV handler which doesn't go through
deliver_process_signal.  Only if it fails, the Emacs handler which does
go through deliver_process_signal is restored for the final SIGSEGV
which will then terminate Emacs.

>> > And AFAIU, on macOS there's no SIGSEGV anyway, is that right?  So why
>> > does this delaying affect responsiveness?
>>
>> Possibly SIGPOLL.
>
> We don't need to block SIGPOLL, either.  Its handler is safe, the same
> as SIGIO.

Thanks!  That's good to know, and that's why we pass the signal number
to gc_signal_handler_can_run.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 16:29 ` Pip Cet via Emacs development discussions.
@ 2024-12-29  2:21   ` Sean Devlin
  2024-12-29 12:22     ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Sean Devlin @ 2024-12-29  2:21 UTC (permalink / raw)
  To: Pip Cet; +Cc: gerd.moellmann, emacs-devel

Hi Pip,

> On Dec 29, 2024, at 1:29 AM, Pip Cet <pipcet@protonmail.com> wrote:
> 
>> I'll try to come up with a patch.
> 
> This should provide some data (on stderr) about which signals we delay,
> and for how long (the "delayed" messages).  It also includes some
> information on additional points at which we can detect whether signals
> are pending (the "delaying" messages); it's probably safe to run them at
> that point, but the code might need some changes because other signals
> (or even the signal in question) might be legitimately blocked when we
> reach that point.
> 
> If the "delaying" messages indicate acceptable (initial) delays, we
> might get away with simply calling gc_maybe_quit more often.  If they
> don't, further fixes will be necessary, or we need to find more such
> points.
> 
> On POSIX systems where we can spare an additional signal, we can run a
> separate thread to ask us to retry running signal handlers when the
> arena lock might be available again.
> 
> Or we could move to a separate thread for slow-path allocations.
> 

I’ve built Emacs with your patch. After running Emacs -Q for a few minutes, I can confirm I see a few log statements:

delaying 20 for 0.066594 sec
delaying 20 for 0.066612 sec
delayed 20 for 0.066614 sec

Please let me know if there are any particular tasks you’d like me to try, or if I should just collect the logs in the background during general usage.

Cheers.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 20:54               ` Pip Cet via Emacs development discussions.
@ 2024-12-29  5:51                 ` Eli Zaretskii
  0 siblings, 0 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-29  5:51 UTC (permalink / raw)
  To: Pip Cet; +Cc: spd, gerd.moellmann, emacs-devel

> Date: Sat, 28 Dec 2024 20:54:03 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> Date: Sat, 28 Dec 2024 19:20:40 +0000
> >> From: Pip Cet <pipcet@protonmail.com>
> >> Cc: spd@toadstyle.org, gerd.moellmann@gmail.com, emacs-devel@gnu.org
> >>
> >> "Eli Zaretskii" <eliz@gnu.org> writes:
> >>
> >> > But OTOH, if this delaying of a signal affects responsiveness, then
> >> > all we need to do is exempt SIGSEGV from being delayed, right?  This
> >> > signal-delay mechanism was invented for SIGPROF, SIGCHLD, and SIGALRM,
> >> > but there's no reason to delay SIGSEGV.
> >>
> >> SIGSEGV is never delayed in any proposal I'm aware of.
> >
> > The call to gc_signal_handler_can_run is inside
> > deliver_process_signal.  Are you saying that deliver_process_signal is
> > not called for SIGSEGV?
> 
> MPS installs its own SIGSEGV handler which doesn't go through
> deliver_process_signal.  Only if it fails, the Emacs handler which does
> go through deliver_process_signal is restored for the final SIGSEGV
> which will then terminate Emacs.

Then I ask once again: how can we explain what Gerd reports about
responsiveness if your changes, which only affect
deliver_process_signal, cannot affect MPS?

> >> > And AFAIU, on macOS there's no SIGSEGV anyway, is that right?  So why
> >> > does this delaying affect responsiveness?
> >>
> >> Possibly SIGPOLL.
> >
> > We don't need to block SIGPOLL, either.  Its handler is safe, the same
> > as SIGIO.
> 
> Thanks!  That's good to know, and that's why we pass the signal number
> to gc_signal_handler_can_run.

But currently gc_signal_handler_can_run does nothing with the signal
number except recording that it happened.  IMO, it should return
'true' immediately unless the signal is SIGPROF, SIGCHLD, or SIGALRM.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 13:50     ` Óscar Fuentes
@ 2024-12-29  8:02       ` Helmut Eller
  0 siblings, 0 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-29  8:02 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

On Sat, Dec 28 2024, Óscar Fuentes wrote:
[...]
> Something related to UI responsiveness would be great. For instance,
> record the time from each interactive command start to command end (or,
> better, until Emacs is idle again, to account for commands accumulating
> on the queue). Then we can perform some statistical analysis on that
> info.
>
> I'm afraid that if we start discussing personal perceptions we will
> devote a lot of time trying to fine-adjust parameters.

I couldn't agree more. 

What would you think about using one of those tracing frameworks, like
LTTng[*]?  Are those any good?

It would be nice, if those tracing points could be used in normal
everyday sessions.  E.g. to see how memory usage evolved over the last
15 minutes or to see if there was a particularly slow regexp search.

Helmut

[*] https://lttng.org/



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-29  2:21   ` Sean Devlin
@ 2024-12-29 12:22     ` Pip Cet via Emacs development discussions.
  2024-12-29 15:01       ` Gerd Möllmann
  2024-12-30  5:23       ` Sean Devlin
  0 siblings, 2 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-29 12:22 UTC (permalink / raw)
  To: Sean Devlin; +Cc: gerd.moellmann, emacs-devel

"Sean Devlin" <spd@toadstyle.org> writes:

> Hi Pip,
>
>> On Dec 29, 2024, at 1:29 AM, Pip Cet <pipcet@protonmail.com> wrote:
>>
>>> I'll try to come up with a patch.
>>
>> This should provide some data (on stderr) about which signals we delay,
>> and for how long (the "delayed" messages).  It also includes some
>> information on additional points at which we can detect whether signals
>> are pending (the "delaying" messages); it's probably safe to run them at
>> that point, but the code might need some changes because other signals
>> (or even the signal in question) might be legitimately blocked when we
>> reach that point.
>>
>> If the "delaying" messages indicate acceptable (initial) delays, we
>> might get away with simply calling gc_maybe_quit more often.  If they
>> don't, further fixes will be necessary, or we need to find more such
>> points.
>>
>> On POSIX systems where we can spare an additional signal, we can run a
>> separate thread to ask us to retry running signal handlers when the
>> arena lock might be available again.
>>
>> Or we could move to a separate thread for slow-path allocations.
>>
>
> I’ve built Emacs with your patch. After running Emacs -Q for a few minutes, I can confirm I see a few log statements:

Can you try setting igc-step-interval to a small float value, like 0.05
?  As long as it's just a few messages, I don't think it'd cause
significant problems, but maybe enabling the background work would do
something.

> Please let me know if there are any particular tasks you’d like me to try, or if I should just collect the logs in the background during general usage.

Repeatedly hitting "s" in an M-x igc-stats buffer should cause more
messages, but that uses IGC in an atypical fashion, so I'm not sure
that's actually useful data...

Thanks!

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-29 12:22     ` Pip Cet via Emacs development discussions.
@ 2024-12-29 15:01       ` Gerd Möllmann
  2024-12-29 19:44         ` Pip Cet via Emacs development discussions.
  2024-12-30  5:24         ` Sean Devlin
  2024-12-30  5:23       ` Sean Devlin
  1 sibling, 2 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-29 15:01 UTC (permalink / raw)
  To: Pip Cet; +Cc: Sean Devlin, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Repeatedly hitting "s" in an M-x igc-stats buffer should cause more
> messages, but that uses IGC in an atypical fashion, so I'm not sure
> that's actually useful data...

Maybe not running with -Q but a normal config would help? I'm using all
sorts of packages to generate garbage. Eglot, flymake, Corfu, Jinx,
Vertico, Marginalia ...



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-29 15:01       ` Gerd Möllmann
@ 2024-12-29 19:44         ` Pip Cet via Emacs development discussions.
  2024-12-30  6:16           ` Gerd Möllmann
  2024-12-30  5:24         ` Sean Devlin
  1 sibling, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-29 19:44 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Sean Devlin, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> Repeatedly hitting "s" in an M-x igc-stats buffer should cause more
>> messages, but that uses IGC in an atypical fashion, so I'm not sure
>> that's actually useful data...
>
> Maybe not running with -Q but a normal config would help? I'm using all
> sorts of packages to generate garbage. Eglot, flymake, Corfu, Jinx,
> Vertico, Marginalia ...

Speaking of running with a "normal" config: something about my
configuration makes buffer_step (the balance_intervals call, in
particular) take forever, to the point the mps build becomes unusable.
The buffer in question, when I caught it, is an M-x shell buffer of size
8 MB, so I don't understand why it's taking so long.

Still investigating, but skipping the buffer_step seems to help.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-29 12:22     ` Pip Cet via Emacs development discussions.
  2024-12-29 15:01       ` Gerd Möllmann
@ 2024-12-30  5:23       ` Sean Devlin
  1 sibling, 0 replies; 119+ messages in thread
From: Sean Devlin @ 2024-12-30  5:23 UTC (permalink / raw)
  To: Pip Cet; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2260 bytes --]



> On Dec 29, 2024, at 9:22 PM, Pip Cet <pipcet@protonmail.com> wrote:
> 
> "Sean Devlin" <spd@toadstyle.org <mailto:spd@toadstyle.org>> writes:
> 
>> Hi Pip,
>> 
>>> On Dec 29, 2024, at 1:29 AM, Pip Cet <pipcet@protonmail.com> wrote:
>>> 
>>>> I'll try to come up with a patch.
>>> 
>>> This should provide some data (on stderr) about which signals we delay,
>>> and for how long (the "delayed" messages).  It also includes some
>>> information on additional points at which we can detect whether signals
>>> are pending (the "delaying" messages); it's probably safe to run them at
>>> that point, but the code might need some changes because other signals
>>> (or even the signal in question) might be legitimately blocked when we
>>> reach that point.
>>> 
>>> If the "delaying" messages indicate acceptable (initial) delays, we
>>> might get away with simply calling gc_maybe_quit more often.  If they
>>> don't, further fixes will be necessary, or we need to find more such
>>> points.
>>> 
>>> On POSIX systems where we can spare an additional signal, we can run a
>>> separate thread to ask us to retry running signal handlers when the
>>> arena lock might be available again.
>>> 
>>> Or we could move to a separate thread for slow-path allocations.
>>> 
>> 
>> I’ve built Emacs with your patch. After running Emacs -Q for a few minutes, I can confirm I see a few log statements:
> 
> Can you try setting igc-step-interval to a small float value, like 0.05
> ?  As long as it's just a few messages, I don't think it'd cause
> significant problems, but maybe enabling the background work would do
> something.

Sounds good, will do.

> 
>> Please let me know if there are any particular tasks you’d like me to try, or if I should just collect the logs in the background during general usage.
> 
> Repeatedly hitting "s" in an M-x igc-stats buffer should cause more
> messages, but that uses IGC in an atypical fashion, so I'm not sure
> that's actually useful data...

Well, I can definitely make it log a bunch of messages either by running the profiler or by spawning subprocesses. Is that helpful by itself, or do I also need to generate garbage for collection?

> 
> Thanks!
> 
> Pip


[-- Attachment #2: Type: text/html, Size: 12799 bytes --]

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-29 15:01       ` Gerd Möllmann
  2024-12-29 19:44         ` Pip Cet via Emacs development discussions.
@ 2024-12-30  5:24         ` Sean Devlin
  2024-12-30  6:17           ` Gerd Möllmann
  1 sibling, 1 reply; 119+ messages in thread
From: Sean Devlin @ 2024-12-30  5:24 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Pip Cet, emacs-devel


> On Dec 30, 2024, at 12:01 AM, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
> 
> Pip Cet <pipcet@protonmail.com> writes:
> 
>> Repeatedly hitting "s" in an M-x igc-stats buffer should cause more
>> messages, but that uses IGC in an atypical fashion, so I'm not sure
>> that's actually useful data...
> 
> Maybe not running with -Q but a normal config would help? I'm using all
> sorts of packages to generate garbage. Eglot, flymake, Corfu, Jinx,
> Vertico, Marginalia …


Sounds good, I’ll use my normal configuration.




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-29 19:44         ` Pip Cet via Emacs development discussions.
@ 2024-12-30  6:16           ` Gerd Möllmann
  2024-12-30 12:51             ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30  6:16 UTC (permalink / raw)
  To: Pip Cet; +Cc: Sean Devlin, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Speaking of running with a "normal" config: something about my
> configuration makes buffer_step (the balance_intervals call, in
> particular) take forever, to the point the mps build becomes unusable.
> The buffer in question, when I caught it, is an M-x shell buffer of size
> 8 MB, so I don't understand why it's taking so long.
>
> Still investigating, but skipping the buffer_step seems to help.

balance_intervals means text properties. The only candidate I see in
comint/shell is ANSI escapes. That could be turned on/off with M-x
ansi-color-for-comint-mode-xy. Only as a workaround, and maybe to check
if it's that.

What I do in buffer_step in idle time is basically one step of what the
old GC does in sweep_buffers.

My expectation was that balancing a tree couldn't take long, and that
this is not called often enough to be a problem if were expensive. Both
wrong, as usual.

Not calling balance_intervals is, BTW, not a catastrophic problem. if
one does anything leading to a graft_intervals_into_buffer, w</r hich is
called in a lot of places in editfns.c and insdel.c, that balances the
tree. And if not, the tree might become slower for lookup (redisplay),
but it still works.

<rant> It's BTW well possible that I myself put that balancing into
sweep_buffers because of redisplay, I seem to remember that. The
interval tree has always been a source of fun. I hope, some day, some
kind soul will eradicate it like the GCPROs. </rant>

In any case, what's a solution?

Right now I'm tending to put the balance_intervals in an if so that one
can turn it on/off with a Lisp variable. Default would be to not to balance,
because I think the problems with degenerated interval trees in
redisplay where rare, and I don't remember problems outside of
redisplay. But that was an awful long time ago, OTOH.

That would give us more time to think about a possible strategy to solve
this.

WDYT?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  5:24         ` Sean Devlin
@ 2024-12-30  6:17           ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30  6:17 UTC (permalink / raw)
  To: Sean Devlin; +Cc: Pip Cet, emacs-devel

Sean Devlin <spd@toadstyle.org> writes:

>> On Dec 30, 2024, at 12:01 AM, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
>> 
>> Pip Cet <pipcet@protonmail.com> writes:
>> 
>>> Repeatedly hitting "s" in an M-x igc-stats buffer should cause more
>>> messages, but that uses IGC in an atypical fashion, so I'm not sure
>>> that's actually useful data...
>> 
>> Maybe not running with -Q but a normal config would help? I'm using all
>> sorts of packages to generate garbage. Eglot, flymake, Corfu, Jinx,
>> Vertico, Marginalia …
>
>
> Sounds good, I’ll use my normal configuration.

Thanks!



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-28 14:45     ` Gerd Möllmann
@ 2024-12-30  7:13       ` Gerd Möllmann
  2024-12-30  7:23         ` Gerd Möllmann
                           ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30  7:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> So maybe Pip is right, and MPS always runs in the main (Lisp) thread,
>> even on macOS?  Can you catch it on a non-main thread?
>
> It's well possible that I misunderstand what the MPS guide says about it
> being concurrent (see my reply to Pip), and that the thread I see here
> is something else.
>
> If you don't see an additional thread on Linux, just don't listen to me
> and do what you think is TRT. I don't know anything about MPS internals.

I've investigated this a bit using LLDB. Starting Emacs and attaching to
it, I see 3 threads.

  (lldb) thread list
  Process 55210 stopped
  * thread #1: tid = 0x8b6558, 0x0000000190be51a8 libsystem_kernel.dylib`__pselect + 8, queue 
    thread #2: tid = 0x8b655c, 0x0000000190bdef54 libsystem_kernel.dylib`mach_msg2_trap + 8
    thread #3: tid = 0x8b65df, 0x0000000190be0ba4 libsystem_kernel.dylib`__workq_kernreturn + 8

Thread 1 is Emacs main thread.

  (lldb) thread backtrace
  * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x0000000190be51a8 libsystem_kernel.dylib`__pselect + 8
    frame #1: 0x0000000190be5080 libsystem_kernel.dylib`pselect$DARWIN_EXTSN + 64
      frame #2: 0x0000000100775948 emacs`really_call_select(arg=0x000000016f851990) at thread.c:620:16 [opt]
      frame #3: 0x00000001007758bc emacs`thread_select [inlined] flush_stack_ca
      frame #4: 0x00000001007758ac emacs`thread_select(func=<unavailable>, max_
      frame #5: 0x0000000100744e78 emacs`wait_reading_process_output(time_limit=<unavailable>, n

Thread 2 is MPS' port, for EXC_BAD_ACCESS

  (lldb) thread backtrace
  * thread #2
    frame #0: 0x0000000190bdef54 libsystem_kernel.dylib`mach_msg2_trap + 8
    frame #1: 0x0000000190bf169c libsystem_kernel.dylib`mach_msg2_internal + 232
    frame #2: 0x0000000190be7af8 libsystem_kernel.dylib`mach_msg_overwrite + 480
    frame #3: 0x0000000190bdf29c libsystem_kernel.dylib`mach_msg + 24
      frame #4: 0x000000010080ae20 emacs`protCatchThread [inlined] protCatchOne at protxc.c:207:8 [opt]
      frame #5: 0x000000010080adf0 emacs`protCatchThread(p=<unavailable>) at protxc.c:284:5 [opt]
    frame #6: 0x0000000190c202e4 libsystem_pthread.dylib`_pthread_start + 136

The protxc.c is from MPS.

Thread 3 is something I can't explain.

  (lldb) thread backtrace
  * thread #3
    frame #0: 0x0000000190be0ba4 libsystem_kernel.dylib`__workq_kernreturn + 8

The __workq_kernreturn should indicate a thread that it is in the
process of finishing, but I have no idea what that could have been.

Anyway, it definitely seems to be the case that MPS is _not_ running GCs
concurrently, unless it would do things that I find highly unlikely.

I find that a bit, let's say, disappointing, TBH :-(.





^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:13       ` Gerd Möllmann
@ 2024-12-30  7:23         ` Gerd Möllmann
  2024-12-30  7:39         ` Helmut Eller
  2024-12-30 10:46         ` Pip Cet via Emacs development discussions.
  2 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30  7:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spd, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>> So maybe Pip is right, and MPS always runs in the main (Lisp) thread,
>>> even on macOS?  Can you catch it on a non-main thread?
>>
>> It's well possible that I misunderstand what the MPS guide says about it
>> being concurrent (see my reply to Pip), and that the thread I see here
>> is something else.
>>
>> If you don't see an additional thread on Linux, just don't listen to me
>> and do what you think is TRT. I don't know anything about MPS internals.
>
> I've investigated this a bit using LLDB. Starting Emacs and attaching to
> it, I see 3 threads.
>
>   (lldb) thread list
>   Process 55210 stopped
>   * thread #1: tid = 0x8b6558, 0x0000000190be51a8 libsystem_kernel.dylib`__pselect + 8, queue 
>     thread #2: tid = 0x8b655c, 0x0000000190bdef54 libsystem_kernel.dylib`mach_msg2_trap + 8
>     thread #3: tid = 0x8b65df, 0x0000000190be0ba4 libsystem_kernel.dylib`__workq_kernreturn + 8
>
> Thread 1 is Emacs main thread.
>
>   (lldb) thread backtrace
>   * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
>     frame #0: 0x0000000190be51a8 libsystem_kernel.dylib`__pselect + 8
>     frame #1: 0x0000000190be5080 libsystem_kernel.dylib`pselect$DARWIN_EXTSN + 64
>       frame #2: 0x0000000100775948 emacs`really_call_select(arg=0x000000016f851990) at thread.c:620:16 [opt]
>       frame #3: 0x00000001007758bc emacs`thread_select [inlined] flush_stack_ca
>       frame #4: 0x00000001007758ac emacs`thread_select(func=<unavailable>, max_
>       frame #5: 0x0000000100744e78 emacs`wait_reading_process_output(time_limit=<unavailable>, n
>
> Thread 2 is MPS' port, for EXC_BAD_ACCESS
>
>   (lldb) thread backtrace
>   * thread #2
>     frame #0: 0x0000000190bdef54 libsystem_kernel.dylib`mach_msg2_trap + 8
>     frame #1: 0x0000000190bf169c libsystem_kernel.dylib`mach_msg2_internal + 232
>     frame #2: 0x0000000190be7af8 libsystem_kernel.dylib`mach_msg_overwrite + 480
>     frame #3: 0x0000000190bdf29c libsystem_kernel.dylib`mach_msg + 24
>       frame #4: 0x000000010080ae20 emacs`protCatchThread [inlined] protCatchOne at protxc.c:207:8 [opt]
>       frame #5: 0x000000010080adf0 emacs`protCatchThread(p=<unavailable>) at protxc.c:284:5 [opt]
>     frame #6: 0x0000000190c202e4 libsystem_pthread.dylib`_pthread_start + 136
>
> The protxc.c is from MPS.
>
> Thread 3 is something I can't explain.
>
>   (lldb) thread backtrace
>   * thread #3
>     frame #0: 0x0000000190be0ba4 libsystem_kernel.dylib`__workq_kernreturn + 8
>
> The __workq_kernreturn should indicate a thread that it is in the
> process of finishing, but I have no idea what that could have been.
>
> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
> concurrently, unless it would do things that I find highly unlikely.
>
> I find that a bit, let's say, disappointing, TBH :-(.

And now, git grep in MPS, reveals

Concurrent collection
.....................

_`.improv.concurrent`: The MPS currently does not collect
concurrently, however the only thing that makes it not-concurrent is a

:-(



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:13       ` Gerd Möllmann
  2024-12-30  7:23         ` Gerd Möllmann
@ 2024-12-30  7:39         ` Helmut Eller
  2024-12-30  7:51           ` Gerd Möllmann
  2024-12-30 10:53           ` Pip Cet via Emacs development discussions.
  2024-12-30 10:46         ` Pip Cet via Emacs development discussions.
  2 siblings, 2 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30  7:39 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

On Mon, Dec 30 2024, Gerd Möllmann wrote:
> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
> concurrently, unless it would do things that I find highly unlikely.
>
> I find that a bit, let's say, disappointing, TBH :-(.

Richard Brooksby thinks[*] that MPS could be concurrent with software
barriers.  Feel like going down that road? :-)

Helmut

[*] https://memory-pool-system.readthedocs.io/en/latest/design/shield.html#concurrent-collection



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:39         ` Helmut Eller
@ 2024-12-30  7:51           ` Gerd Möllmann
  2024-12-30  8:02             ` Helmut Eller
  2024-12-30 11:11             ` Pip Cet via Emacs development discussions.
  2024-12-30 10:53           ` Pip Cet via Emacs development discussions.
  1 sibling, 2 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30  7:51 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
>> concurrently, unless it would do things that I find highly unlikely.
>>
>> I find that a bit, let's say, disappointing, TBH :-(.
>
> Richard Brooksby thinks[*] that MPS could be concurrent with software
> barriers.  Feel like going down that road? :-)
>
> Helmut
>
> [*] https://memory-pool-system.readthedocs.io/en/latest/design/shield.html#concurrent-collection

Yep, found that too, with git grep.

Still grumpy.

I'm afraid Modifying MPS is not my thing, But What about using something
more modern like Oilpan (aka cppgc) from V8? Can be used as a lib, is
concurrent for real. That would also be a perfect time to lift Emacs to
C++.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:51           ` Gerd Möllmann
@ 2024-12-30  8:02             ` Helmut Eller
  2024-12-30  8:47               ` Gerd Möllmann
  2024-12-30 11:11             ` Pip Cet via Emacs development discussions.
  1 sibling, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2024-12-30  8:02 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

On Mon, Dec 30 2024, Gerd Möllmann wrote:

> I'm afraid Modifying MPS is not my thing, But What about using something
> more modern like Oilpan (aka cppgc) from V8? Can be used as a lib, is
> concurrent for real.

Ideally, Emacs would have an abstract GC interface so that different
implementations could be plugged in.

> That would also be a perfect time to lift Emacs to
> C++.

I'd rather see Emacs move to Rust.  Anyway, neither option seems
realistic.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  8:02             ` Helmut Eller
@ 2024-12-30  8:47               ` Gerd Möllmann
  2024-12-30  9:29                 ` Helmut Eller
  2024-12-30 11:18                 ` Pip Cet via Emacs development discussions.
  0 siblings, 2 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30  8:47 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>
>> I'm afraid Modifying MPS is not my thing, But What about using something
>> more modern like Oilpan (aka cppgc) from V8? Can be used as a lib, is
>> concurrent for real.
>
> Ideally, Emacs would have an abstract GC interface so that different
> implementations could be plugged in.

That would indeed be nice to have.

>> That would also be a perfect time to lift Emacs to
>> C++.
>
> I'd rather see Emacs move to Rust.  Anyway, neither option seems
> realistic.

In mainline... (pondering to put a smiley).

But something else: Given what I now believe, I think I want to
understand better (a bit) why everything appears to work just fine on
macOS, with signals. Could you perhaps check if I'm off? MacOS only.

In normal operation, there are only ever 2 threads running. An Emacs
thread is interrupted by a signal and lands in a signal handler, the
MPS port thread keeps running.

In the signal handler, hitting barriers is handled by the MPS port
thread. Consistency of Emacs's state is a problem the signal handler has
to deal with, consistency of MPS' GC data is a problem that hopefully
MPS handles, and it seems to work.

I think I understand that, except when the Emacs thread is interrupted
while in MPS code, which happens for allocation points running out of
memory and mps_arena_step (idle time).

Do you agree so far? If yes, I'd bite the bullet and look at the MPS
code for macOS how that is done, if it's done.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  8:47               ` Gerd Möllmann
@ 2024-12-30  9:29                 ` Helmut Eller
  2024-12-30  9:47                   ` Helmut Eller
  2024-12-30 10:05                   ` Gerd Möllmann
  2024-12-30 11:18                 ` Pip Cet via Emacs development discussions.
  1 sibling, 2 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30  9:29 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

On Mon, Dec 30 2024, Gerd Möllmann wrote:

[...]
> But something else: Given what I now believe, I think I want to
> understand better (a bit) why everything appears to work just fine on
> macOS, with signals. Could you perhaps check if I'm off? MacOS only.
>
> In normal operation, there are only ever 2 threads running. An Emacs
> thread is interrupted by a signal and lands in a signal handler, the
> MPS port thread keeps running.
>
> In the signal handler, hitting barriers is handled by the MPS port
> thread. Consistency of Emacs's state is a problem the signal handler has
> to deal with,

Agreed.

> consistency of MPS' GC data is a problem that hopefully
> MPS handles, and it seems to work.

My interpretation of this design document[*], is that MPS's arena lock
protects most of MPS entry points.  There are a few (e.g. mps_reserve
and mp_ld_add) that don't claim the arena lock and for those it's the
burden of the client to call them in a thread safe way.  For us this
probably means: don't call them in a signal handler.

The main entry point that we want to call in the signal handler is the
SEGFAULT handler (not sure how this works on MacOS).  The fault handler
claims the non-recursive arena lock.  So, in the signal handler we
should not hold the lock while hitting a barrier.

> I think I understand that, except when the Emacs thread is interrupted
> while in MPS code, which happens for allocation points running out of
> memory and mps_arena_step (idle time).

Hmm, is that sentence incomplete?  I don't quite understand it.

> Do you agree so far? If yes, I'd bite the bullet and look at the MPS
> code for macOS how that is done, if it's done.

Yes, mostly.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  9:29                 ` Helmut Eller
@ 2024-12-30  9:47                   ` Helmut Eller
  2024-12-30 11:54                     ` Gerd Möllmann
  2024-12-30 10:05                   ` Gerd Möllmann
  1 sibling, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2024-12-30  9:47 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

On Mon, Dec 30 2024, Helmut Eller wrote:

> My interpretation of this design document[*], is that MPS's arena lock

That one:

https://memory-pool-system.readthedocs.io/en/latest/design/thread-safety.html



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  9:29                 ` Helmut Eller
  2024-12-30  9:47                   ` Helmut Eller
@ 2024-12-30 10:05                   ` Gerd Möllmann
  2024-12-30 10:27                     ` Helmut Eller
  1 sibling, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 10:05 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

>> In the signal handler, hitting barriers is handled by the MPS port
>> thread. Consistency of Emacs's state is a problem the signal handler has
>> to deal with,
>
> Agreed.
>
>> consistency of MPS' GC data is a problem that hopefully
>> MPS handles, and it seems to work.
>
> My interpretation of this design document[*], is that MPS's arena lock
> protects most of MPS entry points.  There are a few (e.g. mps_reserve
> and mp_ld_add) that don't claim the arena lock and for those it's the
> burden of the client to call them in a thread safe way.  For us this
> probably means: don't call them in a signal handler.
>
> The main entry point that we want to call in the signal handler is the
> SEGFAULT handler (not sure how this works on MacOS).  The fault handler
> claims the non-recursive arena lock.  So, in the signal handler we
> should not hold the lock while hitting a barrier.

Okay, that I think I understand then. The "only" difference between
macOS and Linux is that on macOS no SEGV handler is involved. Hitting
the barrier on macOS means that the EXC_BAD_ACCESS port thread, which
was waiting for Mach message, receives a message from the OS and starts
working.

>> I think I understand that, except when the Emacs thread is interrupted
>> while in MPS code, which happens for allocation points running out of
>> memory and mps_arena_step (idle time).
>
> Hmm, is that sentence incomplete?  I don't quite understand it.

What I meant is that I imagine a signal interrupts the Emacs thread at a
point where we are "in MPS". AreaEnter/Leave I think I understand, it's
some pthread_mutex_t, I think, from other mails. A problematic "in MPS"
could then be "while the Emacs thread owns the mutex". The places where
I imagine that mutex could be owned are mps_arena_step (I call it Emacs
is idle) and mps_commit (in alloc_impl, when the allocation point used
runs out of memory). Maybe other places, but mainly these two.

And the question on macOS for me would be if the port thread tries to
qcquire the same mutex, or how the heck that works. Or IOW, if there is
a problem, why I've never seen it happening in all that time I'm using
igc. I find that difficult to understand. But it may be just a
statistical phenomenon. Maybe filling up an APs memory is so fast so
that the probability of a signal hitting while owning the mutex is close
to zero, or something. 

>
>> Do you agree so far? If yes, I'd bite the bullet and look at the MPS
>> code for macOS how that is done, if it's done.
>
> Yes, mostly.
>
> Helmut

Thanks for your help! I'll post something when I think I have something.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 10:05                   ` Gerd Möllmann
@ 2024-12-30 10:27                     ` Helmut Eller
  2024-12-30 11:53                       ` Gerd Möllmann
                                         ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30 10:27 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

On Mon, Dec 30 2024, Gerd Möllmann wrote:

>> My interpretation of this design document[*], is that MPS's arena lock
>> protects most of MPS entry points.  There are a few (e.g. mps_reserve
>> and mp_ld_add) that don't claim the arena lock and for those it's the
>> burden of the client to call them in a thread safe way.  For us this
>> probably means: don't call them in a signal handler.
>>
>> The main entry point that we want to call in the signal handler is the
>> SEGFAULT handler (not sure how this works on MacOS).  The fault handler
>> claims the non-recursive arena lock.  So, in the signal handler we
>> should not hold the lock while hitting a barrier.
>
> Okay, that I think I understand then. The "only" difference between
> macOS and Linux is that on macOS no SEGV handler is involved. Hitting
> the barrier on macOS means that the EXC_BAD_ACCESS port thread, which
> was waiting for Mach message, receives a message from the OS and starts
> working.

I guess that both, macOS and Linux version, will end up in ArenaAccess.
Perhaps on macOS in the other thread.

>>> I think I understand that, except when the Emacs thread is interrupted
>>> while in MPS code, which happens for allocation points running out of
>>> memory and mps_arena_step (idle time).
>>
>> Hmm, is that sentence incomplete?  I don't quite understand it.
>
> What I meant is that I imagine a signal interrupts the Emacs thread at a
> point where we are "in MPS". AreaEnter/Leave I think I understand, it's
> some pthread_mutex_t, I think, from other mails. A problematic "in MPS"
> could then be "while the Emacs thread owns the mutex". The places where
> I imagine that mutex could be owned are mps_arena_step (I call it Emacs
> is idle) and mps_commit (in alloc_impl, when the allocation point used
> runs out of memory). Maybe other places, but mainly these two.

Yes.

> And the question on macOS for me would be if the port thread tries to
> qcquire the same mutex, or how the heck that works. Or IOW, if there is
> a problem, why I've never seen it happening in all that time I'm using
> igc.

Maybe you could set a breakpoint in AreaAccess to find out which thread
removes the barriers.

> I find that difficult to understand. But it may be just a
> statistical phenomenon. Maybe filling up an APs memory is so fast so
> that the probability of a signal hitting while owning the mutex is close
> to zero, or something. 

Very few of Emacs' signal handlers actually touch a barrier.  I've also
not seen any reproducable receipes for the "signal issues" that the igc
branch supposedly has.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:13       ` Gerd Möllmann
  2024-12-30  7:23         ` Gerd Möllmann
  2024-12-30  7:39         ` Helmut Eller
@ 2024-12-30 10:46         ` Pip Cet via Emacs development discussions.
  2024-12-30 12:00           ` Gerd Möllmann
  2024-12-30 12:07           ` Gerd Möllmann
  2 siblings, 2 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 10:46 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, spd, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
> concurrently, unless it would do things that I find highly unlikely.
>
> I find that a bit, let's say, disappointing, TBH :-(.

Well, I think that MPS is bring-your-own-thread concurrent.  I'm not
sure the current MPS can usefully be concurrent, because of the thread
suspension thing, but we can run it in another thread if we want to.

(If it turns out that using a separate thread isn't an advantage, we
should look at reducing the size of our roots (or protecting them)
rather than giving up on the idea entirely.  As long as other threads
freely move references to and from global roots, detecting unreachable
objects is hard.)

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:39         ` Helmut Eller
  2024-12-30  7:51           ` Gerd Möllmann
@ 2024-12-30 10:53           ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 10:53 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
>> concurrently, unless it would do things that I find highly unlikely.
>>
>> I find that a bit, let's say, disappointing, TBH :-(.
>
> Richard Brooksby thinks[*] that MPS could be concurrent with software
> barriers.  Feel like going down that road? :-)

I saw that, but it's from 2008, so I'm not sure whether things changed
after that.

Note that for typical Emacs usage, I'd look into making the
stop-the-world phase of GC interruptible rather than nonexistent.  MPS
has a lot of code to deal with failed scans, so we could find one of
those code paths that fails non-catastrophically and fail it.

That we didn't make the old GC interruptible still seems like a mistake
to me, but for igc it means that we'll be able to pass that off as a new
feature.  Kind of like purespace, the old code we compare again always
had a hand tied behind its back.  (Helmut very impressively demonstrated
that for purespace, so this assumes he doesn't get around to
implementing interruptible mark-and-sweep GC before breakfast).

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  7:51           ` Gerd Möllmann
  2024-12-30  8:02             ` Helmut Eller
@ 2024-12-30 11:11             ` Pip Cet via Emacs development discussions.
  2024-12-30 12:13               ` Gerd Möllmann
  1 sibling, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 11:11 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>>> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
>>> concurrently, unless it would do things that I find highly unlikely.
>>>
>>> I find that a bit, let's say, disappointing, TBH :-(.
>>
>> Richard Brooksby thinks[*] that MPS could be concurrent with software
>> barriers.  Feel like going down that road? :-)
>>
>> Helmut
>>
>> [*] https://memory-pool-system.readthedocs.io/en/latest/design/shield.html#concurrent-collection
>
> Yep, found that too, with git grep.
>
> Still grumpy.
>
> I'm afraid Modifying MPS is not my thing, But What about using something
> more modern like Oilpan (aka cppgc) from V8? Can be used as a lib, is
> concurrent for real. That would also be a perfect time to lift Emacs to
> C++.

Ultimately, I'm still surprised there isn't that much work on precise GC
with compiler support, allawing us to mark the C stack precisely by
emitting DWARF tables indicating precisely which registers or stack
locations references live in at the time of an interruption.
(Identifying them is the easy part.  Emitting only code which allows you
to move such references asynchronously is the hard part).

But if you're doing all that, why not go all the way and implement
resumable exceptions?

(I wasn't going to mention it, but there's always my experiment with the
SpiderMonkey garbage collector.  Lots of bitrot, and I never got things
working quite the way I wanted to (no ambiguous references)).

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  8:47               ` Gerd Möllmann
  2024-12-30  9:29                 ` Helmut Eller
@ 2024-12-30 11:18                 ` Pip Cet via Emacs development discussions.
  2024-12-30 12:23                   ` Gerd Möllmann
  1 sibling, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 11:18 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>>
>>> I'm afraid Modifying MPS is not my thing, But What about using something
>>> more modern like Oilpan (aka cppgc) from V8? Can be used as a lib, is
>>> concurrent for real.
>>
>> Ideally, Emacs would have an abstract GC interface so that different
>> implementations could be plugged in.
>
> That would indeed be nice to have.
>
>>> That would also be a perfect time to lift Emacs to
>>> C++.
>>
>> I'd rather see Emacs move to Rust.  Anyway, neither option seems
>> realistic.
>
> In mainline... (pondering to put a smiley).

I'll throw Zig in and run away quickly.

> But something else: Given what I now believe, I think I want to
> understand better (a bit) why everything appears to work just fine on
> macOS, with signals. Could you perhaps check if I'm off? MacOS only.

Do we know that?  I think macOS doesn't use signals as heavily as other
platforms do, and I don't know how SIGPROF is handled on that platform,
but I would not be surprised if that or SIGALRM require the signal
checking thing on macOS, too.

The macOS thing is equivalent to blocking signals in the SIGSEGV
handler.  I still think that's what MPS should have done.

> In normal operation, there are only ever 2 threads running. An Emacs
> thread is interrupted by a signal and lands in a signal handler, the
> MPS port thread keeps running.
>
> In the signal handler, hitting barriers is handled by the MPS port
> thread. Consistency of Emacs's state is a problem the signal handler has
> to deal with, consistency of MPS' GC data is a problem that hopefully
> MPS handles, and it seems to work.
>
> I think I understand that, except when the Emacs thread is interrupted
> while in MPS code, which happens for allocation points running out of
> memory and mps_arena_step (idle time).

My assumption is that if the signal handler is allowed to run in that
case, and tries to access MPS-managed memory, we deadlock.  It might not
be a detectable deadlock causing a crash, as it would be on POSIX, but
that makes things worse, not better.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 10:27                     ` Helmut Eller
@ 2024-12-30 11:53                       ` Gerd Möllmann
  2024-12-30 14:54                         ` Eli Zaretskii
  2024-12-30 12:32                       ` Pip Cet via Emacs development discussions.
  2024-12-30 12:42                       ` Pip Cet via Emacs development discussions.
  2 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 11:53 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>
>>> My interpretation of this design document[*], is that MPS's arena lock
>>> protects most of MPS entry points.  There are a few (e.g. mps_reserve
>>> and mp_ld_add) that don't claim the arena lock and for those it's the
>>> burden of the client to call them in a thread safe way.  For us this
>>> probably means: don't call them in a signal handler.
>>>
>>> The main entry point that we want to call in the signal handler is the
>>> SEGFAULT handler (not sure how this works on MacOS).  The fault handler
>>> claims the non-recursive arena lock.  So, in the signal handler we
>>> should not hold the lock while hitting a barrier.
>>
>> Okay, that I think I understand then. The "only" difference between
>> macOS and Linux is that on macOS no SEGV handler is involved. Hitting
>> the barrier on macOS means that the EXC_BAD_ACCESS port thread, which
>> was waiting for Mach message, receives a message from the OS and starts
>> working.
>
> I guess that both, macOS and Linux version, will end up in ArenaAccess.
> Perhaps on macOS in the other thread.
>
>>>> I think I understand that, except when the Emacs thread is interrupted
>>>> while in MPS code, which happens for allocation points running out of
>>>> memory and mps_arena_step (idle time).
>>>
>>> Hmm, is that sentence incomplete?  I don't quite understand it.
>>
>> What I meant is that I imagine a signal interrupts the Emacs thread at a
>> point where we are "in MPS". AreaEnter/Leave I think I understand, it's
>> some pthread_mutex_t, I think, from other mails. A problematic "in MPS"
>> could then be "while the Emacs thread owns the mutex". The places where
>> I imagine that mutex could be owned are mps_arena_step (I call it Emacs
>> is idle) and mps_commit (in alloc_impl, when the allocation point used
>> runs out of memory). Maybe other places, but mainly these two.
>
> Yes.
>
>> And the question on macOS for me would be if the port thread tries to
>> qcquire the same mutex, or how the heck that works. Or IOW, if there is
>> a problem, why I've never seen it happening in all that time I'm using
>> igc.
>
> Maybe you could set a breakpoint in AreaAccess to find out which thread
> removes the barriers.
>
>> I find that difficult to understand. But it may be just a
>> statistical phenomenon. Maybe filling up an APs memory is so fast so
>> that the probability of a signal hitting while owning the mutex is close
>> to zero, or something.
>
> Very few of Emacs' signal handlers actually touch a barrier.  I've also
> not seen any reproducable receipes for the "signal issues" that the igc
> branch supposedly has.
>
> Helmut

I've got something, using Eglot/clangd.

Executive summary: If a signal interrupts the Emacs thread, and we are
"inside MPS", meaning the Emacs threads owns an arena's mutex, the macOS
port thread can try to acquire the same mutex and won't get because the
Emacs thread that owns it is stopped by the signal.

It goes like this:

  ** mps_commit

  mps_commit
  -> mps_ap_trip
     -> ArenaEnter

  ** ArenaEnter

  ArenaEnter
  -> ArenaEnterLock non-recursive
    -> LockClaim arena lock
      -> pthread_mutex_lock.
    -> ShieldEnter
      -> shieldQueueReset
         Sets some members of a Shield, .inside = true

  ** MPS port thread

  protCatchThread
  -> mach_msg waiting
  -> protCatchOne thread involved has been stopped by OS
    -> ArenaAccess
      -> arenaClaimRingLock
         -> LockClaimGlobal
           -> LockClaim globalLock (static LockStruct)
      -> RING_FOR uninteresting macrology
      -> ArenaEnter for the 1 arena we have

  ** Global lock (uninteresting here)

  LockInitGlobal
  -> LockInit globalLock
  -> LockInit globalLockRecLock
     -> pthread_mutex_init with no interesting attrs

Sp, I'd say

- It's possible to freeze on macOS because of hitting a barrier, or
  allocating Lisp objects. I don't believe that I would have overlooked
  something preventing it.

- It's apparently very unlikely to happen, for me at least it
  never happened in, don't know, half a year or so of daily usage.
  Maybe one could make it happen when profiling for long enough.

- Only signal handlers are affected, and what you said about signal
  handlers using Lisp and so on.

Hm.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  9:47                   ` Helmut Eller
@ 2024-12-30 11:54                     ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 11:54 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, spd, pipcet, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Helmut Eller wrote:
>
>> My interpretation of this design document[*], is that MPS's arena lock
>
> That one:
>
> https://memory-pool-system.readthedocs.io/en/latest/design/thread-safety.html

Thanks.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 10:46         ` Pip Cet via Emacs development discussions.
@ 2024-12-30 12:00           ` Gerd Möllmann
  2024-12-30 12:07           ` Gerd Möllmann
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 12:00 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
>> concurrently, unless it would do things that I find highly unlikely.
>>
>> I find that a bit, let's say, disappointing, TBH :-(.
>
> Well, I think that MPS is bring-your-own-thread concurrent.  I'm not
> sure the current MPS can usefully be concurrent, because of the thread
> suspension thing, but we can run it in another thread if we want to.
>
> (If it turns out that using a separate thread isn't an advantage, we
> should look at reducing the size of our roots (or protecting them)
> rather than giving up on the idea entirely.  As long as other threads
> freely move references to and from global roots, detecting unreachable
> objects is hard.)
>
> Pip

If they had said something, maybe I'd not tried to use MPS in Emacs. Or
maybe I would who knows.

Anyway, I must say that Emacs with igc feels a lot better and is as
stable. I don't think I'll ever go back :-).



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 10:46         ` Pip Cet via Emacs development discussions.
  2024-12-30 12:00           ` Gerd Möllmann
@ 2024-12-30 12:07           ` Gerd Möllmann
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 12:07 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>> Anyway, it definitely seems to be the case that MPS is _not_ running GCs
>> concurrently, unless it would do things that I find highly unlikely.
>>
>> I find that a bit, let's say, disappointing, TBH :-(.
>
> Well, I think that MPS is bring-your-own-thread concurrent.  I'm not
> sure the current MPS can usefully be concurrent, because of the thread
> suspension thing, but we can run it in another thread if we want to.
>
> (If it turns out that using a separate thread isn't an advantage, we
> should look at reducing the size of our roots (or protecting them)
> rather than giving up on the idea entirely.  As long as other threads
> freely move references to and from global roots, detecting unreachable
> objects is hard.)

(BTW, just remembering. MPS has root protection in it's API, but it's
not implemented.)



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 11:11             ` Pip Cet via Emacs development discussions.
@ 2024-12-30 12:13               ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 12:13 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> (I wasn't going to mention it, but there's always my experiment with the
> SpiderMonkey garbage collector.  Lots of bitrot, and I never got things
> working quite the way I wanted to (no ambiguous references)).

I've noticed that, and also that Daniel C. did something. Maybe others
too. The MPS thing is actually just a coincidence. I wasn't trying to
deliberately ignore all previous effort to improve Emacs' GC. It was
that I wasn't interested in improving Emacs' GC. I was only trying to
learn something about MPS, and Emacs was only a test bed. I guess
sometimes things happen in weird ways :-).




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 11:18                 ` Pip Cet via Emacs development discussions.
@ 2024-12-30 12:23                   ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 12:23 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

>> In mainline... (pondering to put a smiley).
>
> I'll throw Zig in and run away quickly.

(Zig would be fine, weren't it for the future plans of its inventor,
which irritated me enough that I didn't try it in earnest.)

>
>> But something else: Given what I now believe, I think I want to
>> understand better (a bit) why everything appears to work just fine on
>> macOS, with signals. Could you perhaps check if I'm off? MacOS only.
>
> Do we know that?  

Well, that it "appears to..." I know for fact :-). It never happened for
me. But please see my other mails.

> I think macOS doesn't use signals as heavily as other platforms do,
> and I don't know how SIGPROF is handled on that platform, but I would
> not be surprised if that or SIGALRM require the signal checking thing
> on macOS, too.

SIGPROF and SIGLARM are signals in macOS. Only the hardware faults are
not, like EXC_BAD_ACCESS and the float exceptions. Others I don't
remember ATM.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 10:27                     ` Helmut Eller
  2024-12-30 11:53                       ` Gerd Möllmann
@ 2024-12-30 12:32                       ` Pip Cet via Emacs development discussions.
  2024-12-30 14:24                         ` Eli Zaretskii
  2024-12-30 14:59                         ` Helmut Eller
  2024-12-30 12:42                       ` Pip Cet via Emacs development discussions.
  2 siblings, 2 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 12:32 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:
>> I find that difficult to understand. But it may be just a
>> statistical phenomenon. Maybe filling up an APs memory is so fast so
>> that the probability of a signal hitting while owning the mutex is close
>> to zero, or something.
>
> Very few of Emacs' signal handlers actually touch a barrier.  I've also

Indeed.  These crashes are rare in typical usage, which doesn't mean we
should delay fixing them until Emacs is "unstable enough".  It already
is, IMHO, because we take that approach too frequently.

> not seen any reproducable receipes for the "signal issues" that the igc
> branch supposedly has.

Removing the SIGPROF protection code should allow Ihor's recipe to crash
again.  And, anyway, there's no reason for the "supposedly": we know MPS
can't possibly deal with the situation because we've seen the code, so
we should fix it rather than ignoring it because it's rare in typical
usage.

If there's any evidence this cannot happen, I haven't seen it.

GC code in general contains many rarely-exercised code paths.  This is
no exception.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 10:27                     ` Helmut Eller
  2024-12-30 11:53                       ` Gerd Möllmann
  2024-12-30 12:32                       ` Pip Cet via Emacs development discussions.
@ 2024-12-30 12:42                       ` Pip Cet via Emacs development discussions.
  2024-12-30 13:40                         ` Gerd Möllmann
  2 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 12:42 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>> I find that difficult to understand. But it may be just a
>>> statistical phenomenon. Maybe filling up an APs memory is so fast so
>>> that the probability of a signal hitting while owning the mutex is close
>>> to zero, or something.
>>
>> Very few of Emacs' signal handlers actually touch a barrier.  I've also
>
> Indeed.  These crashes are rare in typical usage, which doesn't mean we
> should delay fixing them until Emacs is "unstable enough".  It already
> is, IMHO, because we take that approach too frequently.
>
>> not seen any reproducable receipes for the "signal issues" that the igc
>> branch supposedly has.
>
> Removing the SIGPROF protection code should allow Ihor's recipe to crash
> again.

Confirmed.  Here's the recipe (which, yes, you have already seen):

https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00560.html

Make igc_busy_p () return false (as we could do if the "supposed" signal
issue weren't real), immediate crash.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30  6:16           ` Gerd Möllmann
@ 2024-12-30 12:51             ` Gerd Möllmann
  2024-12-30 13:09               ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 12:51 UTC (permalink / raw)
  To: Pip Cet; +Cc: Sean Devlin, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2406 bytes --]

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> Speaking of running with a "normal" config: something about my
>> configuration makes buffer_step (the balance_intervals call, in
>> particular) take forever, to the point the mps build becomes unusable.
>> The buffer in question, when I caught it, is an M-x shell buffer of size
>> 8 MB, so I don't understand why it's taking so long.
>>
>> Still investigating, but skipping the buffer_step seems to help.
>
> balance_intervals means text properties. The only candidate I see in
> comint/shell is ANSI escapes. That could be turned on/off with M-x
> ansi-color-for-comint-mode-xy. Only as a workaround, and maybe to check
> if it's that.
>
> What I do in buffer_step in idle time is basically one step of what the
> old GC does in sweep_buffers.
>
> My expectation was that balancing a tree couldn't take long, and that
> this is not called often enough to be a problem if were expensive. Both
> wrong, as usual.
>
> Not calling balance_intervals is, BTW, not a catastrophic problem. if
> one does anything leading to a graft_intervals_into_buffer, w</r hich is
> called in a lot of places in editfns.c and insdel.c, that balances the
> tree. And if not, the tree might become slower for lookup (redisplay),
> but it still works.
>
> <rant> It's BTW well possible that I myself put that balancing into
> sweep_buffers because of redisplay, I seem to remember that. The
> interval tree has always been a source of fun. I hope, some day, some
> kind soul will eradicate it like the GCPROs. </rant>
>
> In any case, what's a solution?
>
> Right now I'm tending to put the balance_intervals in an if so that one
> can turn it on/off with a Lisp variable. Default would be to not to balance,
> because I think the problems with degenerated interval trees in
> redisplay where rare, and I don't remember problems outside of
> redisplay. But that was an awful long time ago, OTOH.
>
> That would give us more time to think about a possible strategy to solve
> this.
>
> WDYT?

Patch for that attached. I'm now running with that.

I tried to look at the history of intervals.c to see too which degree
the intervals tree is behaving better than decades ago (I think Stefan
Monnier said it's better), but I couldn't really determine that. Too
much has changed.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Make-balancing-buffer-intervals-optional.patch --]
[-- Type: text/x-patch, Size: 1315 bytes --]

From 0aa8b2f483da11bfd6a6397c56182b5877cb779e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Gerd=20M=C3=B6llmann?= <gerd@gnu.org>
Date: Mon, 30 Dec 2024 13:41:40 +0100
Subject: [PATCH] Make balancing buffer intervals optional

* src/igc.c (buffer_step): Balance intervals only if
igc__balance_intervals is true.
(syms_of_igc): New DEFVAR_BOOL for igc__balance_intervals.
---
 src/igc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/igc.c b/src/igc.c
index 39158b38f05..964723ce315 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -3697,7 +3697,8 @@ buffer_step (struct igc_buffer_it *it)
       buffer_it_next (it);
       struct buffer *b = XBUFFER (buf);
       compact_buffer (b);
-      b->text->intervals = balance_intervals (b->text->intervals);
+      if (igc__balance_intervals)
+	b->text->intervals = balance_intervals (b->text->intervals);
       return true;
     }
   return false;
@@ -5100,4 +5101,8 @@ syms_of_igc (void)
 don't do something when idle.  Negative values and values that are not numbers
 are handled as if they were the default value.  */);
   Vigc_step_interval = make_fixnum (0);
+
+  DEFVAR_BOOL ("igc--balance-intervals", igc__balance_intervals,
+     doc: /* Whether to balance buffer intervals when idle.  */);
+  igc__balance_intervals = false;
 }
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 12:51             ` Gerd Möllmann
@ 2024-12-30 13:09               ` Pip Cet via Emacs development discussions.
  2024-12-30 13:28                 ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 13:09 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Sean Devlin, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>> Speaking of running with a "normal" config: something about my
>>> configuration makes buffer_step (the balance_intervals call, in
>>> particular) take forever, to the point the mps build becomes unusable.
>>> The buffer in question, when I caught it, is an M-x shell buffer of size
>>> 8 MB, so I don't understand why it's taking so long.
>>>
>>> Still investigating, but skipping the buffer_step seems to help.
>>
>> balance_intervals means text properties. The only candidate I see in
>> comint/shell is ANSI escapes. That could be turned on/off with M-x
>> ansi-color-for-comint-mode-xy. Only as a workaround, and maybe to check
>> if it's that.
>>
>> What I do in buffer_step in idle time is basically one step of what the
>> old GC does in sweep_buffers.
>>
>> My expectation was that balancing a tree couldn't take long, and that
>> this is not called often enough to be a problem if were expensive. Both
>> wrong, as usual.
>>
>> Not calling balance_intervals is, BTW, not a catastrophic problem. if
>> one does anything leading to a graft_intervals_into_buffer, w</r hich is
>> called in a lot of places in editfns.c and insdel.c, that balances the
>> tree. And if not, the tree might become slower for lookup (redisplay),
>> but it still works.
>>
>> <rant> It's BTW well possible that I myself put that balancing into
>> sweep_buffers because of redisplay, I seem to remember that. The
>> interval tree has always been a source of fun. I hope, some day, some
>> kind soul will eradicate it like the GCPROs. </rant>

(I have a crazy idea for that, too.  Code, too.  But it does away with
the gap buffer, which the regexp code assumes, so we end up creating a
shadow single-string buffer whenever we call into the regexp code, which
is, er, slow.)

>> In any case, what's a solution?
>>
>> Right now I'm tending to put the balance_intervals in an if so that one
>> can turn it on/off with a Lisp variable. Default would be to not to balance,
>> because I think the problems with degenerated interval trees in
>> redisplay where rare, and I don't remember problems outside of
>> redisplay. But that was an awful long time ago, OTOH.

I did implement a Lisp variable as well (defaulting it to on because I'm
more conservative than you are :-) ).  I still think it's more likely
it's (also) a bug elsewhere: balancing a tree for an 8 MB buffer should
not take long.

I am currently not calling compact_buffer when the variable is off.
Maybe that's something to look into, too.

In my current session, keypresses (with idle time in between, because I
can't outtype Emacs) become noticeably laggy if I set the variable to t,
but not when it's nil.

Further investigation needed, I think.  Unfortunately, that's going to
require some instrumentation code and then I have to restart my Emacs
session...

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 13:09               ` Pip Cet via Emacs development discussions.
@ 2024-12-30 13:28                 ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 13:28 UTC (permalink / raw)
  To: Pip Cet; +Cc: Sean Devlin, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> I did implement a Lisp variable as well (defaulting it to on because I'm
> more conservative than you are :-) ).  I still think it's more likely
> it's (also) a bug elsewhere: balancing a tree for an 8 MB buffer should
> not take long.

Could be, of course.

>
> I am currently not calling compact_buffer when the variable is off.
> Maybe that's something to look into, too.

I think compact_buffer must be done, otherwise the undo list isn't isn't
truncated. I've attached what I'm now using in my Emacs to another
mail.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 12:42                       ` Pip Cet via Emacs development discussions.
@ 2024-12-30 13:40                         ` Gerd Möllmann
  2024-12-30 13:53                           ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 13:40 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>>> I find that difficult to understand. But it may be just a
>>>> statistical phenomenon. Maybe filling up an APs memory is so fast so
>>>> that the probability of a signal hitting while owning the mutex is close
>>>> to zero, or something.
>>>
>>> Very few of Emacs' signal handlers actually touch a barrier.  I've also
>>
>> Indeed.  These crashes are rare in typical usage, which doesn't mean we
>> should delay fixing them until Emacs is "unstable enough".  It already
>> is, IMHO, because we take that approach too frequently.
>>
>>> not seen any reproducable receipes for the "signal issues" that the igc
>>> branch supposedly has.
>>
>> Removing the SIGPROF protection code should allow Ihor's recipe to crash
>> again.
>
> Confirmed.  Here's the recipe (which, yes, you have already seen):
>
> https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00560.html
>
> Make igc_busy_p () return false (as we could do if the "supposed" signal
> issue weren't real), immediate crash.
>
> Pip

With

  modified   src/profiler.c
  @@ -347,7 +347,7 @@ record_backtrace (struct profiler_log *plog, EMACS_INT count)
   add_sample (struct profiler_log *plog, EMACS_INT count)
   {
   #ifdef HAVE_MPS
  -  if (igc_busy_p ())
  +  if (false)
   #else
     if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
   #endif

Result:

        1083  89% + command-execute
         106   8% + redisplay_internal (C function)
          13   1% + timer-event-handler
           1   0% + help--append-keystrokes-help
           1   0% + #<byte-code-function A81>
           0   0%   ...

🤷



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 13:40                         ` Gerd Möllmann
@ 2024-12-30 13:53                           ` Pip Cet via Emacs development discussions.
  2024-12-30 14:02                             ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 13:53 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>>>> I find that difficult to understand. But it may be just a
>>>>> statistical phenomenon. Maybe filling up an APs memory is so fast so
>>>>> that the probability of a signal hitting while owning the mutex is close
>>>>> to zero, or something.
>>>>
>>>> Very few of Emacs' signal handlers actually touch a barrier.  I've also
>>>
>>> Indeed.  These crashes are rare in typical usage, which doesn't mean we
>>> should delay fixing them until Emacs is "unstable enough".  It already
>>> is, IMHO, because we take that approach too frequently.
>>>
>>>> not seen any reproducable receipes for the "signal issues" that the igc
>>>> branch supposedly has.
>>>
>>> Removing the SIGPROF protection code should allow Ihor's recipe to crash
>>> again.
>>
>> Confirmed.  Here's the recipe (which, yes, you have already seen):
>>
>> https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00560.html
>>
>> Make igc_busy_p () return false (as we could do if the "supposed" signal
>> issue weren't real), immediate crash.
>>
>> Pip
>
> With
>
>   modified   src/profiler.c
>   @@ -347,7 +347,7 @@ record_backtrace (struct profiler_log *plog, EMACS_INT count)
>    add_sample (struct profiler_log *plog, EMACS_INT count)
>    {
>    #ifdef HAVE_MPS
>   -  if (igc_busy_p ())
>   +  if (false)
>    #else
>      if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
>    #endif

This is after removing gc_signal_handler_can_run, right?

Even if, in addition, I block signals in the SIGSEGV handler, I see
crashes here, FWIW, but not quite every time that recipe is run.  It
seems to work even less reliably when rr is in use (and then I realized
I was running an optimized build so the trace was useless, sigh).

I'd still like to see at least one crash on macOS, but there's nothing
I'm aware of that would prevent such crashes, only make them (maybe
much) less likely.  For starters, fewer pages so fewer barriers :-)

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 13:53                           ` Pip Cet via Emacs development discussions.
@ 2024-12-30 14:02                             ` Gerd Möllmann
  2024-12-30 14:32                               ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 14:02 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

>> With
>>
>>   modified   src/profiler.c
>>   @@ -347,7 +347,7 @@ record_backtrace (struct profiler_log *plog, EMACS_INT count)
>>    add_sample (struct profiler_log *plog, EMACS_INT count)
>>    {
>>    #ifdef HAVE_MPS
>>   -  if (igc_busy_p ())
>>   +  if (false)
>>    #else
>>      if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
>>    #endif
>
> This is after removing gc_signal_handler_can_run, right?

Right. And meanwhile it also survived the same parsing 100 times in a
loop with profiling on around it. 

> Even if, in addition, I block signals in the SIGSEGV handler, I see
> crashes here, FWIW, but not quite every time that recipe is run.  It
> seems to work even less reliably when rr is in use (and then I realized
> I was running an optimized build so the trace was useless, sigh).
>
> I'd still like to see at least one crash on macOS, but there's nothing
> I'm aware of that would prevent such crashes, only make them (maybe
> much) less likely.  For starters, fewer pages so fewer barriers :-)
>
> Pip

Me too, but I agree with you. It should eventually freeze, there's
nothing preventing it that I could point to.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 12:32                       ` Pip Cet via Emacs development discussions.
@ 2024-12-30 14:24                         ` Eli Zaretskii
  2024-12-30 14:59                         ` Helmut Eller
  1 sibling, 0 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-30 14:24 UTC (permalink / raw)
  To: Pip Cet; +Cc: eller.helmut, gerd.moellmann, spd, emacs-devel

> Date: Mon, 30 Dec 2024 12:32:31 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli Zaretskii <eliz@gnu.org>, spd@toadstyle.org, emacs-devel@gnu.org
> 
> Removing the SIGPROF protection code should allow Ihor's recipe to crash
> again.

At least some (if not all) Ihor's crashes were due to SIGCHLD, not
SIGPROF.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:02                             ` Gerd Möllmann
@ 2024-12-30 14:32                               ` Pip Cet via Emacs development discussions.
  2024-12-30 14:52                                 ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 14:32 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>>> With
>>>
>>>   modified   src/profiler.c
>>>   @@ -347,7 +347,7 @@ record_backtrace (struct profiler_log *plog, EMACS_INT count)
>>>    add_sample (struct profiler_log *plog, EMACS_INT count)
>>>    {
>>>    #ifdef HAVE_MPS
>>>   -  if (igc_busy_p ())
>>>   +  if (false)
>>>    #else
>>>      if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
>>>    #endif
>>
>> This is after removing gc_signal_handler_can_run, right?
>
> Right. And meanwhile it also survived the same parsing 100 times in a
> loop with profiling on around it.

Yes, without the SIGSEGV-alloc-SIGSEGV case, it becomes less likely to
hit a memory barrier.  But the alloc-SIGPROF-SIGSEGV case is real, we've
seen backtraces for it.

>> Even if, in addition, I block signals in the SIGSEGV handler, I see
>> crashes here, FWIW, but not quite every time that recipe is run.  It
>> seems to work even less reliably when rr is in use (and then I realized
>> I was running an optimized build so the trace was useless, sigh).
>>
>> I'd still like to see at least one crash on macOS, but there's nothing
>> I'm aware of that would prevent such crashes, only make them (maybe
>> much) less likely.  For starters, fewer pages so fewer barriers :-)
>>
>> Pip
>
> Me too, but I agree with you. It should eventually freeze, there's
> nothing preventing it that I could point to.

Or produce invalid data, which seems more likely: if we're scanning the
segment, the memory barrier won't be in place, but the contents will be
invalid, pointing to objects which we already decided to move (IOW, if
my double-mapping idea were in place, it'd be easier to catch these
bugs).  A crash is better than derefing random pointers pointing to the
(new) young generation when the object has been moved.

(I'm running into separate rr bugs on both systems I'm testing this on,
so we'll just have to assume that's what behind those crashes, at least
sometimes).

Does adding mps_message_poll (global_igc->arena) in the signal handler
produce a crash/deadlock for you?  Last thing I'm asking you to try
today, I promise :-)

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:32                               ` Pip Cet via Emacs development discussions.
@ 2024-12-30 14:52                                 ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 14:52 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Does adding mps_message_poll (global_igc->arena) in the signal handler
> produce a crash/deadlock for you?  Last thing I'm asking you to try
> today, I promise :-)

:-). When I do this:

1 file changed, 1 insertion(+)
src/igc.c | 1 +

modified   src/igc.c
@@ -4966,6 +4966,7 @@ igc_alloc_dump (size_t nbytes)
 bool
 igc_busy_p (void)
 {
+  mps_message_poll (global_igc->arena);
   return mps_arena_busy (global_igc->arena);
 }
 

with no changes, except the 1 commit reverted, I get an assertion
a short time after M-x profiler-start

Fatal error 6: Aborted
Backtrace:
0   emacs                               0x0000000104fceca8 emacs_backtrace + 104
1   emacs                               0x000000010515b31c terminate_due_to_signal + 220
2   emacs                               0x00000001050c090c syms_of_igc + 0
3   emacs                               0x00000001051081a4 LockClaim + 144
4   emacs                               0x0000000105107f78 ArenaEnterLock + 96
5   emacs                               0x00000001050fa380 mps_message_poll + 24
6   emacs                               0x00000001050c008c igc_busy_p + 28
7   emacs                               0x00000001050b8a54 add_sample + 44
8   emacs                               0x0000000104fce6fc deliver_process_signal + 64
9   libsystem_platform.dylib            0x0000000190c56e04 _sigtramp + 56
10  libsystem_c.dylib                   0x0000000190ad9ba4 clock + 52
11  emacs                               0x00000001050f4c10 ArenaPoll + 164
12  emacs                               0x00000001050f5b68 mps_ap_fill + 160
13  emacs                               0x00000001050c0f24 alloc_impl + 128
14  emacs                               0x00000001050bf290 igc_add_marker + 136
15  emacs                               0x0000000104fe21cc set_marker_internal + 576
16  emacs                               0x0000000104fe2700 Fcopy_marker + 180
17  emacs                               0x0000000104f558b0 save_window_save + 748
18  emacs                               0x0000000104f555ac Fcurrent_window_configuration + 296
19  emacs                               0x0000000104fe3bfc Fread_from_minibuffer + 1044
20  minibuffer-1b0f548b-1c53371b.eln    0x0000000106a680f4
F636f6d706c6574696e672d726561642d64656661
...



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 11:53                       ` Gerd Möllmann
@ 2024-12-30 14:54                         ` Eli Zaretskii
  2024-12-30 15:05                           ` Gerd Möllmann
  2024-12-30 15:05                           ` Pip Cet via Emacs development discussions.
  0 siblings, 2 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-30 14:54 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, spd, pipcet, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  spd@toadstyle.org,
>   pipcet@protonmail.com,  emacs-devel@gnu.org
> Date: Mon, 30 Dec 2024 12:53:33 +0100
> 
> Executive summary: If a signal interrupts the Emacs thread, and we are
> "inside MPS", meaning the Emacs threads owns an arena's mutex, the macOS
> port thread can try to acquire the same mutex and won't get because the
> Emacs thread that owns it is stopped by the signal.

Are you sure the "because the Emacs thread that owns it is stopped by
the signal" part is correct?  AFAIU, since the macOS port thread is a
different thread, it tries to take the arena lock, and is simply stuck
there, waiting for the main thread to release the lock.  IOW, this is
a simple mutex-based synchronization between two threads, that's all:
one thread takes the lock, the other must wait until the first one
releases it before it itself can take the lock.

Moreover, the main thread is not stopped, it runs the signal handler.
Right?

Or what am I missing?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 12:32                       ` Pip Cet via Emacs development discussions.
  2024-12-30 14:24                         ` Eli Zaretskii
@ 2024-12-30 14:59                         ` Helmut Eller
  2024-12-30 15:15                           ` Eli Zaretskii
                                             ` (2 more replies)
  1 sibling, 3 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30 14:59 UTC (permalink / raw)
  To: Pip Cet; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

On Mon, Dec 30 2024, Pip Cet wrote:

> "Helmut Eller" <eller.helmut@gmail.com> writes:
>> Very few of Emacs' signal handlers actually touch a barrier.  I've also
>
> Indeed.  These crashes are rare in typical usage, which doesn't mean we
> should delay fixing them until Emacs is "unstable enough".  It already
> is, IMHO, because we take that approach too frequently.
>
>> not seen any reproducable receipes for the "signal issues" that the igc
>> branch supposedly has.
>
> Removing the SIGPROF protection code should allow Ihor's recipe to crash
> again.

Talking about SIGPROF protection code.  It appears to me now (again)
that, for the SIGPROF handler, this pseudo code

  if (mps_arena_busy (<arena>))
    plog->gc_count = saturated_add (plog->gc_count, count);
  else
    record_backtrace (plog, count);

is safe.  If we don't hold the lock, then mps_arena_busy returns false
and we can access memory.  We are safe even if another thread has
claimed the lock by the time that we reach record_backtrace: the SIGPROF
handler will just block until the lock is released.

Does somebody disagree?

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:54                         ` Eli Zaretskii
@ 2024-12-30 15:05                           ` Gerd Möllmann
  2024-12-30 15:05                           ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 15:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, spd, pipcet, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  spd@toadstyle.org,
>>   pipcet@protonmail.com,  emacs-devel@gnu.org
>> Date: Mon, 30 Dec 2024 12:53:33 +0100
>> 
>> Executive summary: If a signal interrupts the Emacs thread, and we are
>> "inside MPS", meaning the Emacs threads owns an arena's mutex, the macOS
>> port thread can try to acquire the same mutex and won't get because the
>> Emacs thread that owns it is stopped by the signal.
>
> Are you sure the "because the Emacs thread that owns it is stopped by
> the signal" part is correct?  

It's not correct, sorry. When it lands in the port thread, the Emacs
thread has been suspended by the OS.

> AFAIU, since the macOS port thread is a different thread, it tries to
> take the arena lock, and is simply stuck there, waiting for the main
> thread to release the lock. IOW, this is a simple mutex-based
> synchronization between two threads, that's all: one thread takes the
> lock, the other must wait until the first one releases it before it
> itself can take the lock.

That's right. The port thread can't get the lock because it's owned by
the Emacs thread. And the Emacs thread is suspended by the OS.

>
> Moreover, the main thread is not stopped, it runs the signal handler.
> Right?

The Emacs thread is suspended by the OS to let the port thread handle the
situation. After the thread is suspended, the OS sends the port a
message. The port thread is waiting for such a message with mach_msg.
The port does its thing, and when it replies to the OS message
accordingly, the OS lets the Emacs thread continue. But that reply is
never sent because the port is stuck on the mutex.

Hope that's more correct.




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:54                         ` Eli Zaretskii
  2024-12-30 15:05                           ` Gerd Möllmann
@ 2024-12-30 15:05                           ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 15:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, eller.helmut, spd, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  spd@toadstyle.org,
>>   pipcet@protonmail.com,  emacs-devel@gnu.org
>> Date: Mon, 30 Dec 2024 12:53:33 +0100
>>
>> Executive summary: If a signal interrupts the Emacs thread, and we are
>> "inside MPS", meaning the Emacs threads owns an arena's mutex, the macOS
>> port thread can try to acquire the same mutex and won't get because the
>> Emacs thread that owns it is stopped by the signal.
>
> Are you sure the "because the Emacs thread that owns it is stopped by
> the signal" part is correct?  AFAIU, since the macOS port thread is a
> different thread, it tries to take the arena lock, and is simply stuck
> there, waiting for the main thread to release the lock.  IOW, this is
> a simple mutex-based synchronization between two threads, that's all:
> one thread takes the lock, the other must wait until the first one
> releases it before it itself can take the lock.

My understanding is that the main thread cannot continue running while
it's waiting for a memory barrier to be removed.  What would it use for
the memory values it cannot access?

> Moreover, the main thread is not stopped, it runs the signal handler.

It is stopped, in the signal handler, waiting for the memory barrier to
be removed.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:59                         ` Helmut Eller
@ 2024-12-30 15:15                           ` Eli Zaretskii
  2024-12-30 15:24                             ` Helmut Eller
  2024-12-30 15:25                           ` Pip Cet via Emacs development discussions.
  2024-12-30 15:30                           ` Gerd Möllmann
  2 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-30 15:15 UTC (permalink / raw)
  To: Helmut Eller; +Cc: pipcet, gerd.moellmann, spd, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>,  Eli
>  Zaretskii <eliz@gnu.org>,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Mon, 30 Dec 2024 15:59:08 +0100
> 
>   if (mps_arena_busy (<arena>))
>     plog->gc_count = saturated_add (plog->gc_count, count);
>   else
>     record_backtrace (plog, count);
> 
> is safe.  If we don't hold the lock, then mps_arena_busy returns false
> and we can access memory.  We are safe even if another thread has
> claimed the lock by the time that we reach record_backtrace: the SIGPROF
> handler will just block until the lock is released.

Which other thread could have claimed the lock?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 15:15                           ` Eli Zaretskii
@ 2024-12-30 15:24                             ` Helmut Eller
  0 siblings, 0 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30 15:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: pipcet, gerd.moellmann, spd, emacs-devel

On Mon, Dec 30 2024, Eli Zaretskii wrote:

>> From: Helmut Eller <eller.helmut@gmail.com>
>> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>,  Eli
>>  Zaretskii <eliz@gnu.org>,
>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> Date: Mon, 30 Dec 2024 15:59:08 +0100
>> 
>>   if (mps_arena_busy (<arena>))
>>     plog->gc_count = saturated_add (plog->gc_count, count);
>>   else
>>     record_backtrace (plog, count);
>> 
>> is safe.  If we don't hold the lock, then mps_arena_busy returns false
>> and we can access memory.  We are safe even if another thread has
>> claimed the lock by the time that we reach record_backtrace: the SIGPROF
>> handler will just block until the lock is released.
>
> Which other thread could have claimed the lock?

Any other registered thread that we might have, e.g. the
signal_receiver_thread from Pip's proposal.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:59                         ` Helmut Eller
  2024-12-30 15:15                           ` Eli Zaretskii
@ 2024-12-30 15:25                           ` Pip Cet via Emacs development discussions.
  2024-12-30 15:34                             ` Gerd Möllmann
  2024-12-30 19:02                             ` Helmut Eller
  2024-12-30 15:30                           ` Gerd Möllmann
  2 siblings, 2 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 15:25 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Pip Cet wrote:
>
>> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>> Very few of Emacs' signal handlers actually touch a barrier.  I've also
>>
>> Indeed.  These crashes are rare in typical usage, which doesn't mean we
>> should delay fixing them until Emacs is "unstable enough".  It already
>> is, IMHO, because we take that approach too frequently.
>>
>>> not seen any reproducable receipes for the "signal issues" that the igc
>>> branch supposedly has.
>>
>> Removing the SIGPROF protection code should allow Ihor's recipe to crash
>> again.
>
> Talking about SIGPROF protection code.  It appears to me now (again)
> that, for the SIGPROF handler, this pseudo code
>
>   if (mps_arena_busy (<arena>))
>     plog->gc_count = saturated_add (plog->gc_count, count);
>   else
>     record_backtrace (plog, count);
>
> is safe.

Yes, it's safe, because it does have protection code.  The question was
the extent to which this protection code is required, and whether we can
find another way to deliver signals which doesn't require it.

IMHO, we now have three solutions that are still in the running (my
order of preference):

1. keep the current code and special-case some signals which are needed
for user responsiveness
2. use the signal serialization thread you proposed
3. use an allocation thread, but keep SIGSEGV on the main thread

The first two can be combined with blocking other signals in the SIGSEGV
handler (which would make all platforms behave the same and avoid
SIGSEGV-handler-SIGSEGV races).  The third requires it.

I'd like to take (3) out of the picture for now.  It's working here
(still forwarding SIGSEGV, but that's not the point), and performance
seems okay (better, for some unknown reason; maybe it's just chunk
size), but I couldn't make it behave reproducibly when run under rr, and
it makes a "GC is rare" assumption when it splits MPS objects.  I'd like
to be able to use rr, and "rare GC" assumptions mean that further
improvements to or even fine-tuning of MPS would have to happen
differently.

IOW, MPS needs hacking to make an allocation thread truly viable, mostly
to distinguish fast-path and slow-path allocations.  Not a major change,
I hope, but also out of scope for getting things to work with upstream
MPS, which remains my goal.

I think improving (1) is most likely to do that (but that would require
a shadow signal mask, most likely).

> If we don't hold the lock, then mps_arena_busy returns false
> and we can access memory.  We are safe even if another thread has
> claimed the lock by the time that we reach record_backtrace: the SIGPROF
> handler will just block until the lock is released.
>
> Does somebody disagree?

Not me.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 14:59                         ` Helmut Eller
  2024-12-30 15:15                           ` Eli Zaretskii
  2024-12-30 15:25                           ` Pip Cet via Emacs development discussions.
@ 2024-12-30 15:30                           ` Gerd Möllmann
  2024-12-30 16:57                             ` Helmut Eller
  2 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 15:30 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Pip Cet, Eli Zaretskii, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Pip Cet wrote:
>
>> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>> Very few of Emacs' signal handlers actually touch a barrier.  I've also
>>
>> Indeed.  These crashes are rare in typical usage, which doesn't mean we
>> should delay fixing them until Emacs is "unstable enough".  It already
>> is, IMHO, because we take that approach too frequently.
>>
>>> not seen any reproducable receipes for the "signal issues" that the igc
>>> branch supposedly has.
>>
>> Removing the SIGPROF protection code should allow Ihor's recipe to crash
>> again.
>
> Talking about SIGPROF protection code.  It appears to me now (again)
> that, for the SIGPROF handler, this pseudo code
>
>   if (mps_arena_busy (<arena>))
>     plog->gc_count = saturated_add (plog->gc_count, count);
>   else
>     record_backtrace (plog, count);
>
> is safe.  If we don't hold the lock, then mps_arena_busy returns false
> and we can access memory.  We are safe even if another thread has
> claimed the lock by the time that we reach record_backtrace: the SIGPROF
> handler will just block until the lock is released.
>
> Does somebody disagree?
>
> Helmut

That's this:

mps_bool_t mps_arena_busy(mps_arena_t arena)
{
  /* Don't call ArenaEnter -- the purpose of this function is to
   * determine if the arena lock is held */
  AVER(TESTT(Arena, arena));
  return ArenaBusy(arena);
}

Bool ArenaBusy(Arena arena)
{
  return LockIsHeld(ArenaGlobals(arena)->lock);
}

Bool (LockIsHeld)(Lock lock)
{
  AVERT(Lock, lock);
  if (pthread_mutex_trylock(&lock->mut) == 0) {
    Bool claimed = lock->claims > 0;
    int res = pthread_mutex_unlock(&lock->mut);
    AVER(res == 0);
    return claimed;
  }
  return TRUE;
}

There might be a small window after pthread_mutex_trylock and being back
in the signal handler. Can anything happen in this window?

If no other Emacs threads are running, and the Emacs thread is in the
signal handler, we can trust the "false" from the mps_arena_busy.

If other threads were running (and I don't think that's currently the case),
the "false" also means that the Emacs thread in the signal handler
can't have the lock.

In the "true" case from mps_arena_busy, we're anyway not doing much.

So I think I agree.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 15:25                           ` Pip Cet via Emacs development discussions.
@ 2024-12-30 15:34                             ` Gerd Möllmann
  2024-12-30 19:02                             ` Helmut Eller
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 15:34 UTC (permalink / raw)
  To: Pip Cet; +Cc: Helmut Eller, Eli Zaretskii, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

>> Talking about SIGPROF protection code.  It appears to me now (again)
>> that, for the SIGPROF handler, this pseudo code
>>
>>   if (mps_arena_busy (<arena>))
>>     plog->gc_count = saturated_add (plog->gc_count, count);
>>   else
>>     record_backtrace (plog, count);
>>
>> is safe.
>
> Yes, it's safe, because it does have protection code.  
> The question was the extent to which this protection code is required,
> and whether we can find another way to deliver signals which doesn't
> require it.
>

What do you mean by protection code? Is that the commit I reverted?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 15:30                           ` Gerd Möllmann
@ 2024-12-30 16:57                             ` Helmut Eller
  2024-12-30 17:41                               ` Gerd Möllmann
                                                 ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30 16:57 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Pip Cet, Eli Zaretskii, spd, emacs-devel

On Mon, Dec 30 2024, Gerd Möllmann wrote:

> Bool (LockIsHeld)(Lock lock)
> {
>   AVERT(Lock, lock);
>   if (pthread_mutex_trylock(&lock->mut) == 0) {
>     Bool claimed = lock->claims > 0;
>     int res = pthread_mutex_unlock(&lock->mut);
>     AVER(res == 0);
>     return claimed;
>   }
>   return TRUE;
> }
>
> There might be a small window after pthread_mutex_trylock and being back
> in the signal handler. Can anything happen in this window?
>
> If no other Emacs threads are running, and the Emacs thread is in the
> signal handler, we can trust the "false" from the mps_arena_busy.

Theoretically, a signal handler could interrupt the Emacs thread and
lock the mutex without unlocking it.  That would be a very unusual
signal handler.  I hope no other surprises happen in signal handlers.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 16:57                             ` Helmut Eller
@ 2024-12-30 17:41                               ` Gerd Möllmann
  2024-12-30 17:49                               ` Pip Cet via Emacs development discussions.
  2024-12-30 17:49                               ` Eli Zaretskii
  2 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 17:41 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Pip Cet, Eli Zaretskii, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>
>> Bool (LockIsHeld)(Lock lock)
>> {
>>   AVERT(Lock, lock);
>>   if (pthread_mutex_trylock(&lock->mut) == 0) {
>>     Bool claimed = lock->claims > 0;
>>     int res = pthread_mutex_unlock(&lock->mut);
>>     AVER(res == 0);
>>     return claimed;
>>   }
>>   return TRUE;
>> }
>>
>> There might be a small window after pthread_mutex_trylock and being back
>> in the signal handler. Can anything happen in this window?
>>
>> If no other Emacs threads are running, and the Emacs thread is in the
>> signal handler, we can trust the "false" from the mps_arena_busy.
>
> Theoretically, a signal handler could interrupt the Emacs thread and
> lock the mutex without unlocking it.  That would be a very unusual
> signal handler.  I hope no other surprises happen in signal handlers.
>
> Helmut

Right, that one I forgot. A nested signal may interrupt the signal
handler and acquire the lock in its signal handler but not release it.
The effect would be that the original signal handler would see a false
from mps_arena_busy which would not the truth.

I'd call that a bug in the nested signal handler.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 16:57                             ` Helmut Eller
  2024-12-30 17:41                               ` Gerd Möllmann
@ 2024-12-30 17:49                               ` Pip Cet via Emacs development discussions.
  2024-12-30 18:33                                 ` Helmut Eller
  2024-12-30 17:49                               ` Eli Zaretskii
  2 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 17:49 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>
>> Bool (LockIsHeld)(Lock lock)
>> {
>>   AVERT(Lock, lock);
>>   if (pthread_mutex_trylock(&lock->mut) == 0) {
>>     Bool claimed = lock->claims > 0;
>>     int res = pthread_mutex_unlock(&lock->mut);
>>     AVER(res == 0);
>>     return claimed;
>>   }
>>   return TRUE;
>> }
>>
>> There might be a small window after pthread_mutex_trylock and being back
>> in the signal handler. Can anything happen in this window?
>>
>> If no other Emacs threads are running, and the Emacs thread is in the
>> signal handler, we can trust the "false" from the mps_arena_busy.
>
> Theoretically, a signal handler could interrupt the Emacs thread and
> lock the mutex without unlocking it.

I don't think that's a problem.  Here's why:

We'd have to call the POSIX police.  I believe it's a conscious POSIX
decision not to allow hand-over of locks from one thread/signal handler
(those can't even call _trylock) to another; this is relevant to the
priority inversion scenario (if we had a "background" GC thread running
at a lower priority (whatever that would mean?  E-core?  Different
power-performance prefs?  Throttled?), the main thread would have to
find a way to boost its priority (move it to a P-core, unthrottle,
whatever) if we're actually waiting for it to release the arena lock.
One way would be to take over its lock (easy) and stack (hard) while
retaining thread settings, but POSIX decided we don't want to do that.
Thank you, POSIX (in this case)).

On inhomogeneous systems (almost everything you can buy today, ESP32 to
server CPU), "priority inversion" can happen with just two threads,
since priority is no longer defined by access to a single or several
identical cores.

But anyway, POSIX prohibits it, glibc on GNU/Linux doesn't support it,
I'm not aware of any other systems making that useful, certainly not for
Emacs.

> That would be a very unusual signal handler.  I hope no other surprises happen in signal handlers.

longjmp-based green threads?  (MPS currently assumes a simple linear
stack, gcc can produce split-stack code, getting that combination to
work would be good; I can dig up the patch for enabling it for the old
GC).

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 16:57                             ` Helmut Eller
  2024-12-30 17:41                               ` Gerd Möllmann
  2024-12-30 17:49                               ` Pip Cet via Emacs development discussions.
@ 2024-12-30 17:49                               ` Eli Zaretskii
  2024-12-30 18:37                                 ` Gerd Möllmann
  2 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-30 17:49 UTC (permalink / raw)
  To: Helmut Eller; +Cc: gerd.moellmann, pipcet, spd, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Pip Cet <pipcet@protonmail.com>,  Eli Zaretskii <eliz@gnu.org>,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Mon, 30 Dec 2024 17:57:02 +0100
> 
> On Mon, Dec 30 2024, Gerd Möllmann wrote:
> 
> > Bool (LockIsHeld)(Lock lock)
> > {
> >   AVERT(Lock, lock);
> >   if (pthread_mutex_trylock(&lock->mut) == 0) {
> >     Bool claimed = lock->claims > 0;
> >     int res = pthread_mutex_unlock(&lock->mut);
> >     AVER(res == 0);
> >     return claimed;
> >   }
> >   return TRUE;
> > }
> >
> > There might be a small window after pthread_mutex_trylock and being back
> > in the signal handler. Can anything happen in this window?
> >
> > If no other Emacs threads are running, and the Emacs thread is in the
> > signal handler, we can trust the "false" from the mps_arena_busy.
> 
> Theoretically, a signal handler could interrupt the Emacs thread and
> lock the mutex without unlocking it.  That would be a very unusual
> signal handler.  I hope no other surprises happen in signal handlers.

We should keep our signal handlers very simple and safe, that's true.
It is not very hard, and much of that work was already done, when we
stopped running complex stuff in SIGIO handler etc.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 17:49                               ` Pip Cet via Emacs development discussions.
@ 2024-12-30 18:33                                 ` Helmut Eller
  0 siblings, 0 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-30 18:33 UTC (permalink / raw)
  To: Pip Cet; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

On Mon, Dec 30 2024, Pip Cet wrote:

>> Theoretically, a signal handler could interrupt the Emacs thread and
>> lock the mutex without unlocking it.
>
> I don't think that's a problem.  Here's why:
>
> We'd have to call the POSIX police.  I believe it's a conscious POSIX
> decision not to allow hand-over of locks from one thread/signal handler
> (those can't even call _trylock) to another;

Do you mean, it is not allowed to call pthread_mutex_trylock in a signal
handler?

[...]
> But anyway, POSIX prohibits it, glibc on GNU/Linux doesn't support it,
> I'm not aware of any other systems making that useful, certainly not for
> Emacs.
>
>> That would be a very unusual signal handler.  I hope no other
>> surprises happen in signal handlers.
>
> longjmp-based green threads?  (MPS currently assumes a simple linear
> stack, gcc can produce split-stack code, getting that combination to
> work would be good; I can dig up the patch for enabling it for the old
> GC).

I think MPS would require a special thread module that can handle green
threads.  Probably nothing we have to worry about.

Helmut




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 17:49                               ` Eli Zaretskii
@ 2024-12-30 18:37                                 ` Gerd Möllmann
  2024-12-30 19:15                                   ` Eli Zaretskii
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 18:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, pipcet, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Helmut Eller <eller.helmut@gmail.com>
>> Cc: Pip Cet <pipcet@protonmail.com>,  Eli Zaretskii <eliz@gnu.org>,
>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> Date: Mon, 30 Dec 2024 17:57:02 +0100
>> 
>> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>> 
>> > Bool (LockIsHeld)(Lock lock)
>> > {
>> >   AVERT(Lock, lock);
>> >   if (pthread_mutex_trylock(&lock->mut) == 0) {
>> >     Bool claimed = lock->claims > 0;
>> >     int res = pthread_mutex_unlock(&lock->mut);
>> >     AVER(res == 0);
>> >     return claimed;
>> >   }
>> >   return TRUE;
>> > }
>> >
>> > There might be a small window after pthread_mutex_trylock and being back
>> > in the signal handler. Can anything happen in this window?
>> >
>> > If no other Emacs threads are running, and the Emacs thread is in the
>> > signal handler, we can trust the "false" from the mps_arena_busy.
>> 
>> Theoretically, a signal handler could interrupt the Emacs thread and
>> lock the mutex without unlocking it.  That would be a very unusual
>> signal handler.  I hope no other surprises happen in signal handlers.
>
> We should keep our signal handlers very simple and safe, that's true.
> It is not very hard, and much of that work was already done, when we
> stopped running complex stuff in SIGIO handler etc.

So, to summarize, everyone agrees with Helmut? 



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 15:25                           ` Pip Cet via Emacs development discussions.
  2024-12-30 15:34                             ` Gerd Möllmann
@ 2024-12-30 19:02                             ` Helmut Eller
  2024-12-30 20:03                               ` Pip Cet via Emacs development discussions.
  1 sibling, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2024-12-30 19:02 UTC (permalink / raw)
  To: Pip Cet; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

On Mon, Dec 30 2024, Pip Cet wrote:

> IMHO, we now have three solutions that are still in the running (my
> order of preference):
>
> 1. keep the current code and special-case some signals which are needed
> for user responsiveness
> 2. use the signal serialization thread you proposed
> 3. use an allocation thread, but keep SIGSEGV on the main thread

I think this is missing:

4. add callbacks to ArenaEnter/ArenaLeave to block/unblock signals

Perhaps add 5. (or make it a variant of 1)

5. special-case some performance critical handlers and simplify all
   others so that they are obviously harmless.

   The SIGIO handler is an example for a harmless signal handler.
   handle_alarm_signal seems harmless too.

   handle_interrupt_signal is definitely not harmless, but may not be
   preformance critical.

   So far SIGPROF seems to be the only performance cortical handler.
   
Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 18:37                                 ` Gerd Möllmann
@ 2024-12-30 19:15                                   ` Eli Zaretskii
  2024-12-30 19:55                                     ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-30 19:15 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, pipcet, spd, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Mon, 30 Dec 2024 19:37:38 +0100
> 
> So, to summarize, everyone agrees with Helmut? 

That the SIGPROF handler in the form he described would be safe?  I
agree.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 19:15                                   ` Eli Zaretskii
@ 2024-12-30 19:55                                     ` Gerd Möllmann
  2024-12-31  7:34                                       ` Helmut Eller
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-30 19:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, pipcet, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> Date: Mon, 30 Dec 2024 19:37:38 +0100
>> 
>> So, to summarize, everyone agrees with Helmut? 
>
> That the SIGPROF handler in the form he described would be safe?  I
> agree.

What we have in scratch/igc:

static void
handle_profiler_signal (int signal)
{
  EMACS_INT count = 1;
#if defined HAVE_ITIMERSPEC && defined HAVE_TIMER_GETOVERRUN
  if (profiler_timer_ok)
    {
      int overruns = timer_getoverrun (profiler_timer);
      eassert (overruns >= 0);
      count += overruns;
    }
#endif
  add_sample (&cpu, count);
}

static void
add_sample (struct profiler_log *plog, EMACS_INT count)
{
#ifdef HAVE_MPS
  if (igc_busy_p ())
#else
  if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
#endif
    /* Special case the time-count inside GC because the hash-table
       code is not prepared to be used while the GC is running.
       More specifically it uses ASIZE at many places where it does
       not expect the ARRAY_MARK_FLAG to be set.  We could try and
       harden the hash-table code, but it doesn't seem worth the
       effort.  */
    plog->gc_count = saturated_add (plog->gc_count, count);
  else
    record_backtrace (plog, count);
}

bool
igc_busy_p (void)
{
  return mps_arena_busy (global_igc->arena);
}

Now the question is if that's what Helmut was describing.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 19:02                             ` Helmut Eller
@ 2024-12-30 20:03                               ` Pip Cet via Emacs development discussions.
  0 siblings, 0 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 20:03 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Pip Cet wrote:
>
>> IMHO, we now have three solutions that are still in the running (my
>> order of preference):
>>
>> 1. keep the current code and special-case some signals which are needed
>> for user responsiveness
>> 2. use the signal serialization thread you proposed
>> 3. use an allocation thread, but keep SIGSEGV on the main thread
>
> I think this is missing:
>
> 4. add callbacks to ArenaEnter/ArenaLeave to block/unblock signals

Everything that requires MPS modifications for all users (rather than
having a few users benefit from them, if they choose to run a modified
MPS) isn't on my list.

> Perhaps add 5. (or make it a variant of 1)
>
> 5. special-case some performance critical handlers and simplify all
>    others so that they are obviously harmless.

That's a "rewrite Emacs signal handling" project, not a "add MPS to
Emacs" project.  That said, I'm all for it.  The MPS part is definitely
subsumed by (1).

>    The SIGIO handler is an example for a harmless signal handler.
>    handle_alarm_signal seems harmless too.

It needs low latency, but not high throughput, AFAIK.  I consider that
part of "performance".

>    handle_interrupt_signal is definitely not harmless, but may not be
>    preformance critical.

That's meant to do unsafe things on some systems, IIUC.

>    So far SIGPROF seems to be the only performance cortical handler.

I wouldn't count out SIGALRM.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-30 19:55                                     ` Gerd Möllmann
@ 2024-12-31  7:34                                       ` Helmut Eller
  2024-12-31  9:19                                         ` Gerd Möllmann
                                                           ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-31  7:34 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

On Mon, Dec 30 2024, Gerd Möllmann wrote:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>>> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
>>>   spd@toadstyle.org,  emacs-devel@gnu.org
>>> Date: Mon, 30 Dec 2024 19:37:38 +0100
>>> 
>>> So, to summarize, everyone agrees with Helmut?

Except the POSIX police: it says that pthread_mutex_trylock isn't async
signal safe.  I suppose this also makes it's unsafe to use MPS's fault
handler in an async signal handler.  Bummer.  (Does the police take
bribes?)

>> That the SIGPROF handler in the form he described would be safe?  I
>> agree.
>
> What we have in scratch/igc:
[...]
> static void
> add_sample (struct profiler_log *plog, EMACS_INT count)
> {
> #ifdef HAVE_MPS
>   if (igc_busy_p ())
> #else
>   if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
> #endif
>     /* Special case the time-count inside GC because the hash-table
>        code is not prepared to be used while the GC is running.
>        More specifically it uses ASIZE at many places where it does
>        not expect the ARRAY_MARK_FLAG to be set.  We could try and
>        harden the hash-table code, but it doesn't seem worth the
>        effort.  */
>     plog->gc_count = saturated_add (plog->gc_count, count);
>   else
>     record_backtrace (plog, count);
> }
[...]
>
> Now the question is if that's what Helmut was describing.

Yes, that's what I meant.

I wonder if the backtrace that we see in the signal handler is any
different from the backrace that we would see at the next safe point
(i.e. the next time maybe_quit is called).  If the backtraces are the
same, then we could record the backtrace there; that would be much
nicer.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  7:34                                       ` Helmut Eller
@ 2024-12-31  9:19                                         ` Gerd Möllmann
  2024-12-31  9:51                                           ` Helmut Eller
                                                             ` (2 more replies)
  2024-12-31 10:09                                         ` Pip Cet via Emacs development discussions.
  2024-12-31 13:14                                         ` Eli Zaretskii
  2 siblings, 3 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31  9:19 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>>>> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
>>>>   spd@toadstyle.org,  emacs-devel@gnu.org
>>>> Date: Mon, 30 Dec 2024 19:37:38 +0100
>>>>
>>>> So, to summarize, everyone agrees with Helmut?
>
> Except the POSIX police: it says that pthread_mutex_trylock isn't async
> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
> handler in an async signal handler.  Bummer.  (Does the police take
> bribes?)

Thanks. I guess it shows that I couldn't keep up with my mail, sorry for
that.

So we have this picture, I think

              t1           t2                    t3
  ------------|------------|---------------------|-----------------> t
   signal        pthread      other stuff          signal handler
   handler       trylock      until return to      branching
   calling                    signal handler       on result of busy
   mps_arena_
   busy

We have a window [t1, t2] where the nested signals lead to undefined
behavior. and [t2, t3] where threads and nested signals can come into
play, but that's not a problem, iff signal handlers don't leave a lock
behind them.

Hm. Have you perhaps looked at a pthread implementation, what such a
mutex actually is on Linux?

>>
>> Now the question is if that's what Helmut was describing.
>
> Yes, that's what I meant.

Thanks.

> I wonder if the backtrace that we see in the signal handler is any
> different from the backrace that we would see at the next safe point
> (i.e. the next time maybe_quit is called).  If the backtraces are the
> same, then we could record the backtrace there; that would be much
> nicer.

Yeah.




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  9:19                                         ` Gerd Möllmann
@ 2024-12-31  9:51                                           ` Helmut Eller
  2024-12-31 10:00                                             ` Gerd Möllmann
  2024-12-31 13:49                                             ` Pip Cet via Emacs development discussions.
  2024-12-31  9:51                                           ` Gerd Möllmann
  2024-12-31 13:18                                           ` Eli Zaretskii
  2 siblings, 2 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-31  9:51 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

On Tue, Dec 31 2024, Gerd Möllmann wrote:

> Helmut Eller <eller.helmut@gmail.com> writes:
[...]
>> Except the POSIX police: it says that pthread_mutex_trylock isn't async
>> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
>> handler in an async signal handler.  Bummer.  (Does the police take
>> bribes?)
[...]
> So we have this picture, I think
>
>               t1           t2                    t3
>   ------------|------------|---------------------|-----------------> t
>    signal        pthread      other stuff          signal handler
>    handler       trylock      until return to      branching
>    calling                    signal handler       on result of busy
>    mps_arena_
>    busy
>
> We have a window [t1, t2] where the nested signals lead to undefined
> behavior. and [t2, t3] where threads and nested signals can come into
> play, but that's not a problem, iff signal handlers don't leave a lock
> behind them.
>
> Hm. Have you perhaps looked at a pthread implementation, what such a
> mutex actually is on Linux?

Judging from the source here

https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/nptl/bits/struct_mutex.h;hb=HEAD

and here

https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=nptl/pthread_mutex_trylock.c;hb=HEAD

I would say that the mutex is a struct with multiple fields and that
pthread_mutex_trylock is neither a syscall nor an atomic instruction.
The struct may simply be in an inconsistent state at the time t0, the
beginning of the SIGPROF handler.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  9:19                                         ` Gerd Möllmann
  2024-12-31  9:51                                           ` Helmut Eller
@ 2024-12-31  9:51                                           ` Gerd Möllmann
  2024-12-31 13:18                                           ` Eli Zaretskii
  2 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31  9:51 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Hm. Have you perhaps looked at a pthread implementation, what such a
> mutex actually is on Linux?

Unless I got lost in Apple's sources, trylock on macOS is the below.
I find it hard to tell if a signal here while being called from a signal
would do damage.

int
pthread_mutex_trylock(pthread_mutex_t *mutex)
{
	return _pthread_mutex_lock(mutex, true);
}

static inline int
_pthread_mutex_lock(pthread_mutex_t *mutex, bool trylock)
{
	if (os_unlikely(!_pthread_mutex_check_signature_fast(mutex))) {
		return _pthread_mutex_lock_init_slow(mutex, trylock);
	}

	if (os_unlikely(_pthread_mutex_is_fairshare(mutex))) {
		return _pthread_mutex_fairshare_lock(mutex, trylock);
	}

	if (os_unlikely(_pthread_mutex_uses_ulock(mutex))) {
		return _pthread_mutex_ulock_lock(mutex, trylock);
	}

#if ENABLE_USERSPACE_TRACE
	return _pthread_mutex_firstfit_lock_slow(mutex, trylock);
#elif PLOCKSTAT
	if (PLOCKSTAT_MUTEX_ACQUIRE_ENABLED() || PLOCKSTAT_MUTEX_ERROR_ENABLED()) {
		return _pthread_mutex_firstfit_lock_slow(mutex, trylock);
	}
#endif

	return _pthread_mutex_firstfit_lock(mutex, trylock);
}

static inline int
_pthread_mutex_firstfit_lock(pthread_mutex_t *mutex, bool trylock)
{
	/*
	 * This is the first-fit fast path. The fairshare fast-ish path is in
	 * _pthread_mutex_fairshare_lock()
	 */
	uint64_t *tidaddr;
	MUTEX_GETTID_ADDR(mutex, &tidaddr);
	uint64_t selfid = _pthread_threadid_self_np_direct();

	mutex_seq *seqaddr;
	MUTEX_GETSEQ_ADDR(mutex, &seqaddr);

	mutex_seq oldseq, newseq;
	mutex_seq_load(seqaddr, &oldseq);

	if (os_unlikely(!trylock && (oldseq.lgenval & PTH_RWL_EBIT))) {
		return _pthread_mutex_firstfit_lock_slow(mutex, trylock);
	}

	bool gotlock;
	do {
		newseq = oldseq;
		gotlock = is_rwl_ebit_clear(oldseq.lgenval);

		if (trylock && !gotlock) {
#if __LP64__
			// The sequence load is atomic, so we can bail here without writing
			// it and avoid some unnecessary coherence traffic - rdar://57259033
			os_atomic_thread_fence(acquire);
			return EBUSY;
#else
			// A trylock on a held lock will fail immediately. But since
			// we did not load the sequence words atomically, perform a
			// no-op CAS64 to ensure that nobody has unlocked concurrently.
#endif
		} else if (os_likely(gotlock)) {
			// In first-fit, getting the lock simply adds the E-bit
			newseq.lgenval |= PTH_RWL_EBIT;
		} else {
			return _pthread_mutex_firstfit_lock_slow(mutex, trylock);
		}
	} while (os_unlikely(!mutex_seq_atomic_cmpxchgv(seqaddr, &oldseq, &newseq,
			acquire)));

	if (os_likely(gotlock)) {
		os_atomic_store_wide(tidaddr, selfid, relaxed);
		return 0;
	} else if (trylock) {
		return EBUSY;
	} else {
		__builtin_trap();
	}
}




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  9:51                                           ` Helmut Eller
@ 2024-12-31 10:00                                             ` Gerd Möllmann
  2024-12-31 13:49                                             ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 10:00 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

>> Hm. Have you perhaps looked at a pthread implementation, what such a
>> mutex actually is on Linux?
>
> Judging from the source here
>
> https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/nptl/bits/struct_mutex.h;hb=HEAD
>
> and here
>
> https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=nptl/pthread_mutex_trylock.c;hb=HEAD
>
> I would say that the mutex is a struct with multiple fields and that
> pthread_mutex_trylock is neither a syscall nor an atomic instruction.
> The struct may simply be in an inconsistent state at the time t0, the
> beginning of the SIGPROF handler.
>
> Helmut

Thanks. It's similar on macOS. Too bad, I had hoped for some OS call or
something.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  7:34                                       ` Helmut Eller
  2024-12-31  9:19                                         ` Gerd Möllmann
@ 2024-12-31 10:09                                         ` Pip Cet via Emacs development discussions.
  2024-12-31 13:27                                           ` Eli Zaretskii
  2024-12-31 13:14                                         ` Eli Zaretskii
  2 siblings, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31 10:09 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:

> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>>>> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
>>>>   spd@toadstyle.org,  emacs-devel@gnu.org
>>>> Date: Mon, 30 Dec 2024 19:37:38 +0100
>>>>
>>>> So, to summarize, everyone agrees with Helmut?
>
> Except the POSIX police: it says that pthread_mutex_trylock isn't async
> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
> handler in an async signal handler.  Bummer.  (Does the police take
> bribes?)
>
>>> That the SIGPROF handler in the form he described would be safe?  I
>>> agree.
>>
>> What we have in scratch/igc:
> [...]
>> static void
>> add_sample (struct profiler_log *plog, EMACS_INT count)
>> {
>> #ifdef HAVE_MPS
>>   if (igc_busy_p ())
>> #else
>>   if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
>> #endif
>>     /* Special case the time-count inside GC because the hash-table
>>        code is not prepared to be used while the GC is running.
>>        More specifically it uses ASIZE at many places where it does
>>        not expect the ARRAY_MARK_FLAG to be set.  We could try and
>>        harden the hash-table code, but it doesn't seem worth the
>>        effort.  */
>>     plog->gc_count = saturated_add (plog->gc_count, count);
>>   else
>>     record_backtrace (plog, count);
>> }
> [...]
>>
>> Now the question is if that's what Helmut was describing.
>
> Yes, that's what I meant.
>
> I wonder if the backtrace that we see in the signal handler is any
> different from the backrace that we would see at the next safe point
> (i.e. the next time maybe_quit is called).

If we keep a shadow signal mask, the only requirement for a safe point
is that we made some progress OR the lock was released.  But the
backtrace will change if we wait for the next maybe_quit, IIUC.

maybe_quit is not a great safe point, it's just the best we have.  It's
insufficient if Emacs becomes idle, and how often we call rarely_quit is
quite unpredictable.

> If the backtraces are the same, then we could record the backtrace
> there; that would be much nicer.

I'm still hoping for more useful backtraces.  Those require looking at
the C stack or global variables; I'd prefer not to make the assumption
that the SIGPROF handler is interested only in some words of the specpdl

If modifying MPS is fair game, we could at least eliminate the
igc_busy_p () false positives, on systems nice enough to let us know
which thread holds a locked mutex.  On GNU/Linux, only same-thread
deadlocks are detected with EDEADLK so the distinction can be made.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  7:34                                       ` Helmut Eller
  2024-12-31  9:19                                         ` Gerd Möllmann
  2024-12-31 10:09                                         ` Pip Cet via Emacs development discussions.
@ 2024-12-31 13:14                                         ` Eli Zaretskii
  2024-12-31 14:19                                           ` Pip Cet via Emacs development discussions.
  2024-12-31 14:40                                           ` Helmut Eller
  2 siblings, 2 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 13:14 UTC (permalink / raw)
  To: Helmut Eller; +Cc: gerd.moellmann, pipcet, spd, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Tue, 31 Dec 2024 08:34:42 +0100
> 
> On Mon, Dec 30 2024, Gerd Möllmann wrote:
> 
> > Eli Zaretskii <eliz@gnu.org> writes:
> >
> >>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> >>> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
> >>>   spd@toadstyle.org,  emacs-devel@gnu.org
> >>> Date: Mon, 30 Dec 2024 19:37:38 +0100
> >>> 
> >>> So, to summarize, everyone agrees with Helmut?
> 
> Except the POSIX police: it says that pthread_mutex_trylock isn't async
> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
> handler in an async signal handler.  Bummer.  (Does the police take
> bribes?)

Doesn't MPS itself call it from a SIGSEGV handler?

> I wonder if the backtrace that we see in the signal handler is any
> different from the backrace that we would see at the next safe point
> (i.e. the next time maybe_quit is called).

I think we cannot rely on that, because maybe_quit must be called by
hand, it isn't magic.  We call it from various places in the
interpreter, which could well be in some other place of a Lisp
program.

Once again, why not ask the MPS folks to give us a callback?  Or maybe
we could try hacking MPS ourselves first, to see if that does the job,
and ask them then?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  9:19                                         ` Gerd Möllmann
  2024-12-31  9:51                                           ` Helmut Eller
  2024-12-31  9:51                                           ` Gerd Möllmann
@ 2024-12-31 13:18                                           ` Eli Zaretskii
  2024-12-31 14:15                                             ` Gerd Möllmann
  2 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 13:18 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, pipcet, spd, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Tue, 31 Dec 2024 10:19:15 +0100
> 
> Helmut Eller <eller.helmut@gmail.com> writes:
> 
> > Except the POSIX police: it says that pthread_mutex_trylock isn't async
> > signal safe.  I suppose this also makes it's unsafe to use MPS's fault
> > handler in an async signal handler.  Bummer.  (Does the police take
> > bribes?)
> 
> Thanks. I guess it shows that I couldn't keep up with my mail, sorry for
> that.
> 
> So we have this picture, I think
> 
>               t1           t2                    t3
>   ------------|------------|---------------------|-----------------> t
>    signal        pthread      other stuff          signal handler
>    handler       trylock      until return to      branching
>    calling                    signal handler       on result of busy
>    mps_arena_
>    busy
> 
> We have a window [t1, t2] where the nested signals lead to undefined
> behavior. and [t2, t3] where threads and nested signals can come into
> play, but that's not a problem, iff signal handlers don't leave a lock
> behind them.

If the problem is other signals in [t1, t2], we could install the
signal handler in a way that masks all other signals while the handler
runs.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 10:09                                         ` Pip Cet via Emacs development discussions.
@ 2024-12-31 13:27                                           ` Eli Zaretskii
  2024-12-31 14:29                                             ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 13:27 UTC (permalink / raw)
  To: Pip Cet; +Cc: eller.helmut, gerd.moellmann, spd, emacs-devel

> Date: Tue, 31 Dec 2024 10:09:25 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli Zaretskii <eliz@gnu.org>, spd@toadstyle.org, emacs-devel@gnu.org
> 
> "Helmut Eller" <eller.helmut@gmail.com> writes:
> 
> > I wonder if the backtrace that we see in the signal handler is any
> > different from the backrace that we would see at the next safe point
> > (i.e. the next time maybe_quit is called).
> 
> If we keep a shadow signal mask, the only requirement for a safe point
> is that we made some progress OR the lock was released.  But the
> backtrace will change if we wait for the next maybe_quit, IIUC.
> 
> maybe_quit is not a great safe point, it's just the best we have.  It's
> insufficient if Emacs becomes idle, and how often we call rarely_quit is
> quite unpredictable.

What about doing that from process_pending_signals?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31  9:51                                           ` Helmut Eller
  2024-12-31 10:00                                             ` Gerd Möllmann
@ 2024-12-31 13:49                                             ` Pip Cet via Emacs development discussions.
  2024-12-31 14:13                                               ` Eli Zaretskii
  1 sibling, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31 13:49 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Gerd Möllmann, Eli Zaretskii, spd, emacs-devel

"Helmut Eller" <eller.helmut@gmail.com> writes:

> On Tue, Dec 31 2024, Gerd Möllmann wrote:
>
>> Helmut Eller <eller.helmut@gmail.com> writes:
> [...]
>>> Except the POSIX police: it says that pthread_mutex_trylock isn't async
>>> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
>>> handler in an async signal handler.  Bummer.  (Does the police take
>>> bribes?)
> [...]
>> So we have this picture, I think
>>
>>               t1           t2                    t3
>>   ------------|------------|---------------------|-----------------> t
>>    signal        pthread      other stuff          signal handler
>>    handler       trylock      until return to      branching
>>    calling                    signal handler       on result of busy
>>    mps_arena_
>>    busy
>>
>> We have a window [t1, t2] where the nested signals lead to undefined
>> behavior. and [t2, t3] where threads and nested signals can come into
>> play, but that's not a problem, iff signal handlers don't leave a lock
>> behind them.
>>
>> Hm. Have you perhaps looked at a pthread implementation, what such a
>> mutex actually is on Linux?
>
> Judging from the source here
>
> https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/nptl/bits/struct_mutex.h;hb=HEAD
>
> and here
>
> https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=nptl/pthread_mutex_trylock.c;hb=HEAD
>
> I would say that the mutex is a struct with multiple fields and that
> pthread_mutex_trylock is neither a syscall nor an atomic instruction.

The PTHREAD_MUTEX_ERRORCHECK_NP case looks fine to me, assuming signal
handlers balance successful trylock calls with unlocks in the same
handler.  It uses __data.__lock to serialize access to the plain old C
data in the mutex struct, and doesn't touch anything if it can't acquire
the lock atomically (that's what lll does).

PTHREAD_MUTEX_RECURSIVE_NP, not so much, so don't use recursive mutexes
:-)

> The struct may simply be in an inconsistent state at the time t0, the
> beginning of the SIGPROF handler.

It could certainly be made to fail more loudly in the case of acquiring
a recursive mutex from a signal handler.  Someone might try to do that
to build a non-recursive mutex based on recursive mutexes, but that's
the only use case I can see.

Alternatively, let's all switch to FreeDOS!  No memory barriers, no
signals, pdumper works now, and MPS just survived its first GC cycle.

I said I wasn't going to port MPS to DOS, but I didn't have to; the
ANSI C environment works fine, eagerly tripping all memory barriers
rather than installing them.

The real point of that experiment was to test whether the ANSI C
environment might be a useful fire exit in case there are unfixable
protection problems on some platforms.

(It's true that most DOS "extenders" appear to use the MMU, and that
might make it sound like mprotect() could simply involve walking the
(unprotected) page tables; but my understanding is it's 4MB pages, which
is inconvenient for us).

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 13:49                                             ` Pip Cet via Emacs development discussions.
@ 2024-12-31 14:13                                               ` Eli Zaretskii
  0 siblings, 0 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 14:13 UTC (permalink / raw)
  To: Pip Cet; +Cc: eller.helmut, gerd.moellmann, spd, emacs-devel

> Date: Tue, 31 Dec 2024 13:49:05 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli Zaretskii <eliz@gnu.org>, spd@toadstyle.org, emacs-devel@gnu.org
> 
> Alternatively, let's all switch to FreeDOS!  No memory barriers, no
> signals

That's not true: DJGPP (which is the only way to compile Emacs on DOS)
supports signals.  SIGSEGV certainly, but also SIGPROF and SIGALRM,
AFAIR.

> (It's true that most DOS "extenders" appear to use the MMU, and that
> might make it sound like mprotect() could simply involve walking the
> (unprotected) page tables; but my understanding is it's 4MB pages, which
> is inconvenient for us).

The only relevant DOS extenders for Emacs are DPMI hosts, cwsdpmi
being the first and the most obvious candidate.  cwsdpmi supports
mprotect.

But we digress.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 13:18                                           ` Eli Zaretskii
@ 2024-12-31 14:15                                             ` Gerd Möllmann
  2024-12-31 14:27                                               ` Eli Zaretskii
  2024-12-31 15:12                                               ` Helmut Eller
  0 siblings, 2 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 14:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, pipcet, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> Date: Tue, 31 Dec 2024 10:19:15 +0100
>> 
>> Helmut Eller <eller.helmut@gmail.com> writes:
>> 
>> > Except the POSIX police: it says that pthread_mutex_trylock isn't async
>> > signal safe.  I suppose this also makes it's unsafe to use MPS's fault
>> > handler in an async signal handler.  Bummer.  (Does the police take
>> > bribes?)
>> 
>> Thanks. I guess it shows that I couldn't keep up with my mail, sorry for
>> that.
>> 
>> So we have this picture, I think
>> 
>>               t1           t2                    t3
>>   ------------|------------|---------------------|-----------------> t
>>    signal        pthread      other stuff          signal handler
>>    handler       trylock      until return to      branching
>>    calling                    signal handler       on result of busy
>>    mps_arena_
>>    busy
>> 
>> We have a window [t1, t2] where the nested signals lead to undefined
>> behavior. and [t2, t3] where threads and nested signals can come into
>> play, but that's not a problem, iff signal handlers don't leave a lock
>> behind them.
>
> If the problem is other signals in [t1, t2], we could install the
> signal handler in a way that masks all other signals while the handler
> runs.

That would be necessary, but there's another thing Helmut pointed out.
At t0, when we enter the SIGPROF handler, we may have interrupted
pthread code in the Emacs thread, so pthread may currently be in an
inconsistent state.


I'd like to instead revive the idea of getting the backtrace in the
signal handler and doing anything else elsewhere. What I've seen so far
as alternatives is for my taste in the end too difficult.

We have established that calling get_backtrace is safe since it doesn't
access memory in our AMC pool, which might have a barrier. Counter
argument was that one would have to know too much about what is safe to
access and what cannot, and that would be unmaintainable.

What one has to know in get_backtrace is

- struct thread_state is safe because it is Lisp object. but it is not
  in the AMC pool, but another pool not using barriers. One could "hide"
  that knowledge by putting get_backtrace into igc.c. We only need the
  binding stack members (specpdl*) from current_thread. Another
  function.
  
- Accessing any other Lisp objects is taboo. That includes any memory of
  the object, in particular it includes their headers, i.e. type checks
  for PVEC types. One could require no type checks.
  
- Copying Lisp_Object and such is okay because that does not access
  the memory of the referenced object.

Maybe, after reading igc.org, that is acceptable maintenance-wise?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 13:14                                         ` Eli Zaretskii
@ 2024-12-31 14:19                                           ` Pip Cet via Emacs development discussions.
  2024-12-31 14:31                                             ` Eli Zaretskii
  2024-12-31 14:40                                           ` Helmut Eller
  1 sibling, 1 reply; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31 14:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Helmut Eller, gerd.moellmann, spd, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> From: Helmut Eller <eller.helmut@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> Date: Tue, 31 Dec 2024 08:34:42 +0100
>>
>> On Mon, Dec 30 2024, Gerd Möllmann wrote:
>>
>> > Eli Zaretskii <eliz@gnu.org> writes:
>> >
>> >>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> >>> Cc: Helmut Eller <eller.helmut@gmail.com>,  pipcet@protonmail.com,
>> >>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> >>> Date: Mon, 30 Dec 2024 19:37:38 +0100
>> >>>
>> >>> So, to summarize, everyone agrees with Helmut?
>>
>> Except the POSIX police: it says that pthread_mutex_trylock isn't async
>> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
>> handler in an async signal handler.  Bummer.  (Does the police take
>> bribes?)
>
> Doesn't MPS itself call it from a SIGSEGV handler?

Worse, it calls pthread_mutex_lock, IIUC.  This isn't POSIX but works
(on GNU/Linux, it works as long as you don't make the mutex recursive or
rely on deadlock detection.  IIRC MPS uses normal or error-checking
mutexes only and only uses deadlock detection to abort).

>> I wonder if the backtrace that we see in the signal handler is any
>> different from the backrace that we would see at the next safe point
>> (i.e. the next time maybe_quit is called).
>
> I think we cannot rely on that, because maybe_quit must be called by
> hand, it isn't magic.  We call it from various places in the
> interpreter, which could well be in some other place of a Lisp
> program.
>
> Once again, why not ask the MPS folks to give us a callback?  Or maybe
> we could try hacking MPS ourselves first, to see if that does the job,
> and ask them then?

I'm not sure what your suggestion is at this point, sorry.  You said in
another email not too long ago that you wanted to improve the current
code by special-casing "simple" signals; I agreed; now you appear to be
saying we should wait for an MPS modification.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:15                                             ` Gerd Möllmann
@ 2024-12-31 14:27                                               ` Eli Zaretskii
  2024-12-31 15:05                                                 ` Gerd Möllmann
  2024-12-31 15:12                                               ` Helmut Eller
  1 sibling, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 14:27 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, pipcet, spd, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: eller.helmut@gmail.com,  pipcet@protonmail.com,  spd@toadstyle.org,
>   emacs-devel@gnu.org
> Date: Tue, 31 Dec 2024 15:15:04 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > If the problem is other signals in [t1, t2], we could install the
> > signal handler in a way that masks all other signals while the handler
> > runs.
> 
> That would be necessary, but there's another thing Helmut pointed out.
> At t0, when we enter the SIGPROF handler, we may have interrupted
> pthread code in the Emacs thread, so pthread may currently be in an
> inconsistent state.

If that really can happen, then pthreads is more fragile than I hoped.
I hoped they don't let signals interrupt them when they are in
critical sections like that.  Are we sure this danger is real?

> I'd like to instead revive the idea of getting the backtrace in the
> signal handler and doing anything else elsewhere.

If it works, sure.  But I thought copying the data had the same
problems as accessing it from the handler?

> Maybe, after reading igc.org, that is acceptable maintenance-wise?

Can you show a patch?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 13:27                                           ` Eli Zaretskii
@ 2024-12-31 14:29                                             ` Pip Cet via Emacs development discussions.
  2024-12-31 14:34                                               ` Eli Zaretskii
  2024-12-31 15:07                                               ` Gerd Möllmann
  0 siblings, 2 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31 14:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, gerd.moellmann, spd, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Tue, 31 Dec 2024 10:09:25 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli Zaretskii <eliz@gnu.org>, spd@toadstyle.org, emacs-devel@gnu.org
>>
>> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>
>> > I wonder if the backtrace that we see in the signal handler is any
>> > different from the backrace that we would see at the next safe point
>> > (i.e. the next time maybe_quit is called).
>>
>> If we keep a shadow signal mask, the only requirement for a safe point
>> is that we made some progress OR the lock was released.  But the
>> backtrace will change if we wait for the next maybe_quit, IIUC.
>>
>> maybe_quit is not a great safe point, it's just the best we have.  It's
>> insufficient if Emacs becomes idle, and how often we call rarely_quit
>> is quite unpredictable.
>
> What about doing that from process_pending_signals?

Yes.  The rest of this email is a half-hearted defense of why I didn't
do that right away.

We certainly want to call it from unblock_to if the count reaches (I
think that's what you meant?), but I wasn't convinced we wouldn't need a
shadow signal mask for that.

Merging the pending_signals flag in keyboard.c and the one in igc.c (if
that's what you meant) sounds like a good idea, too, but needs some more
thought: if we handle some signals while input is blocked, but not
others, what should pending_signals be?

Anyway, I'm perfectly willing to give up on the "no shadow mask"
assumption.  Maybe it'll help us catch some signal bugs on weird
platforms if we use it when --enable-checking --with-mps=no.

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:19                                           ` Pip Cet via Emacs development discussions.
@ 2024-12-31 14:31                                             ` Eli Zaretskii
  0 siblings, 0 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 14:31 UTC (permalink / raw)
  To: Pip Cet; +Cc: eller.helmut, gerd.moellmann, spd, emacs-devel

> Date: Tue, 31 Dec 2024 14:19:44 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Helmut Eller <eller.helmut@gmail.com>, gerd.moellmann@gmail.com, spd@toadstyle.org, emacs-devel@gnu.org
> 
> > Once again, why not ask the MPS folks to give us a callback?  Or maybe
> > we could try hacking MPS ourselves first, to see if that does the job,
> > and ask them then?
> 
> I'm not sure what your suggestion is at this point, sorry.  You said in
> another email not too long ago that you wanted to improve the current
> code by special-casing "simple" signals; I agreed; now you appear to be
> saying we should wait for an MPS modification.

There are several possible alternative ways of solving these issues.
Moreover, it could be that we prefer different solutions for different
signals.  A callback, if MPS would let us have it, could allow
delaying the signal processing (or most of it) till we are safe to
touch data.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:29                                             ` Pip Cet via Emacs development discussions.
@ 2024-12-31 14:34                                               ` Eli Zaretskii
  2024-12-31 15:08                                                 ` Gerd Möllmann
  2025-01-03 18:37                                                 ` Helmut Eller
  2024-12-31 15:07                                               ` Gerd Möllmann
  1 sibling, 2 replies; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 14:34 UTC (permalink / raw)
  To: Pip Cet; +Cc: eller.helmut, gerd.moellmann, spd, emacs-devel

> Date: Tue, 31 Dec 2024 14:29:18 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: eller.helmut@gmail.com, gerd.moellmann@gmail.com, spd@toadstyle.org, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> maybe_quit is not a great safe point, it's just the best we have.  It's
> >> insufficient if Emacs becomes idle, and how often we call rarely_quit
> >> is quite unpredictable.
> >
> > What about doing that from process_pending_signals?
> 
> Yes.  The rest of this email is a half-hearted defense of why I didn't
> do that right away.
> 
> We certainly want to call it from unblock_to if the count reaches (I
> think that's what you meant?), but I wasn't convinced we wouldn't need a
> shadow signal mask for that.
> 
> Merging the pending_signals flag in keyboard.c and the one in igc.c (if
> that's what you meant) sounds like a good idea, too, but needs some more
> thought: if we handle some signals while input is blocked, but not
> others, what should pending_signals be?

We'd need to add a new function to process_pending_signals, which
would process SIGPROF and maybe also SIGALRM.  The signal handlers for
those would then only set a flag (not pending_signals, some other
flag).




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 13:14                                         ` Eli Zaretskii
  2024-12-31 14:19                                           ` Pip Cet via Emacs development discussions.
@ 2024-12-31 14:40                                           ` Helmut Eller
  2024-12-31 14:55                                             ` Gerd Möllmann
  1 sibling, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2024-12-31 14:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, pipcet, spd, emacs-devel

On Tue, Dec 31 2024, Eli Zaretskii wrote:

>> Except the POSIX police: it says that pthread_mutex_trylock isn't async
>> signal safe.  I suppose this also makes it's unsafe to use MPS's fault
>> handler in an async signal handler.  Bummer.  (Does the police take
>> bribes?)
>
> Doesn't MPS itself call it from a SIGSEGV handler?

Yes.  But I'm not sure that SIGSEGV is considered an async signal.  I
supposed there is something like synchronous signals.

>> I wonder if the backtrace that we see in the signal handler is any
>> different from the backrace that we would see at the next safe point
>> (i.e. the next time maybe_quit is called).
>
> I think we cannot rely on that, because maybe_quit must be called by
> hand, it isn't magic.  We call it from various places in the
> interpreter, which could well be in some other place of a Lisp
> program.

Maybe somebody could do an experiment: record both backtraces for a
while and see if they differ.

> Once again, why not ask the MPS folks to give us a callback?  Or maybe
> we could try hacking MPS ourselves first, to see if that does the job,
> and ask them then?

Nobody stops you from doing this. :-)

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:40                                           ` Helmut Eller
@ 2024-12-31 14:55                                             ` Gerd Möllmann
  2024-12-31 15:07                                               ` Eli Zaretskii
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 14:55 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

>> Once again, why not ask the MPS folks to give us a callback?  Or maybe
>> we could try hacking MPS ourselves first, to see if that does the job,
>> and ask them then?
>
> Nobody stops you from doing this. :-)

+1



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:27                                               ` Eli Zaretskii
@ 2024-12-31 15:05                                                 ` Gerd Möllmann
  2024-12-31 15:14                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, pipcet, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: eller.helmut@gmail.com,  pipcet@protonmail.com,  spd@toadstyle.org,
>>   emacs-devel@gnu.org
>> Date: Tue, 31 Dec 2024 15:15:04 +0100
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> > If the problem is other signals in [t1, t2], we could install the
>> > signal handler in a way that masks all other signals while the handler
>> > runs.
>> 
>> That would be necessary, but there's another thing Helmut pointed out.
>> At t0, when we enter the SIGPROF handler, we may have interrupted
>> pthread code in the Emacs thread, so pthread may currently be in an
>> inconsistent state.
>
> If that really can happen, then pthreads is more fragile than I hoped.
> I hoped they don't let signals interrupt them when they are in
> critical sections like that.  Are we sure this danger is real?

I'm not a pthread expert. I don't know.

>> I'd like to instead revive the idea of getting the backtrace in the
>> signal handler and doing anything else elsewhere.
>
> If it works, sure.  But I thought copying the data had the same
> problems as accessing it from the handler?

Copying Lisp_Object around is not a problem, AFAIK (only saying that
because I'm getting cautious).

>> Maybe, after reading igc.org, that is acceptable maintenance-wise?
>
> Can you show a patch?

We haven't talked about what to do the sample after get_backtrace yet.
And I'm not sure if you accept the approach now or not, TBH. If that's
not the case, we can stop here and save time.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:29                                             ` Pip Cet via Emacs development discussions.
  2024-12-31 14:34                                               ` Eli Zaretskii
@ 2024-12-31 15:07                                               ` Gerd Möllmann
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:07 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, eller.helmut, spd, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> "Eli Zaretskii" <eliz@gnu.org> writes:
>
>>> Date: Tue, 31 Dec 2024 10:09:25 +0000
>>> From: Pip Cet <pipcet@protonmail.com>
>>> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Eli Zaretskii
>>> <eliz@gnu.org>, spd@toadstyle.org, emacs-devel@gnu.org
>>>
>>> "Helmut Eller" <eller.helmut@gmail.com> writes:
>>>
>>> > I wonder if the backtrace that we see in the signal handler is any
>>> > different from the backrace that we would see at the next safe point
>>> > (i.e. the next time maybe_quit is called).
>>>
>>> If we keep a shadow signal mask, the only requirement for a safe point
>>> is that we made some progress OR the lock was released.  But the
>>> backtrace will change if we wait for the next maybe_quit, IIUC.
>>>
>>> maybe_quit is not a great safe point, it's just the best we have.  It's
>>> insufficient if Emacs becomes idle, and how often we call rarely_quit
>>> is quite unpredictable.
>>
>> What about doing that from process_pending_signals?
>
> Yes.  

I'm all for it.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:55                                             ` Gerd Möllmann
@ 2024-12-31 15:07                                               ` Eli Zaretskii
  2024-12-31 15:13                                                 ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 15:07 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, pipcet, spd, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Tue, 31 Dec 2024 15:55:56 +0100
> 
> Helmut Eller <eller.helmut@gmail.com> writes:
> 
> >> Once again, why not ask the MPS folks to give us a callback?  Or maybe
> >> we could try hacking MPS ourselves first, to see if that does the job,
> >> and ask them then?
> >
> > Nobody stops you from doing this. :-)
> 
> +1

Mercy: I have a lot of other Emacs-related stuff on my plate, as you
are well aware.  Just keeping up with this discussion is already hard
for me.  Please help me by reaching out to the MPS folks about this
issue.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:34                                               ` Eli Zaretskii
@ 2024-12-31 15:08                                                 ` Gerd Möllmann
  2025-01-03 18:37                                                 ` Helmut Eller
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Pip Cet, eller.helmut, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Tue, 31 Dec 2024 14:29:18 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: eller.helmut@gmail.com, gerd.moellmann@gmail.com, spd@toadstyle.org, emacs-devel@gnu.org
>> 
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>> 
>> >> maybe_quit is not a great safe point, it's just the best we have.  It's
>> >> insufficient if Emacs becomes idle, and how often we call rarely_quit
>> >> is quite unpredictable.
>> >
>> > What about doing that from process_pending_signals?
>> 
>> Yes.  The rest of this email is a half-hearted defense of why I didn't
>> do that right away.
>> 
>> We certainly want to call it from unblock_to if the count reaches (I
>> think that's what you meant?), but I wasn't convinced we wouldn't need a
>> shadow signal mask for that.
>> 
>> Merging the pending_signals flag in keyboard.c and the one in igc.c (if
>> that's what you meant) sounds like a good idea, too, but needs some more
>> thought: if we handle some signals while input is blocked, but not
>> others, what should pending_signals be?
>
> We'd need to add a new function to process_pending_signals, which
> would process SIGPROF and maybe also SIGALRM.  The signal handlers for
> those would then only set a flag (not pending_signals, some other
> flag).

Perfect. Please put my other mail about get_backtrace on hold.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:15                                             ` Gerd Möllmann
  2024-12-31 14:27                                               ` Eli Zaretskii
@ 2024-12-31 15:12                                               ` Helmut Eller
  2024-12-31 15:31                                                 ` Gerd Möllmann
  1 sibling, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2024-12-31 15:12 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

On Tue, Dec 31 2024, Gerd Möllmann wrote:
[...]
> We have established that calling get_backtrace is safe since it doesn't
> access memory in our AMC pool, which might have a barrier. Counter
> argument was that one would have to know too much about what is safe to
> access and what cannot, and that would be unmaintainable.

I thought, the problem with this was, that writing GC roots in SIGPROF
is not safe: if we have interrupted MPS it may have partly scanned roots
or somthing like that.  Is that not a problem?

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:07                                               ` Eli Zaretskii
@ 2024-12-31 15:13                                                 ` Gerd Möllmann
  2024-12-31 15:16                                                   ` Helmut Eller
  2025-01-02  8:37                                                   ` Stefan Kangas
  0 siblings, 2 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, pipcet, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>>   spd@toadstyle.org,  emacs-devel@gnu.org
>> Date: Tue, 31 Dec 2024 15:55:56 +0100
>> 
>> Helmut Eller <eller.helmut@gmail.com> writes:
>> 
>> >> Once again, why not ask the MPS folks to give us a callback?  Or maybe
>> >> we could try hacking MPS ourselves first, to see if that does the job,
>> >> and ask them then?
>> >
>> > Nobody stops you from doing this. :-)
>> 
>> +1
>
> Mercy: I have a lot of other Emacs-related stuff on my plate, as you
> are well aware.  Just keeping up with this discussion is already hard
> for me.  Please help me by reaching out to the MPS folks about this
> issue.

I think it should be something "official". Maybe Stefan Kangas could
contact them, or Richard.

Trying to contact Richard Brooksky here on emacs-devel apparently didn't
work when I did it.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:05                                                 ` Gerd Möllmann
@ 2024-12-31 15:14                                                   ` Eli Zaretskii
  2024-12-31 15:20                                                     ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2024-12-31 15:14 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eller.helmut, pipcet, spd, emacs-devel

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: eller.helmut@gmail.com,  pipcet@protonmail.com,  spd@toadstyle.org,
>   emacs-devel@gnu.org
> Date: Tue, 31 Dec 2024 16:05:09 +0100
> 
> >> Maybe, after reading igc.org, that is acceptable maintenance-wise?
> >
> > Can you show a patch?
> 
> We haven't talked about what to do the sample after get_backtrace yet.

OK, let's talk about that.  What did you have in mind?

> And I'm not sure if you accept the approach now or not, TBH. If that's
> not the case, we can stop here and save time.

I don't think I have a clear idea what this approach would be.  I
thought showing a patch was a good way of getting us on the same page.
But if that needs a lot of work, then maybe describe the idea in some
detail first?



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:13                                                 ` Gerd Möllmann
@ 2024-12-31 15:16                                                   ` Helmut Eller
  2025-01-02  8:37                                                   ` Stefan Kangas
  1 sibling, 0 replies; 119+ messages in thread
From: Helmut Eller @ 2024-12-31 15:16 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

On Tue, Dec 31 2024, Gerd Möllmann wrote:

>> Mercy: I have a lot of other Emacs-related stuff on my plate, as you
>> are well aware.  Just keeping up with this discussion is already hard
>> for me.  Please help me by reaching out to the MPS folks about this
>> issue.
>
> I think it should be something "official". Maybe Stefan Kangas could
> contact them, or Richard.

I think, the MPS mailing list might be more appropriate for asking
potentially stupid questions:

https://mailman.ravenbrook.com/mailman/listinfo/mps-discussion

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:14                                                   ` Eli Zaretskii
@ 2024-12-31 15:20                                                     ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eller.helmut, pipcet, spd, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: eller.helmut@gmail.com,  pipcet@protonmail.com,  spd@toadstyle.org,
>>   emacs-devel@gnu.org
>> Date: Tue, 31 Dec 2024 16:05:09 +0100
>> 
>> >> Maybe, after reading igc.org, that is acceptable maintenance-wise?
>> >
>> > Can you show a patch?
>> 
>> We haven't talked about what to do the sample after get_backtrace yet.
>
> OK, let's talk about that.  What did you have in mind?
>
>> And I'm not sure if you accept the approach now or not, TBH. If that's
>> not the case, we can stop here and save time.
>
> I don't think I have a clear idea what this approach would be.  I
> thought showing a patch was a good way of getting us on the same page.
> But if that needs a lot of work, then maybe describe the idea in some
> detail first?

Will do if the process_pending_signals turns out not to work. If that
works, it would be even better, I think.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:12                                               ` Helmut Eller
@ 2024-12-31 15:31                                                 ` Gerd Möllmann
  2024-12-31 15:37                                                   ` Helmut Eller
  0 siblings, 1 reply; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:31 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> On Tue, Dec 31 2024, Gerd Möllmann wrote:
> [...]
>> We have established that calling get_backtrace is safe since it doesn't
>> access memory in our AMC pool, which might have a barrier. Counter
>> argument was that one would have to know too much about what is safe to
>> access and what cannot, and that would be unmaintainable.
>
> I thought, the problem with this was, that writing GC roots in SIGPROF
> is not safe: if we have interrupted MPS it may have partly scanned roots
> or somthing like that.  Is that not a problem?

I think you meant reading roots. We're reading the binding stack which
is a root.

From my POV, that has the same problem that we currently have already.
The binding stack may be inconsistent because SIGPROF hits at the wrong
place, with a small probability.

With MPS, it might be a problem that the root is currently exact IIRC,
but it could be made ambiguous, if MPS is fixing addresses, in which
case I think we'd be safe.

I'm too lazy to look at the code, ATM, because I find the
process_pending_singals more attractive.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:31                                                 ` Gerd Möllmann
@ 2024-12-31 15:37                                                   ` Helmut Eller
  2024-12-31 15:39                                                     ` Gerd Möllmann
  0 siblings, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2024-12-31 15:37 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

On Tue, Dec 31 2024, Gerd Möllmann wrote:

>> I thought, the problem with this was, that writing GC roots in SIGPROF
>> is not safe: if we have interrupted MPS it may have partly scanned roots
>> or somthing like that.  Is that not a problem?
>
> I think you meant reading roots. We're reading the binding stack which
> is a root.

Actually I meant writing: we need to store the backtrace somewhere.  And
doing that behind the back of MPS may not be safe.

But let's stop here.  These lengthy email discussions are terribly
unproductive.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:37                                                   ` Helmut Eller
@ 2024-12-31 15:39                                                     ` Gerd Möllmann
  0 siblings, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2024-12-31 15:39 UTC (permalink / raw)
  To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, spd, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

> But let's stop here.  These lengthy email discussions are terribly
> unproductive.

+2



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 15:13                                                 ` Gerd Möllmann
  2024-12-31 15:16                                                   ` Helmut Eller
@ 2025-01-02  8:37                                                   ` Stefan Kangas
  2025-01-02  9:05                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 119+ messages in thread
From: Stefan Kangas @ 2025-01-02  8:37 UTC (permalink / raw)
  To: Gerd Möllmann, Eli Zaretskii; +Cc: eller.helmut, pipcet, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>>> Cc: Eli Zaretskii <eliz@gnu.org>,  pipcet@protonmail.com,
>>>   spd@toadstyle.org,  emacs-devel@gnu.org
>>> Date: Tue, 31 Dec 2024 15:55:56 +0100
>>>
>>> Helmut Eller <eller.helmut@gmail.com> writes:
>>>
>>> >> Once again, why not ask the MPS folks to give us a callback?  Or maybe
>>> >> we could try hacking MPS ourselves first, to see if that does the job,
>>> >> and ask them then?
>>> >
>>> > Nobody stops you from doing this. :-)
>>>
>>> +1
>>
>> Mercy: I have a lot of other Emacs-related stuff on my plate, as you
>> are well aware.  Just keeping up with this discussion is already hard
>> for me.  Please help me by reaching out to the MPS folks about this
>> issue.
>
> I think it should be something "official". Maybe Stefan Kangas could
> contact them, or Richard.

I'm happy to reach out to them in official capacity, but I'm not really
close enough to the code to be able to usefully discuss the issue with
them.  So I think it might be best to put some or all of you in Cc.

Before we do anything though, are we sure that it is faster to ask them
to do things for us, instead of, say, just sending a patch?  I'm not
sure how confident people are with hacking MPS, but if we are still
seriously entertaining the idea of a fork then maybe we should be, to
some extent.

If we do decide to contact them, I'm afraid that I don't have sufficient
context to accurately describe the proposed callback.  I would need to
ask someone to summarize the idea in sufficient detail so that we can
start a conversation.

Please let me know what you think is best here, and let's take it from
there.  Thanks.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-02  8:37                                                   ` Stefan Kangas
@ 2025-01-02  9:05                                                     ` Eli Zaretskii
  2025-01-02 10:00                                                       ` Helmut Eller
  2025-01-02 12:34                                                       ` Pip Cet via Emacs development discussions.
  0 siblings, 2 replies; 119+ messages in thread
From: Eli Zaretskii @ 2025-01-02  9:05 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: gerd.moellmann, eller.helmut, pipcet, emacs-devel

> From: Stefan Kangas <stefankangas@gmail.com>
> Date: Thu, 2 Jan 2025 02:37:12 -0600
> Cc: eller.helmut@gmail.com, pipcet@protonmail.com, emacs-devel@gnu.org
> 
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> 
> >> Mercy: I have a lot of other Emacs-related stuff on my plate, as you
> >> are well aware.  Just keeping up with this discussion is already hard
> >> for me.  Please help me by reaching out to the MPS folks about this
> >> issue.
> >
> > I think it should be something "official". Maybe Stefan Kangas could
> > contact them, or Richard.
> 
> I'm happy to reach out to them in official capacity, but I'm not really
> close enough to the code to be able to usefully discuss the issue with
> them.  So I think it might be best to put some or all of you in Cc.
> 
> Before we do anything though, are we sure that it is faster to ask them
> to do things for us, instead of, say, just sending a patch?  I'm not
> sure how confident people are with hacking MPS, but if we are still
> seriously entertaining the idea of a fork then maybe we should be, to
> some extent.

We don't know the answer, AFAIU.  We could tell them that if they
prefer a patch, we can send one.

> If we do decide to contact them, I'm afraid that I don't have sufficient
> context to accurately describe the proposed callback.  I would need to
> ask someone to summarize the idea in sufficient detail so that we can
> start a conversation.

The problem is that evidently (at least on Posix platforms), if a
program that uses MPS runs application code from a SIGPROF or a
SIGALRM or a SIGCHLD signal handler can trigger a recursive access to
the MPS arena, which causes a fatal signal if that happens while MPS
holds the arena lock.  So we want to ask for a callback when MPS is
about to lock the arena, and another callback immediately after it
releases the lock.  With that, we could defer the application code of
these signal handlers until after the arena is free to be accessed
again.

Alternatively, if MPS already has a solution for such applications
that use signals, we'd like to hear what they suggest.

As background, you can point them to this discussion:

  https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00568.html



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-02  9:05                                                     ` Eli Zaretskii
@ 2025-01-02 10:00                                                       ` Helmut Eller
  2025-01-02 12:34                                                       ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 119+ messages in thread
From: Helmut Eller @ 2025-01-02 10:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Kangas, gerd.moellmann, pipcet, emacs-devel

On Thu, Jan 02 2025, Eli Zaretskii wrote:

> The problem is that evidently (at least on Posix platforms), if a
> program that uses MPS runs application code from a SIGPROF or a
> SIGALRM or a SIGCHLD signal handler can trigger a recursive access to
> the MPS arena, which causes a fatal signal if that happens while MPS
> holds the arena lock.  So we want to ask for a callback when MPS is
> about to lock the arena, and another callback immediately after it
> releases the lock.  With that, we could defer the application code of
> these signal handlers until after the arena is free to be accessed
> again.

The functions to change would most likely be AreaEnter and ArenaLeave.
As alternative to callbacks, we could give the Arena a signal mask.

Arenas can be created with the function mps_arena_create_k.  This
accepts keyword arguments.  So the callbacks or signal mask could be
given with a new keyword argument.

Helmut



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-02  9:05                                                     ` Eli Zaretskii
  2025-01-02 10:00                                                       ` Helmut Eller
@ 2025-01-02 12:34                                                       ` Pip Cet via Emacs development discussions.
  2025-01-02 13:08                                                         ` Gerd Möllmann
  2025-01-02 15:42                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2025-01-02 12:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Kangas, gerd.moellmann, eller.helmut, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> From: Stefan Kangas <stefankangas@gmail.com>
>> Date: Thu, 2 Jan 2025 02:37:12 -0600
>> Cc: eller.helmut@gmail.com, pipcet@protonmail.com, emacs-devel@gnu.org
>>
>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>
>> >> Mercy: I have a lot of other Emacs-related stuff on my plate, as you
>> >> are well aware.  Just keeping up with this discussion is already hard
>> >> for me.  Please help me by reaching out to the MPS folks about this
>> >> issue.
>> >
>> > I think it should be something "official". Maybe Stefan Kangas could
>> > contact them, or Richard.
>>
>> I'm happy to reach out to them in official capacity, but I'm not really
>> close enough to the code to be able to usefully discuss the issue with
>> them.  So I think it might be best to put some or all of you in Cc.
>>
>> Before we do anything though, are we sure that it is faster to ask them
>> to do things for us, instead of, say, just sending a patch?  I'm not
>> sure how confident people are with hacking MPS, but if we are still
>> seriously entertaining the idea of a fork then maybe we should be, to
>> some extent.
>
> We don't know the answer, AFAIU.  We could tell them that if they
> prefer a patch, we can send one.

I see no requirement to modify MPS, so far.

The current solution should work fine for Emacs once the special-casing
of fast symbol handlers, which Helmut is working on IIUC, is in place.

>> If we do decide to contact them, I'm afraid that I don't have sufficient
>> context to accurately describe the proposed callback.  I would need to
>> ask someone to summarize the idea in sufficient detail so that we can
>> start a conversation.
>
> The problem is that evidently (at least on Posix platforms), if a
> program that uses MPS runs application code from a SIGPROF or a
> SIGALRM or a SIGCHLD signal handler can trigger a recursive access to
> the MPS arena,

Let's be specific here: it's about accessing MPS memory, not about
allocating memory.  In particular, we're fully aware that thread APs
cannot be used from signal handlers if they're also used by the main
thread.

> which causes a fatal signal if that happens while MPS
> holds the arena lock.

I don't know what "fatal signal" is supposed ot mean in this case, to
the MPS folks.

If there is a memory barrier in place, the result will be a deadlock
situation, which may or may not be detected.  In the first case, there
is an assertion violation, in some builds (I believe the "rash" build
will simply continue and corrupt memory).  In the second case, there's a
user-visible deadlock.

If there is no memory barrier in place (for example, while the segment
in question is being scanned), but the arena lock is held, silent data
corruption will occur, because we'll read an invalid intermediate state
of the memory.

IOW, the abort() situation is the best of three possible outcomes, not
the only one we want to avoid.  (I say this because I sometimes get the
impression that "make the MPS arena lock recursive" is considered as a
possible "solution"; it's not, for various reasons).

> So we want to ask for a callback when MPS is
> about to lock the arena, and another callback immediately after it
> releases the lock.

That's the first time I hear about the "about to lock the arena"
callback.  Wouldn't hurt, of course, but it's also a new idea.

"Immediately", of course, may be misleading: if another thread is
waiting for the lock, the lock will not be available until that thread
is done with it.

There's a third lock operation, of course: _trylock, used for
(invasively) checking whether a mutex is currently available.  Do we
want a pair of callbacks for that, or a third callback, or nothing at
all?

> With that, we could defer the application code of
> these signal handlers until after the arena is free to be accessed
> again.

We could block the appropriate signals before (in some cases, quite a
while before) we take the arena lock and unblock them after we release
it.  That's not obviously the best solution, but it's the only one this
change would enable, AFAICS.

Running signal handler code from C without ever blocking signals at the
OS level is much more complicated, and I'm not convinced it would work
even with atomic types.

This would also require the OS to continue guaranteeing full POSIX
signal semantics while a "SIGSEGV" exception is being handled, since
that involves taking the arena lock.

> Alternatively, if MPS already has a solution for such applications
> that use signals, we'd like to hear what they suggest.

That's a useful question to ask, of course.  My understanding is that
MPS is configured mostly at build time, and your idea would amount to
creating a replacement for lockix.c and lockw3.c which allows blocking
signals.

> As background, you can point them to this discussion:
>
>   https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00568.html

I think the polite thing to do would be to agree on a short but accurate
summary of what it is we want, explaining why it would be helpful (it
may simplify things but isn't required for correctness).

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-02 12:34                                                       ` Pip Cet via Emacs development discussions.
@ 2025-01-02 13:08                                                         ` Gerd Möllmann
  2025-01-02 15:42                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 119+ messages in thread
From: Gerd Möllmann @ 2025-01-02 13:08 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, Stefan Kangas, eller.helmut, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> The current solution should work fine for Emacs once the special-casing
> of fast symbol handlers, which Helmut is working on IIUC, is in place.
          ^^^^^^
          signal

I think the same. If that works, we're fine without support.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-02 12:34                                                       ` Pip Cet via Emacs development discussions.
  2025-01-02 13:08                                                         ` Gerd Möllmann
@ 2025-01-02 15:42                                                         ` Eli Zaretskii
  2025-01-02 17:56                                                           ` Pip Cet via Emacs development discussions.
  1 sibling, 1 reply; 119+ messages in thread
From: Eli Zaretskii @ 2025-01-02 15:42 UTC (permalink / raw)
  To: Pip Cet; +Cc: stefankangas, gerd.moellmann, eller.helmut, emacs-devel

> Date: Thu, 02 Jan 2025 12:34:58 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Stefan Kangas <stefankangas@gmail.com>, gerd.moellmann@gmail.com, eller.helmut@gmail.com, emacs-devel@gnu.org
> 
> > We don't know the answer, AFAIU.  We could tell them that if they
> > prefer a patch, we can send one.
> 
> I see no requirement to modify MPS, so far.

There's no requirement, but it would be silly, IMO, to try to solve
these issues without ever talking to the MPS developers.  We have
nothing to lose, but it's possible that they will point out other
solutions.  Why give up that up front?  It makes no sense to me.

> The current solution should work fine for Emacs once the special-casing
> of fast symbol handlers, which Helmut is working on IIUC, is in place.

Another pair (or several pairs) of eyes, and by people who know more
than we do about the library, cannot do any harm.  We can always
decide to go our way, regardless.

> >> If we do decide to contact them, I'm afraid that I don't have sufficient
> >> context to accurately describe the proposed callback.  I would need to
> >> ask someone to summarize the idea in sufficient detail so that we can
> >> start a conversation.
> >
> > The problem is that evidently (at least on Posix platforms), if a
> > program that uses MPS runs application code from a SIGPROF or a
> > SIGALRM or a SIGCHLD signal handler can trigger a recursive access to
> > the MPS arena,
> 
> Let's be specific here: it's about accessing MPS memory, not about
> allocating memory.

I agree.  But then I didn't say anything about allocations.

> > which causes a fatal signal if that happens while MPS
> > holds the arena lock.
> 
> I don't know what "fatal signal" is supposed ot mean in this case, to
> the MPS folks.

It's an accepted terminology, but we could explain if needed.  My text
was for Stefan (who does know what "fatal signal" means), not for
quoting it verbatim in a message to MPS.

> > So we want to ask for a callback when MPS is
> > about to lock the arena, and another callback immediately after it
> > releases the lock.
> 
> That's the first time I hear about the "about to lock the arena"
> callback.  Wouldn't hurt, of course, but it's also a new idea.

It was always the idea.

> "Immediately", of course, may be misleading: if another thread is
> waiting for the lock, the lock will not be available until that thread
> is done with it.

Which other thread?  There can be only one Lisp thread running at any
given time.

> There's a third lock operation, of course: _trylock, used for
> (invasively) checking whether a mutex is currently available.  Do we
> want a pair of callbacks for that, or a third callback, or nothing at
> all?

I think these questions are ahead of time.  We should first hear what
the MPS developers think about this idea.

> We could block the appropriate signals before (in some cases, quite a
> while before) we take the arena lock and unblock them after we release
> it.  That's not obviously the best solution, but it's the only one this
> change would enable, AFAICS.

The problem is, we don't know when the arena will be locked, thus the
request for a callback.

> > Alternatively, if MPS already has a solution for such applications
> > that use signals, we'd like to hear what they suggest.
> 
> That's a useful question to ask, of course.  My understanding is that
> MPS is configured mostly at build time, and your idea would amount to
> creating a replacement for lockix.c and lockw3.c which allows blocking
> signals.

Once again, let's hear what the MPS developers have to say about that.

> > As background, you can point them to this discussion:
> >
> >   https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00568.html
> 
> I think the polite thing to do would be to agree on a short but accurate
> summary of what it is we want, explaining why it would be helpful (it
> may simplify things but isn't required for correctness).

That discussion has several backtraces which might be useful for them
to better understand the issue.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-02 15:42                                                         ` Eli Zaretskii
@ 2025-01-02 17:56                                                           ` Pip Cet via Emacs development discussions.
  0 siblings, 0 replies; 119+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2025-01-02 17:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: stefankangas, gerd.moellmann, eller.helmut, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Thu, 02 Jan 2025 12:34:58 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Stefan Kangas <stefankangas@gmail.com>, gerd.moellmann@gmail.com, eller.helmut@gmail.com, emacs-devel@gnu.org
>>
>> > We don't know the answer, AFAIU.  We could tell them that if they
>> > prefer a patch, we can send one.
>>
>> I see no requirement to modify MPS, so far.
>
> There's no requirement, but it would be silly, IMO, to try to solve
> these issues without ever talking to the MPS developers.  We have
> nothing to lose, but it's possible that they will point out other

(Or we'll point them to a discussion which accuses MPS of being
"unreasonable" and pretty much demands MPS is fundamentally changed to
match our alleged requirements, and they'll get the wrong idea and we'll
lose whatever goodwill we might have had.)

> solutions.  Why give up that up front?  It makes no sense to me.

I never said you shouldn't talk to the MPS developers!

I'm not going to unless I can present them with a patch that I could at
least imagine I'd consider if positions were reversed (unless I were
asked to do so, which seems unlikely :-) ).  I think it would be good to
treat their time as valuable enough not to simply point them to a very
long thread, but that's just my opinion.

>> >> If we do decide to contact them, I'm afraid that I don't have sufficient
>> >> context to accurately describe the proposed callback.  I would need to
>> >> ask someone to summarize the idea in sufficient detail so that we can
>> >> start a conversation.
>> >
>> > The problem is that evidently (at least on Posix platforms), if a
>> > program that uses MPS runs application code from a SIGPROF or a
>> > SIGALRM or a SIGCHLD signal handler can trigger a recursive access to
>> > the MPS arena,
>>
>> Let's be specific here: it's about accessing MPS memory, not about
>> allocating memory.
>
> I agree.  But then I didn't say anything about allocations.

(... that's what "being unspecific" means, isn't it?)

>> > which causes a fatal signal if that happens while MPS
>> > holds the arena lock.
>>
>> I don't know what "fatal signal" is supposed ot mean in this case, to
>> the MPS folks.
>
> It's an accepted terminology, but we could explain if needed.  My text
> was for Stefan (who does know what "fatal signal" means), not for
> quoting it verbatim in a message to MPS.

As I explained, it's also not the only problem.  A "fatal signal" is the
best of three possible undesirable outcomes.

>> > So we want to ask for a callback when MPS is
>> > about to lock the arena, and another callback immediately after it
>> > releases the lock.
>>
>> That's the first time I hear about the "about to lock the arena"
>> callback.  Wouldn't hurt, of course, but it's also a new idea.
>
> It was always the idea.

I'm sorry, but I really don't see how you can make that statement.  Your
most recent proposal said nothing about that part of your idea, see:

https://mail.gnu.org/archive/html/emacs-devel/2024-12/msg01537.html

If you expected me to be smart enough to read that email and conclude
that you want another callback which would block signals, and you meant
"unblock the signal" when you wrote "run the handler's body", I'm not.

>> "Immediately", of course, may be misleading: if another thread is
>> waiting for the lock, the lock will not be available until that thread
>> is done with it.
>
> Which other thread?  There can be only one Lisp thread running at any
> given time.

I thought we agreed we should trigger GC from a separate POSIX thread
for at least some builds (to debug things).  Anyway, I don't see why we
should present a solution for a locking problem that assumes things are
single-threaded anyway.

>> We could block the appropriate signals before (in some cases, quite a
>> while before) we take the arena lock and unblock them after we release
>> it.  That's not obviously the best solution, but it's the only one this
>> change would enable, AFAICS.
>
> The problem is, we don't know when the arena will be locked, thus the
> request for a callback.

The problem is that these precise callbacks would enable ONLY this
solution, while a different callback mechanism might enable others.

I also think it would be better to simply set a custom lock/unlock
function which is run INSTEAD of the lockix.c code, which would give us
options to fine-tune the behavior in case of lock contention (it's
perfectly okay to run signal handlers "while" trying to grab the lock,
which potentially takes a while.  They might finish, or they might need
the arena lock before we do.  Most importantly, that would give us a
timeout mechanism for interrupting lengthy scans for user interaction.

I think this is highly relevant; either we need to split long vectors,
or we need a way to interrupt scanning, and this is precisely the point
where we could do so.  If all we have is a pre-lock callback, we can't
interrupt scanning, and then we're forced to split long vectors...).

The main problem, to me, is that the single-lock design of MPS might
change to a multi-lock design, and what do we do then?  Do we need
per-thread per-lock storage, or does the per-thread signal mask suffice?

>> > Alternatively, if MPS already has a solution for such applications
>> > that use signals, we'd like to hear what they suggest.
>>
>> That's a useful question to ask, of course.  My understanding is that
>> MPS is configured mostly at build time, and your idea would amount to
>> creating a replacement for lockix.c and lockw3.c which allows blocking
>> signals.
>
> Once again, let's hear what the MPS developers have to say about that.

As I said, whoever wants to hit "send" can do so.  Not me, for now.

>> > As background, you can point them to this discussion:
>> >
>> >   https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00568.html
>>
>> I think the polite thing to do would be to agree on a short but accurate
>> summary of what it is we want, explaining why it would be helpful (it
>> may simplify things but isn't required for correctness).
>
> That discussion has several backtraces which might be useful for them
> to better understand the issue.

It also contains the "unreasonable" thing, IIRC, and it's quite long.
But, again, my consent is certainly not required :-)

All that said, while I don't think it's the best idea, if you insist on
this MPS change and do the signal-blocking thing, that would, at least,
finally settle the question.

Strictly speaking, we can merge without even splitting long vectors
(some people will decide to give it a try, create a billion-entry
vector, observe the totally unusable Emacs session that leaves them
with, and decide MPS isn't ready).

Pip




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2024-12-31 14:34                                               ` Eli Zaretskii
  2024-12-31 15:08                                                 ` Gerd Möllmann
@ 2025-01-03 18:37                                                 ` Helmut Eller
  2025-01-03 19:55                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 119+ messages in thread
From: Helmut Eller @ 2025-01-03 18:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Pip Cet, gerd.moellmann, spd, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3490 bytes --]

On Tue, Dec 31 2024, Eli Zaretskii wrote:
> We'd need to add a new function to process_pending_signals, which
> would process SIGPROF and maybe also SIGALRM.  The signal handlers for
> those would then only set a flag (not pending_signals, some other
> flag).

I implemented this with the two attached patches.  The trouble is that,
the recorded backtraces are not same.  This can be seen by looking at
the call tree produced by profiler.el and the attached profiler-test.el.
When add_sample is called in the signal handler, then the call tree for
the foo example looks so:

    ...
    1986 100%         main
    1986 100%           record-samples
    1986 100%             foo
    1074  54%               float-time
       0   0%   ...

When add_sample is called from process_pending_signals, it looks like
this:

    ...
    1986 100%         main
    1986 100%           record-samples
    1986 100%             foo
       0   0%   ...

Not the absence of float-time.  The reason for this is, that in
bytecode.c, maybe_quit is called before the function is pushed to the
backtrace with record_in_backtrace.  In the second patch, I moved this
call forward to before the function is popped with lisp_eval_depth--.
With this patch, the call tree includes float-time again:

    ...
    1989 100%         main
    1989 100%           record-samples
    1989 100%             foo
    1981  99%               float-time
       0   0%   ...

However, float-time has now 99% as opposed to 54% in the first call
tree.

A more complex pair of call trees is attached in the files
bar-0.report and bar-2.report.  A significant difference there is
in this section:

     ...
     781  73%                     animate-place-char
      19   1%                       delete-char
      16   1%                       floor
       4   0%                       undo-auto--undoable-change
       4   0%                         undo-auto--boundary-ensure-timer
      96   9%                       insert-char
      14   1%                         undo-auto--undoable-change
       6   0%                           undo-auto--boundary-ensure-timer
       5   0%                       beginning-of-line
     232  21%                       move-to-column
     ...

compared to the version with both patches applied:

     ...
     693  72%                     animate-place-char
      32   3%                       delete-char
      29   3%                       window-start
      43   4%                       insert-char
     309  32%                       move-to-column
     222  23%                       beginning-of-line
       8   0%                       undo-auto--undoable-change
       8   0%                         undo-auto--boundary-ensure-timer
       8   0%                           run-at-time
       8   0%                             timer-set-function
       8   0%                               timerp
       8   0%                                 vectorp
     ...
     
E.g. the percentage attributed to beginning-of-line is quite different
in those two versions (23% and 0%).   

I'm not sure if those differences are acceptable.  I also have no good
idea how to reduce it, except inserting more calls to maybe_quit.

(In eval_sub and Ffuncall, it would also help the profiler to move the
maybe_quit call forward before lisp_eval_depth--. This would only matter
for interpreted functions, not in byte compiled code.  Curiously,
apply_lambda doesn't call maybe_quit at all.)

Helmut



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Delay-processing-of-SIGPROF-to-the-next-safepoint.patch --]
[-- Type: text/x-diff, Size: 2359 bytes --]

From ed26a2e8fc92a321f0afeb38c1f88db46ec957a6 Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Fri, 3 Jan 2025 17:27:10 +0100
Subject: [PATCH 1/2] Delay processing of SIGPROF to the next safepoint

* src/lisp.h (process_pending_profiler_signals): New function.
* src/profiler.c (pending_profiler_signals): New variable.
(handle_profiler_signal): Instead of calling add_sample,
set pending_signals and increment pending_profiler_signals.
(process_pending_profiler_signals): New function.
* src/keyboard.c (process_pending_signals): Call
process_pending_profiler_signals.
---
 src/keyboard.c |  1 +
 src/lisp.h     |  1 +
 src/profiler.c | 17 ++++++++++++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/keyboard.c b/src/keyboard.c
index e875e98fde6..5d6cebdc990 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -8191,6 +8191,7 @@ process_pending_signals (void)
   handle_async_input ();
   do_pending_atimers ();
   do_async_work ();
+  process_pending_profiler_signals ();
 }
 
 /* Undo any number of BLOCK_INPUT calls down to level LEVEL,
diff --git a/src/lisp.h b/src/lisp.h
index 48585c2d8a1..774667a4f9c 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -5938,6 +5938,7 @@ maybe_disable_address_randomization (int argc, char **argv)
 extern void malloc_probe (size_t);
 extern void syms_of_profiler (void);
 extern void mark_profiler (void);
+extern void process_pending_profiler_signals (void);
 
 
 #ifdef DOS_NT
diff --git a/src/profiler.c b/src/profiler.c
index 3db7fe0eb3e..47367982cab 100644
--- a/src/profiler.c
+++ b/src/profiler.c
@@ -387,6 +387,9 @@ add_sample (struct profiler_log *plog, EMACS_INT count)
 /* Hash-table log of CPU profiler.  */
 static struct profiler_log cpu;
 
+/* Number of unprocessed profiler signals. */
+static uintptr_t pending_profiler_signals;
+
 /* The current sampling interval in nanoseconds.  */
 static EMACS_INT current_sampling_interval;
 
@@ -402,7 +405,19 @@ handle_profiler_signal (int signal)
       count += overruns;
     }
 #endif
-  add_sample (&cpu, count);
+  pending_signals = true;
+  pending_profiler_signals += count;
+}
+
+void
+process_pending_profiler_signals (void)
+{
+  uintptr_t count = pending_profiler_signals;
+  if (count)
+    {
+      pending_profiler_signals = 0;
+      add_sample (&cpu, count);
+    }
 }
 
 static void
-- 
2.39.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Call-maybe_quit-at-a-different-point-to-the-help-the.patch --]
[-- Type: text/x-diff, Size: 1154 bytes --]

From ec3227c060f12ca137b5a5bd1e607b922a6dafec Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@gmail.com>
Date: Fri, 3 Jan 2025 18:12:41 +0100
Subject: [PATCH 2/2] Call maybe_quit at a different point to the help the
 profiler.

* src/bytecode.c (exec_byte_code): In the docall sequence, move
the to maybe_quit forward immediately before lisp_eval_depth--.
This helps the profiler to see the function that was
interrupted by the SIGPROF signal.
---
 src/bytecode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bytecode.c b/src/bytecode.c
index 31f7404cbd1..53c200e2c18 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -784,7 +784,6 @@ #define DEFINE(name, value) [name] = &&insn_ ## name,
 		  }
 	      }
 #endif
-	    maybe_quit ();
 
 	    if (++lisp_eval_depth > max_lisp_eval_depth)
 	      {
@@ -829,6 +828,7 @@ #define DEFINE(name, value) [name] = &&insn_ ## name,
 	    else
 	      val = funcall_general (original_fun, call_nargs, call_args);
 
+	    maybe_quit ();
 	    lisp_eval_depth--;
 	    if (backtrace_debug_on_exit (specpdl_ptr - 1))
 	      val = call_debugger (list2 (Qexit, val));
-- 
2.39.5


[-- Attachment #4: profiler-test.el --]
[-- Type: application/emacs-lisp, Size: 1730 bytes --]

[-- Attachment #5: bar-0.report --]
[-- Type: text/plain, Size: 1532 bytes --]

       0   0% nil
    1056 100%   normal-top-level
    1056 100%     command-line
    1056 100%       command-line-1
    1056 100%         main
    1056 100%           record-samples
    1056 100%             bar
    1056 100%               animate-birthday-present
    1046  99%                 animate-string
      61   5%                   sit-for
      61   5%                     sleep-for
     805  76%                   animate-step
     781  73%                     animate-place-char
      19   1%                       delete-char
      16   1%                       floor
       4   0%                       undo-auto--undoable-change
       4   0%                         undo-auto--boundary-ensure-timer
      96   9%                       insert-char
      14   1%                         undo-auto--undoable-change
       6   0%                           undo-auto--boundary-ensure-timer
       5   0%                       beginning-of-line
     232  21%                       move-to-column
     132  12%                   primitive-undo
       6   0%                     undo-auto--undoable-change
      10   0%                 capitalize
      10   0%                   load-with-code-conversion
       8   0%                     hack-read-symbol-shorthands
       8   0%                       hack-local-variables--find-variables
       8   0%                         search-backward
       2   0%                     generate-new-buffer
       2   0%                       get-buffer-create
       0   0%   ...

[-- Attachment #6: bar-2.report --]
[-- Type: text/plain, Size: 1526 bytes --]

       0   0% nil
     959 100%   normal-top-level
     959 100%     command-line
     959 100%       command-line-1
     959 100%         main
     959 100%           record-samples
     959 100%             bar
     959 100%               animate-birthday-present
     953  99%                 animate-string
      12   1%                   animate-initialize
      12   1%                     window-width
     139  14%                   sit-for
     139  14%                     sleep-for
     109  11%                   primitive-undo
      15   1%                     markerp
      32   3%                     abs
     693  72%                   animate-step
     693  72%                     animate-place-char
      32   3%                       delete-char
      29   3%                       window-start
      43   4%                       insert-char
     309  32%                       move-to-column
     222  23%                       beginning-of-line
       8   0%                       undo-auto--undoable-change
       8   0%                         undo-auto--boundary-ensure-timer
       8   0%                           run-at-time
       8   0%                             timer-set-function
       8   0%                               timerp
       8   0%                                 vectorp
       6   0%                 capitalize
       6   0%                   load-with-code-conversion
       6   0%                     eval-buffer
       6   0%                       read
       0   0%   ...

^ permalink raw reply related	[flat|nested] 119+ messages in thread

* Re: igc, macOS avoiding signals
  2025-01-03 18:37                                                 ` Helmut Eller
@ 2025-01-03 19:55                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 119+ messages in thread
From: Eli Zaretskii @ 2025-01-03 19:55 UTC (permalink / raw)
  To: Helmut Eller; +Cc: pipcet, gerd.moellmann, spd, emacs-devel

> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Pip Cet <pipcet@protonmail.com>,  gerd.moellmann@gmail.com,
>   spd@toadstyle.org,  emacs-devel@gnu.org
> Date: Fri, 03 Jan 2025 19:37:59 +0100
> 
> On Tue, Dec 31 2024, Eli Zaretskii wrote:
> > We'd need to add a new function to process_pending_signals, which
> > would process SIGPROF and maybe also SIGALRM.  The signal handlers for
> > those would then only set a flag (not pending_signals, some other
> > flag).
> 
> I implemented this with the two attached patches.  The trouble is that,
> the recorded backtraces are not same.  This can be seen by looking at
> the call tree produced by profiler.el and the attached profiler-test.el.
> When add_sample is called in the signal handler, then the call tree for
> the foo example looks so:
> 
>     ...
>     1986 100%         main
>     1986 100%           record-samples
>     1986 100%             foo
>     1074  54%               float-time
>        0   0%   ...
> 
> When add_sample is called from process_pending_signals, it looks like
> this:
> 
>     ...
>     1986 100%         main
>     1986 100%           record-samples
>     1986 100%             foo
>        0   0%   ...
> 
> Not the absence of float-time.  The reason for this is, that in
> bytecode.c, maybe_quit is called before the function is pushed to the
> backtrace with record_in_backtrace.  In the second patch, I moved this
> call forward to before the function is popped with lisp_eval_depth--.
> With this patch, the call tree includes float-time again:
> 
>     ...
>     1989 100%         main
>     1989 100%           record-samples
>     1989 100%             foo
>     1981  99%               float-time
>        0   0%   ...
> 
> However, float-time has now 99% as opposed to 54% in the first call
> tree.
> 
> A more complex pair of call trees is attached in the files
> bar-0.report and bar-2.report.  A significant difference there is
> in this section:
> 
>      ...
>      781  73%                     animate-place-char
>       19   1%                       delete-char
>       16   1%                       floor
>        4   0%                       undo-auto--undoable-change
>        4   0%                         undo-auto--boundary-ensure-timer
>       96   9%                       insert-char
>       14   1%                         undo-auto--undoable-change
>        6   0%                           undo-auto--boundary-ensure-timer
>        5   0%                       beginning-of-line
>      232  21%                       move-to-column
>      ...
> 
> compared to the version with both patches applied:
> 
>      ...
>      693  72%                     animate-place-char
>       32   3%                       delete-char
>       29   3%                       window-start
>       43   4%                       insert-char
>      309  32%                       move-to-column
>      222  23%                       beginning-of-line
>        8   0%                       undo-auto--undoable-change
>        8   0%                         undo-auto--boundary-ensure-timer
>        8   0%                           run-at-time
>        8   0%                             timer-set-function
>        8   0%                               timerp
>        8   0%                                 vectorp
>      ...
>      
> E.g. the percentage attributed to beginning-of-line is quite different
> in those two versions (23% and 0%).   
> 
> I'm not sure if those differences are acceptable.  I also have no good
> idea how to reduce it, except inserting more calls to maybe_quit.

Thanks.

I guess this means we don't call maybe_quit frequently enough to
produce accurate profiles using this method.



^ permalink raw reply	[flat|nested] 119+ messages in thread

end of thread, other threads:[~2025-01-03 19:55 UTC | newest]

Thread overview: 119+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-28 13:24 igc, macOS avoiding signals Sean Devlin
2024-12-28 13:28 ` Gerd Möllmann
2024-12-28 14:31   ` Eli Zaretskii
2024-12-28 14:45     ` Gerd Möllmann
2024-12-30  7:13       ` Gerd Möllmann
2024-12-30  7:23         ` Gerd Möllmann
2024-12-30  7:39         ` Helmut Eller
2024-12-30  7:51           ` Gerd Möllmann
2024-12-30  8:02             ` Helmut Eller
2024-12-30  8:47               ` Gerd Möllmann
2024-12-30  9:29                 ` Helmut Eller
2024-12-30  9:47                   ` Helmut Eller
2024-12-30 11:54                     ` Gerd Möllmann
2024-12-30 10:05                   ` Gerd Möllmann
2024-12-30 10:27                     ` Helmut Eller
2024-12-30 11:53                       ` Gerd Möllmann
2024-12-30 14:54                         ` Eli Zaretskii
2024-12-30 15:05                           ` Gerd Möllmann
2024-12-30 15:05                           ` Pip Cet via Emacs development discussions.
2024-12-30 12:32                       ` Pip Cet via Emacs development discussions.
2024-12-30 14:24                         ` Eli Zaretskii
2024-12-30 14:59                         ` Helmut Eller
2024-12-30 15:15                           ` Eli Zaretskii
2024-12-30 15:24                             ` Helmut Eller
2024-12-30 15:25                           ` Pip Cet via Emacs development discussions.
2024-12-30 15:34                             ` Gerd Möllmann
2024-12-30 19:02                             ` Helmut Eller
2024-12-30 20:03                               ` Pip Cet via Emacs development discussions.
2024-12-30 15:30                           ` Gerd Möllmann
2024-12-30 16:57                             ` Helmut Eller
2024-12-30 17:41                               ` Gerd Möllmann
2024-12-30 17:49                               ` Pip Cet via Emacs development discussions.
2024-12-30 18:33                                 ` Helmut Eller
2024-12-30 17:49                               ` Eli Zaretskii
2024-12-30 18:37                                 ` Gerd Möllmann
2024-12-30 19:15                                   ` Eli Zaretskii
2024-12-30 19:55                                     ` Gerd Möllmann
2024-12-31  7:34                                       ` Helmut Eller
2024-12-31  9:19                                         ` Gerd Möllmann
2024-12-31  9:51                                           ` Helmut Eller
2024-12-31 10:00                                             ` Gerd Möllmann
2024-12-31 13:49                                             ` Pip Cet via Emacs development discussions.
2024-12-31 14:13                                               ` Eli Zaretskii
2024-12-31  9:51                                           ` Gerd Möllmann
2024-12-31 13:18                                           ` Eli Zaretskii
2024-12-31 14:15                                             ` Gerd Möllmann
2024-12-31 14:27                                               ` Eli Zaretskii
2024-12-31 15:05                                                 ` Gerd Möllmann
2024-12-31 15:14                                                   ` Eli Zaretskii
2024-12-31 15:20                                                     ` Gerd Möllmann
2024-12-31 15:12                                               ` Helmut Eller
2024-12-31 15:31                                                 ` Gerd Möllmann
2024-12-31 15:37                                                   ` Helmut Eller
2024-12-31 15:39                                                     ` Gerd Möllmann
2024-12-31 10:09                                         ` Pip Cet via Emacs development discussions.
2024-12-31 13:27                                           ` Eli Zaretskii
2024-12-31 14:29                                             ` Pip Cet via Emacs development discussions.
2024-12-31 14:34                                               ` Eli Zaretskii
2024-12-31 15:08                                                 ` Gerd Möllmann
2025-01-03 18:37                                                 ` Helmut Eller
2025-01-03 19:55                                                   ` Eli Zaretskii
2024-12-31 15:07                                               ` Gerd Möllmann
2024-12-31 13:14                                         ` Eli Zaretskii
2024-12-31 14:19                                           ` Pip Cet via Emacs development discussions.
2024-12-31 14:31                                             ` Eli Zaretskii
2024-12-31 14:40                                           ` Helmut Eller
2024-12-31 14:55                                             ` Gerd Möllmann
2024-12-31 15:07                                               ` Eli Zaretskii
2024-12-31 15:13                                                 ` Gerd Möllmann
2024-12-31 15:16                                                   ` Helmut Eller
2025-01-02  8:37                                                   ` Stefan Kangas
2025-01-02  9:05                                                     ` Eli Zaretskii
2025-01-02 10:00                                                       ` Helmut Eller
2025-01-02 12:34                                                       ` Pip Cet via Emacs development discussions.
2025-01-02 13:08                                                         ` Gerd Möllmann
2025-01-02 15:42                                                         ` Eli Zaretskii
2025-01-02 17:56                                                           ` Pip Cet via Emacs development discussions.
2024-12-30 12:42                       ` Pip Cet via Emacs development discussions.
2024-12-30 13:40                         ` Gerd Möllmann
2024-12-30 13:53                           ` Pip Cet via Emacs development discussions.
2024-12-30 14:02                             ` Gerd Möllmann
2024-12-30 14:32                               ` Pip Cet via Emacs development discussions.
2024-12-30 14:52                                 ` Gerd Möllmann
2024-12-30 11:18                 ` Pip Cet via Emacs development discussions.
2024-12-30 12:23                   ` Gerd Möllmann
2024-12-30 11:11             ` Pip Cet via Emacs development discussions.
2024-12-30 12:13               ` Gerd Möllmann
2024-12-30 10:53           ` Pip Cet via Emacs development discussions.
2024-12-30 10:46         ` Pip Cet via Emacs development discussions.
2024-12-30 12:00           ` Gerd Möllmann
2024-12-30 12:07           ` Gerd Möllmann
2024-12-28 15:12 ` Pip Cet via Emacs development discussions.
2024-12-28 17:30   ` Eli Zaretskii
2024-12-28 18:40     ` Pip Cet via Emacs development discussions.
2024-12-28 18:50       ` Eli Zaretskii
2024-12-28 19:07         ` Eli Zaretskii
2024-12-28 19:20           ` Pip Cet via Emacs development discussions.
2024-12-28 19:36             ` Eli Zaretskii
2024-12-28 20:54               ` Pip Cet via Emacs development discussions.
2024-12-29  5:51                 ` Eli Zaretskii
2024-12-28 19:15         ` Pip Cet via Emacs development discussions.
2024-12-28 19:30           ` Eli Zaretskii
2024-12-28 16:29 ` Pip Cet via Emacs development discussions.
2024-12-29  2:21   ` Sean Devlin
2024-12-29 12:22     ` Pip Cet via Emacs development discussions.
2024-12-29 15:01       ` Gerd Möllmann
2024-12-29 19:44         ` Pip Cet via Emacs development discussions.
2024-12-30  6:16           ` Gerd Möllmann
2024-12-30 12:51             ` Gerd Möllmann
2024-12-30 13:09               ` Pip Cet via Emacs development discussions.
2024-12-30 13:28                 ` Gerd Möllmann
2024-12-30  5:24         ` Sean Devlin
2024-12-30  6:17           ` Gerd Möllmann
2024-12-30  5:23       ` Sean Devlin
  -- strict thread matches above, loose matches on Subject: below --
2024-12-28  6:40 Gerd Möllmann
2024-12-28 12:49 ` Pip Cet via Emacs development discussions.
2024-12-28 12:55   ` Gerd Möllmann
2024-12-28 13:50     ` Óscar Fuentes
2024-12-29  8:02       ` Helmut Eller

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.