* Some experience with the igc branch
@ 2024-12-22 15:40 Óscar Fuentes
2024-12-22 17:18 ` Gerd Möllmann
` (2 more replies)
0 siblings, 3 replies; 91+ messages in thread
From: Óscar Fuentes @ 2024-12-22 15:40 UTC (permalink / raw)
To: emacs-devel
I've using the igc branch for the past weeks. It was mostly Dart/Flutter
development with lsp-dart / lsp-mode enabled, with all its default
features enabled. On top of that, I use the flx completion algorithm.
This setup puts a lot of stress on GC. To illustrate, on master Emacs
after setting garbage-collection-messages to t, one can see that simply
writing a few characters triggers GC several times, each with its
corresponding pause, which may be very noticeable ("uh! that keypress
didn't register... wait, there it is.") The experience is not great.
Quite miserable, I would say. People suggest playing with
gc-cons-threshold (I have mine set to 10'000'000) but those tricks
simply make things a bit less awful.
With igc the pauses are still there, but they much shorter and
predictable, they no longer distract me from thinking on what I'm
writing, which is a huge improvement. I suspect that some of those
pauses are not related to garbage collection (executing code and moving
data also takes time.)
TL/DR: now I enjoy using Emacs with this setup and I'm no longer tempted
to switch to other editors for this type of work.
A big thank you to all involved on this feature.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 15:40 Some experience with the igc branch Óscar Fuentes
@ 2024-12-22 17:18 ` Gerd Möllmann
2024-12-22 17:29 ` Gerd Möllmann
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
2 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-22 17:18 UTC (permalink / raw)
To: Óscar Fuentes; +Cc: emacs-devel
Óscar Fuentes <ofv@wanadoo.es> writes:
> I've using the igc branch for the past weeks. It was mostly Dart/Flutter
> development with lsp-dart / lsp-mode enabled, with all its default
> features enabled. On top of that, I use the flx completion algorithm.
>
> This setup puts a lot of stress on GC. To illustrate, on master Emacs
> after setting garbage-collection-messages to t, one can see that simply
> writing a few characters triggers GC several times, each with its
> corresponding pause, which may be very noticeable ("uh! that keypress
> didn't register... wait, there it is.") The experience is not great.
> Quite miserable, I would say. People suggest playing with
> gc-cons-threshold (I have mine set to 10'000'000) but those tricks
> simply make things a bit less awful.
>
> With igc the pauses are still there, but they much shorter and
> predictable, they no longer distract me from thinking on what I'm
> writing, which is a huge improvement. I suspect that some of those
> pauses are not related to garbage collection (executing code and moving
> data also takes time.)
>
> TL/DR: now I enjoy using Emacs with this setup and I'm no longer tempted
> to switch to other editors for this type of work.
>
> A big thank you to all involved on this feature.
👍
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 15:40 Some experience with the igc branch Óscar Fuentes
2024-12-22 17:18 ` Gerd Möllmann
@ 2024-12-22 17:29 ` Gerd Möllmann
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
2 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-22 17:29 UTC (permalink / raw)
To: Óscar Fuentes; +Cc: emacs-devel
Óscar Fuentes <ofv@wanadoo.es> writes:
> With igc the pauses are still there, but they much shorter and
> predictable, they no longer distract me from thinking on what I'm
> writing, which is a huge improvement. I suspect that some of those
> pauses are not related to garbage collection (executing code and moving
> data also takes time.)
In my case, with Eglot, the following settings made a difference;
:custom
(eglot-sync-connect nil)
(eglot-events-buffer-config '(:size 0 :format full)))
Don't know if Lsp-mode has similar knobs.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 15:40 Some experience with the igc branch Óscar Fuentes
2024-12-22 17:18 ` Gerd Möllmann
2024-12-22 17:29 ` Gerd Möllmann
@ 2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
2024-12-22 17:56 ` Gerd Möllmann
` (3 more replies)
2 siblings, 4 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-22 17:41 UTC (permalink / raw)
To: Óscar Fuentes
Cc: emacs-devel, Gerd Möllmann, Helmut Eller, Andrea Corallo
Óscar Fuentes <ofv@wanadoo.es> writes:
> With igc the pauses are still there, but they much shorter and
> predictable, they no longer distract me from thinking on what I'm
> writing, which is a huge improvement. I suspect that some of those
> pauses are not related to garbage collection (executing code and moving
> data also takes time.)
Quite possible. Even if it is GC, please keep in mind that MPS has many
settings which you can play with, and it can improve things a lot. It's
not too early to become a fan of the scratch/igc branch, but it is too
early to reject it for performance reasons. It's a "heads you lose, tails I
win" situation, I guess.
> TL/DR: now I enjoy using Emacs with this setup and I'm no longer tempted
> to switch to other editors for this type of work.
I think this is an important point: ultimately, it's about having daily
drivers. We need to remove the remaining impediments for that:
1. The signal issue. I don't have a good way to fix this and make
everyone happy, but I do have a solution which hasn't caused a crash for
me in quite a while. It may be good enough.
2. no-purespace. Merging that into scratch/igc would help, well, me.
What do others think?
3. bytecode stack marking. That comment raises my red-flag alert,
because it sounds like we're just accepting a preventable crash at this
stage rather than wanting to do anything about it. The reality, of
course, is different, but I'd be happier if we refused to create a byte
code object that intends to use more stack than we can guarantee we
would scan. Can we do that?
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
@ 2024-12-22 17:56 ` Gerd Möllmann
2024-12-22 19:11 ` Óscar Fuentes
` (2 subsequent siblings)
3 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-22 17:56 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Pip Cet <pipcet@protonmail.com> writes:
> 3. bytecode stack marking. That comment raises my red-flag alert,
> because it sounds like we're just accepting a preventable crash at this
> stage rather than wanting to do anything about it. The reality, of
> course, is different, but I'd be happier if we refused to create a byte
> code object that intends to use more stack than we can guarantee we
> would scan. Can we do that?
>
> Pip
You mean my comment here?
static mps_res_t
scan_bc (mps_ss_t ss, void *start, void *end, void *closure)
{
MPS_SCAN_BEGIN (ss)
{
struct igc_thread_list *t = closure;
struct bc_thread_state *bc = &t->d.ts->bc;
igc_assert (start == (void *) bc->stack);
igc_assert (end == (void *) bc->stack_end);
/* FIXME/igc: AFAIU the current top frame starts at
bc->fp->next_stack and has a maximum length that is given by the
bytecode being executed (COMPILED_STACK_DEPTH). So, we need to
scan upto bc->fo->next_stack + that max depth to be safe. Since
I don't have that number ATM, I'm using an arbitrary estimate for
now.
This must be changed to something better. Note that Mattias said
the bc stack marking will be changed in the future. */
const size_t HORRIBLE_ESTIMATE = 1024;
char *scan_end = bc_next_frame (bc->fp);
scan_end += HORRIBLE_ESTIMATE;
end = min (end, (void *) scan_end);
if (end > start)
IGC_FIX_CALL (ss, scan_ambig (ss, start, end, NULL));
}
MPS_SCAN_END (ss);
return MPS_RES_OK;
}
I never felt like changing the byte code stack, TBH. For reasons :-).
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
2024-12-22 17:56 ` Gerd Möllmann
@ 2024-12-22 19:11 ` Óscar Fuentes
2024-12-23 0:05 ` Pip Cet via Emacs development discussions.
2024-12-23 6:27 ` Jean Louis
2024-12-22 20:29 ` Helmut Eller
2024-12-22 20:50 ` Gerd Möllmann
3 siblings, 2 replies; 91+ messages in thread
From: Óscar Fuentes @ 2024-12-22 19:11 UTC (permalink / raw)
To: emacs-devel; +Cc: Pip Cet, Gerd Möllmann, Helmut Eller, Andrea Corallo
Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:
>> I suspect that some of those
>> pauses are not related to garbage collection (executing code and moving
>> data also takes time.)
>
> Quite possible. Even if it is GC, please keep in mind that MPS has many
> settings which you can play with, and it can improve things a lot. It's
> not too early to become a fan of the scratch/igc branch, but it is too
> early to reject it for performance reasons. It's a "heads you lose, tails I
> win" situation, I guess.
IIRC MPS is well documented and I can look up those settings, but does
Emacs collect the required info for taking informed decisions?
Anyway, with the setup I'm using for this job is totally unrealistic to
expect instant reaction from Emacs, there is too much heavy stuff
kicking in for every keypress.
> 1. The signal issue. I don't have a good way to fix this and make
> everyone happy, but I do have a solution which hasn't caused a crash for
> me in quite a while. It may be good enough.
Inevitably, a few minutes after sending my message Emacs froze after
working flawlessly since you fixed the JSON issue.
Redisplay just stopped while showing the menu, no crash nor infinite
loop, its CPU usage was typical for the repeating timers that my config
creates. Sadly, instead of attaching gdb I tried to wake up Emacs by
sending SIGUSR1 (no effect, as it is the wrong signal, should be
SIGUSR2) and then sent SINGINT by mistake, which terminated the process.
It's very likely that MPS is innocent on this, but I'm happy to apply
and test any stability improvement patch you have and wish to share.
Thanks.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
2024-12-22 17:56 ` Gerd Möllmann
2024-12-22 19:11 ` Óscar Fuentes
@ 2024-12-22 20:29 ` Helmut Eller
2024-12-22 20:50 ` Gerd Möllmann
3 siblings, 0 replies; 91+ messages in thread
From: Helmut Eller @ 2024-12-22 20:29 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Gerd Möllmann,
Andrea Corallo
On Sun, Dec 22 2024, Pip Cet wrote:
> 2. no-purespace. Merging that into scratch/igc would help, well, me.
> What do others think?
No objections from me.
> 3. bytecode stack marking. That comment raises my red-flag alert,
> because it sounds like we're just accepting a preventable crash at this
> stage rather than wanting to do anything about it. The reality, of
> course, is different, but I'd be happier if we refused to create a byte
> code object that intends to use more stack than we can guarantee we
> would scan. Can we do that?
Maybe the bytecode engine could handle large stack frames differently
from small stack frames.
For large stack frames we would:
1. initialize the stack frame with NULLs
2. bump the stack pointer
3. now the stack frame is usable
For small stack frames, we would skip step 1 but the GC would always
scan one extra "small frame with maximal length".
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
` (2 preceding siblings ...)
2024-12-22 20:29 ` Helmut Eller
@ 2024-12-22 20:50 ` Gerd Möllmann
2024-12-22 22:26 ` Pip Cet via Emacs development discussions.
3 siblings, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-22 20:50 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Pip Cet <pipcet@protonmail.com> writes:
> Óscar Fuentes <ofv@wanadoo.es> writes:
>> With igc the pauses are still there, but they much shorter and
>> predictable, they no longer distract me from thinking on what I'm
>> writing, which is a huge improvement. I suspect that some of those
>> pauses are not related to garbage collection (executing code and moving
>> data also takes time.)
>
> Quite possible. Even if it is GC, please keep in mind that MPS has many
> settings which you can play with, and it can improve things a lot. It's
> not too early to become a fan of the scratch/igc branch, but it is too
> early to reject it for performance reasons. It's a "heads you lose, tails I
> win" situation, I guess.
>
>> TL/DR: now I enjoy using Emacs with this setup and I'm no longer tempted
>> to switch to other editors for this type of work.
>
> I think this is an important point: ultimately, it's about having daily
> drivers. We need to remove the remaining impediments for that:
>
> 1. The signal issue. I don't have a good way to fix this and make
> everyone happy, but I do have a solution which hasn't caused a crash for
> me in quite a while. It may be good enough.
TBH, I'd have put it in already.
> 2. no-purespace. Merging that into scratch/igc would help, well, me.
> What do others think?
Doesn't affect me much.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 20:50 ` Gerd Möllmann
@ 2024-12-22 22:26 ` Pip Cet via Emacs development discussions.
2024-12-23 3:23 ` Gerd Möllmann
2024-12-23 13:35 ` Some experience with the igc branch Eli Zaretskii
0 siblings, 2 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-22 22:26 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Pip Cet <pipcet@protonmail.com> writes:
>
>> Óscar Fuentes <ofv@wanadoo.es> writes:
>>> With igc the pauses are still there, but they much shorter and
>>> predictable, they no longer distract me from thinking on what I'm
>>> writing, which is a huge improvement. I suspect that some of those
>>> pauses are not related to garbage collection (executing code and moving
>>> data also takes time.)
>>
>> Quite possible. Even if it is GC, please keep in mind that MPS has many
>> settings which you can play with, and it can improve things a lot. It's
>> not too early to become a fan of the scratch/igc branch, but it is too
>> early to reject it for performance reasons. It's a "heads you lose, tails I
>> win" situation, I guess.
>>
>>> TL/DR: now I enjoy using Emacs with this setup and I'm no longer tempted
>>> to switch to other editors for this type of work.
>>
>> I think this is an important point: ultimately, it's about having daily
>> drivers. We need to remove the remaining impediments for that:
>>
>> 1. The signal issue. I don't have a good way to fix this and make
>> everyone happy, but I do have a solution which hasn't caused a crash for
>> me in quite a while. It may be good enough.
>
> TBH, I'd have put it in already.
Pushed it now. It is imperfect, but better than crashing.
>> 2. no-purespace. Merging that into scratch/igc would help, well, me.
>> What do others think?
>
> Doesn't affect me much.
Well, it does cause some noise, so I thought I'd ask first.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 19:11 ` Óscar Fuentes
@ 2024-12-23 0:05 ` Pip Cet via Emacs development discussions.
2024-12-23 1:00 ` Óscar Fuentes
2024-12-23 3:42 ` Some experience with the igc branch Gerd Möllmann
2024-12-23 6:27 ` Jean Louis
1 sibling, 2 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 0:05 UTC (permalink / raw)
To: Óscar Fuentes
Cc: emacs-devel, Gerd Möllmann, Helmut Eller, Andrea Corallo
Óscar Fuentes <ofv@wanadoo.es> writes:
> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>>> I suspect that some of those
>>> pauses are not related to garbage collection (executing code and moving
>>> data also takes time.)
>>
>> Quite possible. Even if it is GC, please keep in mind that MPS has many
>> settings which you can play with, and it can improve things a lot. It's
>> not too early to become a fan of the scratch/igc branch, but it is too
>> early to reject it for performance reasons. It's a "heads you lose, tails I
>> win" situation, I guess.
>
> IIRC MPS is well documented and I can look up those settings, but does
> Emacs collect the required info for taking informed decisions?
Not that I'm aware of, at this point.
>> 1. The signal issue. I don't have a good way to fix this and make
>> everyone happy, but I do have a solution which hasn't caused a crash for
>> me in quite a while. It may be good enough.
>
> Inevitably, a few minutes after sending my message Emacs froze after
> working flawlessly since you fixed the JSON issue.
Sorry to hear it, and thanks for letting us know! If it happens again,
any additional information you can provide would be very helpful.
> Redisplay just stopped while showing the menu, no crash nor infinite
> loop, its CPU usage was typical for the repeating timers that my config
> creates.
That's a bit odd. It might be the signal issue, but that's purely a
guess. If it happens again, please let us know.
Which windowing system are you using, and how are you displaying menus,
though?
> It's very likely that MPS is innocent on this, but I'm happy to apply
> and test any stability improvement patch you have and wish to share.
I just pushed the temporary fix for the signal issue, which should
improve stability.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 0:05 ` Pip Cet via Emacs development discussions.
@ 2024-12-23 1:00 ` Óscar Fuentes
2024-12-24 22:34 ` Pip Cet via Emacs development discussions.
2024-12-23 3:42 ` Some experience with the igc branch Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Óscar Fuentes @ 2024-12-23 1:00 UTC (permalink / raw)
To: Pip Cet; +Cc: emacs-devel, Gerd Möllmann, Helmut Eller, Andrea Corallo
Pip Cet <pipcet@protonmail.com> writes:
>> Redisplay just stopped while showing the menu, no crash nor infinite
>> loop, its CPU usage was typical for the repeating timers that my config
>> creates.
>
> That's a bit odd. It might be the signal issue, but that's purely a
> guess. If it happens again, please let us know.
Sure.
> Which windowing system are you using, and how are you displaying menus,
> though?
Configured using:
'configure CPPFLAGS=-I/home/oscar/dev/include/mps
LDFLAGS=-L/home/oscar/dev/other/mps/code --with-native-compilation
--with-tree-sitter --without-toolkit-scroll-bars --with-x-toolkit=lucid
--with-modules --without-imagemagick --with-mps=yes'
Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LIBOTF
LIBSELINUX LIBXML2 MODULES MPS NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG
SECCOMP SOUND SQLITE3 THREADS TIFF TREE_SITTER WEBP X11 XAW3D XDBE XIM
XINPUT2 XPM LUCID ZLIB
I have the menubar disabled (menu-bar-mode -1) and use a custom command
to open it:
(defun my-menu-bar-open-after ()
(remove-hook 'pre-command-hook 'my-menu-bar-open-after)
(when (eq menu-bar-mode 42)
(menu-bar-mode -1)))
(defun my-menu-bar-open (&rest args)
(interactive)
(let ((open menu-bar-mode))
(unless open
(menu-bar-mode 1))
(funcall 'menu-bar-open args)
(unless open
(setq menu-bar-mode 42)
(add-hook 'pre-command-hook 'my-menu-bar-open-after))))
(global-set-key [f10] 'my-menu-bar-open)
On that same session I used the command multiple times.
>> It's very likely that MPS is innocent on this, but I'm happy to apply
>> and test any stability improvement patch you have and wish to share.
>
> I just pushed the temporary fix for the signal issue, which should
> improve stability.
Emacs is already running here with that commit. Thanks!
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 22:26 ` Pip Cet via Emacs development discussions.
@ 2024-12-23 3:23 ` Gerd Möllmann
[not found] ` <m234ieddeu.fsf_-_@gmail.com>
2024-12-23 13:35 ` Some experience with the igc branch Eli Zaretskii
1 sibling, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 3:23 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Pip Cet <pipcet@protonmail.com> writes:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>> Óscar Fuentes <ofv@wanadoo.es> writes:
>>>> With igc the pauses are still there, but they much shorter and
>>>> predictable, they no longer distract me from thinking on what I'm
>>>> writing, which is a huge improvement. I suspect that some of those
>>>> pauses are not related to garbage collection (executing code and moving
>>>> data also takes time.)
>>>
>>> Quite possible. Even if it is GC, please keep in mind that MPS has many
>>> settings which you can play with, and it can improve things a lot. It's
>>> not too early to become a fan of the scratch/igc branch, but it is too
>>> early to reject it for performance reasons. It's a "heads you lose, tails I
>>> win" situation, I guess.
>>>
>>>> TL/DR: now I enjoy using Emacs with this setup and I'm no longer tempted
>>>> to switch to other editors for this type of work.
>>>
>>> I think this is an important point: ultimately, it's about having daily
>>> drivers. We need to remove the remaining impediments for that:
>>>
>>> 1. The signal issue. I don't have a good way to fix this and make
>>> everyone happy, but I do have a solution which hasn't caused a crash for
>>> me in quite a while. It may be good enough.
>>
>> TBH, I'd have put it in already.
>
> Pushed it now. It is imperfect, but better than crashing.
100%. Thanks!
>
>>> 2. no-purespace. Merging that into scratch/igc would help, well, me.
>>> What do others think?
>>
>> Doesn't affect me much.
>
> Well, it does cause some noise, so I thought I'd ask first.
>
> Pip
That's nice of you.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 0:05 ` Pip Cet via Emacs development discussions.
2024-12-23 1:00 ` Óscar Fuentes
@ 2024-12-23 3:42 ` Gerd Möllmann
1 sibling, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 3:42 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Pip Cet <pipcet@protonmail.com> writes:
> Óscar Fuentes <ofv@wanadoo.es> writes:
>
>> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>> writes:
>>
>>>> I suspect that some of those
>>>> pauses are not related to garbage collection (executing code and moving
>>>> data also takes time.)
>>>
>>> Quite possible. Even if it is GC, please keep in mind that MPS has many
>>> settings which you can play with, and it can improve things a lot. It's
>>> not too early to become a fan of the scratch/igc branch, but it is too
>>> early to reject it for performance reasons. It's a "heads you lose, tails I
>>> win" situation, I guess.
>>
>> IIRC MPS is well documented and I can look up those settings, but does
>> Emacs collect the required info for taking informed decisions?
>
> Not that I'm aware of, at this point.
Me neither.
(And, at least for me personally, "interactive performance", i.e. the
impression a user gets when he's using Emacs interactively, is the only
interesting part. That's difficult to measure of course. I don't care
much about performance improvements I don't notice :-)).
>
>>> 1. The signal issue. I don't have a good way to fix this and make
>>> everyone happy, but I do have a solution which hasn't caused a crash for
>>> me in quite a while. It may be good enough.
>>
>> Inevitably, a few minutes after sending my message Emacs froze after
>> working flawlessly since you fixed the JSON issue.
>
> Sorry to hear it, and thanks for letting us know! If it happens again,
> any additional information you can provide would be very helpful.
>
>> Redisplay just stopped while showing the menu, no crash nor infinite
>> loop, its CPU usage was typical for the repeating timers that my config
>> creates.
>
> That's a bit odd. It might be the signal issue, but that's purely a
> guess. If it happens again, please let us know.
>
> Which windowing system are you using, and how are you displaying menus,
> though?
Yes. I don't think I've ever seen a freeze caused by igc here. It always
was crashes. But one never knows, of course.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 19:11 ` Óscar Fuentes
2024-12-23 0:05 ` Pip Cet via Emacs development discussions.
@ 2024-12-23 6:27 ` Jean Louis
1 sibling, 0 replies; 91+ messages in thread
From: Jean Louis @ 2024-12-23 6:27 UTC (permalink / raw)
To: Óscar Fuentes
Cc: emacs-devel, Pip Cet, Gerd Möllmann, Helmut Eller,
Andrea Corallo
* Óscar Fuentes <ofv@wanadoo.es> [2024-12-22 22:13]:
> > 1. The signal issue. I don't have a good way to fix this and make
> > everyone happy, but I do have a solution which hasn't caused a crash for
> > me in quite a while. It may be good enough.
>
> Inevitably, a few minutes after sending my message Emacs froze after
> working flawlessly since you fixed the JSON issue.
>
> Redisplay just stopped while showing the menu, no crash nor infinite
> loop, its CPU usage was typical for the repeating timers that my config
> creates. Sadly, instead of attaching gdb I tried to wake up Emacs by
> sending SIGUSR1 (no effect, as it is the wrong signal, should be
> SIGUSR2) and then sent SINGINT by mistake, which terminated the process.
>
> It's very likely that MPS is innocent on this, but I'm happy to apply
> and test any stability improvement patch you have and wish to share.
I was using that branch for longer, but being heavy daily user of
Emacs with serious business, I cannot use it, it is not yet
stable. Reasons I have sent already to this list, I had no issues
since I switched to standard Emacs.
Most terrible was that ghostly appearance of words and characters
which I didn't type, or totally scrambling characters which I type.
--
Jean Louis
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-22 22:26 ` Pip Cet via Emacs development discussions.
2024-12-23 3:23 ` Gerd Möllmann
@ 2024-12-23 13:35 ` Eli Zaretskii
2024-12-23 14:03 ` Discussion with MPS people Gerd Möllmann
2024-12-23 15:07 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
1 sibling, 2 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 13:35 UTC (permalink / raw)
To: Pip Cet; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> Date: Sun, 22 Dec 2024 22:26:11 +0000
> Cc: Óscar Fuentes <ofv@wanadoo.es>, emacs-devel@gnu.org,
> Helmut Eller <eller.helmut@gmail.com>, Andrea Corallo <acorallo@gnu.org>
> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>
> >> 1. The signal issue. I don't have a good way to fix this and make
> >> everyone happy, but I do have a solution which hasn't caused a crash for
> >> me in quite a while. It may be good enough.
> >
> > TBH, I'd have put it in already.
>
> Pushed it now. It is imperfect, but better than crashing.
Why didn't we discuss this with MPS folks? A program can legitimately
call some code from a signal handler, so the limitations that MPS
seems to impose now are not very reasonable. Maybe we are missing
some feature, or maybe the MPS folks will agree to extend the library
to provide better support for programs that use signals. E.g., AFAIU
with this code installed, we are limiting our profiler too much (it
will never report GC, IIRC?). I think igc_busy_p returns non-zero in
too many situations where delivering signals could not possibly cause
harm, like during object allocation, AFAIR. According to
documentation, that function is not intended for this kind of purpose.
IOW, we had discussions about this which never concluded anything, and
we should pick up where we left off and solve this problem.
We should definitely try improving this before we land the branch on
master. We shouldn't consider this solution "good enough", but just a
temporary kludge meant to avoid too frequent crashes.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Discussion with MPS people
2024-12-23 13:35 ` Some experience with the igc branch Eli Zaretskii
@ 2024-12-23 14:03 ` Gerd Möllmann
2024-12-23 14:04 ` Gerd Möllmann
2024-12-23 15:07 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
1 sibling, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 14:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Pip Cet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
> Why didn't we discuss this with MPS folks?
That would be a good thing. Maybe we can get at least some initial
contact going.
I've CC'd Richard Brooksby, who is one of the main people behind MPS.
(AFAIU; sorry Richard if I under-represent your role.) I've seen that he
recently answered on the bug list, so maybe he's interested in helping
us.
@Richard:
Eli is an Emacs co-maintainer, and this mail goes in CC to the
emacs-devel mailing list. Pip Cet has taken over further development of
the branch scratch/igc, which contains a GC for Emacs based on MPS.
Helmut Eller is basically the third person doing importatnt work on
scratch/igc in the past.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Discussion with MPS people
2024-12-23 14:03 ` Discussion with MPS people Gerd Möllmann
@ 2024-12-23 14:04 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 14:04 UTC (permalink / raw)
To: Eli Zaretskii
Cc: Pip Cet, ofv, emacs-devel, eller.helmut, acorallo,
Richard Brooksby
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>> Why didn't we discuss this with MPS folks?
>
> That would be a good thing. Maybe we can get at least some initial
> contact going.
>
> I've CC'd Richard Brooksby, who is one of the main people behind MPS.
> (AFAIU; sorry Richard if I under-represent your role.) I've seen that he
> recently answered on the bug list, so maybe he's interested in helping
> us.
>
> @Richard:
>
> Eli is an Emacs co-maintainer, and this mail goes in CC to the
> emacs-devel mailing list. Pip Cet has taken over further development of
> the branch scratch/igc, which contains a GC for Emacs based on MPS.
> Helmut Eller is basically the third person doing importatnt work on
> scratch/igc in the past.
And of course I've forgotten to actually add Richard in CC. Now done.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Make Signal handling patch platform-dependent?
[not found] ` <m2bjx2h8dh.fsf@gmail.com>
@ 2024-12-23 14:45 ` Pip Cet via Emacs development discussions.
2024-12-23 14:54 ` Gerd Möllmann
0 siblings, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 14:45 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Pip Cet <pipcet@protonmail.com> writes:
>
>> And, who knows, using a separate thread might help (debugging, not
>> performance).
>
> Yeah, more long-term goals, I'd guess. I'm glad we're moving forward,
> ATM :-).
If we come up with a solution to the signal issue which works but
requires the creation of extra threads, would that prevent the merge?
>> The rest of this email is about a half-baked idea to perform dry-run
>> background GCs to facilitate debugging. It's tantalizingly close to
>> offering performance benefits, but doesn't quite get there, and it
>> doesn't have to: it'd help us detect leaked references to objects, and
>> that's all it needs to do.
>>
>> I'm still thinking about double-mapping MPS segments so one thread can
>> scan them using a "privileged" mapping while the barrier is in place for
>> the ordinary mapping and prevents access to that segment.
>
> Quick question upfront; I'll have to think longer about the rest, and
> maybe try to find existing examples: the double-mapping. How would that
> be done? I know about page-table manipulation, but I don't think it's
> easily doable, at least not on macOS. What would you use for
> double-mapping?
My understanding is the two options are SysV shm* (clunky) or mmapping a
file handle corresponding to an already-deleted file, twice (some risk
the OS will synchronize the file to disk, maybe even page it out; also
might count towards the disk quota).
I prefer the latter because I wouldn't actually delete the file, which
would give us a snapshot of the MPS heap in the event of a crash. If
that isn't enough, we could explicitly snapshot the file once in a
while, before moving objects, giving us the ability to detect where an
object moved to. (If THAT isn't enough, we'd have two additional
options: either hack MPS not to reuse virtual addresses unless it really
has to, or store the file on a fully journaled file system allowing us
to time-travel through the MPS heap.) Also, I've never used shm*.
If there's a third option, it'd be great to learn about it.
Needless to say, double-mapping doubles the VM size, which is limited on
32-bit systems.
I don't think virtually-indexed caches are a thing anymore (if the cache
doesn't recognize two VAs correspond to the same PA, well, great fun
ensues).
IIUC, some aarch64 systems, but not those usually running macOS, have
weak cache coherency, and as double-mapping is a valid but rare thing to
do, who knows what would happen.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Make Signal handling patch platform-dependent?
2024-12-23 14:45 ` Make Signal handling patch platform-dependent? Pip Cet via Emacs development discussions.
@ 2024-12-23 14:54 ` Gerd Möllmann
2024-12-23 15:11 ` Eli Zaretskii
0 siblings, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 14:54 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Pip Cet <pipcet@protonmail.com> writes:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>> And, who knows, using a separate thread might help (debugging, not
>>> performance).
>>
>> Yeah, more long-term goals, I'd guess. I'm glad we're moving forward,
>> ATM :-).
>
> If we come up with a solution to the signal issue which works but
> requires the creation of extra threads, would that prevent the merge?
I think that's for Eli to answer.
>>> The rest of this email is about a half-baked idea to perform dry-run
>>> background GCs to facilitate debugging. It's tantalizingly close to
>>> offering performance benefits, but doesn't quite get there, and it
>>> doesn't have to: it'd help us detect leaked references to objects, and
>>> that's all it needs to do.
>>>
>>> I'm still thinking about double-mapping MPS segments so one thread can
>>> scan them using a "privileged" mapping while the barrier is in place for
>>> the ordinary mapping and prevents access to that segment.
>>
>> Quick question upfront; I'll have to think longer about the rest, and
>> maybe try to find existing examples: the double-mapping. How would that
>> be done? I know about page-table manipulation, but I don't think it's
>> easily doable, at least not on macOS. What would you use for
>> double-mapping?
>
> My understanding is the two options are SysV shm* (clunky) or mmapping a
> file handle corresponding to an already-deleted file, twice (some risk
> the OS will synchronize the file to disk, maybe even page it out; also
> might count towards the disk quota).
>
> I prefer the latter because I wouldn't actually delete the file, which
> would give us a snapshot of the MPS heap in the event of a crash. If
> that isn't enough, we could explicitly snapshot the file once in a
> while, before moving objects, giving us the ability to detect where an
> object moved to. (If THAT isn't enough, we'd have two additional
> options: either hack MPS not to reuse virtual addresses unless it really
> has to, or store the file on a fully journaled file system allowing us
> to time-travel through the MPS heap.) Also, I've never used shm*.
>
> If there's a third option, it'd be great to learn about it.
I don't know of a third option.
>
> Needless to say, double-mapping doubles the VM size, which is limited on
> 32-bit systems.
>
> I don't think virtually-indexed caches are a thing anymore (if the cache
> doesn't recognize two VAs correspond to the same PA, well, great fun
> ensues).
>
> IIUC, some aarch64 systems, but not those usually running macOS, have
> weak cache coherency, and as double-mapping is a valid but rare thing to
> do, who knows what would happen.
>
> Pip
Thanks so far!
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 13:35 ` Some experience with the igc branch Eli Zaretskii
2024-12-23 14:03 ` Discussion with MPS people Gerd Möllmann
@ 2024-12-23 15:07 ` Pip Cet via Emacs development discussions.
2024-12-23 15:26 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 15:07 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
"Eli Zaretskii" <eliz@gnu.org> writes:
>> Date: Sun, 22 Dec 2024 22:26:11 +0000
>> Cc: Óscar Fuentes <ofv@wanadoo.es>, emacs-devel@gnu.org,
>> Helmut Eller <eller.helmut@gmail.com>, Andrea Corallo <acorallo@gnu.org>
>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>
>> >> 1. The signal issue. I don't have a good way to fix this and make
>> >> everyone happy, but I do have a solution which hasn't caused a crash for
>> >> me in quite a while. It may be good enough.
>> >
>> > TBH, I'd have put it in already.
>>
>> Pushed it now. It is imperfect, but better than crashing.
>
> Why didn't we discuss this with MPS folks? A program can legitimately
Because...
> call some code from a signal handler, so the limitations that MPS
> seems to impose now are not very reasonable. Maybe we are missing
...if they were interested, maybe they've read this or some other
blanket accusation of being "unreasonable", and became uninterested
quickly. I know I would.
> some feature, or maybe the MPS folks will agree to extend the library
> to provide better support for programs that use signals. E.g., AFAIU
> with this code installed, we are limiting our profiler too much (it
> will never report GC, IIRC?). I think igc_busy_p returns non-zero in
> too many situations where delivering signals could not possibly cause
> harm, like during object allocation, AFAIR. According to
> documentation, that function is not intended for this kind of purpose.
>
> IOW, we had discussions about this which never concluded anything, and
> we should pick up where we left off and solve this problem.
I have a different idea using a separate allocation thread (for the slow
path only, of course). Would that be potentially acceptable?
It would limit MPS to systems providing a working atomic.h header, and
in practice also require some sort of working (and reasonably fast)
inter-thread signalling (though I suspect it'd be faster to run both
threads on the same core, since it's a handover rather than a
parallelism situation). That excludes very few systems these days
(sorry, MS-DOS).
I'll spare you most of the details for now, but having read the mps
header, MPS allocation is not safe to use from separate threads without
locking the AP (or having per-thread APs), which we might end up doing
on Windows, IIRC. I'd rather give those (potential) issues a wide
berth. Also, by the campsite rule, merging MPS shouldn't make it harder
to move in the direction of multi-threaded Emacs.
Better debugging (which I agree with you is something we need to
improve), no MPS modification. Performance implications TBD.
> We should definitely try improving this before we land the branch on
> master. We shouldn't consider this solution "good enough", but just a
> temporary kludge meant to avoid too frequent crashes.
Agreed.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Make Signal handling patch platform-dependent?
2024-12-23 14:54 ` Gerd Möllmann
@ 2024-12-23 15:11 ` Eli Zaretskii
0 siblings, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 15:11 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Óscar Fuentes <ofv@wanadoo.es>, emacs-devel@gnu.org,
> Helmut Eller <eller.helmut@gmail.com>, Andrea Corallo <acorallo@gnu.org>
> Date: Mon, 23 Dec 2024 15:54:36 +0100
>
> Pip Cet <pipcet@protonmail.com> writes:
>
> > If we come up with a solution to the signal issue which works but
> > requires the creation of extra threads, would that prevent the merge?
>
> I think that's for Eli to answer.
I don't see why extra threads would be a problem, as long as they
don't use the Lisp machine or any parts of the global state. We
already have several threads on MS-Windows, including for emulating
Posix signals, so I don't see why adding C threads on Posix systems
would be "verboten".
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 15:07 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
@ 2024-12-23 15:26 ` Gerd Möllmann
2024-12-23 16:03 ` Pip Cet via Emacs development discussions.
0 siblings, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 15:26 UTC (permalink / raw)
To: Pip Cet; +Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Pip Cet <pipcet@protonmail.com> writes:
> I'll spare you most of the details for now, but having read the mps
> header, MPS allocation is not safe to use from separate threads without
> locking the AP (or having per-thread APs), which we might end up doing
> on Windows, IIRC.
Now I'm confused. We're using thread allocation points. See
create_thread_aps, thread_ap, and so on.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 15:26 ` Gerd Möllmann
@ 2024-12-23 16:03 ` Pip Cet via Emacs development discussions.
2024-12-23 16:44 ` Eli Zaretskii
2024-12-23 17:44 ` Gerd Möllmann
0 siblings, 2 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 16:03 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Pip Cet <pipcet@protonmail.com> writes:
>
>> I'll spare you most of the details for now, but having read the mps
>> header, MPS allocation is not safe to use from separate threads without
>> locking the AP (or having per-thread APs), which we might end up doing
>> on Windows, IIRC.
>
> Now I'm confused. We're using thread allocation points. See
> create_thread_aps, thread_ap, and so on.
I was confused. This is only a problem if we allocate memory from a
signal handler, which is effectively sharing the per-thread structure.
(I'm still confused. My patch worked on the first attempt, which my code
never does. I suspect that while I made a mistake, it caused a subtle
bug rather than an obvious one.)
And we don't want to allocate memory from signal handlers, right? We
could, now (see warnings below):
diff --git a/src/igc.c b/src/igc.c
index eb72406e529..14ecc30f982 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -747,19 +747,41 @@ IGC_DEFINE_LIST (igc_root);
/* Registry entry for an MPS thread mps_thr_t. */
+#include <pthread.h>
+#include <stdatomic.h>
+
+struct emacs_ap
+{
+ mps_ap_t mps_ap;
+ struct igc *gc;
+ pthread_t allocation_thread;
+ atomic_uintptr_t usable_memory;
+ atomic_uintptr_t usable_bytes;
+
+ atomic_uintptr_t waiting_threads;
+ atomic_uintptr_t requested_bytes;
+ atomic_intptr_t requested_type;
+};
+
+typedef struct emacs_ap emacs_ap_t;
+
+#ifndef ATOMIC_POINTER_LOCK_FREE
+#error "this probably won't work"
+#endif
+
struct igc_thread
{
struct igc *gc;
mps_thr_t thr;
/* Allocation points for the thread. */
- mps_ap_t dflt_ap;
- mps_ap_t leaf_ap;
- mps_ap_t weak_strong_ap;
- mps_ap_t weak_weak_ap;
- mps_ap_t weak_hash_strong_ap;
- mps_ap_t weak_hash_weak_ap;
- mps_ap_t immovable_ap;
+ emacs_ap_t dflt_ap;
+ emacs_ap_t leaf_ap;
+ emacs_ap_t weak_strong_ap;
+ emacs_ap_t weak_weak_ap;
+ emacs_ap_t weak_hash_strong_ap;
+ emacs_ap_t weak_hash_weak_ap;
+ emacs_ap_t immovable_ap;
/* Quick access to the roots used for specpdl, bytecode stack and
control stack. */
@@ -805,6 +827,8 @@ IGC_DEFINE_LIST (igc_thread);
/* Registered threads. */
struct igc_thread_list *threads;
+
+ pthread_cond_t cond;
};
static bool process_one_message (struct igc *gc);
@@ -2904,8 +2928,84 @@ igc_root_destroy_comp_unit_eph (struct Lisp_Native_Comp_Unit *u)
maybe_destroy_root (&u->data_eph_relocs_root);
}
+static mps_addr_t alloc_impl_raw (size_t size, enum igc_obj_type type, mps_ap_t ap);
+static mps_addr_t alloc_impl (size_t size, enum igc_obj_type type, emacs_ap_t *ap);
+
+static void *igc_allocation_thread (void *ap_v)
+{
+ emacs_ap_t *ap = ap_v;
+ while (true)
+ {
+ if (ap->requested_bytes)
+ {
+ void *p = alloc_impl_raw (ap->requested_bytes, (enum igc_obj_type) ap->requested_type, ap->mps_ap);
+ atomic_store (&ap->usable_memory, (uintptr_t) p);
+ atomic_store (&ap->usable_bytes, ap->requested_bytes);
+ atomic_store (&ap->requested_type, -1);
+ atomic_store (&ap->requested_bytes, 0);
+ }
+ }
+
+ return NULL;
+}
+
+static mps_addr_t alloc_impl (size_t size, enum igc_obj_type type, emacs_ap_t *ap)
+{
+ if (size == 0)
+ return 0;
+ while (true)
+ {
+ uintptr_t other_threads = atomic_fetch_add (&ap->waiting_threads, 1);
+ if (other_threads != 0)
+ {
+ /* we know that the other "thread" is actually on top of us,
+ * and we're a signal handler. Wait, should we even be
+ * allocating memory? We should still eassert that we're the
+ * right thread. */
+ emacs_ap_t saved_state;
+ while (ap->requested_bytes);
+ memcpy (&saved_state, ap, sizeof saved_state);
+ atomic_store (&ap->waiting_threads, 0);
+ mps_addr_t ret = alloc_impl (size, type, ap);
+ atomic_store (&ap->waiting_threads, saved_state.waiting_threads);
+ memcpy (ap, &saved_state, sizeof saved_state);
+ atomic_fetch_add (&ap->waiting_threads, -1);
+ return ret;
+ }
+
+ atomic_store (&ap->requested_type, (uintptr_t) type);
+ atomic_store (&ap->requested_bytes, (uintptr_t) size);
+
+ while (ap->requested_bytes);
+
+ mps_addr_t ret = (mps_addr_t) ap->usable_memory;
+ atomic_fetch_add (&ap->waiting_threads, -1);
+ return ret;
+ }
+}
+
+static mps_res_t emacs_ap_create_k (emacs_ap_t *ap, mps_pool_t pool,
+ mps_arg_s *args)
+{
+ atomic_store(&ap->usable_memory, 0);
+ atomic_store(&ap->usable_bytes, 0);
+ atomic_store(&ap->waiting_threads, 0);
+ atomic_store(&ap->requested_bytes, 0);
+
+ pthread_attr_t thread_attr;
+ pthread_attr_init (&thread_attr);
+ pthread_create(&ap->allocation_thread, &thread_attr, igc_allocation_thread, ap);
+
+ return mps_ap_create_k (&ap->mps_ap, pool, args);
+}
+
+static void emacs_ap_destroy (emacs_ap_t *ap)
+{
+ return;
+}
+
static mps_res_t
-create_weak_ap (mps_ap_t *ap, struct igc_thread *t, bool weak)
+create_weak_ap (emacs_ap_t *ap, struct igc_thread *t, bool weak)
{
struct igc *gc = t->gc;
mps_res_t res;
@@ -2914,14 +3014,14 @@ create_weak_ap (mps_ap_t *ap, struct igc_thread *t, bool weak)
{
MPS_ARGS_ADD (args, MPS_KEY_RANK,
weak ? mps_rank_weak () : mps_rank_exact ());
- res = mps_ap_create_k (ap, pool, args);
+ res = emacs_ap_create_k (ap, pool, args);
}
MPS_ARGS_END (args);
return res;
}
static mps_res_t
-create_weak_hash_ap (mps_ap_t *ap, struct igc_thread *t, bool weak)
+create_weak_hash_ap (emacs_ap_t *ap, struct igc_thread *t, bool weak)
{
struct igc *gc = t->gc;
mps_res_t res;
@@ -2930,7 +3030,7 @@ create_weak_hash_ap (mps_ap_t *ap, struct igc_thread *t, bool weak)
{
MPS_ARGS_ADD (args, MPS_KEY_RANK,
weak ? mps_rank_weak () : mps_rank_exact ());
- res = mps_ap_create_k (ap, pool, args);
+ res = emacs_ap_create_k (ap, pool, args);
}
MPS_ARGS_END (args);
return res;
@@ -2940,12 +3040,15 @@ create_weak_hash_ap (mps_ap_t *ap, struct igc_thread *t, bool weak)
create_thread_aps (struct igc_thread *t)
{
struct igc *gc = t->gc;
+ pthread_condattr_t condattr;
+ pthread_condattr_init (&condattr);
+ pthread_cond_init (&gc->cond, &condattr);
mps_res_t res;
- res = mps_ap_create_k (&t->dflt_ap, gc->dflt_pool, mps_args_none);
+ res = emacs_ap_create_k (&t->dflt_ap, gc->dflt_pool, mps_args_none);
IGC_CHECK_RES (res);
- res = mps_ap_create_k (&t->leaf_ap, gc->leaf_pool, mps_args_none);
+ res = emacs_ap_create_k (&t->leaf_ap, gc->leaf_pool, mps_args_none);
IGC_CHECK_RES (res);
- res = mps_ap_create_k (&t->immovable_ap, gc->immovable_pool, mps_args_none);
+ res = emacs_ap_create_k (&t->immovable_ap, gc->immovable_pool, mps_args_none);
IGC_CHECK_RES (res);
res = create_weak_ap (&t->weak_strong_ap, t, false);
res = create_weak_hash_ap (&t->weak_hash_strong_ap, t, false);
@@ -3007,13 +3110,13 @@ igc_thread_remove (void **pinfo)
destroy_root (&t->d.stack_root);
destroy_root (&t->d.specpdl_root);
destroy_root (&t->d.bc_root);
- mps_ap_destroy (t->d.dflt_ap);
- mps_ap_destroy (t->d.leaf_ap);
- mps_ap_destroy (t->d.weak_strong_ap);
- mps_ap_destroy (t->d.weak_weak_ap);
- mps_ap_destroy (t->d.weak_hash_strong_ap);
- mps_ap_destroy (t->d.weak_hash_weak_ap);
- mps_ap_destroy (t->d.immovable_ap);
+ emacs_ap_destroy (&t->d.dflt_ap);
+ emacs_ap_destroy (&t->d.leaf_ap);
+ emacs_ap_destroy (&t->d.weak_strong_ap);
+ emacs_ap_destroy (&t->d.weak_weak_ap);
+ emacs_ap_destroy (&t->d.weak_hash_strong_ap);
+ emacs_ap_destroy (&t->d.weak_hash_weak_ap);
+ emacs_ap_destroy (&t->d.immovable_ap);
mps_thread_dereg (deregister_thread (t));
}
@@ -3677,7 +3780,7 @@ igc_on_idle (void)
}
}
-static mps_ap_t
+static emacs_ap_t *
thread_ap (enum igc_obj_type type)
{
struct igc_thread_list *t = current_thread->gc_info;
@@ -3698,13 +3801,13 @@ thread_ap (enum igc_obj_type type)
emacs_abort ();
case IGC_OBJ_MARKER_VECTOR:
- return t->d.weak_weak_ap;
+ return &t->d.weak_weak_ap;
case IGC_OBJ_WEAK_HASH_TABLE_WEAK_PART:
- return t->d.weak_hash_weak_ap;
+ return &t->d.weak_hash_weak_ap;
case IGC_OBJ_WEAK_HASH_TABLE_STRONG_PART:
- return t->d.weak_hash_strong_ap;
+ return &t->d.weak_hash_strong_ap;
case IGC_OBJ_VECTOR:
case IGC_OBJ_CONS:
@@ -3719,12 +3822,12 @@ thread_ap (enum igc_obj_type type)
case IGC_OBJ_FACE_CACHE:
case IGC_OBJ_BLV:
case IGC_OBJ_HANDLER:
- return t->d.dflt_ap;
+ return &t->d.dflt_ap;
case IGC_OBJ_STRING_DATA:
case IGC_OBJ_FLOAT:
case IGC_OBJ_BYTES:
- return t->d.leaf_ap;
+ return &t->d.leaf_ap;
}
emacs_abort ();
}
@@ -3796,7 +3899,7 @@ igc_hash (Lisp_Object key)
object. */
static mps_addr_t
-alloc_impl (size_t size, enum igc_obj_type type, mps_ap_t ap)
+alloc_impl_raw (size_t size, enum igc_obj_type type, mps_ap_t ap)
{
mps_addr_t p UNINIT;
size = alloc_size (size);
@@ -3845,7 +3948,7 @@ alloc (size_t size, enum igc_obj_type type)
alloc_immovable (size_t size, enum igc_obj_type type)
{
struct igc_thread_list *t = current_thread->gc_info;
- return alloc_impl (size, type, t->d.immovable_ap);
+ return alloc_impl (size, type, &t->d.immovable_ap);
}
#ifdef HAVE_MODULES
@@ -4883,17 +4986,17 @@ igc_on_pdump_loaded (void *dump_base, void *hot_start, void *hot_end,
igc_alloc_dump (size_t nbytes)
{
igc_assert (global_igc->park_count > 0);
- mps_ap_t ap = thread_ap (IGC_OBJ_CONS);
+ emacs_ap_t *ap = thread_ap (IGC_OBJ_CONS);
size_t block_size = igc_header_size () + nbytes;
mps_addr_t block;
do
{
- mps_res_t res = mps_reserve (&block, ap, block_size);
+ mps_res_t res = mps_reserve (&block, ap->mps_ap, block_size);
if (res != MPS_RES_OK)
memory_full (0);
set_header (block, IGC_OBJ_INVALID, block_size, 0);
}
- while (!mps_commit (ap, block, block_size));
+ while (!mps_commit (ap->mps_ap, block, block_size));
return (char *) block + igc_header_size ();
}
Warnings:
This is the "slow path" only, used for all allocations. Will cause a
great number of busy-looping threads. Will be very slow. Creating
additional emacs threads will result in a proportional number of
additional threads, which will be very, very slow, so don't. Requires
pthread.h and stdatomic.h, and still does things not covered by those
APIs (memcpying over an atomic_uintptr_t, even if we know that its value
won't change, is probably verboten, and definitely should be). I
*think* this code might work if we allocate from signal handlers, and I
think this code might work on systems that don't have lock-free atomics
(once the #error is removed), but it definitely won't do both at the
same time.
Pip
^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 16:03 ` Pip Cet via Emacs development discussions.
@ 2024-12-23 16:44 ` Eli Zaretskii
2024-12-23 17:16 ` Pip Cet via Emacs development discussions.
2024-12-23 17:44 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 16:44 UTC (permalink / raw)
To: Pip Cet; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> Date: Mon, 23 Dec 2024 16:03:53 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>
> --- a/src/igc.c
> +++ b/src/igc.c
> @@ -747,19 +747,41 @@ IGC_DEFINE_LIST (igc_root);
>
> /* Registry entry for an MPS thread mps_thr_t. */
>
> +#include <pthread.h>
We cannot use pthreads.h in portable code. If we want to use threads,
we need separate implementations for Posix and Windows, like wedid in
systhread.c for Lisp threads.
> +struct emacs_ap
> +{
> + mps_ap_t mps_ap;
> + struct igc *gc;
> + pthread_t allocation_thread;
pthread_t is non-portable, for the same reasons.
> This is the "slow path" only, used for all allocations. Will cause a
> great number of busy-looping threads.
A lot of threads might be problematic. Each thread reserves memory
for its stack, so you end up with lots of reserved memory, and on
32-bit systems can run out of address space.
Why do we need this, again?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 16:44 ` Eli Zaretskii
@ 2024-12-23 17:16 ` Pip Cet via Emacs development discussions.
2024-12-23 18:35 ` Eli Zaretskii
0 siblings, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 17:16 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
"Eli Zaretskii" <eliz@gnu.org> writes:
>> Date: Mon, 23 Dec 2024 16:03:53 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>>
>> --- a/src/igc.c
>> +++ b/src/igc.c
>> @@ -747,19 +747,41 @@ IGC_DEFINE_LIST (igc_root);
>>
>> /* Registry entry for an MPS thread mps_thr_t. */
>>
>> +#include <pthread.h>
>
> We cannot use pthreads.h in portable code. If we want to use threads,
> we need separate implementations for Posix and Windows, like wedid in
> systhread.c for Lisp threads.
Noted.
As an aside, without any relevance to the fact that we should avoid
using them, aren't pthreads available on "mingw"64 systems?
>> +struct emacs_ap
>> +{
>> + mps_ap_t mps_ap;
>> + struct igc *gc;
>> + pthread_t allocation_thread;
>
> pthread_t is non-portable, for the same reasons.
>
>> This is the "slow path" only, used for all allocations. Will cause a
>> great number of busy-looping threads.
>
> A lot of threads might be problematic. Each thread reserves memory
> for its stack, so you end up with lots of reserved memory, and on
> 32-bit systems can run out of address space.
This is a PoC. While we shouldn't share structures between Emacs-side
threads, we should of course use (at most) a single allocation thread
rather than one per thread per AP. Also, yield the CPU once in a while
:-)
> Why do we need this, again?
We can't interrupt allocation, so we move it to a separate thread where
it will complete (unlocking the arena) even if a signal interrupts us.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 16:03 ` Pip Cet via Emacs development discussions.
2024-12-23 16:44 ` Eli Zaretskii
@ 2024-12-23 17:44 ` Gerd Möllmann
2024-12-23 19:00 ` Eli Zaretskii
1 sibling, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 17:44 UTC (permalink / raw)
To: Pip Cet; +Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Pip Cet <pipcet@protonmail.com> writes:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>> I'll spare you most of the details for now, but having read the mps
>>> header, MPS allocation is not safe to use from separate threads without
>>> locking the AP (or having per-thread APs), which we might end up doing
>>> on Windows, IIRC.
>>
>> Now I'm confused. We're using thread allocation points. See
>> create_thread_aps, thread_ap, and so on.
>
> I was confused. This is only a problem if we allocate memory from a
> signal handler, which is effectively sharing the per-thread structure.
>
> (I'm still confused. My patch worked on the first attempt, which my code
> never does. I suspect that while I made a mistake, it caused a subtle
> bug rather than an obvious one.)
>
> And we don't want to allocate memory from signal handlers, right? We
> could, now (see warnings below):
Can't speak for others, but I wouldn't want it :-).
I can't cite myself, but I'm pretty sure I said some time ago already
that in a portable program one cannot do much in a signal handler in the
first place. So I wouldn't be surprised if MPS didn't support being
called from a signal handler. Not unreasonable for me.
But whatever. Maybe Richard Brooksby answers, and can shed light on that
or has ideas, if we don't overload him :-). And anyway, there is now
something workable in the igc branch. Maybe we could wait a bit, and
just proceed with something else meanwhile.
[... Thanks for the patch ...]
> Warnings:
>
> This is the "slow path" only, used for all allocations. Will cause a
> great number of busy-looping threads.
Don't know why, but the busy looping threads makes me feel a bit
uncomfortable :-),
> Will be very slow. Creating additional emacs threads will result in a
> proportional number of additional threads, which will be very, very
> slow, so don't. Requires pthread.h and stdatomic.h, and still does
> things not covered by those APIs (memcpying over an atomic_uintptr_t,
> even if we know that its value won't change, is probably verboten, and
> definitely should be). I *think* this code might work if we allocate
> from signal handlers, and I think this code might work on systems that
> don't have lock-free atomics (once the #error is removed), but it
> definitely won't do both at the same time.
>
> Pip
BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
objects or access some? All? Or, would it be realistic to rewrite signal
handlers to not do that?
One thing I've seen done elsewhere is to publish a message to a message
board so that it can be handled outside of the signal handler. Something
like that, you know what I mean.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 17:16 ` Pip Cet via Emacs development discussions.
@ 2024-12-23 18:35 ` Eli Zaretskii
2024-12-23 18:48 ` Gerd Möllmann
2024-12-23 20:30 ` Benjamin Riefenstahl
0 siblings, 2 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 18:35 UTC (permalink / raw)
To: Pip Cet; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> Date: Mon, 23 Dec 2024 17:16:32 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: gerd.moellmann@gmail.com, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>
> "Eli Zaretskii" <eliz@gnu.org> writes:
>
> >> +#include <pthread.h>
> >
> > We cannot use pthreads.h in portable code. If we want to use threads,
> > we need separate implementations for Posix and Windows, like wedid in
> > systhread.c for Lisp threads.
>
> Noted.
>
> As an aside, without any relevance to the fact that we should avoid
> using them, aren't pthreads available on "mingw"64 systems?
pthreads are ported to both 32-bit and 64-bit Windows (more than
once), but the ports are buggy, and pthreads.h itself defines all
kinds of stuff that conflicts with various w32 places in Emacs. The
following lines from nt/mingw-site.cfg is one sign of that:
# We don't want pthread.h to be picked up just because it defines timespec
gl_cv_sys_struct_timespec_in_pthread_h=no
# Or at all...
ac_cv_header_pthread_h=no
> > Why do we need this, again?
>
> We can't interrupt allocation, so we move it to a separate thread where
> it will complete (unlocking the arena) even if a signal interrupts us.
How will this allow us to run the Lisp machine from a signal? Because
this is the goal, right?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 18:35 ` Eli Zaretskii
@ 2024-12-23 18:48 ` Gerd Möllmann
2024-12-23 19:25 ` Eli Zaretskii
2024-12-23 20:30 ` Benjamin Riefenstahl
1 sibling, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 18:48 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Pip Cet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
> How will this allow us to run the Lisp machine from a signal? Because
> this is the goal, right?
Today I'm confused.
Can I ask what you mean by running the Lisp Machine from a signal
handler? Sounds to me like calling eval, but I'd doubt that works with
the old GC, or does it?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 17:44 ` Gerd Möllmann
@ 2024-12-23 19:00 ` Eli Zaretskii
2024-12-23 19:37 ` Eli Zaretskii
` (2 more replies)
0 siblings, 3 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 19:00 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Mon, 23 Dec 2024 18:44:42 +0100
>
> BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
> objects or access some? All? Or, would it be realistic to rewrite signal
> handlers to not do that?
SIGPROF does (it's the basis for our Lisp profiler).
SIGCHLD doesn't run Lisp (I think), but it examines objects and data
structures of the Lisp machine (those related to child processes).
> One thing I've seen done elsewhere is to publish a message to a message
> board so that it can be handled outside of the signal handler. Something
> like that, you know what I mean.
This is tricky for the profiler, because you want to sample the
function in which you are right there and then, not some time later.
For SIGCHLD this could work, but it might make Emacs slower in
handling subprocesses (there are some Lisp packages that fire
subprocesses at very high rate).
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 18:48 ` Gerd Möllmann
@ 2024-12-23 19:25 ` Eli Zaretskii
0 siblings, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 19:25 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Pip Cet <pipcet@protonmail.com>, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Mon, 23 Dec 2024 19:48:08 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > How will this allow us to run the Lisp machine from a signal? Because
> > this is the goal, right?
>
> Today I'm confused.
>
> Can I ask what you mean by running the Lisp Machine from a signal
> handler? Sounds to me like calling eval, but I'd doubt that works with
> the old GC, or does it?
See what I wrote in response to your other questions, regarding what
SIGPROF and SIGCHLD handlers do.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 19:00 ` Eli Zaretskii
@ 2024-12-23 19:37 ` Eli Zaretskii
2024-12-23 20:49 ` Gerd Möllmann
2024-12-23 23:37 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
2 siblings, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-23 19:37 UTC (permalink / raw)
To: gerd.moellmann, pipcet; +Cc: ofv, emacs-devel, eller.helmut, acorallo
> Date: Mon, 23 Dec 2024 21:00:53 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
>
> > From: Gerd Möllmann <gerd.moellmann@gmail.com>
> > Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
> > eller.helmut@gmail.com, acorallo@gnu.org
> > Date: Mon, 23 Dec 2024 18:44:42 +0100
> >
> > BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
> > objects or access some? All? Or, would it be realistic to rewrite signal
> > handlers to not do that?
>
> SIGPROF does (it's the basis for our Lisp profiler).
Let me clarify to avoid possible confusion: the SIGPROF handler
doesn't run Lisp, but it does access the Lisp machine, via the
backtrace_top_function and get_backtrace functions.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 18:35 ` Eli Zaretskii
2024-12-23 18:48 ` Gerd Möllmann
@ 2024-12-23 20:30 ` Benjamin Riefenstahl
2024-12-23 23:39 ` Pip Cet via Emacs development discussions.
2024-12-24 3:37 ` Eli Zaretskii
1 sibling, 2 replies; 91+ messages in thread
From: Benjamin Riefenstahl @ 2024-12-23 20:30 UTC (permalink / raw)
To: Eli Zaretskii
Cc: Pip Cet, gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
>> From: Pip Cet <pipcet@protonmail.com>
>> >> +#include <pthread.h>
Eli Zaretskii writes:
>> > We cannot use pthreads.h in portable code. If we want to use
>> > threads, we need separate implementations for Posix and Windows,
>> > like wedid in systhread.c for Lisp threads.
Just a drive-by observation: Signals are a POSIX feature, so we have to
think about the potential conflict between signals and MPS only on
POSIX, not on MS Windows, right?
Regards, benny
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 19:00 ` Eli Zaretskii
2024-12-23 19:37 ` Eli Zaretskii
@ 2024-12-23 20:49 ` Gerd Möllmann
2024-12-23 21:43 ` Helmut Eller
2024-12-24 6:03 ` SIGPROF + SIGCHLD and igc Gerd Möllmann
2024-12-23 23:37 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
2 siblings, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-23 20:49 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Mon, 23 Dec 2024 18:44:42 +0100
>>
>> BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
>> objects or access some? All? Or, would it be realistic to rewrite signal
>> handlers to not do that?
>
> SIGPROF does (it's the basis for our Lisp profiler).
>
> SIGCHLD doesn't run Lisp (I think), but it examines objects and data
> structures of the Lisp machine (those related to child processes).
>
>> One thing I've seen done elsewhere is to publish a message to a message
>> board so that it can be handled outside of the signal handler. Something
>> like that, you know what I mean.
>
> This is tricky for the profiler, because you want to sample the
> function in which you are right there and then, not some time later.
>
> For SIGCHLD this could work, but it might make Emacs slower in
> handling subprocesses (there are some Lisp packages that fire
> subprocesses at very high rate).
Thanks.
I've looked at SIGPROF. From an admittedly brief look at this, I'd
summarize my results as:
- The important part is get_backtrace. The rest could be done elsewhere
by posting that to a message board, or whatever the mechanism is at
the end.
- Didn't see get_backtrace or functions called from it allocating Lisp
objects.
- It reads from a Lisp object because of
#define specpdl (current_thread->m_specpdl)
#define specpdl_end (current_thread->m_specpdl_end)
#define specpdl_ptr (current_thread->m_specpdl_ptr)
current_thread is a struct thread_state which is a PVEC_THREAD.
- I remember that I wrote a scanner for the specpdl stacks, so that's
not a Lisp object but a root, so no problem here, I think.
- struct thread_state allocation is done in igc.c via alloc_immovable in
igc_alloc_pseudovector. That allocated from from an AMS pool, which
doesn't use barriers.
- It doesn't seem to access other Lisp objects except current_thread.
That doesn't look bad, I think. Worth mentioning is perhaps that
directly after get_backtrace here
static void
record_backtrace (struct profiler_log *plog, EMACS_INT count)
{
log_t *log = plog->log;
get_backtrace (log->trace, log->depth);
EMACS_UINT hash = trace_hash (log->trace, log->depth);
we access Lisp objects in trace_hash when computing the hash and in the
other hash table code. IIUC that code counts hits with the same
backtrace. Don't know how long that takes. But if posting the backtrace
would take the same time, we would be on par.
I'll try to also look at SIGCHLD at some later point, but Christmas,
family etc.
Happy holidays!
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 20:49 ` Gerd Möllmann
@ 2024-12-23 21:43 ` Helmut Eller
2024-12-23 21:49 ` Pip Cet via Emacs development discussions.
2024-12-24 4:05 ` Gerd Möllmann
2024-12-24 6:03 ` SIGPROF + SIGCHLD and igc Gerd Möllmann
1 sibling, 2 replies; 91+ messages in thread
From: Helmut Eller @ 2024-12-23 21:43 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
On Mon, Dec 23 2024, Gerd Möllmann wrote:
> [...]
> Worth mentioning is perhaps that [...]
> directly after get_backtrace here [...]
> we access Lisp objects in trace_hash when computing the hash and in the
> other hash table code.
Also worth mentioning is that trace_hash uses XHASH, which is probably
problematic in combination with a moving GC.
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 21:43 ` Helmut Eller
@ 2024-12-23 21:49 ` Pip Cet via Emacs development discussions.
2024-12-23 21:58 ` Helmut Eller
2024-12-24 4:05 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 21:49 UTC (permalink / raw)
To: Helmut Eller
Cc: Gerd Möllmann, Eli Zaretskii, ofv, emacs-devel, acorallo
"Helmut Eller" <eller.helmut@gmail.com> writes:
> On Mon, Dec 23 2024, Gerd Möllmann wrote:
>
>> [...]
>> Worth mentioning is perhaps that [...]
>> directly after get_backtrace here [...]
>> we access Lisp objects in trace_hash when computing the hash and in the
>> other hash table code.
>
> Also worth mentioning is that trace_hash uses XHASH, which is probably
> problematic in combination with a moving GC.
Good catch. s/XHASH/sxhash_eq/ there, I think? And let's poison XHASH
when MPS is in use?
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 21:49 ` Pip Cet via Emacs development discussions.
@ 2024-12-23 21:58 ` Helmut Eller
2024-12-23 23:20 ` Pip Cet via Emacs development discussions.
0 siblings, 1 reply; 91+ messages in thread
From: Helmut Eller @ 2024-12-23 21:58 UTC (permalink / raw)
To: Pip Cet; +Cc: Gerd Möllmann, Eli Zaretskii, ofv, emacs-devel, acorallo
On Mon, Dec 23 2024, Pip Cet wrote:
[...]
>> Also worth mentioning is that trace_hash uses XHASH, which is probably
>> problematic in combination with a moving GC.
>
> Good catch. s/XHASH/sxhash_eq/ there, I think? And let's poison XHASH
> when MPS is in use?
sxhash_eq doesn't fly with headerless objects. It should be obsoleted,
IMO.
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 21:58 ` Helmut Eller
@ 2024-12-23 23:20 ` Pip Cet via Emacs development discussions.
2024-12-24 5:38 ` Helmut Eller
0 siblings, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 23:20 UTC (permalink / raw)
To: Helmut Eller
Cc: Gerd Möllmann, Eli Zaretskii, ofv, emacs-devel, acorallo
"Helmut Eller" <eller.helmut@gmail.com> writes:
> On Mon, Dec 23 2024, Pip Cet wrote:
> [...]
>>> Also worth mentioning is that trace_hash uses XHASH, which is probably
>>> problematic in combination with a moving GC.
>>
>> Good catch. s/XHASH/sxhash_eq/ there, I think? And let's poison XHASH
>> when MPS is in use?
>
> sxhash_eq doesn't fly with headerless objects.
Which objects would that be?
Right now all IGC objects have headers, right? Did I miss any?
> It should be obsoleted, IMO.
I don't see why.
Is this about cons cells exclusively? Because 3 words/cons is too
much (possibly 4 words on W64 or 32-bit systems)?
For vectors we can usually derive the length of the vector from the IGC
header (which has plenty of extra bits), which would have the equivalent
effect. Strings, symbols, floats shouldn't matter.
That leaves conses. My guess so far was that you wanted to implement a
hack where a headerless cons is a two-word object that would turn into a
tagged pointer to another two-word object with a header as soon as its
hash value is taken. That requires slowing down either XCAR or XCDR, I
think, and that's sufficient reason for me not to do it, but I guess I
misunderstood your plans. This would also mean sxhash_eq would allocate
memory, so it couldn't be called from a signal handler without yet
another workaround.
(Note that cons cells used to store long lists are inherently
inefficient: naively storing an n-element list with a header requires
n+1 words, but even headerless cons cells will require 2*n words. So if
we really decide we need to reduce cons memory usage, I'd look into that
instead.)
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 19:00 ` Eli Zaretskii
2024-12-23 19:37 ` Eli Zaretskii
2024-12-23 20:49 ` Gerd Möllmann
@ 2024-12-23 23:37 ` Pip Cet via Emacs development discussions.
2024-12-24 4:03 ` Gerd Möllmann
2024-12-24 12:11 ` Eli Zaretskii
2 siblings, 2 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 23:37 UTC (permalink / raw)
To: Eli Zaretskii
Cc: Gerd Möllmann, ofv, emacs-devel, eller.helmut, acorallo
"Eli Zaretskii" <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Mon, 23 Dec 2024 18:44:42 +0100
>>
>> BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
>> objects or access some? All? Or, would it be realistic to rewrite signal
>> handlers to not do that?
I think there are several questions hiding behind the first question
mark:
1. which signal handlers want to read Lisp data
2. which signal handlers want to write Lisp data
3. which signal handlers want to allocate Lisp objects temporarily,
while guaranteeing no references to those objects survive when the
signal handler returns.
4. which signal handlers want to allocate Lisp objects permanently,
storing references to the new objects in "old" data
4a. ... and are willing to call a special transformation function to do
so
4b. ... and want to do so implicitly, expecting memory manipulation to
"just work".
1: definitely works
2: should work, but may hit a write barrier
3: could be made to work if there's interest
4a: if we must
4b: see the other thread. If we have both make_object_writable
(formerly CHECK_IMPURE) and commit_object_changes functions and call
them consistently, it might be possible to find a way.
> SIGPROF does (it's the basis for our Lisp profiler).
That's 1, 2, but not 3 or 4, right?
> SIGCHLD doesn't run Lisp (I think), but it examines objects and data
> structures of the Lisp machine (those related to child processes).
Just 1, then?
>> One thing I've seen done elsewhere is to publish a message to a message
>> board so that it can be handled outside of the signal handler. Something
>> like that, you know what I mean.
>
> This is tricky for the profiler, because you want to sample the
> function in which you are right there and then, not some time later.
But would it be so bad to use a copy of the specpdl stack, placed in a
prepared area which is a GC root so we'd guarantee survival (but not
immutability; I don't think that matters in practice) of entries?
memcpy is safe to call from a signal handler, and then we could do all
of the processing safely.
(My preference would be to make the specpdl stack an ambiguous root
while the profiler is in use: that way, we'd get usable backtraces even
if the SIGPROF happened during GC, which is probably more useful than
merely saying that we were in GC).
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 20:30 ` Benjamin Riefenstahl
@ 2024-12-23 23:39 ` Pip Cet via Emacs development discussions.
2024-12-24 12:14 ` Eli Zaretskii
2024-12-24 3:37 ` Eli Zaretskii
1 sibling, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-23 23:39 UTC (permalink / raw)
To: Benjamin Riefenstahl
Cc: Eli Zaretskii, gerd.moellmann, ofv, emacs-devel, eller.helmut,
acorallo
"Benjamin Riefenstahl" <b.riefenstahl@turtle-trading.net> writes:
>>> From: Pip Cet <pipcet@protonmail.com>
>>> >> +#include <pthread.h>
>
> Eli Zaretskii writes:
>>> > We cannot use pthreads.h in portable code. If we want to use
>>> > threads, we need separate implementations for Posix and Windows,
>>> > like wedid in systhread.c for Lisp threads.
>
> Just a drive-by observation: Signals are a POSIX feature, so we have to
> think about the potential conflict between signals and MPS only on
> POSIX, not on MS Windows, right?
I believe it affects all operating systems we're playing with (well, I'm
also playing with FreeDOS but I'm not going to port MPS to it. It's my
New Year's resolution not to).
The allocation thread approach should work for all of them. If we have
stdatomic.h, performance should be acceptable.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 20:30 ` Benjamin Riefenstahl
2024-12-23 23:39 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 3:37 ` Eli Zaretskii
2024-12-24 8:48 ` Benjamin Riefenstahl
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 3:37 UTC (permalink / raw)
To: Benjamin Riefenstahl
Cc: pipcet, gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Cc: Pip Cet <pipcet@protonmail.com>, gerd.moellmann@gmail.com,
> ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com,
> acorallo@gnu.org
> Date: Mon, 23 Dec 2024 22:30:57 +0200
>
> >> From: Pip Cet <pipcet@protonmail.com>
> >> >> +#include <pthread.h>
>
> Eli Zaretskii writes:
> >> > We cannot use pthreads.h in portable code. If we want to use
> >> > threads, we need separate implementations for Posix and Windows,
> >> > like wedid in systhread.c for Lisp threads.
>
> Just a drive-by observation: Signals are a POSIX feature, so we have to
> think about the potential conflict between signals and MPS only on
> POSIX, not on MS Windows, right?
Emacs on Windows emulates some Posix signals (SIGPROF, SIGCHLD,
SIGALRM), so this affects the Windows build as well.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 23:37 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
@ 2024-12-24 4:03 ` Gerd Möllmann
2024-12-24 10:25 ` Pip Cet via Emacs development discussions.
2024-12-24 12:26 ` Eli Zaretskii
2024-12-24 12:11 ` Eli Zaretskii
1 sibling, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 4:03 UTC (permalink / raw)
To: Pip Cet; +Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Pip Cet <pipcet@protonmail.com> writes:
> "Eli Zaretskii" <eliz@gnu.org> writes:
>
>>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
>>> eller.helmut@gmail.com, acorallo@gnu.org
>>> Date: Mon, 23 Dec 2024 18:44:42 +0100
>>>
>>> BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
>>> objects or access some? All? Or, would it be realistic to rewrite signal
>>> handlers to not do that?
>
> I think there are several questions hiding behind the first question
> mark:
>
> 1. which signal handlers want to read Lisp data
> 2. which signal handlers want to write Lisp data
> 3. which signal handlers want to allocate Lisp objects temporarily,
> while guaranteeing no references to those objects survive when the
> signal handler returns.
> 4. which signal handlers want to allocate Lisp objects permanently,
> storing references to the new objects in "old" data
> 4a. ... and are willing to call a special transformation function to do
> so
> 4b. ... and want to do so implicitly, expecting memory manipulation to
> "just work".
New day, new beliefs :-). Today, when I read my question again, I'd
actually be surprised if a signal handler could allocate Lisp objects
because I wouldn't be able to explain how that works with alloc.c which
isn't reentrant. Not even Fcons is reentrant when I look at it now.
Correct, or am I overlooking something? Could others please check? If
it's right, things get a lot easier.
Maybe allocation of Lisp objects on the stack remains as some sort of
problem (AUTO_CONS etc)? I don't see how though, ATM.
> 1: definitely works
> 2: should work, but may hit a write barrier
> 3: could be made to work if there's interest
> 4a: if we must
> 4b: see the other thread. If we have both make_object_writable
> (formerly CHECK_IMPURE) and commit_object_changes functions and call
> them consistently, it might be possible to find a way.
>
>> SIGPROF does (it's the basis for our Lisp profiler).
>
> That's 1, 2, but not 3 or 4, right?
>
>> SIGCHLD doesn't run Lisp (I think), but it examines objects and data
>> structures of the Lisp machine (those related to child processes).
>
> Just 1, then?
>
>>> One thing I've seen done elsewhere is to publish a message to a message
>>> board so that it can be handled outside of the signal handler. Something
>>> like that, you know what I mean.
>>
>> This is tricky for the profiler, because you want to sample the
>> function in which you are right there and then, not some time later.
>
> But would it be so bad to use a copy of the specpdl stack, placed in a
> prepared area which is a GC root so we'd guarantee survival (but not
> immutability; I don't think that matters in practice) of entries?
> memcpy is safe to call from a signal handler, and then we could do all
> of the processing safely.
>
> (My preference would be to make the specpdl stack an ambiguous root
> while the profiler is in use: that way, we'd get usable backtraces even
> if the SIGPROF happened during GC, which is probably more useful than
> merely saying that we were in GC).
>
> Pip
I'd prefer to send messages from handle_profiler_signal. Or something
equivalent to sending messages. Please see my other mail where I looked
at that function. Implementing such a message board is of course not
easy. But I think it would be easy to understand how things work once one
has something like that.
And if I'm right with what I wrote above about allocation (and I think I
am), we also don't need provisions for allocating Lisp objects from
signal handlers, which would be a great simplification.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 21:43 ` Helmut Eller
2024-12-23 21:49 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 4:05 ` Gerd Möllmann
2024-12-24 8:50 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 4:05 UTC (permalink / raw)
To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
Helmut Eller <eller.helmut@gmail.com> writes:
> On Mon, Dec 23 2024, Gerd Möllmann wrote:
>
>> [...]
>> Worth mentioning is perhaps that [...]
>> directly after get_backtrace here [...]
>> we access Lisp objects in trace_hash when computing the hash and in the
>> other hash table code.
>
> Also worth mentioning is that trace_hash uses XHASH, which is probably
> problematic in combination with a moving GC.
>
> Helmut
Right, I must have overlooked that back then :-/
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 23:20 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 5:38 ` Helmut Eller
2024-12-24 6:27 ` Gerd Möllmann
2024-12-24 10:09 ` Pip Cet via Emacs development discussions.
0 siblings, 2 replies; 91+ messages in thread
From: Helmut Eller @ 2024-12-24 5:38 UTC (permalink / raw)
To: Pip Cet; +Cc: Gerd Möllmann, Eli Zaretskii, ofv, emacs-devel, acorallo
On Mon, Dec 23 2024, Pip Cet wrote:
>> sxhash_eq doesn't fly with headerless objects.
>
> Which objects would that be?
>
> Right now all IGC objects have headers, right? Did I miss any?
Right, but I'd like to keep that option on the table.
>> It should be obsoleted, IMO.
[...]
> That leaves conses. My guess so far was that you wanted to implement a
> hack where a headerless cons is a two-word object that would turn into a
> tagged pointer to another two-word object with a header as soon as its
> hash value is taken. That requires slowing down either XCAR or XCDR, I
> think, and that's sufficient reason for me not to do it, but I guess I
> misunderstood your plans. This would also mean sxhash_eq would allocate
> memory, so it couldn't be called from a signal handler without yet
> another workaround.
I would go the obvious way: use segregated allocation. Each Lisp_Type
gets its own MPS pool, without igc-headers. The dflt pool would only
contain non-lisp types, like IGC_OBJ_STRING_DATA, with igc-headers.
That wouldn't slow down XCAR, but it requires that hash tables use MPS's
location dependencies.
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* SIGPROF + SIGCHLD and igc
2024-12-23 20:49 ` Gerd Möllmann
2024-12-23 21:43 ` Helmut Eller
@ 2024-12-24 6:03 ` Gerd Möllmann
2024-12-24 8:23 ` Helmut Eller
2024-12-24 12:54 ` Eli Zaretskii
1 sibling, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 6:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
(I've given this a new subject.)
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
>>> eller.helmut@gmail.com, acorallo@gnu.org
>>> Date: Mon, 23 Dec 2024 18:44:42 +0100
>>>
>>> BTW, do you know which signal handlers use Lisp, i.e. allocate Lisp
>>> objects or access some? All? Or, would it be realistic to rewrite signal
>>> handlers to not do that?
>>
>> SIGPROF does (it's the basis for our Lisp profiler).
>>
>> SIGCHLD doesn't run Lisp (I think), but it examines objects and data
>> structures of the Lisp machine (those related to child processes).
>>
>>> One thing I've seen done elsewhere is to publish a message to a message
>>> board so that it can be handled outside of the signal handler. Something
>>> like that, you know what I mean.
>>
>> This is tricky for the profiler, because you want to sample the
>> function in which you are right there and then, not some time later.
>>
>> For SIGCHLD this could work, but it might make Emacs slower in
>> handling subprocesses (there are some Lisp packages that fire
>> subprocesses at very high rate).
>
> Thanks.
>
> I've looked at SIGPROF. From an admittedly brief look at this, I'd
> summarize my results as:
>
> - The important part is get_backtrace. The rest could be done elsewhere
> by posting that to a message board, or whatever the mechanism is at
> the end.
>
> - Didn't see get_backtrace or functions called from it allocating Lisp
> objects.
>
> - It reads from a Lisp object because of
>
> #define specpdl (current_thread->m_specpdl)
> #define specpdl_end (current_thread->m_specpdl_end)
> #define specpdl_ptr (current_thread->m_specpdl_ptr)
>
> current_thread is a struct thread_state which is a PVEC_THREAD.
>
> - I remember that I wrote a scanner for the specpdl stacks, so that's
> not a Lisp object but a root, so no problem here, I think.
>
> - struct thread_state allocation is done in igc.c via alloc_immovable in
> igc_alloc_pseudovector. That allocated from from an AMS pool, which
> doesn't use barriers.
>
> - It doesn't seem to access other Lisp objects except current_thread.
>
> That doesn't look bad, I think. Worth mentioning is perhaps that
> directly after get_backtrace here
>
> static void
> record_backtrace (struct profiler_log *plog, EMACS_INT count)
> {
> log_t *log = plog->log;
> get_backtrace (log->trace, log->depth);
> EMACS_UINT hash = trace_hash (log->trace, log->depth);
>
> we access Lisp objects in trace_hash when computing the hash and in the
> other hash table code. IIUC that code counts hits with the same
> backtrace. Don't know how long that takes. But if posting the backtrace
> would take the same time, we would be on par.
>
> I'll try to also look at SIGCHLD at some later point, but Christmas,
> family etc.
>
> Happy holidays!
Been up a bit early, so...
This is about SIGCHLD, and I must say I find it a bit hard to tell if
all other platforms do the same. There are simply too many #if's to
consider in the signal handling code.
Anyway, what I see here: SIGCHLD doesn't do anything dangerous in the
signal handler. Instead, the occurrence of SIGCHLD is added to a queue
with enqueue_async_work and that's basically it.
The work items in the queue are processed by process_pending_signals,
outside of the signal handler. Very nice, that's how it should be :-).
(And maybe, just as an inspiration, one could use that construct for
SIGPROF?)
So, there is actually no problem at all with SIGCHLD that I can see.
My personal summary for SIGPROF + SIGCHLD at this point:
- I recommend rewriting SIGPROF handling in the way I tried to describe,
possibly using the existing work queue mechanism. Everything else looks
too complicated to me.
- Lisp allocation in signal handlers cannot exist because alloc.c is not
reentrant which means we would crash with the old GC. We don't need
anything extra for that in igc.
- No longer wondering why macOS does not show any problems in that whole
area. The only problem is SIGPROF accessing Lisp objects, and the
memory barrier is not a problem on macOS because it doesn't use
signals.
Please double-check!
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 5:38 ` Helmut Eller
@ 2024-12-24 6:27 ` Gerd Möllmann
2024-12-24 10:09 ` Pip Cet via Emacs development discussions.
1 sibling, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 6:27 UTC (permalink / raw)
To: Helmut Eller; +Cc: Pip Cet, Eli Zaretskii, ofv, emacs-devel, acorallo
Helmut Eller <eller.helmut@gmail.com> writes:
> On Mon, Dec 23 2024, Pip Cet wrote:
>
>>> sxhash_eq doesn't fly with headerless objects.
>>
>> Which objects would that be?
>>
>> Right now all IGC objects have headers, right? Did I miss any?
>
> Right, but I'd like to keep that option on the table.
>
>>> It should be obsoleted, IMO.
Agree. I thihk sxhash-eq sort of leaks details of the GC implementation.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 6:03 ` SIGPROF + SIGCHLD and igc Gerd Möllmann
@ 2024-12-24 8:23 ` Helmut Eller
2024-12-24 8:39 ` Gerd Möllmann
2024-12-24 13:05 ` Eli Zaretskii
2024-12-24 12:54 ` Eli Zaretskii
1 sibling, 2 replies; 91+ messages in thread
From: Helmut Eller @ 2024-12-24 8:23 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
On Tue, Dec 24 2024, Gerd Möllmann wrote:
[...]
> Anyway, what I see here: SIGCHLD doesn't do anything dangerous in the
> signal handler. Instead, the occurrence of SIGCHLD is added to a queue
> with enqueue_async_work and that's basically it.
Wrong branch! enqueue_async_work doesn't exist in master. ISTR that in
master it iterates through process-list. Also, Pip said something that
the queue is not signal safe, because signals can nest or something like
that. Also, Eli didn't like enqueue_async_work much.
> My personal summary for SIGPROF + SIGCHLD at this point:
>
> - I recommend rewriting SIGPROF handling in the way I tried to describe,
> possibly using the existing work queue mechanism. Everything else looks
> too complicated to me.
>
> - Lisp allocation in signal handlers cannot exist because alloc.c is not
> reentrant which means we would crash with the old GC. We don't need
> anything extra for that in igc.
>
> - No longer wondering why macOS does not show any problems in that whole
> area. The only problem is SIGPROF accessing Lisp objects, and the
> memory barrier is not a problem on macOS because it doesn't use
> signals.
>
> Please double-check!
I think, SIGIO might cause trouble. But that async IO code in process.c
is sooo hard to read. I wonder if it would be simpler with threads,
e.g. one thread per Lisp_Process.
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 8:23 ` Helmut Eller
@ 2024-12-24 8:39 ` Gerd Möllmann
2024-12-25 9:22 ` Helmut Eller
2024-12-24 13:05 ` Eli Zaretskii
1 sibling, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 8:39 UTC (permalink / raw)
To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
Helmut Eller <eller.helmut@gmail.com> writes:
> On Tue, Dec 24 2024, Gerd Möllmann wrote:
>
> [...]
>> Anyway, what I see here: SIGCHLD doesn't do anything dangerous in the
>> signal handler. Instead, the occurrence of SIGCHLD is added to a queue
>> with enqueue_async_work and that's basically it.
>
> Wrong branch! enqueue_async_work doesn't exist in master. ISTR that in
> master it iterates through process-list. Also, Pip said something that
> the queue is not signal safe, because signals can nest or something like
> that. Also, Eli didn't like enqueue_async_work much.
Oops, thanks for checking! And 👍 to Pip. Then we have to see what to do
with nested signals if that's a problem.
>> My personal summary for SIGPROF + SIGCHLD at this point:
>>
>> - I recommend rewriting SIGPROF handling in the way I tried to describe,
>> possibly using the existing work queue mechanism. Everything else looks
>> too complicated to me.
>>
>> - Lisp allocation in signal handlers cannot exist because alloc.c is not
>> reentrant which means we would crash with the old GC. We don't need
>> anything extra for that in igc.
>>
>> - No longer wondering why macOS does not show any problems in that whole
>> area. The only problem is SIGPROF accessing Lisp objects, and the
>> memory barrier is not a problem on macOS because it doesn't use
>> signals.
>>
>> Please double-check!
>
> I think, SIGIO might cause trouble. But that async IO code in process.c
> is sooo hard to read. I wonder if it would be simpler with threads,
> e.g. one thread per Lisp_Process.
It's a maze :-(.
BTW, do you agree with my analysis that Lisp allocations can't possibly
exist in signal handlers today?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 3:37 ` Eli Zaretskii
@ 2024-12-24 8:48 ` Benjamin Riefenstahl
2024-12-24 13:52 ` Eli Zaretskii
0 siblings, 1 reply; 91+ messages in thread
From: Benjamin Riefenstahl @ 2024-12-24 8:48 UTC (permalink / raw)
To: Eli Zaretskii
Cc: pipcet, gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii writes:
> Emacs on Windows emulates some Posix signals (SIGPROF, SIGCHLD,
> SIGALRM), so this affects the Windows build as well.
That's interesting. But does this emulation have the same constraints
as POSIX signals have?
benny
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 4:05 ` Gerd Möllmann
@ 2024-12-24 8:50 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 8:50 UTC (permalink / raw)
To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Helmut Eller <eller.helmut@gmail.com> writes:
>
>> On Mon, Dec 23 2024, Gerd Möllmann wrote:
>>
>>> [...]
>>> Worth mentioning is perhaps that [...]
>>> directly after get_backtrace here [...]
>>> we access Lisp objects in trace_hash when computing the hash and in the
>>> other hash table code.
>>
>> Also worth mentioning is that trace_hash uses XHASH, which is probably
>> problematic in combination with a moving GC.
>>
>> Helmut
>
> Right, I must have overlooked that back then :-/
I've pushed 2 things to igc. One for the above, and a second that
removes XHASH for HAVE_MPS. Hope the second works for all platforms, at
couldn't find uses elsewhere with git-grep.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 5:38 ` Helmut Eller
2024-12-24 6:27 ` Gerd Möllmann
@ 2024-12-24 10:09 ` Pip Cet via Emacs development discussions.
1 sibling, 0 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-24 10:09 UTC (permalink / raw)
To: Helmut Eller
Cc: Gerd Möllmann, Eli Zaretskii, ofv, emacs-devel, acorallo
"Helmut Eller" <eller.helmut@gmail.com> writes:
> On Mon, Dec 23 2024, Pip Cet wrote:
>
>>> sxhash_eq doesn't fly with headerless objects.
>>
>> Which objects would that be?
>>
>> Right now all IGC objects have headers, right? Did I miss any?
>
> Right, but I'd like to keep that option on the table.
I see one specific case where it would be useful: storing 64-bit
integers on 32-bit systems. We don't need the entire integer range,
since -256M .. +256M - 1 are fixnums (assuming we reduce fixnum range by
1 bit). So we have 512M unused values, which is precisely the number of
possible forwarding pointers if we maintain 8-byte alignment. We can
use two "impossible" forwarding pointers for 1-word padding and N-word
padding, so this case works out precisely. No hash problems, since a
u64 is constant and we can hash the contents instead.
The only relevant 2-word object is conses, and I don't see a way to do
it for them.
Most N-word objects with N>2 are either fairly large to begin with, or
they're vectorlikes and we have a redundant size field, which we can get
rid of.
>>> It should be obsoleted, IMO.
>
> [...]
>> That leaves conses. My guess so far was that you wanted to implement a
>> hack where a headerless cons is a two-word object that would turn into a
>> tagged pointer to another two-word object with a header as soon as its
>> hash value is taken. That requires slowing down either XCAR or XCDR, I
>> think, and that's sufficient reason for me not to do it, but I guess I
>> misunderstood your plans. This would also mean sxhash_eq would allocate
>> memory, so it couldn't be called from a signal handler without yet
>> another workaround.
>
> I would go the obvious way: use segregated allocation. Each Lisp_Type
> gets its own MPS pool, without igc-headers. The dflt pool would only
Why bother for non-conses?
> contain non-lisp types, like IGC_OBJ_STRING_DATA, with igc-headers.
> That wouldn't slow down XCAR, but it requires that hash tables use MPS's
> location dependencies.
I don't think we want to use location dependencies: even if we solved
all the other problems (Fsxhash_eq, permanent hashes for those places
where we can't rehash), I'm pretty sure rehashing would kill us. In
particular, if we somehow managed to make GC more fine-grained and move
fewer objects, we'd end up rehashing more, so suddenly we'd have an
incentive not to use minor GCs.
But I confess that I haven't looked at the location dependency code.
There's no need for us to use it, and from the documentation it seemed
it wouldn't be a good idea to start using it if you don't have to.
(Also, at that point, shouldn't we just use an AMS pool for conses?)
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 4:03 ` Gerd Möllmann
@ 2024-12-24 10:25 ` Pip Cet via Emacs development discussions.
2024-12-24 10:50 ` Gerd Möllmann
2024-12-24 13:15 ` Eli Zaretskii
2024-12-24 12:26 ` Eli Zaretskii
1 sibling, 2 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-24 10:25 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> New day, new beliefs :-). Today, when I read my question again, I'd
> actually be surprised if a signal handler could allocate Lisp objects
> because I wouldn't be able to explain how that works with alloc.c which
> isn't reentrant. Not even Fcons is reentrant when I look at it now.
>
> Correct, or am I overlooking something? Could others please check? If
> it's right, things get a lot easier.
I agree. But Eli said something about wanting to run Lisp from a signal
handler, which would change that. I was trying to explain why we don't
want to do that.
> Maybe allocation of Lisp objects on the stack remains as some sort of
> problem (AUTO_CONS etc)? I don't see how though, ATM.
Stack objects are always optional, so if there is code that attempts to
avoid alloc.c by using those, it's broken.
My current patch makes it so the main thread never takes the arena lock,
ever. Performance isn't quite the same as scratch/igc: for some reason
I don't understand, it's slightly better. Still needs cleanup,
de-pthreading, and we probably don't need to use atomic types
everywhere.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 10:25 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 10:50 ` Gerd Möllmann
2024-12-24 13:15 ` Eli Zaretskii
1 sibling, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 10:50 UTC (permalink / raw)
To: Pip Cet; +Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Pip Cet <pipcet@protonmail.com> writes:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> New day, new beliefs :-). Today, when I read my question again, I'd
>> actually be surprised if a signal handler could allocate Lisp objects
>> because I wouldn't be able to explain how that works with alloc.c which
>> isn't reentrant. Not even Fcons is reentrant when I look at it now.
>>
>> Correct, or am I overlooking something? Could others please check? If
>> it's right, things get a lot easier.
>
> I agree. But Eli said something about wanting to run Lisp from a signal
> handler, which would change that. I was trying to explain why we don't
> want to do that.
Thanks for checking! Must be kind of a misunderstanding going on. And
anyway, it would be a feature we don't have with the old GC, so I'd
declare it out of scope :-).
>> Maybe allocation of Lisp objects on the stack remains as some sort of
>> problem (AUTO_CONS etc)? I don't see how though, ATM.
>
> Stack objects are always optional, so if there is code that attempts to
> avoid alloc.c by using those, it's broken.
Yes!
> My current patch makes it so the main thread never takes the arena lock,
> ever.
Hm, how and why does that work?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 23:37 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
2024-12-24 4:03 ` Gerd Möllmann
@ 2024-12-24 12:11 ` Eli Zaretskii
1 sibling, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 12:11 UTC (permalink / raw)
To: Pip Cet; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> Date: Mon, 23 Dec 2024 23:37:13 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>
> "Eli Zaretskii" <eliz@gnu.org> writes:
>
> 1. which signal handlers want to read Lisp data
> 2. which signal handlers want to write Lisp data
> 3. which signal handlers want to allocate Lisp objects temporarily,
> while guaranteeing no references to those objects survive when the
> signal handler returns.
> 4. which signal handlers want to allocate Lisp objects permanently,
> storing references to the new objects in "old" data
> 4a. ... and are willing to call a special transformation function to do
> so
> 4b. ... and want to do so implicitly, expecting memory manipulation to
> "just work".
>
> 1: definitely works
> 2: should work, but may hit a write barrier
> 3: could be made to work if there's interest
> 4a: if we must
> 4b: see the other thread. If we have both make_object_writable
> (formerly CHECK_IMPURE) and commit_object_changes functions and call
> them consistently, it might be possible to find a way.
>
> > SIGPROF does (it's the basis for our Lisp profiler).
>
> That's 1, 2, but not 3 or 4, right?
I don't think I understand your categories well enough, and anyway
didn't look at the code to find out where it stops in that scale.
> > SIGCHLD doesn't run Lisp (I think), but it examines objects and data
> > structures of the Lisp machine (those related to child processes).
>
> Just 1, then?
Ditto. It calls various functions, which I didn't trace into.
> >> One thing I've seen done elsewhere is to publish a message to a message
> >> board so that it can be handled outside of the signal handler. Something
> >> like that, you know what I mean.
> >
> > This is tricky for the profiler, because you want to sample the
> > function in which you are right there and then, not some time later.
>
> But would it be so bad to use a copy of the specpdl stack, placed in a
> prepared area which is a GC root so we'd guarantee survival (but not
> immutability; I don't think that matters in practice) of entries?
> memcpy is safe to call from a signal handler, and then we could do all
> of the processing safely.
How will you ensure that the copied specpdl stack faithfully tells the
profile info? It will most probably introduce bias into the profile.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 23:39 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 12:14 ` Eli Zaretskii
2024-12-24 13:18 ` Pip Cet via Emacs development discussions.
2024-12-24 13:42 ` Benjamin Riefenstahl
0 siblings, 2 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 12:14 UTC (permalink / raw)
To: Pip Cet
Cc: b.riefenstahl, gerd.moellmann, ofv, emacs-devel, eller.helmut,
acorallo
> Date: Mon, 23 Dec 2024 23:39:41 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, gerd.moellmann@gmail.com, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>
> "Benjamin Riefenstahl" <b.riefenstahl@turtle-trading.net> writes:
>
> The allocation thread approach should work for all of them. If we have
> stdatomic.h, performance should be acceptable.
We should carefully discuss the design and its implications before we
conclude that this is an idea that is good enough to justify such a
significant change. If nothing else, it throws out the window several
months of everyone's experience with the current implementation.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 4:03 ` Gerd Möllmann
2024-12-24 10:25 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 12:26 ` Eli Zaretskii
2024-12-24 12:56 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 12:26 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Tue, 24 Dec 2024 05:03:36 +0100
>
> I'd prefer to send messages from handle_profiler_signal. Or something
> equivalent to sending messages.
How would that be different? If the messages arrive asynchronously
and are handled asynchronously, that's the moral equivalent of
signals, no? If the messages are not handled asynchronously, how do
we make sure the obtained profile is accurate?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 6:03 ` SIGPROF + SIGCHLD and igc Gerd Möllmann
2024-12-24 8:23 ` Helmut Eller
@ 2024-12-24 12:54 ` Eli Zaretskii
2024-12-24 12:59 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 12:54 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Tue, 24 Dec 2024 07:03:53 +0100
>
> (I've given this a new subject.)
Not a second too soon!
> This is about SIGCHLD, and I must say I find it a bit hard to tell if
> all other platforms do the same. There are simply too many #if's to
> consider in the signal handling code.
>
> Anyway, what I see here: SIGCHLD doesn't do anything dangerous in the
> signal handler. Instead, the occurrence of SIGCHLD is added to a queue
> with enqueue_async_work and that's basically it.
Are we looking at the same code? I was talking about
handle_child_signal, which is called thusly:
static void
deliver_child_signal (int sig)
{
deliver_process_signal (sig, handle_child_signal);
}
What I see in handle_child_signal is not what you describe above.
> The work items in the queue are processed by process_pending_signals,
> outside of the signal handler. Very nice, that's how it should be :-).
I think you are looking at how SIGIO and SIGALRM are processed.
> (And maybe, just as an inspiration, one could use that construct for
> SIGPROF?)
Could one? SIGPROF's handler should sample the "program counter", so
delaying the sample will sample it in a wrong place. Right?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 12:26 ` Eli Zaretskii
@ 2024-12-24 12:56 ` Gerd Möllmann
2024-12-24 13:19 ` Pip Cet via Emacs development discussions.
2024-12-24 13:46 ` Eli Zaretskii
0 siblings, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 12:56 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Tue, 24 Dec 2024 05:03:36 +0100
>>
>> I'd prefer to send messages from handle_profiler_signal. Or something
>> equivalent to sending messages.
>
> How would that be different? If the messages arrive asynchronously
> and are handled asynchronously, that's the moral equivalent of
> signals, no?
I'm using SIGPROF below to make it more concrete. Similar for other
signals.
The idea is to get the backtrace in the SIGPROF handler, without
accessing Lisp data. That can be done, as I've tried to show.
Then place that backtrace somewhere.
In an an actor model architecture, one would use a message that contains
the backtrace and post it to a message board. I used that architecture
just as an example, because I like it a lot. In the same architecture,
typically a scheduler thread would then assign a thread to handle the
message. The handler handling the profiler message would then do what
record_backtrace today does after get_backtrace, i.e. count same
backtraces.
That's only one example architectures, of course. One can use something
else, like queues that are handled by another thread, one doesn't need a
scheduler thread, and so on, and so on. Pip's work queue is an
example.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 12:54 ` Eli Zaretskii
@ 2024-12-24 12:59 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 12:59 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Tue, 24 Dec 2024 07:03:53 +0100
>>
>> (I've given this a new subject.)
>
> Not a second too soon!
>
>> This is about SIGCHLD, and I must say I find it a bit hard to tell if
>> all other platforms do the same. There are simply too many #if's to
>> consider in the signal handling code.
>>
>> Anyway, what I see here: SIGCHLD doesn't do anything dangerous in the
>> signal handler. Instead, the occurrence of SIGCHLD is added to a queue
>> with enqueue_async_work and that's basically it.
>
> Are we looking at the same code? I was talking about
> handle_child_signal, which is called thusly:
No we aren't :-). My mistake. I was looking at he code Pip wrote.
See Helmut's later message and my response.
>
> static void
> deliver_child_signal (int sig)
> {
> deliver_process_signal (sig, handle_child_signal);
> }
>
> What I see in handle_child_signal is not what you describe above.
>
>> The work items in the queue are processed by process_pending_signals,
>> outside of the signal handler. Very nice, that's how it should be :-).
>
> I think you are looking at how SIGIO and SIGALRM are processed.
>
>> (And maybe, just as an inspiration, one could use that construct for
>> SIGPROF?)
>
> Could one? SIGPROF's handler should sample the "program counter", so
> delaying the sample will sample it in a wrong place. Right?
Taking the backtrace would be done in the signal handler, the rest would
be done elsewhere. So, no.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 8:23 ` Helmut Eller
2024-12-24 8:39 ` Gerd Möllmann
@ 2024-12-24 13:05 ` Eli Zaretskii
2024-12-25 10:46 ` Helmut Eller
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 13:05 UTC (permalink / raw)
To: Helmut Eller; +Cc: gerd.moellmann, pipcet, ofv, emacs-devel, acorallo
> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, pipcet@protonmail.com, ofv@wanadoo.es,
> emacs-devel@gnu.org, acorallo@gnu.org
> Date: Tue, 24 Dec 2024 09:23:11 +0100
>
> I think, SIGIO might cause trouble.
Why do you think so? Its handler does almost nothing, just sets a
flag (if you ignore the Android-specific stuff there).
> But that async IO code in process.c is sooo hard to read. I wonder
> if it would be simpler with threads, e.g. one thread per
> Lisp_Process.
Heh, see w32proc.c.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 10:25 ` Pip Cet via Emacs development discussions.
2024-12-24 10:50 ` Gerd Möllmann
@ 2024-12-24 13:15 ` Eli Zaretskii
1 sibling, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 13:15 UTC (permalink / raw)
To: Pip Cet; +Cc: gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> Date: Tue, 24 Dec 2024 10:25:38 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
> > New day, new beliefs :-). Today, when I read my question again, I'd
> > actually be surprised if a signal handler could allocate Lisp objects
> > because I wouldn't be able to explain how that works with alloc.c which
> > isn't reentrant. Not even Fcons is reentrant when I look at it now.
> >
> > Correct, or am I overlooking something? Could others please check? If
> > it's right, things get a lot easier.
>
> I agree. But Eli said something about wanting to run Lisp from a signal
> handler, which would change that.
Not Lisp, but the Lisp machine in general. Which includes access to
Lisp data to read and write it.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 12:14 ` Eli Zaretskii
@ 2024-12-24 13:18 ` Pip Cet via Emacs development discussions.
2024-12-24 13:42 ` Benjamin Riefenstahl
1 sibling, 0 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-24 13:18 UTC (permalink / raw)
To: Eli Zaretskii
Cc: b.riefenstahl, gerd.moellmann, ofv, emacs-devel, eller.helmut,
acorallo
"Eli Zaretskii" <eliz@gnu.org> writes:
>> Date: Mon, 23 Dec 2024 23:39:41 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, gerd.moellmann@gmail.com, ofv@wanadoo.es, emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
>>
>> "Benjamin Riefenstahl" <b.riefenstahl@turtle-trading.net> writes:
>>
>> The allocation thread approach should work for all of them. If we have
>> stdatomic.h, performance should be acceptable.
>
> We should carefully discuss the design and its implications before we
> conclude that this is an idea that is good enough to justify such a
> significant change. If nothing else, it throws out the window several
> months of everyone's experience with the current implementation.
I mostly agree, though I think to say that we throw out months of
experience is to overestimate the magnitude of the change a bit.
I'll push the bugfix I found, but I won't push this until there's some
sort of consensus about whether it's a good idea (we seem to be close to
a consensus that it isn't necessary or desirable; I'm the only one who
disagrees with that, and I can live with the "not desirable" part if
someone can convince me this change isn't necessary).
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 12:56 ` Gerd Möllmann
@ 2024-12-24 13:19 ` Pip Cet via Emacs development discussions.
2024-12-24 13:38 ` Gerd Möllmann
2024-12-24 13:46 ` Eli Zaretskii
1 sibling, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-24 13:19 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> I'm using SIGPROF below to make it more concrete. Similar for other
> signals.
>
> The idea is to get the backtrace in the SIGPROF handler, without
> accessing Lisp data. That can be done, as I've tried to show.
I don't understand. We need to access the specpdl, which I consider
Lisp data, and certainly the backtrace includes data which can only be
generated using MPS-managed memory.
> Then place that backtrace somewhere.
I still think it's better to copy the specpdl, since that allows us to
generate the "backtrace" (whatever we choose to use for that) in Lisp.
If we spend too much time allocating short-lived data which triggers too
many GCs, we want to know what to fix in the Lisp code.
Honestly, though, it doesn't matter much, does it?
> That's only one example architectures, of course. One can use something
> else, like queues that are handled by another thread, one doesn't need a
> scheduler thread, and so on, and so on. Pip's work queue is an
> example.
That's Helmut's code, not mine.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 13:19 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 13:38 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 13:38 UTC (permalink / raw)
To: Pip Cet; +Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Pip Cet <pipcet@protonmail.com> writes:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> I'm using SIGPROF below to make it more concrete. Similar for other
>> signals.
>>
>> The idea is to get the backtrace in the SIGPROF handler, without
>> accessing Lisp data. That can be done, as I've tried to show.
>
> I don't understand. We need to access the specpdl, which I consider
> Lisp data, and certainly the backtrace includes data which can only be
> generated using MPS-managed memory.
What I meant with it not being Lisp data is union specbinding. The stack
of bindings is a root, and doesn't have a barrier. And accessing the
stack is not a problem because PVEC_THREAD is allocated from AMS which
doesn't have barriers. What we collect in get_backtrace is an array of
Lisp_Objects for the functions, and that's okay.
>
>> Then place that backtrace somewhere.
>
> I still think it's better to copy the specpdl, since that allows us to
> generate the "backtrace" (whatever we choose to use for that) in Lisp.
> If we spend too much time allocating short-lived data which triggers too
> many GCs, we want to know what to fix in the Lisp code.
In a way, what get_backtrace does is copy part of the bindings stack,
only the functions. The resulting backtrace that the user sees could
be done in Lisp, maybe, don't know. Important part for me is that we get
out of the signal handler to do stuff.
> Honestly, though, it doesn't matter much, does it?
Right, it's all details.
>
>> That's only one example architectures, of course. One can use something
>> else, like queues that are handled by another thread, one doesn't need a
>> scheduler thread, and so on, and so on. Pip's work queue is an
>> example.
>
> That's Helmut's code, not mine.
+2👍 to Helmut, -👍 to Pip, -👍 to me :-)
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 12:14 ` Eli Zaretskii
2024-12-24 13:18 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 13:42 ` Benjamin Riefenstahl
1 sibling, 0 replies; 91+ messages in thread
From: Benjamin Riefenstahl @ 2024-12-24 13:42 UTC (permalink / raw)
To: Eli Zaretskii
Cc: Pip Cet, gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
>> From: Pip Cet <pipcet@protonmail.com>
>> [...]
>> "Benjamin Riefenstahl" <b.riefenstahl@turtle-trading.net> writes:
>>
>> The allocation thread approach should work for all of them. If we have
>> stdatomic.h, performance should be acceptable.
JFTR, that quote wasn't from me, I believe it belongs to Pip. ;-)
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 12:56 ` Gerd Möllmann
2024-12-24 13:19 ` Pip Cet via Emacs development discussions.
@ 2024-12-24 13:46 ` Eli Zaretskii
2024-12-24 14:12 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 13:46 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Tue, 24 Dec 2024 13:56:18 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> >> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> >> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
> >> eller.helmut@gmail.com, acorallo@gnu.org
> >> Date: Tue, 24 Dec 2024 05:03:36 +0100
> >>
> >> I'd prefer to send messages from handle_profiler_signal. Or something
> >> equivalent to sending messages.
> >
> > How would that be different? If the messages arrive asynchronously
> > and are handled asynchronously, that's the moral equivalent of
> > signals, no?
>
> I'm using SIGPROF below to make it more concrete. Similar for other
> signals.
>
> The idea is to get the backtrace in the SIGPROF handler, without
> accessing Lisp data. That can be done, as I've tried to show.
> Then place that backtrace somewhere.
Let's be more accurate: when I said "Lisp data", I actually meant any
data that is part of the Lisp machine's global state. That's because
you cannot safely access that state while the Lisp machine runs (and
modifies the state). You need the Lisp machine stopped in its tracks.
Agreed?
Now, with that definition, isn't specpdl stack part of "Lisp data"?
If so, and if we can safely access it from a signal handler, why do we
need to move it aside at all? And how would the "message handler" be
different in that aspect from a signal hanlder?
> In an an actor model architecture, one would use a message that contains
> the backtrace and post it to a message board. I used that architecture
> just as an example, because I like it a lot. In the same architecture,
> typically a scheduler thread would then assign a thread to handle the
> message. The handler handling the profiler message would then do what
> record_backtrace today does after get_backtrace, i.e. count same
> backtraces.
What is the purpose of delaying the part of record_backtrace after
get_backtrace to later? Is the counting it does dangerous when done
from a signal handler?
> That's only one example architectures, of course. One can use something
> else, like queues that are handled by another thread, one doesn't need a
> scheduler thread, and so on, and so on. Pip's work queue is an
> example.
Doing this from another thread raises the problem I describe above: we
need the Lisp thread(s) stopped, because you cannot examine the data
of the Lisp machine while the machine is running. And if we stop the
Lisp threads, why do we need the other thread at all?
I guess we are tossing ideas without sufficient detail, so each one
understands something different from each idea (since we have
different backgrounds and experiences). My suggestion is that to
describe each idea in enough detail to make the design and its
implications clear to all. A kind of DR, if you want. Then we will
be on the same page, and can have an effective discussion of the
various ideas.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 8:48 ` Benjamin Riefenstahl
@ 2024-12-24 13:52 ` Eli Zaretskii
2024-12-24 13:54 ` Benjamin Riefenstahl
0 siblings, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 13:52 UTC (permalink / raw)
To: Benjamin Riefenstahl
Cc: pipcet, gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Cc: pipcet@protonmail.com, gerd.moellmann@gmail.com, ofv@wanadoo.es,
> emacs-devel@gnu.org, eller.helmut@gmail.com, acorallo@gnu.org
> Date: Tue, 24 Dec 2024 10:48:34 +0200
>
> Eli Zaretskii writes:
> > Emacs on Windows emulates some Posix signals (SIGPROF, SIGCHLD,
> > SIGALRM), so this affects the Windows build as well.
>
> That's interesting. But does this emulation have the same constraints
> as POSIX signals have?
If it's a useful emulation, it must somehow generate an asynchronous
event, and then arrange for that event to call the signal handler.
Right? So the constraints we are talking about, which have to do with
the fact that the signal handlers are invoked asynchronously, are
definitely relevant for this emulation (or any useful emulation),
because the problem we discuss here is the situation where the signal
handler is invoked while MPS holds the arena lock.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 13:52 ` Eli Zaretskii
@ 2024-12-24 13:54 ` Benjamin Riefenstahl
0 siblings, 0 replies; 91+ messages in thread
From: Benjamin Riefenstahl @ 2024-12-24 13:54 UTC (permalink / raw)
To: Eli Zaretskii
Cc: pipcet, gerd.moellmann, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii writes:
> So the constraints we are talking about, which have to do with the
> fact that the signal handlers are invoked asynchronously, are
> definitely relevant for this emulation (or any useful emulation),
> because the problem we discuss here is the situation where the signal
> handler is invoked while MPS holds the arena lock.
Understood.
Regards, benny
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 13:46 ` Eli Zaretskii
@ 2024-12-24 14:12 ` Gerd Möllmann
2024-12-24 14:40 ` Eli Zaretskii
2024-12-24 21:18 ` Pip Cet via Emacs development discussions.
0 siblings, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-24 14:12 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>>
>> I'm using SIGPROF below to make it more concrete. Similar for other
>> signals.
>>
>> The idea is to get the backtrace in the SIGPROF handler, without
>> accessing Lisp data. That can be done, as I've tried to show.
>> Then place that backtrace somewhere.
>
> Let's be more accurate: when I said "Lisp data", I actually meant any
> data that is part of the Lisp machine's global state. That's because
> you cannot safely access that state while the Lisp machine runs (and
> modifies the state). You need the Lisp machine stopped in its tracks.
> Agreed?
Ok, let's use that definition.
> Now, with that definition, isn't specpdl stack part of "Lisp data"?
> If so, and if we can safely access it from a signal handler, why do we
> need to move it aside at all? And how would the "message handler" be
> different in that aspect from a signal hanlder?
We're coming from the problem that MPS uses signals for memory barriers.
On platforms != macOS. And I am proposing a solution for that.
The SIGPROF handler does two things: (1) get the current backtrace,
which does not trip on memory barriers, and (2) build a summary, i.e.
count same backtraces using a hash table. (2) trips on memory barriers.
So, my proposal, is to do (1) in the signal handler and do (2)
elsewhere, not in the signal handler. Where (2) is done is a matter of
design. If we use Helmut's work queue, it would be the main thread, I
suppose.
In any case we're in "normal" multi-threading territory, with the usual
restrictions and so on, but these are restrictions Emacs has. And we
don't need anything from MPS, which might or might not be possible to
get.
>
>> In an an actor model architecture, one would use a message that contains
>> the backtrace and post it to a message board. I used that architecture
>> just as an example, because I like it a lot. In the same architecture,
>> typically a scheduler thread would then assign a thread to handle the
>> message. The handler handling the profiler message would then do what
>> record_backtrace today does after get_backtrace, i.e. count same
>> backtraces.
>
> What is the purpose of delaying the part of record_backtrace after
> get_backtrace to later? Is the counting it does dangerous when done
> from a signal handler?
That part (2) which can trip on memory barriers because it accesses
MPS-managed memory like vectors and so on.
>
>> That's only one example architectures, of course. One can use something
>> else, like queues that are handled by another thread, one doesn't need a
>> scheduler thread, and so on, and so on. Pip's work queue is an
>> example.
>
> Doing this from another thread raises the problem I describe above: we
> need the Lisp thread(s) stopped, because you cannot examine the data
> of the Lisp machine while the machine is running. And if we stop the
> Lisp threads, why do we need the other thread at all?
>
> I guess we are tossing ideas without sufficient detail, so each one
> understands something different from each idea (since we have
> different backgrounds and experiences). My suggestion is that to
> describe each idea in enough detail to make the design and its
> implications clear to all. A kind of DR, if you want. Then we will
> be on the same page, and can have an effective discussion of the
> various ideas.
I hope the above helps. Please understand that I'm not proposing a
ready-made design, but mainly recommend moving (2) out of the signal
handler. Sorry if that was too abstract so far, I guess that's just the
way I'm thinking.
If it helps, maybe we should concentrate on solving this with Helmut's
work queue. Put the backtrace from (1) in the work queue, then do (2)
where the work queue is processed. Something like that.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 14:12 ` Gerd Möllmann
@ 2024-12-24 14:40 ` Eli Zaretskii
2024-12-25 4:56 ` Gerd Möllmann
2024-12-24 21:18 ` Pip Cet via Emacs development discussions.
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-24 14:40 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Tue, 24 Dec 2024 15:12:40 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > Now, with that definition, isn't specpdl stack part of "Lisp data"?
> > If so, and if we can safely access it from a signal handler, why do we
> > need to move it aside at all? And how would the "message handler" be
> > different in that aspect from a signal hanlder?
>
> We're coming from the problem that MPS uses signals for memory barriers.
> On platforms != macOS. And I am proposing a solution for that.
>
> The SIGPROF handler does two things: (1) get the current backtrace,
> which does not trip on memory barriers, and (2) build a summary, i.e.
> count same backtraces using a hash table. (2) trips on memory barriers.
Can you elaborate on (2) and why it trips? I guess I'm missing
something because I don't understand which code in record_backtrace
does trip on memory barriers and why.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 14:12 ` Gerd Möllmann
2024-12-24 14:40 ` Eli Zaretskii
@ 2024-12-24 21:18 ` Pip Cet via Emacs development discussions.
2024-12-25 5:23 ` Gerd Möllmann
1 sibling, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-24 21:18 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
> We're coming from the problem that MPS uses signals for memory barriers.
> On platforms != macOS. And I am proposing a solution for that.
I don't think that's the problem. The problem is that signals can
interrupt MPS, on all platforms. We can't have MPS-signal-MPS stacks,
ever. The best way to ensure that is to keep signals on one stack, and
MPS on another stack. MacOS already does that for their SIGSEGV
equivalent, but we need to do it for all entry points into MPS.
If they don't have separate stacks, and we interrupt MPS, the signal
handler cannot look at any MPS-modifiable memory (including roots, which
may be in an inconsistent state mid-GC), ever. This includes the
specpdl. We can't write to MPS-known memory, ever. This includes any
area we might want to copy the backtrace or specpdl to.
> The SIGPROF handler does two things: (1) get the current backtrace,
> which does not trip on memory barriers, and
Even if the specpdl were an ambiguous root, we'd be making very
permanent and far-reaching assumptions about how MPS handles such roots
if we assumed that we could even look at such roots during GC. This
goes doubly for assuming that we can extract references to
ambiguously-rooted objects and put them into other areas of MPS-visible
memory. Even if this worked perfectly with current MPS on all
platforms, it would still be unreasonable for us to rely on such
implementation details.
We can't do (1).
>> Doing this from another thread raises the problem I describe above: we
>> need the Lisp thread(s) stopped, because you cannot examine the data
>> of the Lisp machine while the machine is running. And if we stop the
>> Lisp threads, why do we need the other thread at all?
Because MPS can continue and reach an MPS-consistent state only if it
has its own stack. In practice, this means an extra thread.
Or we re-raise signals (scratch/igc right now; this will delay signals
unpredictably), or we block them for the allocation fast path
(significant slowdown on some systems) *and* in the SIGSEGV handler
(which we'd need to "steal" from MPS, calling it from our real signal
handler by extracting sa_sigaction and calling that pointer. Recipe for
disaster).
I'm still convinced the extra thread is the least painful option,
followed by what we have now.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-23 1:00 ` Óscar Fuentes
@ 2024-12-24 22:34 ` Pip Cet via Emacs development discussions.
2024-12-25 4:25 ` Freezing frame with igc Gerd Möllmann
0 siblings, 1 reply; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-24 22:34 UTC (permalink / raw)
To: Óscar Fuentes
Cc: emacs-devel, Gerd Möllmann, Helmut Eller, Andrea Corallo
Óscar Fuentes <ofv@wanadoo.es> writes:
> Pip Cet <pipcet@protonmail.com> writes:
>
>>> Redisplay just stopped while showing the menu, no crash nor infinite
>>> loop, its CPU usage was typical for the repeating timers that my config
>>> creates.
>>
>> That's a bit odd. It might be the signal issue, but that's purely a
>> guess. If it happens again, please let us know.
>
> Sure.
I'm not a hundred percent sure, because I was testing other changes, but
I just observed an Emacs session in a very similar state to what you
describe: very little but nonzero CPU usage, but unresponsive to X
interactions. I attached gdb, observed it was stuck in read_char, then
I messed up and set Vquit_flag to Qt, at which point the Emacs session
recovered and seems fully usable once more (it did take a while to do
so, though). So no valuable debug info this time, hope I'll hit it
again.
Again, it's possible this is a similar-looking but different bug,
possibly caused by local changes.
I don't think read_char or its subroutines even use MPS memory, though?
As this is a GTK build, and yours wasn't, we should probably look at X
interaction code shared between the GTK and non-GTK builds.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Freezing frame with igc
2024-12-24 22:34 ` Pip Cet via Emacs development discussions.
@ 2024-12-25 4:25 ` Gerd Möllmann
2024-12-25 11:19 ` Pip Cet via Emacs development discussions.
2024-12-25 11:55 ` Óscar Fuentes
0 siblings, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 4:25 UTC (permalink / raw)
To: Pip Cet; +Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
(Subject changed.)
Pip Cet <pipcet@protonmail.com> writes:
> Óscar Fuentes <ofv@wanadoo.es> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>>> Redisplay just stopped while showing the menu, no crash nor infinite
>>>> loop, its CPU usage was typical for the repeating timers that my config
>>>> creates.
>>>
>>> That's a bit odd. It might be the signal issue, but that's purely a
>>> guess. If it happens again, please let us know.
>>
>> Sure.
>
> I'm not a hundred percent sure, because I was testing other changes, but
> I just observed an Emacs session in a very similar state to what you
> describe: very little but nonzero CPU usage, but unresponsive to X
> interactions. I attached gdb, observed it was stuck in read_char, then
> I messed up and set Vquit_flag to Qt, at which point the Emacs session
> recovered and seems fully usable once more (it did take a while to do
> so, though). So no valuable debug info this time, hope I'll hit it
> again.
>
> Again, it's possible this is a similar-looking but different bug,
> possibly caused by local changes.
>
> I don't think read_char or its subroutines even use MPS memory, though?
> As this is a GTK build, and yours wasn't, we should probably look at X
> interaction code shared between the GTK and non-GTK builds.
>
> Pip
That reminds of something. Maybe what you've seen is completely
unrelated, it's impossible to tell, but please find below a comment that
I added to do_switch_frame in frame.c.
/* We want to make sure that the next event generates a frame-switch
event to the appropriate frame. This seems kludgy to me, but
before you take it out, make sure that evaluating something like
(select-window (frame-root-window (make-frame))) doesn't end up
with your typing being interpreted in the new frame instead of
the one you're actually typing in. */
/* FIXME/tty: I don't understand this. (The comment above is from
Jim BLandy 1993 BTW, and the frame_ancestor_p from 2017.)
Setting the last event frame to nil leads to switch-frame events
being generated even if they normally wouldn't be because the frame
in question equals selected-frame. See the places in keyboard.c
where make_lispy_switch_frame is called.
This leads to problems at least on ttys.
Imagine that we have functions in post-command-hook that use
select-frame in some way (e.g., with-selected-window). Let these
functions select different frames during the execution of
post-command-hook in command_loop_1. Setting
internal_last_event_frame to nil here makes these select-frame
calls (potentially and in reality) generate switch-frame events.
(But only in one direction (frame_ancestor_p), which I also don't
understand).
These switch-frame events form an endless loop in
command_loop_1. It runs post-command-hook, which generates
switch-frame events, which command_loop_1 finds (bound to '#ignore)
and executes, which again runs post-command-hook etc., ad
infinitum.
Let's not do that for now on ttys. */
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 14:40 ` Eli Zaretskii
@ 2024-12-25 4:56 ` Gerd Möllmann
2024-12-25 12:19 ` Eli Zaretskii
0 siblings, 1 reply; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 4:56 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Tue, 24 Dec 2024 15:12:40 +0100
>>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>> > Now, with that definition, isn't specpdl stack part of "Lisp data"?
>> > If so, and if we can safely access it from a signal handler, why do we
>> > need to move it aside at all? And how would the "message handler" be
>> > different in that aspect from a signal hanlder?
>>
>> We're coming from the problem that MPS uses signals for memory barriers.
>> On platforms != macOS. And I am proposing a solution for that.
>>
>> The SIGPROF handler does two things: (1) get the current backtrace,
>> which does not trip on memory barriers, and (2) build a summary, i.e.
>> count same backtraces using a hash table. (2) trips on memory barriers.
>
> Can you elaborate on (2) and why it trips? I guess I'm missing
> something because I don't understand which code in record_backtrace
> does trip on memory barriers and why.
Ok, (2) begins as shown below.
static void
record_backtrace (struct profiler_log *plog, EMACS_INT count)
{
log_t *log = plog->log;
get_backtrace (log->trace, log->depth);
--- (2) begins after this line -------------------------------
EMACS_UINT hash = trace_hash (log->trace, log->depth);
The SIGPROF can have interrupted Emacs at any point, both the MPS thread
and all others. MPS may have been doing arbitrary stuff when
interrupted, and Emacs threads too. Memory barriers may be on
unpredictable segments of memory, as they usually are, as part of MPS'
GC implementation. Do you agree with this picture?
Elsewhere I tried to explain why I think this works up to the line
marked (2) above. Now enter trace_hash. Current implementation:
static EMACS_UINT
trace_hash (Lisp_Object *trace, int depth)
{
EMACS_UINT hash = 0;
for (int i = 0; i < depth; i++)
{
Lisp_Object f = trace[i];
EMACS_UINT hash1;
#ifdef HAVE_MPS
hash1 = (CLOSUREP (f) ? igc_hash (AREF (f, CLOSURE_CODE)) : igc_hash (f));
^^^^^^^^ ^^^^^^^^ ^^^^
The constructs I marked with ^^^ all access the memory of F. F is a
vectorlike, it's memory is managed by MPS in an MPS pool that uses
memory barriers, so the memory of F can currently be behind a barrier.
It doesn't have to, but it can.
When we access F's memory and it is behind a barrier, the result is a
nested SIgSEGV while handling SIGPROF.
More code accessing memory that is potentially behind a barrier follows
in record_backtrace.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-24 21:18 ` Pip Cet via Emacs development discussions.
@ 2024-12-25 5:23 ` Gerd Möllmann
2024-12-25 10:48 ` Pip Cet via Emacs development discussions.
` (2 more replies)
0 siblings, 3 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 5:23 UTC (permalink / raw)
To: Pip Cet; +Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Pip Cet <pipcet@protonmail.com> writes:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Eli Zaretskii <eliz@gnu.org> writes:
>> We're coming from the problem that MPS uses signals for memory barriers.
>> On platforms != macOS. And I am proposing a solution for that.
>
> I don't think that's the problem. The problem is that signals can
> interrupt MPS, on all platforms. We can't have MPS-signal-MPS stacks,
> ever. The best way to ensure that is to keep signals on one stack, and
> MPS on another stack. MacOS already does that for their SIGSEGV
> equivalent, but we need to do it for all entry points into MPS.
>
> If they don't have separate stacks, and we interrupt MPS, the signal
> handler cannot look at any MPS-modifiable memory (including roots, which
> may be in an inconsistent state mid-GC), ever. This includes the
> specpdl. We can't write to MPS-known memory, ever. This includes any
> area we might want to copy the backtrace or specpdl to.
And I don't think that's right :-). It's completely right that in the
SIGPROF handler everything can be inconsistent. That's true both for MPS
and Emacs. For example, the bindings stack (specpdl) may be inconsistent
when SIGPROF arrives. Literally everything we do in the SIGPROF runs the
risk of encountering inconsistencies.
I think that's already true for the old GC. There is nothing
guaranteeing that the contents of the binding stack is consistent, for
example. But we get away with it well enough that the profiler is
useful.
With MPS, from my POV, the situation is pretty similar. Try to get away
with it by not triggering MPS while in a state that we must assume is
inconsistent.
>> The SIGPROF handler does two things: (1) get the current backtrace,
>> which does not trip on memory barriers, and
>
> Even if the specpdl were an ambiguous root, we'd be making very
> permanent and far-reaching assumptions about how MPS handles such roots
> if we assumed that we could even look at such roots during GC. This
> goes doubly for assuming that we can extract references to
> ambiguously-rooted objects and put them into other areas of MPS-visible
> memory. Even if this worked perfectly with current MPS on all
> platforms, it would still be unreasonable for us to rely on such
> implementation details.
>
> We can't do (1).
I disagree, abviously :-)
For me, it's not about a theoretical or even practical solution that
somehow ensures a consistent state in MPS, or some future changes in MPS
or something. It's about getting away with what we do in the profiler
_now_, as we do with the old GC. which is already seeing potentially
inconsistent state in Emacs' memory.
I think the _now_ is also important. From my POV, we could discuss
better solutions later.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 8:39 ` Gerd Möllmann
@ 2024-12-25 9:22 ` Helmut Eller
2024-12-25 9:43 ` Gerd Möllmann
0 siblings, 1 reply; 91+ messages in thread
From: Helmut Eller @ 2024-12-25 9:22 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
On Tue, Dec 24 2024, Gerd Möllmann wrote:
[...]
>>> - Lisp allocation in signal handlers cannot exist because alloc.c is not
>>> reentrant which means we would crash with the old GC. We don't need
>>> anything extra for that in igc.
[...]
> BTW, do you agree with my analysis that Lisp allocations can't possibly
> exist in signal handlers today?
I don't know alloc.c well enough to make a judgment. This comment for
XMALLOC_BLOCK_INPUT_CHECK seems to say that signal handlers used to
allocate but no longer do:
If compiled with XMALLOC_BLOCK_INPUT_CHECK, define a symbol
BLOCK_INPUT_IN_MEMORY_ALLOCATORS that is visible to the debugger.
If that variable is set, block input while in one of Emacs's memory
allocation functions. There should be no need for this debugging
option, since signal handlers do not allocate memory, but Emacs
formerly allocated memory in signal handlers and this compile-time
option remains as a way to help debug the issue should it rear its
ugly head again. */
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-25 9:22 ` Helmut Eller
@ 2024-12-25 9:43 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 9:43 UTC (permalink / raw)
To: Helmut Eller; +Cc: Eli Zaretskii, pipcet, ofv, emacs-devel, acorallo
Helmut Eller <eller.helmut@gmail.com> writes:
> On Tue, Dec 24 2024, Gerd Möllmann wrote:
>
> [...]
>>>> - Lisp allocation in signal handlers cannot exist because alloc.c is not
>>>> reentrant which means we would crash with the old GC. We don't need
>>>> anything extra for that in igc.
> [...]
>> BTW, do you agree with my analysis that Lisp allocations can't possibly
>> exist in signal handlers today?
>
> I don't know alloc.c well enough to make a judgment. This comment for
> XMALLOC_BLOCK_INPUT_CHECK seems to say that signal handlers used to
> allocate but no longer do:
>
> If compiled with XMALLOC_BLOCK_INPUT_CHECK, define a symbol
> BLOCK_INPUT_IN_MEMORY_ALLOCATORS that is visible to the debugger.
> If that variable is set, block input while in one of Emacs's memory
> allocation functions. There should be no need for this debugging
> option, since signal handlers do not allocate memory, but Emacs
> formerly allocated memory in signal handlers and this compile-time
> option remains as a way to help debug the issue should it rear its
> ugly head again. */
>
> Helmut
Thanks.
Stefan Monnier seems to have added that 2012, judging from git grep in
the ChangeLogs, and it reads as if it has something to do with
SYNC_INPUT, which I think means no longer doing X event handling in a
SIGIO handler.
And it seems to be no longer in use. XMALLOC_BLOCK_INPUT_CHECK appears
nowhere, and MALLOC_BLOCK_INPUE is always a no-op. alloc.c could need
some love.
Anyway. Just looking at Fcons wrt async-signal-safety, makes me pretty
sure.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-24 13:05 ` Eli Zaretskii
@ 2024-12-25 10:46 ` Helmut Eller
2024-12-25 12:45 ` Eli Zaretskii
0 siblings, 1 reply; 91+ messages in thread
From: Helmut Eller @ 2024-12-25 10:46 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: gerd.moellmann, pipcet, ofv, emacs-devel, acorallo
On Tue, Dec 24 2024, Eli Zaretskii wrote:
>> From: Helmut Eller <eller.helmut@gmail.com>
>> I think, SIGIO might cause trouble.
>
> Why do you think so? Its handler does almost nothing, just sets a
> flag (if you ignore the Android-specific stuff there).
Indeed that looks quite tame. Makes me wonder why
handle_interrupt_signal needs to be so complicated in comparison. E.g.
the line
internal_last_event_frame = terminal->display_info.tty->top_frame;
looks problematic for MPS.
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 5:23 ` Gerd Möllmann
@ 2024-12-25 10:48 ` Pip Cet via Emacs development discussions.
2024-12-25 11:48 ` Helmut Eller
2024-12-25 12:31 ` Eli Zaretskii
2 siblings, 0 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-25 10:48 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Eli Zaretskii, ofv, emacs-devel, eller.helmut, acorallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Pip Cet <pipcet@protonmail.com> writes:
>
>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>
>>> Eli Zaretskii <eliz@gnu.org> writes:
>>> We're coming from the problem that MPS uses signals for memory barriers.
>>> On platforms != macOS. And I am proposing a solution for that.
>>
>> I don't think that's the problem. The problem is that signals can
>> interrupt MPS, on all platforms. We can't have MPS-signal-MPS stacks,
>> ever. The best way to ensure that is to keep signals on one stack, and
>> MPS on another stack. MacOS already does that for their SIGSEGV
>> equivalent, but we need to do it for all entry points into MPS.
>>
>> If they don't have separate stacks, and we interrupt MPS, the signal
>> handler cannot look at any MPS-modifiable memory (including roots, which
>> may be in an inconsistent state mid-GC), ever. This includes the
>> specpdl. We can't write to MPS-known memory, ever. This includes any
>> area we might want to copy the backtrace or specpdl to.
>
> And I don't think that's right :-). It's completely right that in the
> SIGPROF handler everything can be inconsistent. That's true both for MPS
> and Emacs. For example, the bindings stack (specpdl) may be inconsistent
> when SIGPROF arrives. Literally everything we do in the SIGPROF runs the
> risk of encountering inconsistencies.
This is getting into technical details, but I think the specpdl code
was, at one point, carefully written so the specpdl stack would always
look consistent, making some assumptions about the compiler in use.
Then compilers changed to make this impossible (automatic inlining,
LTO), then they changed to make this possible again (stdatomic.h, memory
ordering), and we also introduced an unfortunate feature which breaks
consistency. Now, we can (and should) restore the consistency
assumption, at least if we drop that unfortunate feature (as we should).
Inconsistency of the specpdl stack is avoidable, because we control both
the mutator and the inspection code. Inconsistency of MPS data is not,
unless we take over control of the entire MPS library so we can make
far-reaching assumptions there.
> I think that's already true for the old GC. There is nothing
> guaranteeing that the contents of the binding stack is consistent, for
> example. But we get away with it well enough that the profiler is
> useful.
My understanding is that was true at one point, before C caught up to
memory ordering between a thread and its signal handlers, but with C11,
we have everything we need to ensure consistency, at least on systems
that store words atomically (we don't use memcpy for modifying the
specpdl stack).
And about the usefulness thing: I really want SIGPROF specifically to
improve MPS performance, which means we need to do something in the
"we've interrupted MPS" situation. Or at least I want the option,
rather than make "signal handlers can't do anything useful if
igc_busy_p()" an axiom. And if we start declaring huge chunks of Emacs
data (the entire specpdl, a large area of storage for storing the
"backtrace", all thread stacks, why not the pdumper area while we're at
it?) as ambiguous roots, we risk ending up with AMS pools everywhere and
no copying.
> With MPS, from my POV, the situation is pretty similar. Try to get away
> with it by not triggering MPS while in a state that we must assume is
> inconsistent.
That's one approach. The other approach is to keep arguing about this
until we get a SIGPROF that we're actually happy with, and then we can
tell people interested in other signal handlers to copy that code :-)
> For me, it's not about a theoretical or even practical solution that
> somehow ensures a consistent state in MPS, or some future changes in MPS
> or something. It's about getting away with what we do in the profiler
> _now_, as we do with the old GC. which is already seeing potentially
> inconsistent state in Emacs' memory.
>
> I think the _now_ is also important. From my POV, we could discuss
> better solutions later.
I misunderstood.
We've got a solution that I'm convinced we can get away with, on
scratch/igc, now. It's not pretty or permanent, and I don't think it's
"good enough"; but I don't think splitting SIGPROF handlers improves it
enough to make it "good enough", either.
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Freezing frame with igc
2024-12-25 4:25 ` Freezing frame with igc Gerd Möllmann
@ 2024-12-25 11:19 ` Pip Cet via Emacs development discussions.
2024-12-25 11:55 ` Óscar Fuentes
1 sibling, 0 replies; 91+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-25 11:19 UTC (permalink / raw)
To: Gerd Möllmann
Cc: Óscar Fuentes, emacs-devel, Helmut Eller, Andrea Corallo
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> That reminds of something. Maybe what you've seen is completely
> unrelated, it's impossible to tell, but please find below a comment that
> I added to do_switch_frame in frame.c.
Thanks! I hope I'll be able to reproduce the issue again.
> These switch-frame events form an endless loop in
> command_loop_1. It runs post-command-hook, which generates
> switch-frame events, which command_loop_1 finds (bound to '#ignore)
> and executes, which again runs post-command-hook etc., ad
> infinitum.
The strange thing is there is not apparently an infinite loop: Emacs
doesn't use 100% CPU, it seems to be calling *select() as usual.
Definitely something to check for, though!
Pip
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 5:23 ` Gerd Möllmann
2024-12-25 10:48 ` Pip Cet via Emacs development discussions.
@ 2024-12-25 11:48 ` Helmut Eller
2024-12-25 11:58 ` Gerd Möllmann
2024-12-25 12:52 ` Eli Zaretskii
2024-12-25 12:31 ` Eli Zaretskii
2 siblings, 2 replies; 91+ messages in thread
From: Helmut Eller @ 2024-12-25 11:48 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Pip Cet, Eli Zaretskii, ofv, emacs-devel, acorallo
On Wed, Dec 25 2024, Gerd Möllmann wrote:
> Pip Cet <pipcet@protonmail.com> writes:
>
>> I don't think that's the problem. The problem is that signals can
>> interrupt MPS, on all platforms.
[...]
> And I don't think that's right :-). It's completely right that in the
> SIGPROF handler everything can be inconsistent. That's true both for MPS
> and Emacs. For example, the bindings stack (specpdl) may be inconsistent
> when SIGPROF arrives. Literally everything we do in the SIGPROF runs the
> risk of encountering inconsistencies.
The SIGPROF handler copies part of the potentially inconsistent state to
the profiler log. That same potentially inconsistent profiler log is
used later, outside the signal handler. Sounds like a problem to me.
Is it not? Or is the probability for inconistencies being copied so low
that we ignore it?
Helmut
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Freezing frame with igc
2024-12-25 4:25 ` Freezing frame with igc Gerd Möllmann
2024-12-25 11:19 ` Pip Cet via Emacs development discussions.
@ 2024-12-25 11:55 ` Óscar Fuentes
1 sibling, 0 replies; 91+ messages in thread
From: Óscar Fuentes @ 2024-12-25 11:55 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Pip Cet, emacs-devel
(CC list trimmed)
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>>>> Redisplay just stopped while showing the menu, no crash nor infinite
>>>>> loop, its CPU usage was typical for the repeating timers that my config
>>>>> creates.
>>>>
>>>> That's a bit odd. It might be the signal issue, but that's purely a
>>>> guess. If it happens again, please let us know.
>>>
>>> Sure.
>>
>> I'm not a hundred percent sure, because I was testing other changes, but
>> I just observed an Emacs session in a very similar state to what you
>> describe: very little but nonzero CPU usage, but unresponsive to X
>> interactions. I attached gdb, observed it was stuck in read_char, then
>> I messed up and set Vquit_flag to Qt, at which point the Emacs session
>> recovered and seems fully usable once more (it did take a while to do
>> so, though). So no valuable debug info this time, hope I'll hit it
>> again.
>>
>> Again, it's possible this is a similar-looking but different bug,
>> possibly caused by local changes.
>>
>> I don't think read_char or its subroutines even use MPS memory, though?
>> As this is a GTK build, and yours wasn't, we should probably look at X
>> interaction code shared between the GTK and non-GTK builds.
>>
>> Pip
>
> That reminds of something. Maybe what you've seen is completely
> unrelated, it's impossible to tell, but please find below a comment that
> I added to do_switch_frame in frame.c.
[snip]
At the time I was working with two frames. The frame where I tried to
show the menu went blank.
I use mini-echo [1], which removes the mode line and uses the echo area
instead, periodically (0.3 seconds) updating the echo area (it has a
cache for not updating when there are no changes, but my experience is
that the updates are very frequent.)
So it could be that mini-echo tried to update the echo area (of both
frames, because it shows the same text on all frames) at an "unfortunate
time" while the menu was being displayed. I don't know if this
hypothesis even makes sense.
BTW, I'm fairly sure that mini-echo was the responsible of the small CPU
activity I saw on htop after Emacs UI froze.
1. https://github.com/liuyinz/mini-echo.el
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 11:48 ` Helmut Eller
@ 2024-12-25 11:58 ` Gerd Möllmann
2024-12-25 12:52 ` Eli Zaretskii
1 sibling, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 11:58 UTC (permalink / raw)
To: Helmut Eller; +Cc: Pip Cet, Eli Zaretskii, ofv, emacs-devel, acorallo
Helmut Eller <eller.helmut@gmail.com> writes:
> On Wed, Dec 25 2024, Gerd Möllmann wrote:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>>
>>> I don't think that's the problem. The problem is that signals can
>>> interrupt MPS, on all platforms.
> [...]
>> And I don't think that's right :-). It's completely right that in the
>> SIGPROF handler everything can be inconsistent. That's true both for MPS
>> and Emacs. For example, the bindings stack (specpdl) may be inconsistent
>> when SIGPROF arrives. Literally everything we do in the SIGPROF runs the
>> risk of encountering inconsistencies.
>
> The SIGPROF handler copies part of the potentially inconsistent state to
> the profiler log. That same potentially inconsistent profiler log is
> used later, outside the signal handler. Sounds like a problem to me.
> Is it not? Or is the probability for inconistencies being copied so low
> that we ignore it?
>
> Helmut
I think the latter, i.e. we ignore it. I think, but I can't prove
anything, that the probability is good that we get away with it. For
example, We're only using the backtrace_p binding stack entries,
so the GIGPROF would have to happen when in some code putting them to be
in danger, so to speak. That's not so likely, I think.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 4:56 ` Gerd Möllmann
@ 2024-12-25 12:19 ` Eli Zaretskii
2024-12-25 12:50 ` Gerd Möllmann
0 siblings, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-25 12:19 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Wed, 25 Dec 2024 05:56:26 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> >> The SIGPROF handler does two things: (1) get the current backtrace,
> >> which does not trip on memory barriers, and (2) build a summary, i.e.
> >> count same backtraces using a hash table. (2) trips on memory barriers.
> >
> > Can you elaborate on (2) and why it trips? I guess I'm missing
> > something because I don't understand which code in record_backtrace
> > does trip on memory barriers and why.
>
> Ok, (2) begins as shown below.
>
> static void
> record_backtrace (struct profiler_log *plog, EMACS_INT count)
> {
> log_t *log = plog->log;
> get_backtrace (log->trace, log->depth);
> --- (2) begins after this line -------------------------------
> EMACS_UINT hash = trace_hash (log->trace, log->depth);
>
> The SIGPROF can have interrupted Emacs at any point, both the MPS thread
> and all others. MPS may have been doing arbitrary stuff when
> interrupted, and Emacs threads too. Memory barriers may be on
> unpredictable segments of memory, as they usually are, as part of MPS'
> GC implementation. Do you agree with this picture?
>
> Elsewhere I tried to explain why I think this works up to the line
> marked (2) above. Now enter trace_hash. Current implementation:
>
> static EMACS_UINT
> trace_hash (Lisp_Object *trace, int depth)
> {
> EMACS_UINT hash = 0;
> for (int i = 0; i < depth; i++)
> {
> Lisp_Object f = trace[i];
> EMACS_UINT hash1;
> #ifdef HAVE_MPS
> hash1 = (CLOSUREP (f) ? igc_hash (AREF (f, CLOSURE_CODE)) : igc_hash (f));
> ^^^^^^^^ ^^^^^^^^ ^^^^
>
> The constructs I marked with ^^^ all access the memory of F. F is a
> vectorlike, it's memory is managed by MPS in an MPS pool that uses
> memory barriers, so the memory of F can currently be behind a barrier.
> It doesn't have to, but it can.
>
> When we access F's memory and it is behind a barrier, the result is a
> nested SIgSEGV while handling SIGPROF.
Two followup questions:
. how is accessing F different from accessing the specpdl stack?
. how does this work with the current GC, where F could have been
collected and its memory freed?
The first question is more important, from where I stand. Looking
forward beyond the point where we land igc on master, I wonder how
will be able to tell, for a random non-trivial change on the C level,
whether what it does can cause trouble with MPS? That is, how can a
mere mortal determine whether a given data structure in igc Emacs can
or cannot be safely touched when MPS happens to do its thing, whether
synchronously or asynchronously? We must have some reasonably
practical way of telling this, or else we will be breaking Emacs high
and low.
> More code accessing memory that is potentially behind a barrier follows
> in record_backtrace.
Which code is that? (It's a serious question: I tried to identify
that code, but couldn't. I'm probably missing something.)
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 5:23 ` Gerd Möllmann
2024-12-25 10:48 ` Pip Cet via Emacs development discussions.
2024-12-25 11:48 ` Helmut Eller
@ 2024-12-25 12:31 ` Eli Zaretskii
2024-12-25 12:54 ` Gerd Möllmann
2 siblings, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-25 12:31 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Wed, 25 Dec 2024 06:23:49 +0100
>
> > If they don't have separate stacks, and we interrupt MPS, the signal
> > handler cannot look at any MPS-modifiable memory (including roots, which
> > may be in an inconsistent state mid-GC), ever. This includes the
> > specpdl. We can't write to MPS-known memory, ever. This includes any
> > area we might want to copy the backtrace or specpdl to.
>
> And I don't think that's right :-). It's completely right that in the
> SIGPROF handler everything can be inconsistent. That's true both for MPS
> and Emacs. For example, the bindings stack (specpdl) may be inconsistent
> when SIGPROF arrives.
Theoretically, maybe. But in practice, you'd need to identify the
code which manipulates specpdl that could have specpdl in inconsistent
state if interrupted at some opportune point. Can you identify such
places in the code?
> Literally everything we do in the SIGPROF runs the
> risk of encountering inconsistencies.
Only if we interrupt code which leaves the global state inconsistent,
and if what the SIGPROF handler does involves accessing those
potentially-inconsistent data.
> I think that's already true for the old GC. There is nothing
> guaranteeing that the contents of the binding stack is consistent, for
> example. But we get away with it well enough that the profiler is
> useful.
With the old GC, we have special code to deal with this:
/* Signal handler for sampling profiler. */
static void
add_sample (struct profiler_log *plog, EMACS_INT count)
{
if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
/* Special case the time-count inside GC because the hash-table
code is not prepared to be used while the GC is running.
More specifically it uses ASIZE at many places where it does
not expect the ARRAY_MARK_FLAG to be set. We could try and
harden the hash-table code, but it doesn't seem worth the
effort. */
plog->gc_count = saturated_add (plog->gc_count, count);
So all we need is for backtrace_top_function to be safe when SIGPROF
arrives while we are in GC. Are you saying backtrace_top_function is
unsafe in that case?
> With MPS, from my POV, the situation is pretty similar. Try to get away
> with it by not triggering MPS while in a state that we must assume is
> inconsistent.
The difference with MPS is that the old GC is synchronous with the
Lisp machine, so it couldn't possibly start while we are modifying
specpdl. That is no longer true with MPS, AFAIU, because MPS could
start GC asynchronously.
> >> The SIGPROF handler does two things: (1) get the current backtrace,
> >> which does not trip on memory barriers, and
> >
> > Even if the specpdl were an ambiguous root, we'd be making very
> > permanent and far-reaching assumptions about how MPS handles such roots
> > if we assumed that we could even look at such roots during GC. This
> > goes doubly for assuming that we can extract references to
> > ambiguously-rooted objects and put them into other areas of MPS-visible
> > memory. Even if this worked perfectly with current MPS on all
> > platforms, it would still be unreasonable for us to rely on such
> > implementation details.
> >
> > We can't do (1).
>
> I disagree, abviously :-)
>
> For me, it's not about a theoretical or even practical solution that
> somehow ensures a consistent state in MPS, or some future changes in MPS
> or something. It's about getting away with what we do in the profiler
> _now_, as we do with the old GC. which is already seeing potentially
> inconsistent state in Emacs' memory.
See above: there's a difference. So I would really like to hear why
you think accessing specpdl from a SIGPROF handler in an igc build is
safe.
> I think the _now_ is also important. From my POV, we could discuss
> better solutions later.
If you are right in your conclusions, certainly.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: SIGPROF + SIGCHLD and igc
2024-12-25 10:46 ` Helmut Eller
@ 2024-12-25 12:45 ` Eli Zaretskii
0 siblings, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-25 12:45 UTC (permalink / raw)
To: Helmut Eller; +Cc: gerd.moellmann, pipcet, ofv, emacs-devel, acorallo
> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: gerd.moellmann@gmail.com, pipcet@protonmail.com, ofv@wanadoo.es,
> emacs-devel@gnu.org, acorallo@gnu.org
> Date: Wed, 25 Dec 2024 11:46:38 +0100
>
> On Tue, Dec 24 2024, Eli Zaretskii wrote:
>
> >> From: Helmut Eller <eller.helmut@gmail.com>
> >> I think, SIGIO might cause trouble.
> >
> > Why do you think so? Its handler does almost nothing, just sets a
> > flag (if you ignore the Android-specific stuff there).
>
> Indeed that looks quite tame. Makes me wonder why
> handle_interrupt_signal needs to be so complicated in comparison. E.g.
> the line
>
> internal_last_event_frame = terminal->display_info.tty->top_frame;
>
> looks problematic for MPS.
SIGINT is different because C-g is programmed to trigger SIGINT on TTY
frames.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 12:19 ` Eli Zaretskii
@ 2024-12-25 12:50 ` Gerd Möllmann
2024-12-25 13:00 ` Eli Zaretskii
2024-12-25 13:09 ` Eli Zaretskii
0 siblings, 2 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 12:50 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Wed, 25 Dec 2024 05:56:26 +0100
>>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>> >> The SIGPROF handler does two things: (1) get the current backtrace,
>> >> which does not trip on memory barriers, and (2) build a summary, i.e.
>> >> count same backtraces using a hash table. (2) trips on memory barriers.
>> >
>> > Can you elaborate on (2) and why it trips? I guess I'm missing
>> > something because I don't understand which code in record_backtrace
>> > does trip on memory barriers and why.
>>
>> Ok, (2) begins as shown below.
>>
>> static void
>> record_backtrace (struct profiler_log *plog, EMACS_INT count)
>> {
>> log_t *log = plog->log;
>> get_backtrace (log->trace, log->depth);
>> --- (2) begins after this line -------------------------------
>> EMACS_UINT hash = trace_hash (log->trace, log->depth);
>>
>> The SIGPROF can have interrupted Emacs at any point, both the MPS thread
>> and all others. MPS may have been doing arbitrary stuff when
>> interrupted, and Emacs threads too. Memory barriers may be on
>> unpredictable segments of memory, as they usually are, as part of MPS'
>> GC implementation. Do you agree with this picture?
>>
>> Elsewhere I tried to explain why I think this works up to the line
>> marked (2) above. Now enter trace_hash. Current implementation:
>>
>> static EMACS_UINT
>> trace_hash (Lisp_Object *trace, int depth)
>> {
>> EMACS_UINT hash = 0;
>> for (int i = 0; i < depth; i++)
>> {
>> Lisp_Object f = trace[i];
>> EMACS_UINT hash1;
>> #ifdef HAVE_MPS
>> hash1 = (CLOSUREP (f) ? igc_hash (AREF (f, CLOSURE_CODE)) : igc_hash (f));
>> ^^^^^^^^ ^^^^^^^^ ^^^^
>>
>> The constructs I marked with ^^^ all access the memory of F. F is a
>> vectorlike, it's memory is managed by MPS in an MPS pool that uses
>> memory barriers, so the memory of F can currently be behind a barrier.
>> It doesn't have to, but it can.
>>
>> When we access F's memory and it is behind a barrier, the result is a
>> nested SIgSEGV while handling SIGPROF.
>
> Two followup questions:
>
> . how is accessing F different from accessing the specpdl stack?
F's memory is allocated from an MPS pool via alloc_impl in igc.c. Most
objects are allocated from a pool that uses barriers (I think except
PVEC_THREAD). The specpdl stacks are mallocs (see
grow_specpdl_allocation), and uses as a roots. There are currently no
barriers on roots.
> . how does this work with the current GC, where F could have been
> collected and its memory freed?
I think when we find F in a specpdl stack, GC should have seen it and
marked it too in mark_specpdl. So it wouldn't be freed.
(Same for igc, where the stacks are roots, and should have seen F in
that way in scan_specdl.)
> The first question is more important, from where I stand. Looking
> forward beyond the point where we land igc on master, I wonder how
> will be able to tell, for a random non-trivial change on the C level,
> whether what it does can cause trouble with MPS? That is, how can a
> mere mortal determine whether a given data structure in igc Emacs can
> or cannot be safely touched when MPS happens to do its thing, whether
> synchronously or asynchronously? We must have some reasonably
> practical way of telling this, or else we will be breaking Emacs high
> and low.
>
>> More code accessing memory that is potentially behind a barrier follows
>> in record_backtrace.
>
> Which code is that? (It's a serious question: I tried to identify
> that code, but couldn't. I'm probably missing something.)
The example I saw, with ^^^^ marking the call sites:
static void
record_backtrace (struct profiler_log *plog, EMACS_INT count)
{
log_t *log = plog->log;
get_backtrace (log->trace, log->depth);
EMACS_UINT hash = trace_hash (log->trace, log->depth);
int hidx = log_hash_index (log, hash);
int idx = log->index[hidx];
while (idx >= 0)
{
if (log->hash[idx] == hash
&& trace_equal (log->trace, get_key_vector (log, idx), log->depth))
^^^^^^^^^^^
static bool
trace_equal (Lisp_Object *bt1, Lisp_Object *bt2, int depth)
{
for (int i = 0; i < depth; i++)
if (!BASE_EQ (bt1[i], bt2[i]) && NILP (Ffunction_equal (bt1[i], bt2[i])))
^^^^^^^^^^^^^^^
DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
doc: /* Return non-nil if F1 and F2 come from the same source.
Used to determine if different closures are just different instances of
the same lambda expression, or are really unrelated function. */)
(Lisp_Object f1, Lisp_Object f2)
{
bool res;
if (EQ (f1, f2))
res = true;
else if (CLOSUREP (f1) && CLOSUREP (f2))
^^^^^^^^ ^^^^^^^^
res = EQ (AREF (f1, CLOSURE_CODE), AREF (f2, CLOSURE_CODE));
^^^^ ^^^^
Didn't look further than that, though.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 11:48 ` Helmut Eller
2024-12-25 11:58 ` Gerd Möllmann
@ 2024-12-25 12:52 ` Eli Zaretskii
1 sibling, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-25 12:52 UTC (permalink / raw)
To: Helmut Eller; +Cc: gerd.moellmann, pipcet, ofv, emacs-devel, acorallo
> From: Helmut Eller <eller.helmut@gmail.com>
> Cc: Pip Cet <pipcet@protonmail.com>, Eli Zaretskii <eliz@gnu.org>,
> ofv@wanadoo.es, emacs-devel@gnu.org, acorallo@gnu.org
> Date: Wed, 25 Dec 2024 12:48:44 +0100
>
> On Wed, Dec 25 2024, Gerd Möllmann wrote:
>
> > Pip Cet <pipcet@protonmail.com> writes:
> >
> >> I don't think that's the problem. The problem is that signals can
> >> interrupt MPS, on all platforms.
> [...]
> > And I don't think that's right :-). It's completely right that in the
> > SIGPROF handler everything can be inconsistent. That's true both for MPS
> > and Emacs. For example, the bindings stack (specpdl) may be inconsistent
> > when SIGPROF arrives. Literally everything we do in the SIGPROF runs the
> > risk of encountering inconsistencies.
>
> The SIGPROF handler copies part of the potentially inconsistent state to
> the profiler log. That same potentially inconsistent profiler log is
> used later, outside the signal handler. Sounds like a problem to me.
> Is it not? Or is the probability for inconistencies being copied so low
> that we ignore it?
Or maybe the profiling code is robust in the face of these
inconsistencies?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 12:31 ` Eli Zaretskii
@ 2024-12-25 12:54 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 12:54 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Wed, 25 Dec 2024 06:23:49 +0100
>>
>> > If they don't have separate stacks, and we interrupt MPS, the signal
>> > handler cannot look at any MPS-modifiable memory (including roots, which
>> > may be in an inconsistent state mid-GC), ever. This includes the
>> > specpdl. We can't write to MPS-known memory, ever. This includes any
>> > area we might want to copy the backtrace or specpdl to.
>>
>> And I don't think that's right :-). It's completely right that in the
>> SIGPROF handler everything can be inconsistent. That's true both for MPS
>> and Emacs. For example, the bindings stack (specpdl) may be inconsistent
>> when SIGPROF arrives.
>
> Theoretically, maybe. But in practice, you'd need to identify the
> code which manipulates specpdl that could have specpdl in inconsistent
> state if interrupted at some opportune point. Can you identify such
> places in the code?
Which is basically what I answered to Helmut in another sub-thread.
...
>> For me, it's not about a theoretical or even practical solution that
>> somehow ensures a consistent state in MPS, or some future changes in MPS
>> or something. It's about getting away with what we do in the profiler
>> _now_, as we do with the old GC. which is already seeing potentially
>> inconsistent state in Emacs' memory.
>
> See above: there's a difference. So I would really like to hear why
> you think accessing specpdl from a SIGPROF handler in an igc build is
> safe.
I tried to answer your questions in a different reply sent a few minutes
ago.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 12:50 ` Gerd Möllmann
@ 2024-12-25 13:00 ` Eli Zaretskii
2024-12-25 13:08 ` Gerd Möllmann
2024-12-25 13:09 ` Eli Zaretskii
1 sibling, 1 reply; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-25 13:00 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Wed, 25 Dec 2024 13:50:37 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> >> More code accessing memory that is potentially behind a barrier follows
> >> in record_backtrace.
> >
> > Which code is that? (It's a serious question: I tried to identify
> > that code, but couldn't. I'm probably missing something.)
>
> The example I saw, with ^^^^ marking the call sites:
>
> static void
> record_backtrace (struct profiler_log *plog, EMACS_INT count)
> {
> log_t *log = plog->log;
> get_backtrace (log->trace, log->depth);
> EMACS_UINT hash = trace_hash (log->trace, log->depth);
> int hidx = log_hash_index (log, hash);
> int idx = log->index[hidx];
> while (idx >= 0)
> {
> if (log->hash[idx] == hash
> && trace_equal (log->trace, get_key_vector (log, idx), log->depth))
> ^^^^^^^^^^^
>
> static bool
> trace_equal (Lisp_Object *bt1, Lisp_Object *bt2, int depth)
> {
> for (int i = 0; i < depth; i++)
> if (!BASE_EQ (bt1[i], bt2[i]) && NILP (Ffunction_equal (bt1[i], bt2[i])))
> ^^^^^^^^^^^^^^^
>
> DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
> doc: /* Return non-nil if F1 and F2 come from the same source.
> Used to determine if different closures are just different instances of
> the same lambda expression, or are really unrelated function. */)
> (Lisp_Object f1, Lisp_Object f2)
> {
> bool res;
> if (EQ (f1, f2))
> res = true;
> else if (CLOSUREP (f1) && CLOSUREP (f2))
> ^^^^^^^^ ^^^^^^^^
> res = EQ (AREF (f1, CLOSURE_CODE), AREF (f2, CLOSURE_CODE));
> ^^^^ ^^^^
>
> Didn't look further than that, though.
But CLOSUREP is just
INLINE bool
CLOSUREP (Lisp_Object a)
{
return PSEUDOVECTORP (a, PVEC_CLOSURE);
}
And AREF is even simpler:
INLINE Lisp_Object
AREF (Lisp_Object array, ptrdiff_t idx)
{
eassert (0 <= idx && idx < gc_asize (array));
return XVECTOR (array)->contents[idx];
}
So why are those unsafe? Because they access Lisp objects, or for
some other reason?
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 13:00 ` Eli Zaretskii
@ 2024-12-25 13:08 ` Gerd Möllmann
0 siblings, 0 replies; 91+ messages in thread
From: Gerd Möllmann @ 2024-12-25 13:08 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
>> eller.helmut@gmail.com, acorallo@gnu.org
>> Date: Wed, 25 Dec 2024 13:50:37 +0100
>>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>> >> More code accessing memory that is potentially behind a barrier follows
>> >> in record_backtrace.
>> >
>> > Which code is that? (It's a serious question: I tried to identify
>> > that code, but couldn't. I'm probably missing something.)
>>
>> The example I saw, with ^^^^ marking the call sites:
>>
>> static void
>> record_backtrace (struct profiler_log *plog, EMACS_INT count)
>> {
>> log_t *log = plog->log;
>> get_backtrace (log->trace, log->depth);
>> EMACS_UINT hash = trace_hash (log->trace, log->depth);
>> int hidx = log_hash_index (log, hash);
>> int idx = log->index[hidx];
>> while (idx >= 0)
>> {
>> if (log->hash[idx] == hash
>> && trace_equal (log->trace, get_key_vector (log, idx), log->depth))
>> ^^^^^^^^^^^
>>
>> static bool
>> trace_equal (Lisp_Object *bt1, Lisp_Object *bt2, int depth)
>> {
>> for (int i = 0; i < depth; i++)
>> if (!BASE_EQ (bt1[i], bt2[i]) && NILP (Ffunction_equal (bt1[i], bt2[i])))
>> ^^^^^^^^^^^^^^^
>>
>> DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
>> doc: /* Return non-nil if F1 and F2 come from the same source.
>> Used to determine if different closures are just different instances of
>> the same lambda expression, or are really unrelated function. */)
>> (Lisp_Object f1, Lisp_Object f2)
>> {
>> bool res;
>> if (EQ (f1, f2))
>> res = true;
>> else if (CLOSUREP (f1) && CLOSUREP (f2))
>> ^^^^^^^^ ^^^^^^^^
>> res = EQ (AREF (f1, CLOSURE_CODE), AREF (f2, CLOSURE_CODE));
>> ^^^^ ^^^^
>>
>> Didn't look further than that, though.
>
> But CLOSUREP is just
>
> INLINE bool
> CLOSUREP (Lisp_Object a)
> {
> return PSEUDOVECTORP (a, PVEC_CLOSURE);
> }
PSEUDOVECTORP reads the vectorlike_header header from A's memory.
> And AREF is even simpler:
>
> INLINE Lisp_Object
> AREF (Lisp_Object array, ptrdiff_t idx)
> {
> eassert (0 <= idx && idx < gc_asize (array));
> return XVECTOR (array)->contents[idx];
> }
And AREF accesses ARRAY's memory via ->contents.
> So why are those unsafe? Because they access Lisp objects, or for
> some other reason?
What do you mean with unsafe? We are accessing an object's memory. That
memory may potentially be protected by a barrier. I thought we agreed on
that.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Some experience with the igc branch
2024-12-25 12:50 ` Gerd Möllmann
2024-12-25 13:00 ` Eli Zaretskii
@ 2024-12-25 13:09 ` Eli Zaretskii
1 sibling, 0 replies; 91+ messages in thread
From: Eli Zaretskii @ 2024-12-25 13:09 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: pipcet, ofv, emacs-devel, eller.helmut, acorallo
> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: pipcet@protonmail.com, ofv@wanadoo.es, emacs-devel@gnu.org,
> eller.helmut@gmail.com, acorallo@gnu.org
> Date: Wed, 25 Dec 2024 13:50:37 +0100
>
> > . how is accessing F different from accessing the specpdl stack?
>
> F's memory is allocated from an MPS pool via alloc_impl in igc.c. Most
> objects are allocated from a pool that uses barriers (I think except
> PVEC_THREAD). The specpdl stacks are mallocs (see
> grow_specpdl_allocation), and uses as a roots. There are currently no
> barriers on roots.
So you are saying that the answer to this:
> > The first question is more important, from where I stand. Looking
> > forward beyond the point where we land igc on master, I wonder how
> > will be able to tell, for a random non-trivial change on the C level,
> > whether what it does can cause trouble with MPS? That is, how can a
> > mere mortal determine whether a given data structure in igc Emacs can
> > or cannot be safely touched when MPS happens to do its thing, whether
> > synchronously or asynchronously? We must have some reasonably
> > practical way of telling this, or else we will be breaking Emacs high
> > and low.
is that we need to trace each datum to see whether it is "used as
roots" (what does that mean in practice, btw?) or is "allocated via
alloc_impl in igc.c"? Does the latter include all the Lisp objects
(except fixnums)? Do we allocate non-Lisp data via alloc_impl, and if
so, which data?
Once again, I think this is very important for future maintenance. I
feel that this barrier thing in MPS introduces significant
complications into reasoning about safety of C-level changes.
Previously, we only had the mark bit to worry about if we wanted to
access Lisp objects during GC (see gc_asize, for example), but now we
have a much larger problem, AFAIU. How do we manage that for the next
40 years?
^ permalink raw reply [flat|nested] 91+ messages in thread
end of thread, other threads:[~2024-12-25 13:09 UTC | newest]
Thread overview: 91+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-22 15:40 Some experience with the igc branch Óscar Fuentes
2024-12-22 17:18 ` Gerd Möllmann
2024-12-22 17:29 ` Gerd Möllmann
2024-12-22 17:41 ` Pip Cet via Emacs development discussions.
2024-12-22 17:56 ` Gerd Möllmann
2024-12-22 19:11 ` Óscar Fuentes
2024-12-23 0:05 ` Pip Cet via Emacs development discussions.
2024-12-23 1:00 ` Óscar Fuentes
2024-12-24 22:34 ` Pip Cet via Emacs development discussions.
2024-12-25 4:25 ` Freezing frame with igc Gerd Möllmann
2024-12-25 11:19 ` Pip Cet via Emacs development discussions.
2024-12-25 11:55 ` Óscar Fuentes
2024-12-23 3:42 ` Some experience with the igc branch Gerd Möllmann
2024-12-23 6:27 ` Jean Louis
2024-12-22 20:29 ` Helmut Eller
2024-12-22 20:50 ` Gerd Möllmann
2024-12-22 22:26 ` Pip Cet via Emacs development discussions.
2024-12-23 3:23 ` Gerd Möllmann
[not found] ` <m234ieddeu.fsf_-_@gmail.com>
[not found] ` <87ttaueqp9.fsf@protonmail.com>
[not found] ` <m2frme921u.fsf@gmail.com>
[not found] ` <87ldw6ejkv.fsf@protonmail.com>
[not found] ` <m2bjx2h8dh.fsf@gmail.com>
2024-12-23 14:45 ` Make Signal handling patch platform-dependent? Pip Cet via Emacs development discussions.
2024-12-23 14:54 ` Gerd Möllmann
2024-12-23 15:11 ` Eli Zaretskii
2024-12-23 13:35 ` Some experience with the igc branch Eli Zaretskii
2024-12-23 14:03 ` Discussion with MPS people Gerd Möllmann
2024-12-23 14:04 ` Gerd Möllmann
2024-12-23 15:07 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
2024-12-23 15:26 ` Gerd Möllmann
2024-12-23 16:03 ` Pip Cet via Emacs development discussions.
2024-12-23 16:44 ` Eli Zaretskii
2024-12-23 17:16 ` Pip Cet via Emacs development discussions.
2024-12-23 18:35 ` Eli Zaretskii
2024-12-23 18:48 ` Gerd Möllmann
2024-12-23 19:25 ` Eli Zaretskii
2024-12-23 20:30 ` Benjamin Riefenstahl
2024-12-23 23:39 ` Pip Cet via Emacs development discussions.
2024-12-24 12:14 ` Eli Zaretskii
2024-12-24 13:18 ` Pip Cet via Emacs development discussions.
2024-12-24 13:42 ` Benjamin Riefenstahl
2024-12-24 3:37 ` Eli Zaretskii
2024-12-24 8:48 ` Benjamin Riefenstahl
2024-12-24 13:52 ` Eli Zaretskii
2024-12-24 13:54 ` Benjamin Riefenstahl
2024-12-23 17:44 ` Gerd Möllmann
2024-12-23 19:00 ` Eli Zaretskii
2024-12-23 19:37 ` Eli Zaretskii
2024-12-23 20:49 ` Gerd Möllmann
2024-12-23 21:43 ` Helmut Eller
2024-12-23 21:49 ` Pip Cet via Emacs development discussions.
2024-12-23 21:58 ` Helmut Eller
2024-12-23 23:20 ` Pip Cet via Emacs development discussions.
2024-12-24 5:38 ` Helmut Eller
2024-12-24 6:27 ` Gerd Möllmann
2024-12-24 10:09 ` Pip Cet via Emacs development discussions.
2024-12-24 4:05 ` Gerd Möllmann
2024-12-24 8:50 ` Gerd Möllmann
2024-12-24 6:03 ` SIGPROF + SIGCHLD and igc Gerd Möllmann
2024-12-24 8:23 ` Helmut Eller
2024-12-24 8:39 ` Gerd Möllmann
2024-12-25 9:22 ` Helmut Eller
2024-12-25 9:43 ` Gerd Möllmann
2024-12-24 13:05 ` Eli Zaretskii
2024-12-25 10:46 ` Helmut Eller
2024-12-25 12:45 ` Eli Zaretskii
2024-12-24 12:54 ` Eli Zaretskii
2024-12-24 12:59 ` Gerd Möllmann
2024-12-23 23:37 ` Some experience with the igc branch Pip Cet via Emacs development discussions.
2024-12-24 4:03 ` Gerd Möllmann
2024-12-24 10:25 ` Pip Cet via Emacs development discussions.
2024-12-24 10:50 ` Gerd Möllmann
2024-12-24 13:15 ` Eli Zaretskii
2024-12-24 12:26 ` Eli Zaretskii
2024-12-24 12:56 ` Gerd Möllmann
2024-12-24 13:19 ` Pip Cet via Emacs development discussions.
2024-12-24 13:38 ` Gerd Möllmann
2024-12-24 13:46 ` Eli Zaretskii
2024-12-24 14:12 ` Gerd Möllmann
2024-12-24 14:40 ` Eli Zaretskii
2024-12-25 4:56 ` Gerd Möllmann
2024-12-25 12:19 ` Eli Zaretskii
2024-12-25 12:50 ` Gerd Möllmann
2024-12-25 13:00 ` Eli Zaretskii
2024-12-25 13:08 ` Gerd Möllmann
2024-12-25 13:09 ` Eli Zaretskii
2024-12-24 21:18 ` Pip Cet via Emacs development discussions.
2024-12-25 5:23 ` Gerd Möllmann
2024-12-25 10:48 ` Pip Cet via Emacs development discussions.
2024-12-25 11:48 ` Helmut Eller
2024-12-25 11:58 ` Gerd Möllmann
2024-12-25 12:52 ` Eli Zaretskii
2024-12-25 12:31 ` Eli Zaretskii
2024-12-25 12:54 ` Gerd Möllmann
2024-12-24 12:11 ` Eli Zaretskii
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.