* MPS: Bytecode stack
@ 2024-05-06 9:58 Gerd Möllmann
2024-05-06 12:12 ` Mattias Engdegård
0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 9:58 UTC (permalink / raw)
To: Eli Zaretskii, Helmut Eller; +Cc: Emacs Devel
I think I fixed, or better said worked around, something important today
in scan_bc which can lead to objects being "lost". Might be interesting
to see if some errors magically disappear.
A real fix has yet to be found, but see the FIXME comment in the commit.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 9:58 MPS: Bytecode stack Gerd Möllmann
@ 2024-05-06 12:12 ` Mattias Engdegård
2024-05-06 12:47 ` Gerd Möllmann
0 siblings, 1 reply; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 12:12 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
6 maj 2024 kl. 11.58 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
> I think I fixed, or better said worked around, something important today
> in scan_bc which can lead to objects being "lost". Might be interesting
> to see if some errors magically disappear.
I'm not sure what you are trying to fix. Dodging a data race?
To see if I understood what your problem is, and how MPS works, consider the thread state
Lisp_Object *a;
int top;
where `a` is a heap-allocated array of some size, but only indices [0..top] are actually used so the GC doesn't need to look at the rest. The mutator thread does:
1 x = a[top];
2 top = top - 1;
What does the GC thread see? Even if it reads a (signal-atomic) snapshot of the mutator's register and stack, does this atomicity extend to `top` which isn't even a Lisp_Object? Is the mutator suspended during this scan?
Furthermore, the compiler is allowed to hoist the store in line 2 past the load in line 1. If that's the only problem, it may just be a matter of adding a compiler barrier (acquire).
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 12:12 ` Mattias Engdegård
@ 2024-05-06 12:47 ` Gerd Möllmann
2024-05-06 13:07 ` Mattias Engdegård
0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 12:47 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
Mattias Engdegård <mattias.engdegard@gmail.com> writes:
> 6 maj 2024 kl. 11.58 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>
>> I think I fixed, or better said worked around, something important today
>> in scan_bc which can lead to objects being "lost". Might be interesting
>> to see if some errors magically disappear.
>
> I'm not sure what you are trying to fix. Dodging a data race?
>
> To see if I understood what your problem is, and how MPS works, consider the thread state
>
> Lisp_Object *a;
> int top;
>
> where `a` is a heap-allocated array of some size, but only indices [0..top] are actually used so the GC doesn't need to look at the rest. The mutator thread does:
>
> 1 x = a[top];
> 2 top = top - 1;
>
> What does the GC thread see? Even if it reads a (signal-atomic)
> snapshot of the mutator's register and stack, does this atomicity
> extend to `top` which isn't even a Lisp_Object? Is the mutator
> suspended during this scan?
>
> Furthermore, the compiler is allowed to hoist the store in line 2 past
> the load in line 1. If that's the only problem, it may just be a
> matter of adding a compiler barrier (acquire).
It would be easiest if you'd look at scan_bc in igc.c on the branch
That explains it better als prosa:
static mps_res_t
scan_bc (mps_ss_t ss, void *start, void *end, void *closure)
{
MPS_SCAN_BEGIN (ss)
{x
struct igc_thread_list *t = closure;
struct bc_thread_state *bc = &t->d.ts->bc;
igc_assert (start == (void *) bc->stack);
igc_assert (end == (void *) bc->stack_end);
/* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
and has a maximum length that is given by the bytecode being
executed (COMPILED_STACK_DEPTH). So, we need to scan upto
bc->fo->next_stack + that max depth to be safe. Since I don't
have that number ATM, I'm using an arbitrary estimate for
now.
This must be changed to something better. Note that Mattias said
the bc stack marking will be changed in the future. */
const size_t HORRIBLE_ESTIMATE = 1024;
char *scan_end = bc_next_frame (bc->fp);
scan_end += HORRIBLE_ESTIMATE;
end = min (end, (void *) scan_end);
if (end > start)
IGC_FIX_CALL (ss, scan_ambig (ss, start, end, NULL));
}
MPS_SCAN_END (ss);
return MPS_RES_OK;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 12:47 ` Gerd Möllmann
@ 2024-05-06 13:07 ` Mattias Engdegård
2024-05-06 13:23 ` Gerd Möllmann
0 siblings, 1 reply; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 13:07 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
> It would be easiest if you'd look at scan_bc in igc.c on the branch
I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other:
> /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
Does it? Look at the ASCII art in bytecode.c. During bytecode execution, fp->next_stack is the lowest completely unused entry of the bytecode stack, but that refers to the engine register `fp` which is almost but not entirely in sync with `bc->fp`. There are gaps when there is live data on the stack beyond bc->fp->next_stack, such as during frame setup:
514 Lisp_Object *frame_base = bc->fp->next_stack;
515 struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack);
516
517 if ((char *)fp->next_stack > bc->stack_end)
518 error ("Bytecode stack overflow");
519
520 /* Save the function object so that the bytecode and vector are
521 held from removal by the GC. */
522 fp->fun = fun;
523 /* Save previous stack pointer and pc in the new frame. If we came
524 directly from outside, these will be NULL. */
525 fp->saved_top = top;
526 fp->saved_pc = pc;
527 fp->saved_fp = bc->fp;
528 bc->fp = fp;
which is why I was concerned about whether there might be a race somewhere. Note, however, that in this particular window we don't stash anything to the stack beyond bc->fp->next_stack that isn't already accessible via ambig refs elsewhere (`fun` in particular).
There may be similar gaps when we pop stack frames: return from a function and in the catch and condition-case handling, but it would be easier if I knew what I was looking for.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 13:07 ` Mattias Engdegård
@ 2024-05-06 13:23 ` Gerd Möllmann
2024-05-06 13:35 ` Gerd Möllmann
0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 13:23 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
Mattias Engdegård <mattias.engdegard@gmail.com> writes:
> 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>
>> It would be easiest if you'd look at scan_bc in igc.c on the branch
>
> I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other:
>
>> /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
>
> Does it? Look at the ASCII art in bytecode.c. During bytecode
> execution, fp->next_stack is the lowest completely unused entry of the
> bytecode stack, but that refers to the engine register `fp` which is
> almost but not entirely in sync with `bc->fp`. There are gaps when
> there is live data on the stack beyond bc->fp->next_stack, such as
> during frame setup:
>
> 514 Lisp_Object *frame_base = bc->fp->next_stack;
> 515 struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack);
> 516
> 517 if ((char *)fp->next_stack > bc->stack_end)
> 518 error ("Bytecode stack overflow");
> 519
> 520 /* Save the function object so that the bytecode and vector are
> 521 held from removal by the GC. */
> 522 fp->fun = fun;
> 523 /* Save previous stack pointer and pc in the new frame. If we came
> 524 directly from outside, these will be NULL. */
> 525 fp->saved_top = top;
> 526 fp->saved_pc = pc;
> 527 fp->saved_fp = bc->fp;
> 528 bc->fp = fp;
>
> which is why I was concerned about whether there might be a race
> somewhere. Note, however, that in this particular window we don't
> stash anything to the stack beyond bc->fp->next_stack that isn't
> already accessible via ambig refs elsewhere (`fun` in particular).
>
> There may be similar gaps when we pop stack frames: return from a
> function and in the catch and condition-case handling, but it would be
> easier if I knew what I was looking for.
Ok, let me try to explain. Thanks for looking into this!
I made the bc stack(s) ambig roots (conservativly scanned, like with
mark_memory). For the root address range I use the stack's start and
end. Since that is quite large (512K?), I though it a good idea to scan
less. So scan_bc tries to figure out an 'end' address that is smaller
the whole stack's end. That's all it is about. How to find a realistic,
scan end.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 13:23 ` Gerd Möllmann
@ 2024-05-06 13:35 ` Gerd Möllmann
2024-05-06 13:54 ` Mattias Engdegård
0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 13:35 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Mattias Engdegård <mattias.engdegard@gmail.com> writes:
>
>> 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>>
>>> It would be easiest if you'd look at scan_bc in igc.c on the branch
>>
>> I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other:
>>
>>> /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
>>
>> Does it? Look at the ASCII art in bytecode.c. During bytecode
>> execution, fp->next_stack is the lowest completely unused entry of the
>> bytecode stack, but that refers to the engine register `fp` which is
>> almost but not entirely in sync with `bc->fp`. There are gaps when
>> there is live data on the stack beyond bc->fp->next_stack, such as
>> during frame setup:
>>
>> 514 Lisp_Object *frame_base = bc->fp->next_stack;
>> 515 struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack);
>> 516
>> 517 if ((char *)fp->next_stack > bc->stack_end)
>> 518 error ("Bytecode stack overflow");
>> 519
>> 520 /* Save the function object so that the bytecode and vector are
>> 521 held from removal by the GC. */
>> 522 fp->fun = fun;
>> 523 /* Save previous stack pointer and pc in the new frame. If we came
>> 524 directly from outside, these will be NULL. */
>> 525 fp->saved_top = top;
>> 526 fp->saved_pc = pc;
>> 527 fp->saved_fp = bc->fp;
>> 528 bc->fp = fp;
>>
>> which is why I was concerned about whether there might be a race
>> somewhere. Note, however, that in this particular window we don't
>> stash anything to the stack beyond bc->fp->next_stack that isn't
>> already accessible via ambig refs elsewhere (`fun` in particular).
>>
>> There may be similar gaps when we pop stack frames: return from a
>> function and in the catch and condition-case handling, but it would be
>> easier if I knew what I was looking for.
>
> Ok, let me try to explain. Thanks for looking into this!
>
> I made the bc stack(s) ambig roots (conservativly scanned, like with
> mark_memory). For the root address range I use the stack's start and
> end. Since that is quite large (512K?), I though it a good idea to scan
> less. So scan_bc tries to figure out an 'end' address that is smaller
> the whole stack's end. That's all it is about. How to find a realistic,
> scan end.
Or, wait a moment... I'm wondering about what happens when the client
continues to exec byte code while we are scanning... The root is not
like objects in MPS memory to which we get granted exclusive access. Or
so I think ATM. But roots can have a barrier applied to them, I think.
MPS_RM_PROT. I have to read a bit.
(BTW, funny I think: Emacs/master froze while having Emacs/igc in LLDB,
and Emacs/igc was still functioning well :-))
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 13:35 ` Gerd Möllmann
@ 2024-05-06 13:54 ` Mattias Engdegård
2024-05-06 14:02 ` Gerd Möllmann
0 siblings, 1 reply; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 13:54 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
6 maj 2024 kl. 15.35 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
> Or, wait a moment... I'm wondering about what happens when the client
> continues to exec byte code while we are scanning... The root is not
> like objects in MPS memory to which we get granted exclusive access. Or
> so I think ATM. But roots can have a barrier applied to them, I think.
> MPS_RM_PROT. I have to read a bit.
Yes, that's basically what I was wondering and the reason for the synthetic little example.
Essentially, what context is the scanning code running in? The mutator thread? If so, as an async signal handler, or in a more controlled context? If in a separate thread, does the mutator keep running? How did we get hold of the mutator stack and registers? And so on.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 13:54 ` Mattias Engdegård
@ 2024-05-06 14:02 ` Gerd Möllmann
2024-05-06 14:14 ` Gerd Möllmann
2024-05-06 15:59 ` Mattias Engdegård
0 siblings, 2 replies; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 14:02 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
Mattias Engdegård <mattias.engdegard@gmail.com> writes:
> 6 maj 2024 kl. 15.35 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>
>> Or, wait a moment... I'm wondering about what happens when the client
>> continues to exec byte code while we are scanning... The root is not
>> like objects in MPS memory to which we get granted exclusive access. Or
>> so I think ATM. But roots can have a barrier applied to them, I think.
>> MPS_RM_PROT. I have to read a bit.
>
> Yes, that's basically what I was wondering and the reason for the synthetic little example.
>
> Essentially, what context is the scanning code running in? The mutator
> thread? If so, as an async signal handler, or in a more controlled
> context? If in a separate thread, does the mutator keep running? How
> did we get hold of the mutator stack and registers? And so on.
Yes, we're in the GC thread when scanning.
The docs say about MPS_RM_PROT:
MPS_RM_PROT
The root mode for protectable roots. This tells the MPS that it may
place a barrier (1) on any page containing any part of the root. No
format method or scan method (except for the one for this root) may
write data in this root. They may read it.
Note
You must not specify MPS_RM_PROT on a root allocated by the MPS.
No page may contain parts of two or more protectable roots. You mustn’t
specify MPS_RM_PROT if the client program or anything other than (this
instance of) the MPS is going to protect or unprotect the relevant
pages.
This mode may not be suitable if the client program wants the operating
system to be able to access the root. Many operating systems can’t cope
with writing to protected pages.
which is wanted to use for the loaded pdump, but have never tried.
Sounds reasonable to me.
Hm, I guess for now it's best to forget about optimization, and just
scan the whole stack until we have something that works, and then come
back later. The three rules of optimization again... don't, don't do it
yet, ...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 14:02 ` Gerd Möllmann
@ 2024-05-06 14:14 ` Gerd Möllmann
2024-05-06 15:59 ` Mattias Engdegård
1 sibling, 0 replies; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 14:14 UTC (permalink / raw)
To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Hm, I guess for now it's best to forget about optimization, and just
> scan the whole stack until we have something that works, and then come
> back later. The three rules of optimization again... don't, don't do it
> yet, ...
Now pushed. No optimizations anymore.
Thanks, Mattias, that helped me!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack
2024-05-06 14:02 ` Gerd Möllmann
2024-05-06 14:14 ` Gerd Möllmann
@ 2024-05-06 15:59 ` Mattias Engdegård
1 sibling, 0 replies; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 15:59 UTC (permalink / raw)
To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel
6 maj 2024 kl. 16.02 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
> Hm, I guess for now it's best to forget about optimization, and just
> scan the whole stack until we have something that works, and then come
> back later.
Agreed. The stack is huge, though, so we definitely want to do better, and we shouldn't forget about the ambition to scan it precisely; I haven't given up on that patch. It even helps performance with the standard GC, and would help with debug info.
And we should understand where the possible races are, partly for a performant solution but also because similar problems could be lurking elsewhere.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-05-06 15:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-06 9:58 MPS: Bytecode stack Gerd Möllmann
2024-05-06 12:12 ` Mattias Engdegård
2024-05-06 12:47 ` Gerd Möllmann
2024-05-06 13:07 ` Mattias Engdegård
2024-05-06 13:23 ` Gerd Möllmann
2024-05-06 13:35 ` Gerd Möllmann
2024-05-06 13:54 ` Mattias Engdegård
2024-05-06 14:02 ` Gerd Möllmann
2024-05-06 14:14 ` Gerd Möllmann
2024-05-06 15:59 ` Mattias Engdegård
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.