MPS: Bytecode stack

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* MPS: Bytecode stack
@ 2024-05-06  9:58 Gerd Möllmann
  2024-05-06 12:12 ` Mattias Engdegård
  0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06  9:58 UTC (permalink / raw)
  To: Eli Zaretskii, Helmut Eller; +Cc: Emacs Devel

I think I fixed, or better said worked around, something important today
in scan_bc which can lead to objects being "lost". Might be interesting
to see if some errors magically disappear.

A real fix has yet to be found, but see the FIXME comment in the commit.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06  9:58 MPS: Bytecode stack Gerd Möllmann
@ 2024-05-06 12:12 ` Mattias Engdegård
  2024-05-06 12:47   ` Gerd Möllmann
  0 siblings, 1 reply; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 12:12 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

6 maj 2024 kl. 11.58 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:

> I think I fixed, or better said worked around, something important today
> in scan_bc which can lead to objects being "lost". Might be interesting
> to see if some errors magically disappear.

I'm not sure what you are trying to fix. Dodging a data race?

To see if I understood what your problem is, and how MPS works, consider the thread state

Lisp_Object *a;
int top;

where `a` is a heap-allocated array of some size, but only indices [0..top] are actually used so the GC doesn't need to look at the rest. The mutator thread does:

1  x = a[top];
2  top = top - 1;

What does the GC thread see? Even if it reads a (signal-atomic) snapshot of the mutator's register and stack, does this atomicity extend to `top` which isn't even a Lisp_Object? Is the mutator suspended during this scan?

Furthermore, the compiler is allowed to hoist the store in line 2 past the load in line 1. If that's the only problem, it may just be a matter of adding a compiler barrier (acquire).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 12:12 ` Mattias Engdegård
@ 2024-05-06 12:47   ` Gerd Möllmann
  2024-05-06 13:07     ` Mattias Engdegård
  0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 12:47 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

Mattias Engdegård <mattias.engdegard@gmail.com> writes:

> 6 maj 2024 kl. 11.58 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>
>> I think I fixed, or better said worked around, something important today
>> in scan_bc which can lead to objects being "lost". Might be interesting
>> to see if some errors magically disappear.
>
> I'm not sure what you are trying to fix. Dodging a data race?
>
> To see if I understood what your problem is, and how MPS works, consider the thread state
>
> Lisp_Object *a;
> int top;
>
> where `a` is a heap-allocated array of some size, but only indices [0..top] are actually used so the GC doesn't need to look at the rest. The mutator thread does:
>
> 1  x = a[top];
> 2  top = top - 1;
>
> What does the GC thread see? Even if it reads a (signal-atomic)
> snapshot of the mutator's register and stack, does this atomicity
> extend to `top` which isn't even a Lisp_Object? Is the mutator
> suspended during this scan?
>
> Furthermore, the compiler is allowed to hoist the store in line 2 past
> the load in line 1. If that's the only problem, it may just be a
> matter of adding a compiler barrier (acquire).

It would be easiest if you'd look at scan_bc in igc.c on the branch
That explains it better als prosa:

static mps_res_t
scan_bc (mps_ss_t ss, void *start, void *end, void *closure)
{
  MPS_SCAN_BEGIN (ss)
  {x
    struct igc_thread_list *t = closure;
    struct bc_thread_state *bc = &t->d.ts->bc;
    igc_assert (start == (void *) bc->stack);
    igc_assert (end == (void *) bc->stack_end);
    /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
       and has a maximum length that is given by the bytecode being
       executed (COMPILED_STACK_DEPTH). So, we need to scan upto
       bc->fo->next_stack + that max depth to be safe.  Since I don't
       have that number ATM, I'm using an arbitrary estimate for
       now.

       This must be changed to something better. Note that Mattias said
       the bc stack marking will be changed in the future.  */
    const size_t HORRIBLE_ESTIMATE = 1024;
    char *scan_end = bc_next_frame (bc->fp);
    scan_end += HORRIBLE_ESTIMATE;
    end = min (end, (void *) scan_end);
    if (end > start)
      IGC_FIX_CALL (ss, scan_ambig (ss, start, end, NULL));
  }
  MPS_SCAN_END (ss);
  return MPS_RES_OK;
}



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 12:47   ` Gerd Möllmann
@ 2024-05-06 13:07     ` Mattias Engdegård
  2024-05-06 13:23       ` Gerd Möllmann
  0 siblings, 1 reply; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 13:07 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:

> It would be easiest if you'd look at scan_bc in igc.c on the branch

I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other:

>    /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack

Does it? Look at the ASCII art in bytecode.c. During bytecode execution, fp->next_stack is the lowest completely unused entry of the bytecode stack, but that refers to the engine register `fp` which is almost but not entirely in sync with `bc->fp`. There are gaps when there is live data on the stack beyond bc->fp->next_stack, such as during frame setup:

 514   Lisp_Object *frame_base = bc->fp->next_stack;
 515   struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack);
 516 
 517   if ((char *)fp->next_stack > bc->stack_end)
 518     error ("Bytecode stack overflow");
 519 
 520   /* Save the function object so that the bytecode and vector are
 521      held from removal by the GC. */
 522   fp->fun = fun;
 523   /* Save previous stack pointer and pc in the new frame.  If we came
 524      directly from outside, these will be NULL.  */
 525   fp->saved_top = top;
 526   fp->saved_pc = pc;
 527   fp->saved_fp = bc->fp;
 528   bc->fp = fp;

which is why I was concerned about whether there might be a race somewhere. Note, however, that in this particular window we don't stash anything to the stack beyond bc->fp->next_stack that isn't already accessible via ambig refs elsewhere (`fun` in particular).

There may be similar gaps when we pop stack frames: return from a function and in the catch and condition-case handling, but it would be easier if I knew what I was looking for.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 13:07     ` Mattias Engdegård
@ 2024-05-06 13:23       ` Gerd Möllmann
  2024-05-06 13:35         ` Gerd Möllmann
  0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 13:23 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

Mattias Engdegård <mattias.engdegard@gmail.com> writes:

> 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>
>> It would be easiest if you'd look at scan_bc in igc.c on the branch
>
> I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other:
>
>>    /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
>
> Does it? Look at the ASCII art in bytecode.c. During bytecode
> execution, fp->next_stack is the lowest completely unused entry of the
> bytecode stack, but that refers to the engine register `fp` which is
> almost but not entirely in sync with `bc->fp`. There are gaps when
> there is live data on the stack beyond bc->fp->next_stack, such as
> during frame setup:
>
>  514   Lisp_Object *frame_base = bc->fp->next_stack;
>  515   struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack);
>  516 
>  517   if ((char *)fp->next_stack > bc->stack_end)
>  518     error ("Bytecode stack overflow");
>  519 
>  520   /* Save the function object so that the bytecode and vector are
>  521      held from removal by the GC. */
>  522   fp->fun = fun;
>  523   /* Save previous stack pointer and pc in the new frame.  If we came
>  524      directly from outside, these will be NULL.  */
>  525   fp->saved_top = top;
>  526   fp->saved_pc = pc;
>  527   fp->saved_fp = bc->fp;
>  528   bc->fp = fp;
>
> which is why I was concerned about whether there might be a race
> somewhere. Note, however, that in this particular window we don't
> stash anything to the stack beyond bc->fp->next_stack that isn't
> already accessible via ambig refs elsewhere (`fun` in particular).
>
> There may be similar gaps when we pop stack frames: return from a
> function and in the catch and condition-case handling, but it would be
> easier if I knew what I was looking for.

Ok, let me try to explain. Thanks for looking into this!

I made the bc stack(s) ambig roots (conservativly scanned, like with
mark_memory). For the root address range I use the stack's start and
end. Since that is quite large (512K?), I though it a good idea to scan
less. So scan_bc tries to figure out an 'end' address that is smaller
the whole stack's end. That's all it is about. How to find a realistic,
scan end.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 13:23       ` Gerd Möllmann
@ 2024-05-06 13:35         ` Gerd Möllmann
  2024-05-06 13:54           ` Mattias Engdegård
  0 siblings, 1 reply; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 13:35 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Mattias Engdegård <mattias.engdegard@gmail.com> writes:
>
>> 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>>
>>> It would be easiest if you'd look at scan_bc in igc.c on the branch
>>
>> I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other:
>>
>>>    /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack
>>
>> Does it? Look at the ASCII art in bytecode.c. During bytecode
>> execution, fp->next_stack is the lowest completely unused entry of the
>> bytecode stack, but that refers to the engine register `fp` which is
>> almost but not entirely in sync with `bc->fp`. There are gaps when
>> there is live data on the stack beyond bc->fp->next_stack, such as
>> during frame setup:
>>
>>  514   Lisp_Object *frame_base = bc->fp->next_stack;
>>  515   struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack);
>>  516 
>>  517   if ((char *)fp->next_stack > bc->stack_end)
>>  518     error ("Bytecode stack overflow");
>>  519 
>>  520   /* Save the function object so that the bytecode and vector are
>>  521      held from removal by the GC. */
>>  522   fp->fun = fun;
>>  523   /* Save previous stack pointer and pc in the new frame.  If we came
>>  524      directly from outside, these will be NULL.  */
>>  525   fp->saved_top = top;
>>  526   fp->saved_pc = pc;
>>  527   fp->saved_fp = bc->fp;
>>  528   bc->fp = fp;
>>
>> which is why I was concerned about whether there might be a race
>> somewhere. Note, however, that in this particular window we don't
>> stash anything to the stack beyond bc->fp->next_stack that isn't
>> already accessible via ambig refs elsewhere (`fun` in particular).
>>
>> There may be similar gaps when we pop stack frames: return from a
>> function and in the catch and condition-case handling, but it would be
>> easier if I knew what I was looking for.
>
> Ok, let me try to explain. Thanks for looking into this!
>
> I made the bc stack(s) ambig roots (conservativly scanned, like with
> mark_memory). For the root address range I use the stack's start and
> end. Since that is quite large (512K?), I though it a good idea to scan
> less. So scan_bc tries to figure out an 'end' address that is smaller
> the whole stack's end. That's all it is about. How to find a realistic,
> scan end.

Or, wait a moment... I'm wondering about what happens when the client
continues to exec byte code while we are scanning... The root is not
like objects in MPS memory to which we get granted exclusive access. Or
so I think ATM. But roots can have a barrier applied to them, I think.
MPS_RM_PROT. I have to read a bit.

(BTW, funny I think: Emacs/master froze while having Emacs/igc in LLDB,
and Emacs/igc was still functioning well :-))



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 13:35         ` Gerd Möllmann
@ 2024-05-06 13:54           ` Mattias Engdegård
  2024-05-06 14:02             ` Gerd Möllmann
  0 siblings, 1 reply; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 13:54 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

6 maj 2024 kl. 15.35 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:

> Or, wait a moment... I'm wondering about what happens when the client
> continues to exec byte code while we are scanning... The root is not
> like objects in MPS memory to which we get granted exclusive access. Or
> so I think ATM. But roots can have a barrier applied to them, I think.
> MPS_RM_PROT. I have to read a bit.

Yes, that's basically what I was wondering and the reason for the synthetic little example.

Essentially, what context is the scanning code running in? The mutator thread? If so, as an async signal handler, or in a more controlled context? If in a separate thread, does the mutator keep running? How did we get hold of the mutator stack and registers? And so on.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 13:54           ` Mattias Engdegård
@ 2024-05-06 14:02             ` Gerd Möllmann
  2024-05-06 14:14               ` Gerd Möllmann
  2024-05-06 15:59               ` Mattias Engdegård
  0 siblings, 2 replies; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 14:02 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

Mattias Engdegård <mattias.engdegard@gmail.com> writes:

> 6 maj 2024 kl. 15.35 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:
>
>> Or, wait a moment... I'm wondering about what happens when the client
>> continues to exec byte code while we are scanning... The root is not
>> like objects in MPS memory to which we get granted exclusive access. Or
>> so I think ATM. But roots can have a barrier applied to them, I think.
>> MPS_RM_PROT. I have to read a bit.
>
> Yes, that's basically what I was wondering and the reason for the synthetic little example.
>
> Essentially, what context is the scanning code running in? The mutator
> thread? If so, as an async signal handler, or in a more controlled
> context? If in a separate thread, does the mutator keep running? How
> did we get hold of the mutator stack and registers? And so on.

Yes, we're in the GC thread when scanning.

The docs say about MPS_RM_PROT:

  MPS_RM_PROT

  The root mode for protectable roots. This tells the MPS that it may
  place a barrier (1) on any page containing any part of the root. No
  format method or scan method (except for the one for this root) may
  write data in this root. They may read it.

  Note

  You must not specify MPS_RM_PROT on a root allocated by the MPS.

  No page may contain parts of two or more protectable roots. You mustn’t
  specify MPS_RM_PROT if the client program or anything other than (this
  instance of) the MPS is going to protect or unprotect the relevant
  pages.

  This mode may not be suitable if the client program wants the operating
  system to be able to access the root. Many operating systems can’t cope
  with writing to protected pages.

which is wanted to use for the loaded pdump, but have never tried.
Sounds reasonable to me.

Hm, I guess for now it's best to forget about optimization, and just
scan the whole stack until we have something that works, and then come
back later. The three rules of optimization again... don't, don't do it
yet, ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 14:02             ` Gerd Möllmann
@ 2024-05-06 14:14               ` Gerd Möllmann
  2024-05-06 15:59               ` Mattias Engdegård
  1 sibling, 0 replies; 10+ messages in thread
From: Gerd Möllmann @ 2024-05-06 14:14 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Hm, I guess for now it's best to forget about optimization, and just
> scan the whole stack until we have something that works, and then come
> back later. The three rules of optimization again... don't, don't do it
> yet, ...

Now pushed. No optimizations anymore.

Thanks, Mattias, that helped me!



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MPS: Bytecode stack
  2024-05-06 14:02             ` Gerd Möllmann
  2024-05-06 14:14               ` Gerd Möllmann
@ 2024-05-06 15:59               ` Mattias Engdegård
  1 sibling, 0 replies; 10+ messages in thread
From: Mattias Engdegård @ 2024-05-06 15:59 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel

6 maj 2024 kl. 16.02 skrev Gerd Möllmann <gerd.moellmann@gmail.com>:

> Hm, I guess for now it's best to forget about optimization, and just
> scan the whole stack until we have something that works, and then come
> back later.

Agreed. The stack is huge, though, so we definitely want to do better, and we shouldn't forget about the ambition to scan it precisely; I haven't given up on that patch. It even helps performance with the standard GC, and would help with debug info.

And we should understand where the possible races are, partly for a performant solution but also because similar problems could be lurking elsewhere.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-05-06 15:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-06  9:58 MPS: Bytecode stack Gerd Möllmann
2024-05-06 12:12 ` Mattias Engdegård
2024-05-06 12:47   ` Gerd Möllmann
2024-05-06 13:07     ` Mattias Engdegård
2024-05-06 13:23       ` Gerd Möllmann
2024-05-06 13:35         ` Gerd Möllmann
2024-05-06 13:54           ` Mattias Engdegård
2024-05-06 14:02             ` Gerd Möllmann
2024-05-06 14:14               ` Gerd Möllmann
2024-05-06 15:59               ` Mattias Engdegård

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).