* MPS: Bytecode stack @ 2024-05-06 9:58 Gerd Möllmann 2024-05-06 12:12 ` Mattias Engdegård 0 siblings, 1 reply; 10+ messages in thread From: Gerd Möllmann @ 2024-05-06 9:58 UTC (permalink / raw) To: Eli Zaretskii, Helmut Eller; +Cc: Emacs Devel I think I fixed, or better said worked around, something important today in scan_bc which can lead to objects being "lost". Might be interesting to see if some errors magically disappear. A real fix has yet to be found, but see the FIXME comment in the commit. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 9:58 MPS: Bytecode stack Gerd Möllmann @ 2024-05-06 12:12 ` Mattias Engdegård 2024-05-06 12:47 ` Gerd Möllmann 0 siblings, 1 reply; 10+ messages in thread From: Mattias Engdegård @ 2024-05-06 12:12 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel 6 maj 2024 kl. 11.58 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > I think I fixed, or better said worked around, something important today > in scan_bc which can lead to objects being "lost". Might be interesting > to see if some errors magically disappear. I'm not sure what you are trying to fix. Dodging a data race? To see if I understood what your problem is, and how MPS works, consider the thread state Lisp_Object *a; int top; where `a` is a heap-allocated array of some size, but only indices [0..top] are actually used so the GC doesn't need to look at the rest. The mutator thread does: 1 x = a[top]; 2 top = top - 1; What does the GC thread see? Even if it reads a (signal-atomic) snapshot of the mutator's register and stack, does this atomicity extend to `top` which isn't even a Lisp_Object? Is the mutator suspended during this scan? Furthermore, the compiler is allowed to hoist the store in line 2 past the load in line 1. If that's the only problem, it may just be a matter of adding a compiler barrier (acquire). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 12:12 ` Mattias Engdegård @ 2024-05-06 12:47 ` Gerd Möllmann 2024-05-06 13:07 ` Mattias Engdegård 0 siblings, 1 reply; 10+ messages in thread From: Gerd Möllmann @ 2024-05-06 12:47 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel Mattias Engdegård <mattias.engdegard@gmail.com> writes: > 6 maj 2024 kl. 11.58 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > >> I think I fixed, or better said worked around, something important today >> in scan_bc which can lead to objects being "lost". Might be interesting >> to see if some errors magically disappear. > > I'm not sure what you are trying to fix. Dodging a data race? > > To see if I understood what your problem is, and how MPS works, consider the thread state > > Lisp_Object *a; > int top; > > where `a` is a heap-allocated array of some size, but only indices [0..top] are actually used so the GC doesn't need to look at the rest. The mutator thread does: > > 1 x = a[top]; > 2 top = top - 1; > > What does the GC thread see? Even if it reads a (signal-atomic) > snapshot of the mutator's register and stack, does this atomicity > extend to `top` which isn't even a Lisp_Object? Is the mutator > suspended during this scan? > > Furthermore, the compiler is allowed to hoist the store in line 2 past > the load in line 1. If that's the only problem, it may just be a > matter of adding a compiler barrier (acquire). It would be easiest if you'd look at scan_bc in igc.c on the branch That explains it better als prosa: static mps_res_t scan_bc (mps_ss_t ss, void *start, void *end, void *closure) { MPS_SCAN_BEGIN (ss) {x struct igc_thread_list *t = closure; struct bc_thread_state *bc = &t->d.ts->bc; igc_assert (start == (void *) bc->stack); igc_assert (end == (void *) bc->stack_end); /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack and has a maximum length that is given by the bytecode being executed (COMPILED_STACK_DEPTH). So, we need to scan upto bc->fo->next_stack + that max depth to be safe. Since I don't have that number ATM, I'm using an arbitrary estimate for now. This must be changed to something better. Note that Mattias said the bc stack marking will be changed in the future. */ const size_t HORRIBLE_ESTIMATE = 1024; char *scan_end = bc_next_frame (bc->fp); scan_end += HORRIBLE_ESTIMATE; end = min (end, (void *) scan_end); if (end > start) IGC_FIX_CALL (ss, scan_ambig (ss, start, end, NULL)); } MPS_SCAN_END (ss); return MPS_RES_OK; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 12:47 ` Gerd Möllmann @ 2024-05-06 13:07 ` Mattias Engdegård 2024-05-06 13:23 ` Gerd Möllmann 0 siblings, 1 reply; 10+ messages in thread From: Mattias Engdegård @ 2024-05-06 13:07 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > It would be easiest if you'd look at scan_bc in igc.c on the branch I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other: > /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack Does it? Look at the ASCII art in bytecode.c. During bytecode execution, fp->next_stack is the lowest completely unused entry of the bytecode stack, but that refers to the engine register `fp` which is almost but not entirely in sync with `bc->fp`. There are gaps when there is live data on the stack beyond bc->fp->next_stack, such as during frame setup: 514 Lisp_Object *frame_base = bc->fp->next_stack; 515 struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack); 516 517 if ((char *)fp->next_stack > bc->stack_end) 518 error ("Bytecode stack overflow"); 519 520 /* Save the function object so that the bytecode and vector are 521 held from removal by the GC. */ 522 fp->fun = fun; 523 /* Save previous stack pointer and pc in the new frame. If we came 524 directly from outside, these will be NULL. */ 525 fp->saved_top = top; 526 fp->saved_pc = pc; 527 fp->saved_fp = bc->fp; 528 bc->fp = fp; which is why I was concerned about whether there might be a race somewhere. Note, however, that in this particular window we don't stash anything to the stack beyond bc->fp->next_stack that isn't already accessible via ambig refs elsewhere (`fun` in particular). There may be similar gaps when we pop stack frames: return from a function and in the catch and condition-case handling, but it would be easier if I knew what I was looking for. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 13:07 ` Mattias Engdegård @ 2024-05-06 13:23 ` Gerd Möllmann 2024-05-06 13:35 ` Gerd Möllmann 0 siblings, 1 reply; 10+ messages in thread From: Gerd Möllmann @ 2024-05-06 13:23 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel Mattias Engdegård <mattias.engdegard@gmail.com> writes: > 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > >> It would be easiest if you'd look at scan_bc in igc.c on the branch > > I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other: > >> /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack > > Does it? Look at the ASCII art in bytecode.c. During bytecode > execution, fp->next_stack is the lowest completely unused entry of the > bytecode stack, but that refers to the engine register `fp` which is > almost but not entirely in sync with `bc->fp`. There are gaps when > there is live data on the stack beyond bc->fp->next_stack, such as > during frame setup: > > 514 Lisp_Object *frame_base = bc->fp->next_stack; > 515 struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack); > 516 > 517 if ((char *)fp->next_stack > bc->stack_end) > 518 error ("Bytecode stack overflow"); > 519 > 520 /* Save the function object so that the bytecode and vector are > 521 held from removal by the GC. */ > 522 fp->fun = fun; > 523 /* Save previous stack pointer and pc in the new frame. If we came > 524 directly from outside, these will be NULL. */ > 525 fp->saved_top = top; > 526 fp->saved_pc = pc; > 527 fp->saved_fp = bc->fp; > 528 bc->fp = fp; > > which is why I was concerned about whether there might be a race > somewhere. Note, however, that in this particular window we don't > stash anything to the stack beyond bc->fp->next_stack that isn't > already accessible via ambig refs elsewhere (`fun` in particular). > > There may be similar gaps when we pop stack frames: return from a > function and in the catch and condition-case handling, but it would be > easier if I knew what I was looking for. Ok, let me try to explain. Thanks for looking into this! I made the bc stack(s) ambig roots (conservativly scanned, like with mark_memory). For the root address range I use the stack's start and end. Since that is quite large (512K?), I though it a good idea to scan less. So scan_bc tries to figure out an 'end' address that is smaller the whole stack's end. That's all it is about. How to find a realistic, scan end. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 13:23 ` Gerd Möllmann @ 2024-05-06 13:35 ` Gerd Möllmann 2024-05-06 13:54 ` Mattias Engdegård 0 siblings, 1 reply; 10+ messages in thread From: Gerd Möllmann @ 2024-05-06 13:35 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Mattias Engdegård <mattias.engdegard@gmail.com> writes: > >> 6 maj 2024 kl. 14.47 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: >> >>> It would be easiest if you'd look at scan_bc in igc.c on the branch >> >> I did, which is why I wondered whether I had understood the problem correctly. Maybe we can help each other: >> >>> /* FIXME: AFAIU the current top frame starts at bc->fp->next_stack >> >> Does it? Look at the ASCII art in bytecode.c. During bytecode >> execution, fp->next_stack is the lowest completely unused entry of the >> bytecode stack, but that refers to the engine register `fp` which is >> almost but not entirely in sync with `bc->fp`. There are gaps when >> there is live data on the stack beyond bc->fp->next_stack, such as >> during frame setup: >> >> 514 Lisp_Object *frame_base = bc->fp->next_stack; >> 515 struct bc_frame *fp = (struct bc_frame *)(frame_base + max_stack); >> 516 >> 517 if ((char *)fp->next_stack > bc->stack_end) >> 518 error ("Bytecode stack overflow"); >> 519 >> 520 /* Save the function object so that the bytecode and vector are >> 521 held from removal by the GC. */ >> 522 fp->fun = fun; >> 523 /* Save previous stack pointer and pc in the new frame. If we came >> 524 directly from outside, these will be NULL. */ >> 525 fp->saved_top = top; >> 526 fp->saved_pc = pc; >> 527 fp->saved_fp = bc->fp; >> 528 bc->fp = fp; >> >> which is why I was concerned about whether there might be a race >> somewhere. Note, however, that in this particular window we don't >> stash anything to the stack beyond bc->fp->next_stack that isn't >> already accessible via ambig refs elsewhere (`fun` in particular). >> >> There may be similar gaps when we pop stack frames: return from a >> function and in the catch and condition-case handling, but it would be >> easier if I knew what I was looking for. > > Ok, let me try to explain. Thanks for looking into this! > > I made the bc stack(s) ambig roots (conservativly scanned, like with > mark_memory). For the root address range I use the stack's start and > end. Since that is quite large (512K?), I though it a good idea to scan > less. So scan_bc tries to figure out an 'end' address that is smaller > the whole stack's end. That's all it is about. How to find a realistic, > scan end. Or, wait a moment... I'm wondering about what happens when the client continues to exec byte code while we are scanning... The root is not like objects in MPS memory to which we get granted exclusive access. Or so I think ATM. But roots can have a barrier applied to them, I think. MPS_RM_PROT. I have to read a bit. (BTW, funny I think: Emacs/master froze while having Emacs/igc in LLDB, and Emacs/igc was still functioning well :-)) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 13:35 ` Gerd Möllmann @ 2024-05-06 13:54 ` Mattias Engdegård 2024-05-06 14:02 ` Gerd Möllmann 0 siblings, 1 reply; 10+ messages in thread From: Mattias Engdegård @ 2024-05-06 13:54 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel 6 maj 2024 kl. 15.35 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > Or, wait a moment... I'm wondering about what happens when the client > continues to exec byte code while we are scanning... The root is not > like objects in MPS memory to which we get granted exclusive access. Or > so I think ATM. But roots can have a barrier applied to them, I think. > MPS_RM_PROT. I have to read a bit. Yes, that's basically what I was wondering and the reason for the synthetic little example. Essentially, what context is the scanning code running in? The mutator thread? If so, as an async signal handler, or in a more controlled context? If in a separate thread, does the mutator keep running? How did we get hold of the mutator stack and registers? And so on. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 13:54 ` Mattias Engdegård @ 2024-05-06 14:02 ` Gerd Möllmann 2024-05-06 14:14 ` Gerd Möllmann 2024-05-06 15:59 ` Mattias Engdegård 0 siblings, 2 replies; 10+ messages in thread From: Gerd Möllmann @ 2024-05-06 14:02 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel Mattias Engdegård <mattias.engdegard@gmail.com> writes: > 6 maj 2024 kl. 15.35 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > >> Or, wait a moment... I'm wondering about what happens when the client >> continues to exec byte code while we are scanning... The root is not >> like objects in MPS memory to which we get granted exclusive access. Or >> so I think ATM. But roots can have a barrier applied to them, I think. >> MPS_RM_PROT. I have to read a bit. > > Yes, that's basically what I was wondering and the reason for the synthetic little example. > > Essentially, what context is the scanning code running in? The mutator > thread? If so, as an async signal handler, or in a more controlled > context? If in a separate thread, does the mutator keep running? How > did we get hold of the mutator stack and registers? And so on. Yes, we're in the GC thread when scanning. The docs say about MPS_RM_PROT: MPS_RM_PROT The root mode for protectable roots. This tells the MPS that it may place a barrier (1) on any page containing any part of the root. No format method or scan method (except for the one for this root) may write data in this root. They may read it. Note You must not specify MPS_RM_PROT on a root allocated by the MPS. No page may contain parts of two or more protectable roots. You mustn’t specify MPS_RM_PROT if the client program or anything other than (this instance of) the MPS is going to protect or unprotect the relevant pages. This mode may not be suitable if the client program wants the operating system to be able to access the root. Many operating systems can’t cope with writing to protected pages. which is wanted to use for the loaded pdump, but have never tried. Sounds reasonable to me. Hm, I guess for now it's best to forget about optimization, and just scan the whole stack until we have something that works, and then come back later. The three rules of optimization again... don't, don't do it yet, ... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 14:02 ` Gerd Möllmann @ 2024-05-06 14:14 ` Gerd Möllmann 2024-05-06 15:59 ` Mattias Engdegård 1 sibling, 0 replies; 10+ messages in thread From: Gerd Möllmann @ 2024-05-06 14:14 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Hm, I guess for now it's best to forget about optimization, and just > scan the whole stack until we have something that works, and then come > back later. The three rules of optimization again... don't, don't do it > yet, ... Now pushed. No optimizations anymore. Thanks, Mattias, that helped me! ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: MPS: Bytecode stack 2024-05-06 14:02 ` Gerd Möllmann 2024-05-06 14:14 ` Gerd Möllmann @ 2024-05-06 15:59 ` Mattias Engdegård 1 sibling, 0 replies; 10+ messages in thread From: Mattias Engdegård @ 2024-05-06 15:59 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Eli Zaretskii, Helmut Eller, Emacs Devel 6 maj 2024 kl. 16.02 skrev Gerd Möllmann <gerd.moellmann@gmail.com>: > Hm, I guess for now it's best to forget about optimization, and just > scan the whole stack until we have something that works, and then come > back later. Agreed. The stack is huge, though, so we definitely want to do better, and we shouldn't forget about the ambition to scan it precisely; I haven't given up on that patch. It even helps performance with the standard GC, and would help with debug info. And we should understand where the possible races are, partly for a performant solution but also because similar problems could be lurking elsewhere. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-05-06 15:59 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-06 9:58 MPS: Bytecode stack Gerd Möllmann 2024-05-06 12:12 ` Mattias Engdegård 2024-05-06 12:47 ` Gerd Möllmann 2024-05-06 13:07 ` Mattias Engdegård 2024-05-06 13:23 ` Gerd Möllmann 2024-05-06 13:35 ` Gerd Möllmann 2024-05-06 13:54 ` Mattias Engdegård 2024-05-06 14:02 ` Gerd Möllmann 2024-05-06 14:14 ` Gerd Möllmann 2024-05-06 15:59 ` Mattias Engdegård
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).