* Re: GC and stack marking
@ 2014-05-21 19:31 Barry OReilly
2014-05-21 20:13 ` Eli Zaretskii
0 siblings, 1 reply; 40+ messages in thread
From: Barry OReilly @ 2014-05-21 19:31 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 339 bytes --]
> It might simply be a slot that's unused by the current stack frame,
> whose value comes from some stack frame that existed some time in
> the past.
So should the relevant C code try to initialize variables with non
garbage? I took a look at Fgarbage_collect and found that the
stack_top_variable variable for example is garbage valued.
[-- Attachment #2: Type: text/html, Size: 402 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 19:31 GC and stack marking Barry OReilly @ 2014-05-21 20:13 ` Eli Zaretskii 2014-05-21 20:49 ` Barry OReilly 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 20:13 UTC (permalink / raw) To: Barry OReilly; +Cc: emacs-devel > Date: Wed, 21 May 2014 15:31:49 -0400 > From: Barry OReilly <gundaetiapo@gmail.com> > > So should the relevant C code try to initialize variables with non > garbage? No. It's prohibitively expensive. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 20:13 ` Eli Zaretskii @ 2014-05-21 20:49 ` Barry OReilly 2014-05-22 2:43 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Barry OReilly @ 2014-05-21 20:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 300 bytes --] Even if we're only talking about the stack variables in the frames that are active during your particular problematic case (and perhaps in the idle Emacs GC case)? Have you already ruled out whether stack_top_variable contributes one of the bytes in your false positive lookup in the mem_node tree? [-- Attachment #2: Type: text/html, Size: 342 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 20:49 ` Barry OReilly @ 2014-05-22 2:43 ` Eli Zaretskii 2014-05-22 3:12 ` Daniel Colascione 2014-05-22 14:59 ` Barry OReilly 0 siblings, 2 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-22 2:43 UTC (permalink / raw) To: Barry OReilly; +Cc: emacs-devel > Date: Wed, 21 May 2014 16:49:22 -0400 > From: Barry OReilly <gundaetiapo@gmail.com> > Cc: emacs-devel@gnu.org > > Even if we're only talking about the stack variables in the frames that are > active during your particular problematic case (and perhaps in the idle > Emacs GC case)? I thought you were asking about having the compiler generate the code to do that, which would then happen everywhere. If you propose doing that selectively, I don't know how this would be possible, since on the C level you don't have a way of telling how much stack is allocated in a given function. > Have you already ruled out whether stack_top_variable contributes one of > the bytes in your false positive lookup in the mem_node tree? Yes. I looked at all the local variables in that stack frame, and their addresses on the stack are different from the one that triggers the problem. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 2:43 ` Eli Zaretskii @ 2014-05-22 3:12 ` Daniel Colascione 2014-05-22 5:37 ` David Kastrup 2014-05-22 15:49 ` Eli Zaretskii 2014-05-22 14:59 ` Barry OReilly 1 sibling, 2 replies; 40+ messages in thread From: Daniel Colascione @ 2014-05-22 3:12 UTC (permalink / raw) To: Eli Zaretskii, Barry OReilly; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1081 bytes --] On 05/21/2014 07:43 PM, Eli Zaretskii wrote: >> Date: Wed, 21 May 2014 16:49:22 -0400 >> From: Barry OReilly <gundaetiapo@gmail.com> >> Cc: emacs-devel@gnu.org >> >> Even if we're only talking about the stack variables in the frames that are >> active during your particular problematic case (and perhaps in the idle >> Emacs GC case)? > > I thought you were asking about having the compiler generate the code > to do that, which would then happen everywhere. > > If you propose doing that selectively, I don't know how this would be > possible, since on the C level you don't have a way of telling how > much stack is allocated in a given function. > >> Have you already ruled out whether stack_top_variable contributes one of >> the bytes in your false positive lookup in the mem_node tree? > > Yes. I looked at all the local variables in that stack frame, and > their addresses on the stack are different from the one that triggers > the problem. What about cleaning the stack (memset from the top to the high water mark) every once in a while? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 884 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 3:12 ` Daniel Colascione @ 2014-05-22 5:37 ` David Kastrup 2014-05-22 13:57 ` Stefan Monnier 2014-05-22 15:49 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: David Kastrup @ 2014-05-22 5:37 UTC (permalink / raw) To: emacs-devel Daniel Colascione <dancol@dancol.org> writes: > On 05/21/2014 07:43 PM, Eli Zaretskii wrote: >>> Date: Wed, 21 May 2014 16:49:22 -0400 >>> From: Barry OReilly <gundaetiapo@gmail.com> >>> Cc: emacs-devel@gnu.org >>> >>> Even if we're only talking about the stack variables in the frames that are >>> active during your particular problematic case (and perhaps in the idle >>> Emacs GC case)? >> >> I thought you were asking about having the compiler generate the code >> to do that, which would then happen everywhere. >> >> If you propose doing that selectively, I don't know how this would be >> possible, since on the C level you don't have a way of telling how >> much stack is allocated in a given function. >> >>> Have you already ruled out whether stack_top_variable contributes one of >>> the bytes in your false positive lookup in the mem_node tree? >> >> Yes. I looked at all the local variables in that stack frame, and >> their addresses on the stack are different from the one that triggers >> the problem. > > What about cleaning the stack (memset from the top to the high water > mark) every once in a while? How about explicitly triggering garbage collection at a point of time where the water mark is really low? For the few remaining variables, initializing them explicitly would then not be a high cost. -- David Kastrup ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 5:37 ` David Kastrup @ 2014-05-22 13:57 ` Stefan Monnier 0 siblings, 0 replies; 40+ messages in thread From: Stefan Monnier @ 2014-05-22 13:57 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel > How about explicitly triggering garbage collection at a point of time > where the water mark is really low? We already do that. Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 3:12 ` Daniel Colascione 2014-05-22 5:37 ` David Kastrup @ 2014-05-22 15:49 ` Eli Zaretskii 1 sibling, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-22 15:49 UTC (permalink / raw) To: Daniel Colascione; +Cc: gundaetiapo, emacs-devel > Date: Wed, 21 May 2014 20:12:45 -0700 > From: Daniel Colascione <dancol@dancol.org> > CC: emacs-devel@gnu.org > > What about cleaning the stack (memset from the top to the high water > mark) every once in a while? I believe this would be as tedious and expensive as clearing the stack on entry to a function. It also requires ugly OS-dependent code/assembly. Also, when would you exactly do that, except where we call GC? I think what I suggested a few minutes ago is better, and seems to solve the problem at hand. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 2:43 ` Eli Zaretskii 2014-05-22 3:12 ` Daniel Colascione @ 2014-05-22 14:59 ` Barry OReilly 2014-05-22 17:03 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: Barry OReilly @ 2014-05-22 14:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 794 bytes --] > Yes. I looked at all the local variables in that stack frame, and > their addresses on the stack are different from the one that > triggers the problem. [I assume you mean "void* values on the stack" rather than "addresses on the stack".] So when you printed the value of a one byte variable like stack_top_variable, you printed it with any alignment padding there might be? Or in case of GC_POINTER_ALIGNMENT < sizeof(void*), you accounted for mark_stack's candidate void* coming partially from different stack variables? And you accounted for the compiler reordering stack variables, eg to more optimally align data? I confirmed for example that stack_top_variable and message_p are allocated next to each other on the stack in my build, with the i variable not between them in memory. [-- Attachment #2: Type: text/html, Size: 920 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 14:59 ` Barry OReilly @ 2014-05-22 17:03 ` Eli Zaretskii 0 siblings, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-22 17:03 UTC (permalink / raw) To: Barry OReilly; +Cc: emacs-devel > Date: Thu, 22 May 2014 10:59:00 -0400 > From: Barry OReilly <gundaetiapo@gmail.com> > Cc: emacs-devel@gnu.org > > > Yes. I looked at all the local variables in that stack frame, and > > their addresses on the stack are different from the one that > > triggers the problem. > > [I assume you mean "void* values on the stack" rather than "addresses > on the stack".] No, I meant addresses on the stack. Like this: (gdb) info locals foo = 0xbaadf00d bar = 191919191 baz = 0 '\000' (gdb) p/x &foo $1 = 0x12345678 (gdb) p/x &bar $2 = 0x23456789 (gdb) p/x &baz $3 = 0x87654321 I compared these addresses with the value the 'pp' variable had in mark_memory, here: for (pp = start; (void *) pp < end; pp++) for (i = 0; i < sizeof *pp; i += GC_POINTER_ALIGNMENT) { void *p = *(void **) ((char *) pp + i); mark_maybe_pointer (p); if (POINTERS_MIGHT_HIDE_IN_OBJECTS) mark_maybe_object (XIL ((intptr_t) p)); } when the value of 'p' was the address of the hash-table struct that was passed to mark_maybe_pointer. > So when you printed the value of a one byte variable like > stack_top_variable, you printed it with any alignment padding there > might be? I didn't print any values, just the addresses, see above. That's because I already knew the address of the stack slot where the offending value was stored, so I didn't need to look for it. That address was the value of 'pp' above. > And you accounted for the compiler reordering stack variables, eg to > more optimally align data? Yes, in a way: I looked at the disassembly of the offending function, and reviewed every reference to a stack slot via $ebp and $esp. Since I knew the values of $ebp and $esp of that function when mark_stack was called, and I also knew the address of the stack slot where the offending value was stored, it was simple to calculate the offsets from $ebp and $esp corresponding to that stack slot. I looked for those offsets in the disassembly, but they weren't there. > I confirmed for example that stack_top_variable and message_p are > allocated next to each other on the stack in my build, with the i > variable not between them in memory. Again, I checked all the locals in that function, and I also checked all the references to the stack in the disassembly, thus accounting for temporary values that have no C variables in the source. I think this covers all the possibilities, and isn't affected by how the compiler allocates the variables on the stack. ^ permalink raw reply [flat|nested] 40+ messages in thread
* GC and stack marking @ 2014-05-19 16:31 Eli Zaretskii 2014-05-19 18:47 ` Paul Eggert 2014-05-20 13:44 ` Stefan Monnier 0 siblings, 2 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-19 16:31 UTC (permalink / raw) To: emacs-devel; +Cc: Fabrice Popineau I have a question regarding GC and stack marking. This issue popped up during testing of the new code written by Fabrice for managing Emacs memory on MS-Windows. I don't think this issue is Windows specific, and I don't think the details of the new implementation matter for what I'm about to ask (but if someone wants the gory details, please holler). The short version of the question is: is it possible that a Lisp object which is no longer referenced by anything won't be GC'ed because it is marked by mark_stack due to some kind of coincidence? The specific situation where I think I see something like this is during dumping. When temacs loads and runs loadup.el, it does this near the beginning: (if (eq t purify-flag) (setq purify-flag (make-hash-table :test 'equal :size 70000))) This creates a large hash-table and stores its reference in purify-flag. Then, after loading all the preloaded packages, temacs does this: ;; Avoid error if user loads some more libraries now and make sure the ;; hash-consing hash table is GC'd. (setq purify-flag nil) (if (null (garbage-collect)) (setq pure-space-overflow t)) Note the comment: "...and make sure the hash-consing hash table is GC'd.". Well, on one machine to which I have access, it isn't GC'd. Why? because mark_stack happens to find its address somewhere on the stack. (I have a backtrace to prove it.) So the huge hash-table gets dumped into the emacs executable, and causes all kinds of trouble in the dumped Emacs. On another machine (with a different version of the OS and of GCC), the problem doesn't happen, and the table is indeed GC'd. My question is: is this a legitimate situation? Since all mark_stack does is look for values recorded in the red-black tree, it might find such a value by sheer luck (or lack thereof). Right? Or is this a bug that needs to be researched further? If this can legitimately happen, then how can we make sure this hash-table indeed gets GC'd before we dump Emacs? TIA ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 16:31 Eli Zaretskii @ 2014-05-19 18:47 ` Paul Eggert 2014-05-19 19:14 ` Eli Zaretskii 2014-05-20 13:44 ` Stefan Monnier 1 sibling, 1 reply; 40+ messages in thread From: Paul Eggert @ 2014-05-19 18:47 UTC (permalink / raw) To: Eli Zaretskii, emacs-devel; +Cc: Fabrice Popineau On 05/19/2014 09:31 AM, Eli Zaretskii wrote: > is it possible that a Lisp object which is no longer referenced by anything won't be GC'ed because it is marked by mark_stack due to some kind of coincidence? Yes. Normally Emacs uses a conservative approach, which means it occasionally does not collect something that is in fact garbage. See, for example, <https://www.gnu.org/software/guile/manual/html_node/Conservative-GC.html>. > how can we make sure this hash-table indeed gets GC'd before we dump Emacs? > We could have the garbage collector treat purify-flag specially, I suppose. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 18:47 ` Paul Eggert @ 2014-05-19 19:14 ` Eli Zaretskii 2014-05-19 19:58 ` Paul Eggert 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-19 19:14 UTC (permalink / raw) To: Paul Eggert; +Cc: fabrice.popineau, emacs-devel > Date: Mon, 19 May 2014 11:47:28 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: Fabrice Popineau <fabrice.popineau@gmail.com> > > On 05/19/2014 09:31 AM, Eli Zaretskii wrote: > > is it possible that a Lisp object which is no longer referenced by anything won't be GC'ed because it is marked by mark_stack due to some kind of coincidence? > > Yes. Normally Emacs uses a conservative approach, which means it > occasionally does not collect something that is in fact garbage. See, > for example, > <https://www.gnu.org/software/guile/manual/html_node/Conservative-GC.html>. Thanks for confirming. I couldn't explain what I saw in the debugger except as such a coincidence. > > how can we make sure this hash-table indeed gets GC'd before we dump Emacs? > > > > We could have the garbage collector treat purify-flag specially, I suppose. I'm not sure I understand the suggestion. Can you elaborate? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 19:14 ` Eli Zaretskii @ 2014-05-19 19:58 ` Paul Eggert 2014-05-19 20:03 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Paul Eggert @ 2014-05-19 19:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel On 05/19/2014 12:14 PM, Eli Zaretskii wrote: >> > >> >We could have the garbage collector treat purify-flag specially, I suppose. > I'm not sure I understand the suggestion. Can you elaborate? I was thinking of a horrible hack where the GC knows about purify-flag, so that when you set purify-flag to nil the GCC immediately frees the object that purify-flag used to contain, and that we make purify-flag special in this way. I hope there's a better idea out there somewhere. Maybe we should get rid of the hash-table purify-flag hack, for example. I'm just thinking out loud here. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 19:58 ` Paul Eggert @ 2014-05-19 20:03 ` Eli Zaretskii 2014-05-19 20:17 ` Paul Eggert 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-19 20:03 UTC (permalink / raw) To: Paul Eggert; +Cc: fabrice.popineau, emacs-devel > Date: Mon, 19 May 2014 12:58:34 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: emacs-devel@gnu.org, fabrice.popineau@gmail.com > > On 05/19/2014 12:14 PM, Eli Zaretskii wrote: > >> > > >> >We could have the garbage collector treat purify-flag specially, I suppose. > > I'm not sure I understand the suggestion. Can you elaborate? > > I was thinking of a horrible hack where the GC knows about purify-flag, > so that when you set purify-flag to nil the GCC immediately frees the > object that purify-flag used to contain, and that we make purify-flag > special in this way. Right, but that would only work for that single object. The problem, by contrast, sounds more general than that. > Maybe we should get rid of the hash-table purify-flag hack, for > example. Maybe, I'm not sure I fully understand its purpose to begin with (speed of finding objects?). There's a comment by Stefan there saying something about some savings. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 20:03 ` Eli Zaretskii @ 2014-05-19 20:17 ` Paul Eggert 2014-05-20 16:37 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Paul Eggert @ 2014-05-19 20:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel On 05/19/2014 01:03 PM, Eli Zaretskii wrote: > The problem, by contrast, sounds more general than that. Yes, it's a general problem with conservative garbage collection; it's why such garbage collection is called "conservative" rather than "accurate". If it's essential that GC be accurate, then Emacs shouldn't be using conservative GC. My impression, though, is that the goal is to arrange Emacs's internals so that accurate GC isn't essential. If purify-flag is a counterexample, it's almost surely simpler to change howpurify-flag works than to insist on accurate GC. What happens if you change this: (setq purify-flag nil) to something like this? (clrhash purify-flag) (setq purify-flag nil) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 20:17 ` Paul Eggert @ 2014-05-20 16:37 ` Eli Zaretskii 0 siblings, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-20 16:37 UTC (permalink / raw) To: Paul Eggert; +Cc: fabrice.popineau, emacs-devel > Date: Mon, 19 May 2014 13:17:40 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: emacs-devel@gnu.org, fabrice.popineau@gmail.com > > What happens if you change this: > > (setq purify-flag nil) > > to something like this? > > (clrhash purify-flag) > (setq purify-flag nil) Thanks, but it didn't help. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-19 16:31 Eli Zaretskii 2014-05-19 18:47 ` Paul Eggert @ 2014-05-20 13:44 ` Stefan Monnier 2014-05-20 16:57 ` Eli Zaretskii 2014-05-31 6:31 ` Florian Weimer 1 sibling, 2 replies; 40+ messages in thread From: Stefan Monnier @ 2014-05-20 13:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Fabrice Popineau, emacs-devel > The short version of the question is: is it possible that a Lisp > object which is no longer referenced by anything won't be GC'ed > because it is marked by mark_stack due to some kind of coincidence? Yes, of course, it's what makes a conservative marking conservative. > So the huge hash-table gets dumped into the emacs executable, and That's bad luck, indeed. > causes all kinds of trouble in the dumped Emacs. But it shouldn't cause any trouble (other than extra memory use). > If this can legitimately happen, then how can we make sure this > hash-table indeed gets GC'd before we dump Emacs? First we should make sure that even if this table is not GC'd, Emacs behaves correctly. Otherwise, we probably have a bug that can appear in other situations. As for ensuring that the table gets' GC'd, there are 2 approaches: - provide a low-level "free-this-table" function which loadup.el could use. This is dangerous, since it basically says "trust me there are no other references to this object". Even implementing this function can be tricky; it would probably be easier to provide it as a C function only. - find where the spurious "reference" is coming from and add code to set this reference to some other value (e.g. it might be some variable left uninitialized, or a dead variable which we could explicitly set back to NULL or something), or to mark this memory location as "not a pointer" (like GCPRO but reversed: we'd do a NEGATIVE_GCPRO on the var (presumably of a type like int or float)). The Boehm's GC has developed ways to do this second option automatically: if during a GC, a memory cell is found to "point to" unallocated memory, then it is assumed to be of non-pointer type and this fact is recorded somewhere so that if in subsequent GC's this cell ends up "pointing" to allocated memory that won't be considered as an actual pointer. This can be very important when you get close to using the whole address space, in which case most addresses are allocated, so that many/most ints and floats end up spuriously pointing "somewhere". This doesn't work for us, tho, because we don't know when a stack location is reused for some other purpose (i.e. when it changes type), and more importantly because we have the Lisp_Object type which is a memory cell which can sometimes contain integers and sometimes pointers. OTOH, we are only conservative w.r.t stack scanning, so we're only subject to spurious pointers coming from the stack, not from the rest of the heap. And furthermore we have the great advantage that, as an interactive application, our stack regularly comes back to "almost empty" (and since we do "opportunistic GC" during idle time, we often GC at the very moment the stack is almost empty). Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 13:44 ` Stefan Monnier @ 2014-05-20 16:57 ` Eli Zaretskii 2014-05-20 17:54 ` Stefan Monnier 2014-05-20 19:12 ` Daniel Colascione 2014-05-31 6:31 ` Florian Weimer 1 sibling, 2 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-20 16:57 UTC (permalink / raw) To: Stefan Monnier; +Cc: fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: emacs-devel@gnu.org, Fabrice Popineau <fabrice.popineau@gmail.com> > Date: Tue, 20 May 2014 09:44:05 -0400 > > > The short version of the question is: is it possible that a Lisp > > object which is no longer referenced by anything won't be GC'ed > > because it is marked by mark_stack due to some kind of coincidence? > > Yes, of course, it's what makes a conservative marking conservative. I have nothing against conservative, but this failure to GC is too spectacular to ignore. > > So the huge hash-table gets dumped into the emacs executable, and > > That's bad luck, indeed. > > > causes all kinds of trouble in the dumped Emacs. > > But it shouldn't cause any trouble (other than extra memory use). It does, due to all kinds of subtleties. The result is that the large_vectors linked list gets dumped with a pointer to a non-existent memory, and the dumped Emacs then crashes on the first GC when it tries to traverse that linked list. > > If this can legitimately happen, then how can we make sure this > > hash-table indeed gets GC'd before we dump Emacs? > > First we should make sure that even if this table is not GC'd, Emacs > behaves correctly. Fabrice might have found a work-around, so there is hope. I found a way to kludge around it, but my solution is more fragile. Otherwise, we probably have a bug that can appear in > other situations. > > - find where the spurious "reference" is coming from and add code to set > this reference to some other value I think this is hopeless: I see this problem on a single system; two others don't have it. It's just some semi-random garbage somehwre on the stack. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 16:57 ` Eli Zaretskii @ 2014-05-20 17:54 ` Stefan Monnier 2014-05-20 19:28 ` Eli Zaretskii 2014-05-20 19:12 ` Daniel Colascione 1 sibling, 1 reply; 40+ messages in thread From: Stefan Monnier @ 2014-05-20 17:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel >> But it shouldn't cause any trouble (other than extra memory use). > It does, due to all kinds of subtleties. The result is that the > large_vectors linked list gets dumped with a pointer to a non-existent > memory, and the dumped Emacs then crashes on the first GC when it > tries to traverse that linked list. We should fix that. > I think this is hopeless: I see this problem on a single system; two > others don't have it. It's just some semi-random garbage somehwre on > the stack. Of course, but if you can find where it comes from, we can fix that one case. After all, we don't know of any other anyway. Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 17:54 ` Stefan Monnier @ 2014-05-20 19:28 ` Eli Zaretskii 2014-05-20 22:01 ` Stefan Monnier 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-20 19:28 UTC (permalink / raw) To: Stefan Monnier; +Cc: fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: emacs-devel@gnu.org, fabrice.popineau@gmail.com > Date: Tue, 20 May 2014 13:54:16 -0400 > > >> But it shouldn't cause any trouble (other than extra memory use). > > It does, due to all kinds of subtleties. The result is that the > > large_vectors linked list gets dumped with a pointer to a non-existent > > memory, and the dumped Emacs then crashes on the first GC when it > > tries to traverse that linked list. > > We should fix that. No argument here. Otherwise the dumped Emacs crashes. > > I think this is hopeless: I see this problem on a single system; two > > others don't have it. It's just some semi-random garbage somehwre on > > the stack. > > Of course, but if you can find where it comes from, we can fix that > one case. I tried, but couldn't. Suggestions for how to set up a GDB session for that are welcome. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 19:28 ` Eli Zaretskii @ 2014-05-20 22:01 ` Stefan Monnier 2014-05-21 2:48 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Stefan Monnier @ 2014-05-20 22:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel > I tried, but couldn't. Suggestions for how to set up a GDB session > for that are welcome. I guess you could try the following: - interrupt the dump just before setting purify-flag to nil. - get the value of purify-flag. - set a conditional break point in mark_object that catches the case where the argument is equal to the value you just got. - "c" Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 22:01 ` Stefan Monnier @ 2014-05-21 2:48 ` Eli Zaretskii 2014-05-21 3:01 ` Stefan Monnier 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 2:48 UTC (permalink / raw) To: Stefan Monnier; +Cc: fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org, fabrice.popineau@gmail.com > Date: Tue, 20 May 2014 18:01:05 -0400 > > > I tried, but couldn't. Suggestions for how to set up a GDB session > > for that are welcome. > > I guess you could try the following: > - interrupt the dump just before setting purify-flag to nil. > - get the value of purify-flag. > - set a conditional break point in mark_object that catches the case > where the argument is equal to the value you just got. > - "c" That's how I found out that it was being marked by mark_stack. But that doesn't tell you how that value _got_ on the stack, does it? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 2:48 ` Eli Zaretskii @ 2014-05-21 3:01 ` Stefan Monnier 2014-05-21 15:39 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Stefan Monnier @ 2014-05-21 3:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel > That's how I found out that it was being marked by mark_stack. But > that doesn't tell you how that value _got_ on the stack, does it? No, but it does tell you its address in the stack, so you can then walk up the backtrace and look at the address of local variables until you (hopefully) find the one that matters. Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 3:01 ` Stefan Monnier @ 2014-05-21 15:39 ` Eli Zaretskii 2014-05-21 15:57 ` Dmitry Antipov 2014-05-21 17:40 ` Stefan Monnier 0 siblings, 2 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 15:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org, fabrice.popineau@gmail.com > Date: Tue, 20 May 2014 23:01:24 -0400 > > > That's how I found out that it was being marked by mark_stack. But > > that doesn't tell you how that value _got_ on the stack, does it? > > No, but it does tell you its address in the stack, so you can then walk > up the backtrace and look at the address of local variables until you > (hopefully) find the one that matters. I already tried that before, and came up empty-handed. I tried again now; the address of that value on the stack does not correspond to any local variable in the corresponding stack frame, and I also cannot find that address in the disassembly of the function whose stack frame includes the value. I might try setting a watchpoint at that address, but that might be impractical; we shall see. Now, I have a question: mark_stack stops examining the stack when it gets to its own stack frame. That is certainly safe, but it sounds too conservative: it should stop at the stack frame of Fgarbage_collect, I think, because no live Lisp object can appear while Fgarbage_collect runs, right? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 15:39 ` Eli Zaretskii @ 2014-05-21 15:57 ` Dmitry Antipov 2014-05-21 16:06 ` Dmitry Antipov 2014-05-21 16:53 ` Eli Zaretskii 2014-05-21 17:40 ` Stefan Monnier 1 sibling, 2 replies; 40+ messages in thread From: Dmitry Antipov @ 2014-05-21 15:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, Stefan Monnier, emacs-devel On 05/21/2014 07:39 PM, Eli Zaretskii wrote: > Now, I have a question: mark_stack stops examining the stack when it > gets to its own stack frame. That is certainly safe, but it sounds > too conservative: it should stop at the stack frame of > Fgarbage_collect, I think, because no live Lisp object can appear > while Fgarbage_collect runs, right? 1) Yes, but you need ABI- and machine-specific tricks to find the stack frame boundaries. I.e. while in mark_stack, there is no easy way to find start and end of Fgarbage_collect's stack frame. 2) But see GCC's __builtin_frame_address, https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html. 3) But even if 2) works on all platforms we have to support, I don't see a reasons to complicate GC just to avoid scanning a few tens of bytes of an extra stack frame. Dmitry ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 15:57 ` Dmitry Antipov @ 2014-05-21 16:06 ` Dmitry Antipov 2014-05-21 16:55 ` Eli Zaretskii 2014-05-21 16:53 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: Dmitry Antipov @ 2014-05-21 16:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, Stefan Monnier, emacs-devel On 05/21/2014 07:57 PM, Dmitry Antipov wrote: > 1) Yes, but you need ABI- and machine-specific tricks to find the stack frame boundaries. I.e. > while in mark_stack, there is no easy way to find start and end of Fgarbage_collect's stack frame. > > 2) But see GCC's __builtin_frame_address, https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html. > > 3) But even if 2) works on all platforms we have to support, I don't see a reasons to complicate > GC just to avoid scanning a few tens of bytes of an extra stack frame. 4) mark_stack calls __builtin_unwind_init to save registers onto the stack. So if you stop at Fgarbage_collect's stack frame, you don't scan the frame of mark_stack too and may loose Lisp_Objects accessible from registers. Dmitry ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 16:06 ` Dmitry Antipov @ 2014-05-21 16:55 ` Eli Zaretskii 0 siblings, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 16:55 UTC (permalink / raw) To: Dmitry Antipov; +Cc: fabrice.popineau, monnier, emacs-devel > Date: Wed, 21 May 2014 20:06:55 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: fabrice.popineau@gmail.com, Stefan Monnier <monnier@iro.umontreal.ca>, > emacs-devel@gnu.org > > 4) mark_stack calls __builtin_unwind_init to save registers onto the stack. So if you stop > at Fgarbage_collect's stack frame, you don't scan the frame of mark_stack too and may > loose Lisp_Objects accessible from registers. We could call __builtin_unwind_init in Fgarbage_collect, no? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 15:57 ` Dmitry Antipov 2014-05-21 16:06 ` Dmitry Antipov @ 2014-05-21 16:53 ` Eli Zaretskii 1 sibling, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 16:53 UTC (permalink / raw) To: Dmitry Antipov; +Cc: fabrice.popineau, monnier, emacs-devel > Date: Wed, 21 May 2014 19:57:42 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Stefan Monnier <monnier@iro.umontreal.ca>, > fabrice.popineau@gmail.com, emacs-devel@gnu.org > > On 05/21/2014 07:39 PM, Eli Zaretskii wrote: > > > Now, I have a question: mark_stack stops examining the stack when it > > gets to its own stack frame. That is certainly safe, but it sounds > > too conservative: it should stop at the stack frame of > > Fgarbage_collect, I think, because no live Lisp object can appear > > while Fgarbage_collect runs, right? > > 1) Yes, but you need ABI- and machine-specific tricks to find the stack frame boundaries. I.e. > while in mark_stack, there is no easy way to find start and end of Fgarbage_collect's stack frame. I thought of passing that to mark_stack as argument when Fgarbage_collect calls it. That should work as well as what we do in mark_stack to find its own stack frame, no? > 3) But even if 2) works on all platforms we have to support, I don't see a reasons to complicate > GC just to avoid scanning a few tens of bytes of an extra stack frame. The issue discussed in this thread _is_ that reason: we are dumping Emacs with a dead object, for no good reason, and that object is quite large (around 1MB). ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 15:39 ` Eli Zaretskii 2014-05-21 15:57 ` Dmitry Antipov @ 2014-05-21 17:40 ` Stefan Monnier 2014-05-21 17:58 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: Stefan Monnier @ 2014-05-21 17:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel > I already tried that before, and came up empty-handed. I tried again > now; the address of that value on the stack does not correspond to any > local variable in the corresponding stack frame, and I also cannot > find that address in the disassembly of the function whose stack frame > includes the value. It might simply be a slot that's unused by the current stack frame, whose value comes from some stack frame that existed some time in the past. Which stack frame is that? Is it high up or very deep (both of which we could hope to solve by using tighter bounds on the start and end addresses of the stack scan), or neither? > Now, I have a question: mark_stack stops examining the stack when it > gets to its own stack frame. That is certainly safe, but it sounds > too conservative: it should stop at the stack frame of > Fgarbage_collect, I think, because no live Lisp object can appear > while Fgarbage_collect runs, right? Sounds right, yes. Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 17:40 ` Stefan Monnier @ 2014-05-21 17:58 ` Eli Zaretskii 2014-05-22 15:20 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 17:58 UTC (permalink / raw) To: Stefan Monnier; +Cc: fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org, fabrice.popineau@gmail.com > Date: Wed, 21 May 2014 13:40:21 -0400 > > > I already tried that before, and came up empty-handed. I tried again > > now; the address of that value on the stack does not correspond to any > > local variable in the corresponding stack frame, and I also cannot > > find that address in the disassembly of the function whose stack frame > > includes the value. > > It might simply be a slot that's unused by the current stack frame, > whose value comes from some stack frame that existed some time in > the past. That's probably what it is, yes. > Which stack frame is that? The one of Fgarbage_collect. That's why I asked about mark_stack looking for objects too high on the stack. > > Now, I have a question: mark_stack stops examining the stack when it > > gets to its own stack frame. That is certainly safe, but it sounds > > too conservative: it should stop at the stack frame of > > Fgarbage_collect, I think, because no live Lisp object can appear > > while Fgarbage_collect runs, right? > > Sounds right, yes. I will try that and see if that helps. Of course, if my reading of GDB data is correct, and the value was indeed in the Fgarbage_collect's stack frame, it must help. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-21 17:58 ` Eli Zaretskii @ 2014-05-22 15:20 ` Eli Zaretskii 2014-05-22 16:14 ` Stefan Monnier 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-22 15:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, monnier, emacs-devel > Date: Wed, 21 May 2014 20:58:19 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: fabrice.popineau@gmail.com, emacs-devel@gnu.org > > > From: Stefan Monnier <monnier@iro.umontreal.ca> > > Cc: emacs-devel@gnu.org, fabrice.popineau@gmail.com > > Date: Wed, 21 May 2014 13:40:21 -0400 > > > > > I already tried that before, and came up empty-handed. I tried again > > > now; the address of that value on the stack does not correspond to any > > > local variable in the corresponding stack frame, and I also cannot > > > find that address in the disassembly of the function whose stack frame > > > includes the value. > > > > It might simply be a slot that's unused by the current stack frame, > > whose value comes from some stack frame that existed some time in > > the past. > > That's probably what it is, yes. That's definitely what it is. The value gets onto the stack when loadup.el does this: (when (hash-table-p purify-flag) (let ((strings 0) (vectors 0) (bytecodes 0) (conses 0) (others 0)) (maphash (lambda (k v) (cond ((stringp k) (setq strings (1+ strings))) ((vectorp k) (setq vectors (1+ vectors))) ((consp k) (setq conses (1+ conses))) ((byte-code-function-p v) (setq bytecodes (1+ bytecodes))) (t (setq others (1+ others))))) purify-flag) (message "Pure-hashed: %d strings, %d vectors, %d conses, %d bytecodes, %d others" strings vectors conses bytecodes others))) The call to hash-table-p pushes the table address on the stack before calling Fhash_table_p, and it remains there until the call to mark_stack. > > > Now, I have a question: mark_stack stops examining the stack when it > > > gets to its own stack frame. That is certainly safe, but it sounds > > > too conservative: it should stop at the stack frame of > > > Fgarbage_collect, I think, because no live Lisp object can appear > > > while Fgarbage_collect runs, right? > > > > Sounds right, yes. > > I will try that and see if that helps. Of course, if my reading of > GDB data is correct, and the value was indeed in the > Fgarbage_collect's stack frame, it must help. It did help, at least in an unoptimized build. The suggested patch is below. It just reshuffles the existing code: we now determine the limit for searching the stack in Fgarbage_collect, and then call a subroutine that does what Fgarbage_collect should actually do. This way, none of the variables local to Fgarbage_collect or its stack will be searched by mark_stack. Is the patchy below OK for the trunk? Does anyone see anything problematic with it? --- src/alloc.c~ 2014-05-21 18:04:29 +0300 +++ src/alloc.c 2014-05-22 18:18:32 +0300 @@ -4880,61 +4880,8 @@ dump_zombies (void) from the stack start. */ static void -mark_stack (void) +mark_stack (void *end) { - void *end; - -#ifdef HAVE___BUILTIN_UNWIND_INIT - /* Force callee-saved registers and register windows onto the stack. - This is the preferred method if available, obviating the need for - machine dependent methods. */ - __builtin_unwind_init (); - end = &end; -#else /* not HAVE___BUILTIN_UNWIND_INIT */ -#ifndef GC_SAVE_REGISTERS_ON_STACK - /* jmp_buf may not be aligned enough on darwin-ppc64 */ - union aligned_jmpbuf { - Lisp_Object o; - sys_jmp_buf j; - } j; - volatile bool stack_grows_down_p = (char *) &j > (char *) stack_base; -#endif - /* This trick flushes the register windows so that all the state of - the process is contained in the stack. */ - /* Fixme: Code in the Boehm GC suggests flushing (with `flushrs') is - needed on ia64 too. See mach_dep.c, where it also says inline - assembler doesn't work with relevant proprietary compilers. */ -#ifdef __sparc__ -#if defined (__sparc64__) && defined (__FreeBSD__) - /* FreeBSD does not have a ta 3 handler. */ - asm ("flushw"); -#else - asm ("ta 3"); -#endif -#endif - - /* Save registers that we need to see on the stack. We need to see - registers used to hold register variables and registers used to - pass parameters. */ -#ifdef GC_SAVE_REGISTERS_ON_STACK - GC_SAVE_REGISTERS_ON_STACK (end); -#else /* not GC_SAVE_REGISTERS_ON_STACK */ - -#ifndef GC_SETJMP_WORKS /* If it hasn't been checked yet that - setjmp will definitely work, test it - and print a message with the result - of the test. */ - if (!setjmp_tested_p) - { - setjmp_tested_p = 1; - test_setjmp (); - } -#endif /* GC_SETJMP_WORKS */ - - sys_setjmp (j.j); - end = stack_grows_down_p ? (char *) &j + sizeof j : (char *) &j; -#endif /* not GC_SAVE_REGISTERS_ON_STACK */ -#endif /* not HAVE___BUILTIN_UNWIND_INIT */ /* This assumes that the stack is a contiguous region in memory. If that's not the case, something has to be done here to iterate @@ -5542,22 +5489,14 @@ mark_pinned_symbols (void) } } -DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "", - doc: /* Reclaim storage for Lisp objects no longer needed. -Garbage collection happens automatically if you cons more than -`gc-cons-threshold' bytes of Lisp data since previous garbage collection. -`garbage-collect' normally returns a list with info on amount of space in use, -where each entry has the form (NAME SIZE USED FREE), where: -- NAME is a symbol describing the kind of objects this entry represents, -- SIZE is the number of bytes used by each one, -- USED is the number of those objects that were found live in the heap, -- FREE is the number of those objects that are not live but that Emacs - keeps around for future allocations (maybe because it does not know how - to return them to the OS). -However, if there was overflow in pure space, `garbage-collect' -returns nil, because real GC can't be done. -See Info node `(elisp)Garbage Collection'. */) - (void) +/* Subroutine of Fgarbage_collect that does most of the work. It is a + separate function so that we could limit mark_stack in searching + the stack frames below this function, thus avoiding the rare cases + where mark_stack finds values that look like live Lisp objects on + portions of stack that couldn't possibly contain such live + objects. */ +static Lisp_Object +garbage_collect_1 (void *end) { struct buffer *nextb; char stack_top_variable; @@ -5655,7 +5594,7 @@ See Info node `(elisp)Garbage Collection #if (GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS \ || GC_MARK_STACK == GC_MARK_STACK_CHECK_GCPROS) - mark_stack (); + mark_stack (end); #else { register struct gcpro *tail; @@ -5678,7 +5617,7 @@ See Info node `(elisp)Garbage Collection #endif #if GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES - mark_stack (); + mark_stack (end); #endif /* Everything is now marked, except for the data in font caches @@ -5838,6 +5777,82 @@ See Info node `(elisp)Garbage Collection return retval; } +DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "", + doc: /* Reclaim storage for Lisp objects no longer needed. +Garbage collection happens automatically if you cons more than +`gc-cons-threshold' bytes of Lisp data since previous garbage collection. +`garbage-collect' normally returns a list with info on amount of space in use, +where each entry has the form (NAME SIZE USED FREE), where: +- NAME is a symbol describing the kind of objects this entry represents, +- SIZE is the number of bytes used by each one, +- USED is the number of those objects that were found live in the heap, +- FREE is the number of those objects that are not live but that Emacs + keeps around for future allocations (maybe because it does not know how + to return them to the OS). +However, if there was overflow in pure space, `garbage-collect' +returns nil, because real GC can't be done. +See Info node `(elisp)Garbage Collection'. */) + (void) +{ +#if (GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS \ + || GC_MARK_STACK == GC_MARK_STACK_CHECK_GCPROS \ + || GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES) + void *end; + +#ifdef HAVE___BUILTIN_UNWIND_INIT + /* Force callee-saved registers and register windows onto the stack. + This is the preferred method if available, obviating the need for + machine dependent methods. */ + __builtin_unwind_init (); + end = &end; +#else /* not HAVE___BUILTIN_UNWIND_INIT */ +#ifndef GC_SAVE_REGISTERS_ON_STACK + /* jmp_buf may not be aligned enough on darwin-ppc64 */ + union aligned_jmpbuf { + Lisp_Object o; + sys_jmp_buf j; + } j; + volatile bool stack_grows_down_p = (char *) &j > (char *) stack_base; +#endif + /* This trick flushes the register windows so that all the state of + the process is contained in the stack. */ + /* Fixme: Code in the Boehm GC suggests flushing (with `flushrs') is + needed on ia64 too. See mach_dep.c, where it also says inline + assembler doesn't work with relevant proprietary compilers. */ +#ifdef __sparc__ +#if defined (__sparc64__) && defined (__FreeBSD__) + /* FreeBSD does not have a ta 3 handler. */ + asm ("flushw"); +#else + asm ("ta 3"); +#endif +#endif + + /* Save registers that we need to see on the stack. We need to see + registers used to hold register variables and registers used to + pass parameters. */ +#ifdef GC_SAVE_REGISTERS_ON_STACK + GC_SAVE_REGISTERS_ON_STACK (end); +#else /* not GC_SAVE_REGISTERS_ON_STACK */ + +#ifndef GC_SETJMP_WORKS /* If it hasn't been checked yet that + setjmp will definitely work, test it + and print a message with the result + of the test. */ + if (!setjmp_tested_p) + { + setjmp_tested_p = 1; + test_setjmp (); + } +#endif /* GC_SETJMP_WORKS */ + + sys_setjmp (j.j); + end = stack_grows_down_p ? (char *) &j + sizeof j : (char *) &j; +#endif /* not GC_SAVE_REGISTERS_ON_STACK */ +#endif /* not HAVE___BUILTIN_UNWIND_INIT */ +#endif /* GC_MARK_STACK */ + return garbage_collect_1 (end); +} /* Mark Lisp objects in glyph matrix MATRIX. Currently the only interesting objects referenced from glyphs are strings. */ ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 15:20 ` Eli Zaretskii @ 2014-05-22 16:14 ` Stefan Monnier 2014-05-24 12:03 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Stefan Monnier @ 2014-05-22 16:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: fabrice.popineau, emacs-devel > Is the patchy below OK for the trunk? Looks good to me. Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-22 16:14 ` Stefan Monnier @ 2014-05-24 12:03 ` Eli Zaretskii 0 siblings, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-24 12:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: fabrice.popineau@gmail.com, emacs-devel@gnu.org > Date: Thu, 22 May 2014 12:14:36 -0400 > > > Is the patchy below OK for the trunk? > > Looks good to me. Thanks, committed on the trunk. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 16:57 ` Eli Zaretskii 2014-05-20 17:54 ` Stefan Monnier @ 2014-05-20 19:12 ` Daniel Colascione 2014-05-20 19:43 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: Daniel Colascione @ 2014-05-20 19:12 UTC (permalink / raw) To: Eli Zaretskii, Stefan Monnier; +Cc: fabrice.popineau, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1189 bytes --] On 05/20/2014 09:57 AM, Eli Zaretskii wrote: >> From: Stefan Monnier <monnier@IRO.UMontreal.CA> >> Cc: emacs-devel@gnu.org, Fabrice Popineau <fabrice.popineau@gmail.com> >> Date: Tue, 20 May 2014 09:44:05 -0400 >> >>> The short version of the question is: is it possible that a Lisp >>> object which is no longer referenced by anything won't be GC'ed >>> because it is marked by mark_stack due to some kind of coincidence? >> >> Yes, of course, it's what makes a conservative marking conservative. > > I have nothing against conservative, but this failure to GC is too > spectacular to ignore. > >>> So the huge hash-table gets dumped into the emacs executable, and >> >> That's bad luck, indeed. >> >>> causes all kinds of trouble in the dumped Emacs. >> >> But it shouldn't cause any trouble (other than extra memory use). > > It does, due to all kinds of subtleties. The result is that the > large_vectors linked list gets dumped with a pointer to a non-existent > memory, and the dumped Emacs then crashes on the first GC when it > tries to traverse that linked list. Can you elaborate on how that happens? This behavior sounds like a plain GC bug. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 884 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 19:12 ` Daniel Colascione @ 2014-05-20 19:43 ` Eli Zaretskii 2014-05-20 22:03 ` Stefan Monnier 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2014-05-20 19:43 UTC (permalink / raw) To: Daniel Colascione; +Cc: fabrice.popineau, monnier, emacs-devel > Date: Tue, 20 May 2014 12:12:45 -0700 > From: Daniel Colascione <dancol@dancol.org> > CC: fabrice.popineau@gmail.com, emacs-devel@gnu.org > > >> But it shouldn't cause any trouble (other than extra memory use). > > > > It does, due to all kinds of subtleties. The result is that the > > large_vectors linked list gets dumped with a pointer to a non-existent > > memory, and the dumped Emacs then crashes on the first GC when it > > tries to traverse that linked list. > > Can you elaborate on how that happens? This behavior sounds like a plain > GC bug. It's not a bug in GC. The memory management scheme that Fabrice wrote does not dump the heap (because doing that is problematic on Windows, and requires addition of a separate section to the executable, which then precludes its stripping, and has also other complexities). Instead, temacs uses a private fixed-address heap that is located in a static array, and whose memory is allocated by a replacement malloc function. So any address that points to memory allocated not in that array, but in the real heap provided by malloc from libc, cannot be safely dumped, because in the dumped Emacs it will point to some random location. Now, the large_vectors list is a linked list chained via the next pointer. If one of these next pointers points to a memory on the heap, following it in the dumped Emacs will surely crash. There's no way GC can work around that, when it traverses that linked list. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 19:43 ` Eli Zaretskii @ 2014-05-20 22:03 ` Stefan Monnier 2014-05-21 2:51 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Stefan Monnier @ 2014-05-20 22:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, fabrice.popineau, emacs-devel > It's not a bug in GC. The memory management scheme that Fabrice wrote > does not dump the heap (because doing that is problematic on Windows, > and requires addition of a separate section to the executable, which > then precludes its stripping, and has also other complexities). > Instead, temacs uses a private fixed-address heap that is located in a > static array, and whose memory is allocated by a replacement malloc > function. So any address that points to memory allocated not in that > array, but in the real heap provided by malloc from libc, cannot be > safely dumped, because in the dumped Emacs it will point to some > random location. OK, so why is the hash table allocated elsewhere then the other objects (I understand why one might want to do that, but the question is about what is different in the code in the case of this purify-flag hash-table compared to other vectors/hashtables allocated during the dump). Is it just based on size? I.e. would the same problem show up if some large vector were to be allocated (and not freed) before dumping? Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 22:03 ` Stefan Monnier @ 2014-05-21 2:51 ` Eli Zaretskii 0 siblings, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2014-05-21 2:51 UTC (permalink / raw) To: Stefan Monnier; +Cc: dancol, fabrice.popineau, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Daniel Colascione <dancol@dancol.org>, fabrice.popineau@gmail.com, emacs-devel@gnu.org > Date: Tue, 20 May 2014 18:03:51 -0400 > > > It's not a bug in GC. The memory management scheme that Fabrice wrote > > does not dump the heap (because doing that is problematic on Windows, > > and requires addition of a separate section to the executable, which > > then precludes its stripping, and has also other complexities). > > Instead, temacs uses a private fixed-address heap that is located in a > > static array, and whose memory is allocated by a replacement malloc > > function. So any address that points to memory allocated not in that > > array, but in the real heap provided by malloc from libc, cannot be > > safely dumped, because in the dumped Emacs it will point to some > > random location. > > OK, so why is the hash table allocated elsewhere then the other objects > (I understand why one might want to do that, but the question is about > what is different in the code in the case of this purify-flag hash-table > compared to other vectors/hashtables allocated during the dump). Because fixed-address heaps on Windows are limited to allocations whose size is at most 0x7f000, and one of the vectors allocated for a 70K hash-table is larger than that. > Is it just based on size? I.e. would the same problem show up if some > large vector were to be allocated (and not freed) before dumping? Yes. And not just large vectors, any large object (e.g., string). And that's what scared me, because I can always find a solution for the case I know of, but how to make this reliable in the face of future changes in Emacs? Anyway, it looks like Fabrice found a way to work around the above limitation, so I guess this issue is no longer such a big problem. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-20 13:44 ` Stefan Monnier 2014-05-20 16:57 ` Eli Zaretskii @ 2014-05-31 6:31 ` Florian Weimer 2014-05-31 14:24 ` Stefan Monnier 1 sibling, 1 reply; 40+ messages in thread From: Florian Weimer @ 2014-05-31 6:31 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, Fabrice Popineau, emacs-devel * Stefan Monnier: > The Boehm's GC has developed ways to do this second option > automatically: if during a GC, a memory cell is found to "point to" > unallocated memory, then it is assumed to be of non-pointer type and > this fact is recorded somewhere so that if in subsequent GC's this cell > ends up "pointing" to allocated memory that won't be considered as an > actual pointer. I believe this is not a correct description of the mechanism. What happens is that the pointer *target* is blacklisted and not used for allocation. What you propose instead is not safe because it will result in dangling pointers in some cases. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: GC and stack marking 2014-05-31 6:31 ` Florian Weimer @ 2014-05-31 14:24 ` Stefan Monnier 0 siblings, 0 replies; 40+ messages in thread From: Stefan Monnier @ 2014-05-31 14:24 UTC (permalink / raw) To: Florian Weimer; +Cc: Eli Zaretskii, Fabrice Popineau, emacs-devel > I believe this is not a correct description of the mechanism. What > happens is that the pointer *target* is blacklisted and not used for > allocation. Oh right, sorry, and thanks for the correction, Stefan ^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2014-05-31 14:24 UTC | newest] Thread overview: 40+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-21 19:31 GC and stack marking Barry OReilly 2014-05-21 20:13 ` Eli Zaretskii 2014-05-21 20:49 ` Barry OReilly 2014-05-22 2:43 ` Eli Zaretskii 2014-05-22 3:12 ` Daniel Colascione 2014-05-22 5:37 ` David Kastrup 2014-05-22 13:57 ` Stefan Monnier 2014-05-22 15:49 ` Eli Zaretskii 2014-05-22 14:59 ` Barry OReilly 2014-05-22 17:03 ` Eli Zaretskii -- strict thread matches above, loose matches on Subject: below -- 2014-05-19 16:31 Eli Zaretskii 2014-05-19 18:47 ` Paul Eggert 2014-05-19 19:14 ` Eli Zaretskii 2014-05-19 19:58 ` Paul Eggert 2014-05-19 20:03 ` Eli Zaretskii 2014-05-19 20:17 ` Paul Eggert 2014-05-20 16:37 ` Eli Zaretskii 2014-05-20 13:44 ` Stefan Monnier 2014-05-20 16:57 ` Eli Zaretskii 2014-05-20 17:54 ` Stefan Monnier 2014-05-20 19:28 ` Eli Zaretskii 2014-05-20 22:01 ` Stefan Monnier 2014-05-21 2:48 ` Eli Zaretskii 2014-05-21 3:01 ` Stefan Monnier 2014-05-21 15:39 ` Eli Zaretskii 2014-05-21 15:57 ` Dmitry Antipov 2014-05-21 16:06 ` Dmitry Antipov 2014-05-21 16:55 ` Eli Zaretskii 2014-05-21 16:53 ` Eli Zaretskii 2014-05-21 17:40 ` Stefan Monnier 2014-05-21 17:58 ` Eli Zaretskii 2014-05-22 15:20 ` Eli Zaretskii 2014-05-22 16:14 ` Stefan Monnier 2014-05-24 12:03 ` Eli Zaretskii 2014-05-20 19:12 ` Daniel Colascione 2014-05-20 19:43 ` Eli Zaretskii 2014-05-20 22:03 ` Stefan Monnier 2014-05-21 2:51 ` Eli Zaretskii 2014-05-31 6:31 ` Florian Weimer 2014-05-31 14:24 ` Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.