all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Matt Armstrong <matt@rfc20.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: 58639@debbugs.gnu.org
Subject: bug#58639: 29.0.50; [noverlay] Nested overlay iteration in GC
Date: Wed, 19 Oct 2022 14:49:35 -0700	[thread overview]
Message-ID: <87k04vo55c.fsf@rfc20.org> (raw)
In-Reply-To: <jwvfsfjenp1.fsf-monnier+emacs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 2809 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Two easy fixes I can think of:
>> a) Break the recursion while marking with a queue, rather than a stack.
>>    To bound the size of the queue do this only for buffers.
>
> Actually, the alloc.c code already has that, so it's a trivial change.

I don't see code for a mark *queue*.  I see one for a stack.  What am I
missing?  (admittedly, I was kinda lazy and just searched for the word
"queue", then paged through rather quickly, failing to spot anything...)


>> b) Add a specialized itree iteration function for gc.  In fact,
>>    properties already have one that we can copy:
>>
>>     extern void traverse_intervals_noorder (
>>         INTERVAL,
>>         void (*) (INTERVAL, void *), void *);
>
> We even have better: just undo commit
> b8fbd42f0a7caa4cd9e2d50dd4e4b2101ac78acd

Attached, with cosmetic changes to the original.

But I also added an eassert(!busy) right before `gc-post-hook', to
hopefully still retain some of the checking for the issue you were
originally trying find.


> Or use the (100% guaranteed untested) code below.

Looks like it would work, but I didn't like the idea of throwing all the
overlays on a stack during GC, since the count can become large.


> The good thing about using `ITREE_FOREACH` in the GC is that it checks
> that we're not running the GC from within an `ITREE_FOREACH` loop.

Ah, good point.

I suppose my patch demotes the checking in GC to just an eassert.  Let
me know what you think of that.

I was up to now thinking that the issue was that GC itself couldn't nest
with ITREE_FOREACH, but that isn't really it.  It is that we don't want
ELisp to run during ITREE_FOREACH *at all* because any ELisp might cause
a nested ITREE_FOREACH.

Is there someplace even more fundamental we can put a check for this?
Maybe Ffuncall?  We really want a "die if ELisp code runs" thing.


> Side comment about `traverse_intervals_noorder`: I added this function
> back when I played with using splay trees for `intervals.c`, which
> worked well except that it occasionally created "pathologically" deep
> trees, hence the need for `traverse_intervals_noorder` to keep the
> recursion depth in O(log N).

Definitely a side conversation.  :-) I don't think
`traverse_intervals_noorder` caps depth at O(log N) for arbitrarily
shaped trees like splay trees.  Counter example: Construct a tree where
every left child has two children, and every right child only one
additional right child.  In this wildly unbalanced tree 1/3 of the nodes
have two children, so `traverse_intervals_noorder` will recurse 1/3 N
times and so take O(N) stack space.  But, the trick will save a lot of
stack in the common cases that splay trees can often have (long chains
of single-child nodes), so I see why you wrote it that way.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Revert-mark_overlays-Use-the-normal-ITREE_FOREACH.patch --]
[-- Type: text/x-diff, Size: 2822 bytes --]

From 570352e1de01312a0e0b8a54a37066d47b7ab79a Mon Sep 17 00:00:00 2001
From: Matt Armstrong <matt@rfc20.org>
Date: Wed, 19 Oct 2022 13:42:35 -0700
Subject: [PATCH] Revert "mark_overlays: Use the normal ITREE_FOREACH"

This reverts commit b8fbd42f0a7caa4cd9e2d50dd4e4b2101ac78acd,
with edits.

* src/alloc.c (mark_overlays): restore function.
(mark_buffer): Call it, not ITREE_FOREACH.
(garbage_collect): eassert (!itree_busy_p ()).
* src/itree.h: Comment tweak: explain why GC is considered risky.  It
isn't that GC itself is risky, it is that GC can call ELisp by way of
a hook, and running ELisp during iteration is risks nested iteration.
---
 src/alloc.c | 22 +++++++++++++++++++---
 src/itree.h |  3 ++-
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/src/alloc.c b/src/alloc.c
index 00f2991f250..189c3be7e23 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -6279,6 +6279,11 @@ garbage_collect (void)
   image_prune_animation_caches (false);
 #endif
 
+  /* ELisp code run by `gc-post-hook' could result in itree iteration,
+     which must not happen while the itree is already busy.  See
+     bug#58639.  */
+  eassert (!itree_busy_p ());
+
   if (!NILP (Vpost_gc_hook))
     {
       specpdl_ref gc_count = inhibit_garbage_collection ();
@@ -6510,6 +6515,18 @@ mark_overlay (struct Lisp_Overlay *ov)
   mark_object (ov->plist);
 }
 
+/* Mark the overlay subtree rooted at NODE.  */
+
+static void
+mark_overlays (struct interval_node *node)
+{
+  if (node == NULL)
+    return;
+  mark_object (node->data);
+  mark_overlays (node->left);
+  mark_overlays (node->right);
+}
+
 /* Mark Lisp_Objects and special pointers in BUFFER.  */
 
 static void
@@ -6531,9 +6548,8 @@ mark_buffer (struct buffer *buffer)
   if (!BUFFER_LIVE_P (buffer))
       mark_object (BVAR (buffer, undo_list));
 
-  struct interval_node *node;
-  ITREE_FOREACH (node, buffer->overlays, PTRDIFF_MIN, PTRDIFF_MAX, ASCENDING)
-    mark_object (node->data);
+  if (buffer->overlays)
+    mark_overlays (buffer->overlays->root);
 
   /* If this is an indirect buffer, mark its base buffer.  */
   if (buffer->base_buffer &&
diff --git a/src/itree.h b/src/itree.h
index 0e2e7d1f81f..860bd835a2c 100644
--- a/src/itree.h
+++ b/src/itree.h
@@ -146,7 +146,8 @@ #define ITREE_NULL NULL
      it is cheap a pure.
    - Only a single iteration can happen at a time, so make sure none of the
      code within the loop can start another tree iteration, i.e. it shouldn't
-     be able to run ELisp code (or GC for that matter).
+     be able to run ELisp code, nor GC since GC can run ELisp by way
+     of `post-gc-hook`.
    - If you need to exit the loop early, you *have* to call `ITREE_ABORT`
      just before exiting (e.g. with `break` or `return`).
    - Non-local exits are not supported within the body of the loop.
-- 
2.35.1


  reply	other threads:[~2022-10-19 21:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-19 16:40 bug#58639: 29.0.50; [noverlay] Nested overlay iteration in GC Matt Armstrong
2022-10-19 17:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-19 21:49   ` Matt Armstrong [this message]
2022-10-19 22:16     ` Matt Armstrong
2022-10-19 23:50       ` Matt Armstrong
2022-10-20  1:55         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-12 20:58           ` Stefan Kangas
2022-11-15 17:47             ` Matt Armstrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k04vo55c.fsf@rfc20.org \
    --to=matt@rfc20.org \
    --cc=58639@debbugs.gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.