From: Pip Cet <pipcet@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Paul Eggert <eggert@cs.ucla.edu>, 41321@debbugs.gnu.org
Subject: bug#41321: 27.0.91; Emacs aborts due to invalid pseudovector objects
Date: Fri, 29 May 2020 09:43:24 +0000 [thread overview]
Message-ID: <CAOqdjBfaSdu8qjfFArnbB7PXsgVjTF3n_JWdGT91Vg=iSYaTNw@mail.gmail.com> (raw)
In-Reply-To: <CAOqdjBcASE_oWgomY7kg6Xi5gNMgaLPoR9sJOh+ZDSy-ja4WfQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2013 bytes --]
On Thu, May 28, 2020 at 2:28 PM Pip Cet <pipcet@gmail.com> wrote:
> My suggestion is instead to put MEM_TYPE_SYMBOL blocks into the rbtree
> twice, once at their proper address and once at the lispsym-based
> offset.
>
> We could then look up each pointer precisely once, though sometimes
> the blocks might overlap and we'd end up marking two objects for one
> pointer.
>
> But that would lead to overlapping rbtree entries, and that requires
> some extra code which wouldn't be exercised very often... still, I
> think it might be worth doing, particularly since there are relatively
> few symbol blocks on most systems.
Okay, here's some initial code that does that. It's a little tricky,
because real addresses and symbol offsets can overlap arbitrarily and
become mapped and unmapped in any order. The basic idea is that symbol
offsets are marked two ways:
1. an overlaps_with_symbols flag on a "normal" memory node
2. a mem node type of MEM_TYPE_SYMBOL_ADJUSTED
(2) implies (1), but not the other way around. There's only one flag
per normal memory node, which is true if any of the addresses in the
node are also valid symbol offsets. MEM_TYPE_SYMBOL_ADJUSTED nodes
have start and end addresses that do not necessarily correspond to
symbol blocks or even symbols; their length is arbitrary.
When we insert or delete memory nodes, we perform the obvious
operations to keep MEM_TYPE_SYMBOL_ADJUSTED blocks accurate: i.e.,
when a MEM_TYPE_SYMBOL_ADJUSTED node is split by an
intervening/overlapping normal node, we insert one or two new
MEM_TYPE_SYMBOL_ADJUSTED nodes to cover the remaining offsets, and set
the overlaps_with_symbols flag on the normal node, to cover those,
etc.
As I said, the code is tricky (i.e. might contain bugs that can only
be discovered through extensive testing on 32-bit systems), and it
complicates what should be generic functions for the rbtree
implementation, so this is probably a 32-bit optimization that is too
late because 32-bit systems are no longer that relevant...
[-- Attachment #2: 0001-snapshot.patch --]
[-- Type: text/x-patch, Size: 8603 bytes --]
From 246493425f01fc6876ed2222fd4c1806dc0e12f1 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Fri, 29 May 2020 09:40:36 +0000
Subject: [PATCH] snapshot
---
src/alloc.c | 200 +++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 199 insertions(+), 1 deletion(-)
diff --git a/src/alloc.c b/src/alloc.c
index e241b9933a..65cbacbe87 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -475,6 +475,7 @@ no_sanitize_memcpy (void *dest, void const *src, size_t size)
MEM_TYPE_CONS,
MEM_TYPE_STRING,
MEM_TYPE_SYMBOL,
+ MEM_TYPE_SYMBOL_ADJUSTED,
MEM_TYPE_FLOAT,
/* Since all non-bool pseudovectors are small enough to be
allocated from vector blocks, this memory type denotes
@@ -534,6 +535,12 @@ deadp (Lisp_Object x)
/* Start and end of allocated region. */
void *start, *end;
+ /* Whether any symbol blocks are known to exist whose adjusted
+ offsets fall in this region. If only symbol offsets in this
+ region are valid, type == MEM_TYPE_SYMBOL_ADJUSTED, but this
+ flag will also be true. */
+ bool overlaps_with_symbols;
+
/* Node color. */
enum {MEM_BLACK, MEM_RED} color;
@@ -981,6 +988,17 @@ record_xmalloc (size_t size)
return p;
}
+static void *
+adjust_symbol (void *ptr)
+{
+ return (void *)((uintptr_t) ptr - (uintptr_t) &lispsym);
+}
+
+static void *
+unadjust_symbol (void *ptr)
+{
+ return (void *)((uintptr_t) ptr + (uintptr_t) &lispsym);
+}
/* Like malloc but used for allocating Lisp data. NBYTES is the
number of bytes to allocate, TYPE describes the intended use of the
@@ -1023,6 +1041,9 @@ lisp_malloc (size_t nbytes, bool clearit, enum mem_type type)
#ifndef GC_MALLOC_CHECK
if (val && type != MEM_TYPE_NON_LISP)
mem_insert (val, (char *) val + nbytes, type);
+ if (val && type == MEM_TYPE_SYMBOL)
+ mem_insert (adjust_symbol (val), (char *) adjust_symbol (val) + nbytes,
+ MEM_TYPE_SYMBOL_ADJUSTED)->overlaps_with_symbols = true;
#endif
MALLOC_UNBLOCK_INPUT;
@@ -1259,6 +1280,9 @@ lisp_align_malloc (size_t nbytes, enum mem_type type)
#ifndef GC_MALLOC_CHECK
if (type != MEM_TYPE_NON_LISP)
mem_insert (val, (char *) val + nbytes, type);
+ if (val && type == MEM_TYPE_SYMBOL)
+ mem_insert (adjust_symbol (val), (char *) adjust_symbol (val) + nbytes,
+ MEM_TYPE_SYMBOL_ADJUSTED)->overlaps_with_symbols = true;
#endif
MALLOC_UNBLOCK_INPUT;
@@ -4073,6 +4097,36 @@ mem_init (void)
mem_root = MEM_NIL;
}
+/* Value is a pointer to the first mem node not to start before START.
+ Value is MEM_NIL if there is no such node. */
+
+static struct mem_node *
+mem_find_next (void *start)
+{
+ struct mem_node *p, *parent;
+
+ p = mem_root;
+ parent = p;
+ while (p != MEM_NIL)
+ {
+ if (start >= p->end)
+ {
+ p = p->right;
+ }
+ else if (start <= p->start)
+ {
+ parent = p;
+ p = p->left;
+ }
+ else
+ return p;
+ }
+
+ if (start <= parent->start)
+ return parent;
+
+ return MEM_NIL;
+}
/* Value is a pointer to the mem_node containing START. Value is
MEM_NIL if there is no node in the tree containing START. */
@@ -4119,9 +4173,42 @@ mem_insert (void *start, void *end, enum mem_type type)
while (c != MEM_NIL)
{
parent = c;
- c = start < c->start ? c->left : c->right;
+ if (start < c->end && c->start < end)
+ break;
+ if (start < c->start)
+ c = c->left;
+ else if (end >= c->end)
+ c = c->right;
+ else
+ break;
}
+ if (parent && parent->end > start && parent->start < end)
+ {
+ void *old_start = parent->start;
+ void *old_end = parent->end;
+ enum mem_type old_type = parent->type;
+ if (type == MEM_TYPE_SYMBOL_ADJUSTED
+ && old_type != MEM_TYPE_SYMBOL_ADJUSTED)
+ {
+ if (start < old_start)
+ mem_insert (start, old_start, type)->overlaps_with_symbols = true;
+ if (old_end < end)
+ mem_insert (old_end, end, type)->overlaps_with_symbols = true;
+ }
+ else
+ {
+ eassert (parent->type == MEM_TYPE_SYMBOL_ADJUSTED);
+ mem_delete (parent);
+ parent = mem_insert (start, end, type);
+ if (old_start < start)
+ mem_insert (old_start, start, old_type)->overlaps_with_symbols = true;
+ if (old_end > end)
+ mem_insert (end, old_end, old_type)->overlaps_with_symbols = true;
+ }
+ parent->overlaps_with_symbols = true;
+ return parent;
+ }
/* Create a new node. */
#ifdef GC_MALLOC_CHECK
x = malloc (sizeof *x);
@@ -4136,6 +4223,7 @@ mem_insert (void *start, void *end, enum mem_type type)
x->parent = parent;
x->left = x->right = MEM_NIL;
x->color = MEM_RED;
+ x->overlaps_with_symbols = false;
/* Insert it as child of PARENT or install it as root. */
if (parent)
@@ -4301,12 +4389,92 @@ mem_rotate_right (struct mem_node *x)
x->parent = y;
}
+/* Set the overlaps_with_symbols flag based on MEM_TYPE_SYMBOL
+ blocks. */
+
+static void
+mem_set_overlaps_with_symbols (struct mem_node *x)
+{
+ x->overlaps_with_symbols = false;
+
+ for (void *p = unadjust_symbol (x->start);
+ p < unadjust_symbol (x->end);)
+ {
+ struct mem_node *y = mem_find_next (p);
+ p = y->end;
+ if (y->start >= x->end)
+ break;
+ if (y->type == MEM_TYPE_SYMBOL)
+ {
+ x->overlaps_with_symbols = true;
+ return;
+ }
+ }
+}
/* Delete node Z from the tree. If Z is null or MEM_NIL, do nothing. */
static void
mem_delete (struct mem_node *z)
{
+ if (z->overlaps_with_symbols)
+ {
+ void *z_start = z->start;
+ void *z_end = z->end;
+ z->overlaps_with_symbols = false;
+ mem_delete (z);
+ /* Find all the symbol blocks that intersected with z, and add
+ them to the rbtree. */
+ for (void *unadjusted_start = unadjust_symbol (z_start);
+ unadjusted_start < unadjust_symbol (z_end);)
+ {
+ struct mem_node *x = mem_find_next (unadjusted_start);
+ unadjusted_start = x->end;
+
+ if (x == MEM_NIL)
+ break;
+
+ if (x->start > unadjust_symbol (z_end))
+ break;
+
+ if (x->type == MEM_TYPE_SYMBOL)
+ {
+ mem_insert (max (z_start, adjust_symbol (x->start)),
+ min (z_end, adjust_symbol (x->end)),
+ MEM_TYPE_SYMBOL_ADJUSTED)
+ ->overlaps_with_symbols = true;
+ }
+ }
+ return;
+ }
+ if (z->type == MEM_TYPE_SYMBOL)
+ {
+ for (void *adjusted_start = adjust_symbol (z->start);
+ adjusted_start < adjust_symbol (z->end);)
+ {
+ struct mem_node *x = mem_find_next (adjusted_start);
+ adjusted_start = x->end;
+ if (x->type == MEM_TYPE_SYMBOL_ADJUSTED)
+ {
+ void *x_start = x->start;
+ void *x_end = x->end;
+ mem_delete (x);
+ if (x_start < adjust_symbol (z->start))
+ mem_insert (x_start, adjust_symbol (z->start),
+ MEM_TYPE_SYMBOL_ADJUSTED)->overlaps_with_symbols
+ = true;
+ if (x_end > adjust_symbol (z->end))
+ mem_insert (adjust_symbol (z->end), x_end,
+ MEM_TYPE_SYMBOL_ADJUSTED)->overlaps_with_symbols
+ = true;
+ }
+ else
+ {
+ eassert (x->overlaps_with_symbols);
+ mem_set_overlaps_with_symbols (x);
+ }
+ }
+ }
struct mem_node *x, *y;
if (!z || z == MEM_NIL)
@@ -4342,6 +4510,7 @@ mem_delete (struct mem_node *z)
z->start = y->start;
z->end = y->end;
z->type = y->type;
+ z->overlaps_with_symbols = y->overlaps_with_symbols;
}
if (y->color == MEM_BLACK)
@@ -4766,6 +4935,10 @@ mark_maybe_pointer (void *p)
obj = live_symbol_holding (m, p);
break;
+ case MEM_TYPE_SYMBOL_ADJUSTED:
+ /* handled below */
+ break;
+
case MEM_TYPE_FLOAT:
if (live_float_p (m, p))
obj = make_lisp_ptr (p, Lisp_Float);
@@ -4782,6 +4955,18 @@ mark_maybe_pointer (void *p)
if (!NILP (obj))
mark_object (obj);
+
+ if (m->overlaps_with_symbols)
+ {
+ obj = Qnil;
+ p = unadjust_symbol (p);
+ m = mem_find (p);
+ if (m != MEM_NIL
+ && m->type == MEM_TYPE_SYMBOL)
+ obj = live_symbol_holding (m, unadjust_symbol (p));
+ if (!NILP (obj))
+ mark_object (obj);
+ }
}
}
@@ -7077,6 +7262,19 @@ sweep_symbols (void)
/* Unhook from the free list. */
symbol_free_list = sblk->symbols[0].u.s.next;
lisp_free (sblk);
+ void *p = adjust_symbol (sblk);
+ while (true)
+ {
+ struct mem_node *m = mem_find_next (p);
+ if (m->start >=
+ adjust_symbol (sblk + 1))
+ break;
+ p = m->end;
+ mem_set_overlaps_with_symbols (m);
+ if (m->type == MEM_TYPE_SYMBOL_ADJUSTED
+ && !m->overlaps_with_symbols)
+ mem_delete (m);
+ }
}
else
{
--
2.27.0.rc0
next prev parent reply other threads:[~2020-05-29 9:43 UTC|newest]
Thread overview: 132+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-16 10:33 bug#41321: 27.0.91; Emacs aborts due to invalid pseudovector objects Eli Zaretskii
2020-05-16 16:33 ` Paul Eggert
2020-05-16 16:47 ` Eli Zaretskii
2020-05-17 10:56 ` Pip Cet
2020-05-17 15:28 ` Eli Zaretskii
2020-05-17 15:57 ` Eli Zaretskii
2020-05-22 7:22 ` Eli Zaretskii
2020-05-22 8:35 ` Andrea Corallo
2020-05-22 11:04 ` Eli Zaretskii
2020-05-22 12:55 ` Andrea Corallo
2020-05-22 10:54 ` Eli Zaretskii
2020-05-22 11:47 ` Pip Cet
2020-05-22 12:13 ` Eli Zaretskii
2020-05-22 12:39 ` Pip Cet
2020-05-22 12:48 ` Eli Zaretskii
2020-05-22 14:04 ` Pip Cet
2020-05-22 14:26 ` Eli Zaretskii
2020-05-22 14:40 ` Andrea Corallo
2020-05-22 19:03 ` Eli Zaretskii
[not found] ` <CAOqdjBdpU4U1NqErNH0idBmUxNeE3fL=2=KKpo9kbCM3DhW5gA@mail.gmail.com>
2020-05-23 17:58 ` Andrea Corallo
2020-05-23 22:37 ` Stefan Monnier
2020-05-23 22:41 ` Pip Cet
2020-05-23 23:26 ` Stefan Monnier
2020-05-22 12:32 ` Eli Zaretskii
2020-05-29 9:51 ` Eli Zaretskii
2020-05-29 10:00 ` Pip Cet
2020-05-23 23:54 ` Pip Cet
2020-05-24 14:24 ` Eli Zaretskii
2020-05-24 15:00 ` Pip Cet
2020-05-24 16:25 ` Eli Zaretskii
2020-05-24 16:55 ` Eli Zaretskii
2020-05-24 18:03 ` Pip Cet
2020-05-24 18:40 ` Eli Zaretskii
2020-05-24 19:40 ` Pip Cet
2020-05-25 2:30 ` Eli Zaretskii
2020-05-25 6:40 ` Pip Cet
2020-05-25 11:28 ` Pip Cet
2020-05-25 14:53 ` Eli Zaretskii
2020-05-25 15:12 ` Stefan Monnier
2020-05-26 3:39 ` Paul Eggert
2020-05-26 3:33 ` Paul Eggert
2020-05-26 6:18 ` Pip Cet
2020-05-26 7:51 ` Paul Eggert
2020-05-26 8:27 ` Pip Cet
2020-05-26 6:46 ` Paul Eggert
2020-05-26 15:17 ` Eli Zaretskii
2020-05-26 22:49 ` Paul Eggert
2020-05-27 15:26 ` Eli Zaretskii
2020-05-27 16:58 ` Paul Eggert
2020-05-27 17:33 ` Eli Zaretskii
2020-05-27 17:53 ` Paul Eggert
2020-05-27 18:24 ` Eli Zaretskii
2020-05-27 18:39 ` Paul Eggert
2020-05-28 2:43 ` Stefan Monnier
2020-05-28 7:27 ` Eli Zaretskii
2020-05-28 7:41 ` Paul Eggert
2020-05-28 13:30 ` Stefan Monnier
2020-05-28 14:28 ` Pip Cet
2020-05-28 16:24 ` Stefan Monnier
2020-05-29 9:43 ` Pip Cet [this message]
2020-05-29 18:31 ` Paul Eggert
2020-05-29 18:37 ` Pip Cet
2020-05-29 19:32 ` Paul Eggert
2020-05-29 19:37 ` Pip Cet
2020-05-29 20:26 ` Stefan Monnier
2020-05-29 20:40 ` Paul Eggert
2020-05-30 5:54 ` Eli Zaretskii
2020-05-30 17:52 ` Paul Eggert
2020-05-30 18:11 ` Eli Zaretskii
2020-05-30 18:17 ` Paul Eggert
2020-05-30 5:51 ` Eli Zaretskii
2020-05-30 14:26 ` Stefan Monnier
2020-05-27 17:57 ` Pip Cet
2020-05-27 18:39 ` Paul Eggert
2020-05-27 18:56 ` Pip Cet
2020-05-28 1:21 ` Paul Eggert
2020-05-28 6:31 ` Pip Cet
2020-05-28 7:47 ` Paul Eggert
2020-05-28 8:11 ` Pip Cet
2020-05-28 18:27 ` Eli Zaretskii
2020-05-28 19:33 ` Paul Eggert
2020-05-29 6:19 ` Eli Zaretskii
2020-05-29 20:24 ` Paul Eggert
2020-05-29 21:01 ` Pip Cet
2020-05-30 5:58 ` Eli Zaretskii
2020-05-30 7:19 ` Pip Cet
2020-05-30 9:08 ` Eli Zaretskii
2020-05-30 11:06 ` Pip Cet
2020-05-30 11:31 ` Eli Zaretskii
2020-05-30 13:29 ` Pip Cet
2020-05-30 16:32 ` Eli Zaretskii
2020-05-30 16:36 ` Pip Cet
2020-05-30 16:45 ` Eli Zaretskii
2020-05-30 18:04 ` Paul Eggert
2020-05-30 18:12 ` Pip Cet
2020-05-30 18:16 ` Eli Zaretskii
2020-05-30 18:45 ` Paul Eggert
2020-05-30 18:39 ` Pip Cet
2020-05-30 18:57 ` Paul Eggert
2020-05-30 19:06 ` Pip Cet
2020-05-30 21:27 ` Paul Eggert
2020-05-30 21:49 ` Pip Cet
2020-05-30 22:23 ` Paul Eggert
2020-05-30 22:54 ` Pip Cet
2020-05-30 16:31 ` Paul Eggert
2020-05-30 16:42 ` Eli Zaretskii
2020-05-30 17:06 ` Paul Eggert
2020-05-30 17:22 ` Eli Zaretskii
2020-05-30 18:12 ` Paul Eggert
2020-05-30 18:21 ` Eli Zaretskii
2020-05-30 19:14 ` Paul Eggert
2020-05-30 19:33 ` Eli Zaretskii
2020-05-30 22:18 ` Paul Eggert
2020-05-31 15:48 ` Eli Zaretskii
2020-06-01 14:48 ` Eli Zaretskii
2020-09-27 14:39 ` Lars Ingebrigtsen
2020-09-27 14:45 ` Pip Cet
2020-09-27 15:02 ` Lars Ingebrigtsen
2020-09-27 15:16 ` Eli Zaretskii
2020-05-30 16:53 ` Pip Cet
2020-05-30 5:50 ` Eli Zaretskii
2020-05-29 8:25 ` Pip Cet
2020-05-25 15:14 ` Eli Zaretskii
2020-05-25 17:41 ` Pip Cet
2020-05-24 19:00 ` Andy Moreton
2020-05-24 19:09 ` Pip Cet
2020-05-29 10:16 ` Eli Zaretskii
2020-05-29 10:34 ` Pip Cet
2020-05-29 10:55 ` Eli Zaretskii
2020-05-29 11:47 ` Pip Cet
2020-05-29 13:52 ` Eli Zaretskii
2020-05-29 14:19 ` Pip Cet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOqdjBfaSdu8qjfFArnbB7PXsgVjTF3n_JWdGT91Vg=iSYaTNw@mail.gmail.com' \
--to=pipcet@gmail.com \
--cc=41321@debbugs.gnu.org \
--cc=eggert@cs.ucla.edu \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).