all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Pip Cet via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: 72802@debbugs.gnu.org
Subject: bug#72802: 31.0.50; Crash in (equal sub-char-table-a sub-char-table-b)
Date: Sun, 25 Aug 2024 13:13:34 +0000	[thread overview]
Message-ID: <871q2c4uf6.fsf@protonmail.com> (raw)

Summary: Comparing sub char tables can lead to crashes in equal when
they are read with their read syntax; using high-level char table
manipulation routines and comparing char tables (not sub char tables
directly) is almost certain to result in rare crashes as well.

The code in internal_equal compares sub-char-tables incorrectly and
segfaults on my machine (little-endian 64-bit words, LSB tags, 3 ==
Lisp_Cons) when evaluating this code:

(setq a #^^[3 2597376 (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3)])

(setq b #^^[3 2597504 (3) (3) (3) (3) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (3) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2) (2)])

(equal a b)


This happened to me while working on pdumper code for the no-purespace
branch and trying to compare dumped sub char tables, but it can happen
when reading sub char tables using their read syntax, too.

I'm almost certain the bug can actually happen when manipulating char
tables using higher-level routines, and comparing char tables, not sub
char tables.

Comparing char tables definitely results in nonsensical (but identical)
arguments 'o1' and 'o2' being passed to 'internal_equal', and put into
the hash table 'ht' in internal_equal.

This is almost definitely a crashable bug, but a very rare one (it
relies on conservative stack marking marking our hash table and trying
to mark the invalid conses in it), without using the read syntax for sub
char tables.

On 32-bit machines, the crashes might be much more common.

The problem is this code:

	for (ptrdiff_t i = 0; i < size; i++)
	  {
	    Lisp_Object v1, v2;
	    v1 = AREF (o1, i);
	    v2 = AREF (o2, i);
	    if (!internal_equal (v1, v2, equal_kind, depth + 1, ht))
	      return false;
	  }

which assumes sub char tables are ordinary pseudovectors and can be
compared by comparing XVECTOR (o1)->contents to XVECTOR (o2)->contents.

However, sub char tables should be compared by comparing
XSUB_CHAR_TABLE (o1)->contents to XSUB_CHAR_TABLE (o2)->contents, after
checking that 'depth' and 'min_char' also match.

The memory layout of sub char tables is:

struct Lisp_Sub_Char_Table
  {
    /* HEADER.SIZE is the vector's size field, which also holds the
       pseudovector type information.  It holds the size, too.  */
    union vectorlike_header header;

    /* Depth of this sub char-table.  It should be 1, 2, or 3.  A sub
       char-table of depth 1 contains 16 elements, and each element
       covers 4096 (128*32) characters.  A sub char-table of depth 2
       contains 32 elements, and each element covers 128 characters.  A
       sub char-table of depth 3 contains 128 elements, and each element
       is for one character.  */
    int depth;

    /* Minimum character covered by the sub char-table.  */
    int min_char;

    /* Use set_sub_char_table_contents to set this.  */
    Lisp_Object contents[FLEXIBLE_ARRAY_MEMBER];
  } GCALIGNED_STRUCT;

So the first 64-bit word after the header has 'min_char' in the high
bits, 3 in the low bits, in the above example.  In my case, we end up
calling

internal_equal (o1=XIL(0x27a20000000003), o2=XIL(0x27a28000000003),
equal_kind=EQUAL_PLAIN, depth=1, ht=XIL(0)) at fns.c:2887

with the nonsensical Lisp words o1 = 0x27a20000000003 (depth = 3,
min_char = 0x27a200) and o2 = 0x27a28000000003 (depth = 3, min_char =
0x27a280); these are interpreted as Lisp conses and we attempt to
dereference them, which leads to the segfault.

Relevant section of the backtrace:

(gdb) bt full
#0  0x0000555555838ea7 in internal_equal (o1=XIL(0x27a20000000003), o2=XIL(0x27a28000000003), equal_kind=EQUAL_PLAIN, depth=1, ht=XIL(0)) at fns.c:2887
        li = {
          tortoise = XIL(0x27a20000000003),
          max = 2,
          n = 0,
          q = 2
        }
#1  0x00005555558393c8 in internal_equal (o1=XIL(0x555557718945), o2=XIL(0x55555677bda5), equal_kind=EQUAL_PLAIN, depth=0, ht=XIL(0)) at fns.c:2963
        v1 = XIL(0x27a20000000003)
        v2 = XIL(0x27a28000000003)
        i = 0
        size = 129
#2  0x0000555555838a01 in Fequal (o1=XIL(0x555557718945), o2=XIL(0x55555677bda5)) at fns.c:2783

(gdb) l
2958		for (ptrdiff_t i = 0; i < size; i++)
2959		  {
2960		    Lisp_Object v1, v2;
2961		    v1 = AREF (o1, i);
2962		    v2 = AREF (o2, i);
2963		    if (!internal_equal (v1, v2, equal_kind, depth + 1, ht))
2964		      return false;
2965		  }
2966		return true;
2967	      }
(gdb) p SUB_CHAR_TABLE_P (o1)
$38 = true
(gdb) p SUB_CHAR_TABLE_P (o2)
$39 = true
(gdb) p *XSUB_CHAR_TABLE (o1)
$40 = {
  header = {
    size = 4611686018981036161
  },
  depth = 3,
  min_char = 2597376,
  contents = 0x555557718950
}
(gdb) p *XSUB_CHAR_TABLE (o2)
$41 = {
  header = {
    size = 4611686018981036161
  },
  depth = 3,
  min_char = 2597504,
  contents = 0x55555677bdb0
}
(gdb) p AREF (o1, 0)
$42 = (struct Lisp_X *) 0x27a20000000003
(gdb) p AREF (o2, 0)
$43 = (struct Lisp_X *) 0x27a28000000003
(gdb) p *XVECTOR (o1)
$44 = {
  header = {
    size = 4611686018981036161
  },
  contents = 0x555557718948
}
(gdb) p *XVECTOR (o2)
$45 = {
  header = {
    size = 4611686018981036161
  },
  contents = 0x55555677bda8
}

Fix coming up once this has a bug number.






             reply	other threads:[~2024-08-25 13:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-25 13:13 Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2024-08-25 14:11 ` bug#72802: 31.0.50; Crash in (equal sub-char-table-a sub-char-table-b) Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-08-25 14:50   ` Eli Zaretskii
2024-08-25 15:15     ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-08-25 15:39       ` Eli Zaretskii
2024-08-25 16:01         ` Pip Cet via Bug reports for GNU Emacs, the Swiss army knife of text editors

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871q2c4uf6.fsf@protonmail.com \
    --to=bug-gnu-emacs@gnu.org \
    --cc=72802@debbugs.gnu.org \
    --cc=pipcet@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.