* bug#52461: spontaneous crash with portable dumper
@ 2021-12-13 1:38 YAMAMOTO Mitsuharu
2021-12-13 1:44 ` YAMAMOTO Mitsuharu
2021-12-13 13:31 ` Eli Zaretskii
0 siblings, 2 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-13 1:38 UTC (permalink / raw)
To: 52461
During the development of the Mac port based on Emacs 28.0.90, I had
spontaneous crash inside dump_cold_charset.
% cd src; lldb temacs
(lldb) target create "temacs"
Current executable set to '/Users/mituharu/src/git/emacs-builds/work-debug/src/temacs' (arm64).
(lldb) r -batch -l loadup --temacs=pdump --bin-dest /usr/local/bin/ --eln-dest /usr/local/lib/emacs/28.0.90/
Process 19997 launched: '/Users/mituharu/src/git/emacs-builds/work-debug/src/temacs' (arm64)
Loading loadup.el (source)...
Dump mode: pdump
Using load-path (/Users/mituharu/src/git/emacs-builds/work-debug/../../emacs/work/lisp)
Loading emacs-lisp/byte-run...
Loading emacs-lisp/backquote...
Loading subr...
Loading version...
Loading widget...
Loading custom...
Loading emacs-lisp/map-ynp...
Loading international/mule...
Loading international/mule-conf...
Loading env...
Loading format...
Loading bindings...
Loading window...
Loading files...
Loading emacs-lisp/macroexp...
Loading cus-face...
Loading faces...
Loading loaddefs.el (source)...
Loading button...
Loading emacs-lisp/nadvice...
Loading emacs-lisp/cl-preloaded...
Loading obarray...
Loading abbrev...
Loading simple...
Loading help...
Loading jka-cmpr-hook...
Loading epa-hook...
Loading international/mule-cmds...
Loading case-table...
Loading international/charprop.el (source)...
Loading international/characters...
Loading international/charscript...
Loading international/emoji-zwj...
Loading composite...
Loading language/chinese...
Loading language/cyrillic...
Loading language/indian...
Loading language/sinhala...
Loading language/english...
Loading language/ethiopic...
Loading language/european...
Loading language/czech...
Loading language/slovak...
Loading language/romanian...
Loading language/greek...
Loading language/hebrew...
Loading international/cp51932...
Loading international/eucjp-ms...
Loading language/japanese...
Loading language/korean...
Loading language/lao...
Loading language/tai-viet...
Loading language/thai...
Loading language/tibetan...
Loading language/vietnamese...
Loading language/misc-lang...
Loading language/utf-8-lang...
Loading language/georgian...
Loading language/khmer...
Loading language/burmese...
Loading language/cham...
Loading indent...
Loading emacs-lisp/cl-generic...
Loading minibuffer...
Loading frame...
Loading startup...
Loading term/tty-colors...
Loading font-core...
Loading emacs-lisp/syntax...
Loading font-lock...
Loading jit-lock...
Loading mouse...
Loading scroll-bar...
Loading select...
Loading emacs-lisp/timer...
Loading emacs-lisp/easymenu...
Loading isearch...
Loading rfn-eshadow...
Loading menu-bar...
Loading tab-bar...
Loading emacs-lisp/lisp...
Loading textmodes/page...
Loading register...
Loading textmodes/paragraphs...
Loading progmodes/prog-mode...
Loading emacs-lisp/lisp-mode...
Loading textmodes/text-mode...
Loading textmodes/fill...
Loading newcomment...
Loading replace...
Loading emacs-lisp/tabulated-list...
Loading buff-menu...
Loading fringe...
Loading emacs-lisp/regexp-opt...
Loading image...
Loading international/fontset...
Loading dnd...
Loading tool-bar...
Loading term/common-win...
Loading term/mac-win...
Loading mwheel...
Loading progmodes/elisp-mode...
Loading emacs-lisp/float-sup...
Loading vc/vc-hooks...
Loading vc/ediff-hook...
Loading uniquify...
Loading electric...
Loading paren...
Loading emacs-lisp/shorthands...
Loading emacs-lisp/eldoc...
Loading cus-start...
Loading tooltip...
Loading international/iso-transl...
Loading leim/leim-list.el (source)...
Waiting for git...
Waiting for git...
Finding pointers to doc strings...
Finding pointers to doc strings...done
Pure-hashed: 17091 strings, 5197 vectors, 42628 conses, 4696 bytecodes, 270 others
Dumping under the name emacs.pdmp
Dumping fingerprint: 134341316bf9884828a54d89e5feeb5b0544373e345d945d5498970dc66fa98c
Process 19997 stopped
* thread #2, name = 'org.gnu.Emacs.lisp-main', stop reason = EXC_BAD_ACCESS (code=2, address=0x4300000020)
frame #0: 0x00000001912d41a0 libsystem_platform.dylib`_platform_memmove + 144
libsystem_platform.dylib`_platform_memmove:
-> 0x1912d41a0 <+144>: ldnp q2, q3, [x1]
0x1912d41a4 <+148>: sub x5, x3, x0
0x1912d41a8 <+152>: add x1, x1, x5
0x1912d41ac <+156>: ldnp q0, q1, [x1]
Target 0: (temacs) stopped.
(lldb) up
frame #1: 0x0000000100247c78 temacs`dump_write(ctx=0x0000000170793bf8, buf=0x0000004300000020, nbyte=256) at pdumper.c:779:3
776 eassert (ctx->flags.dump_object_contents);
777 while (ctx->offset + nbyte > ctx->buf_size)
778 dump_grow_buffer (ctx);
-> 779 memcpy ((char *)ctx->buf + ctx->offset, buf, nbyte);
780 ctx->offset += nbyte;
781 }
782
(lldb) p buf
(const void *) $0 = 0x0000004300000020
(lldb) up
frame #2: 0x0000000100253654 temacs`dump_cold_charset(ctx=0x0000000170793bf8, data=(i = 0x0000000101121f53)) at pdumper.c:3361:3
3358 cs_dump_offset + dump_offsetof (struct charset, code_space_mask),
3359 ctx->offset);
3360 struct charset *cs = charset_table + cs_i;
-> 3361 dump_write (ctx, cs->code_space_mask, 256);
3362 }
3363
3364 static void
(lldb) p *cs
(charset) $1 = {
id = 90
hash_index = 386547056672
dimension = 108
code_space = ([0] = 32, [1] = 90, [2] = 112, [3] = 32, [4] = 67, [5] = 99, [6] = 32, [7] = 67, [8] = 102, [9] = 32, [10] = 67, [11] = 115, [12] = 32, [13] = 67, [14] = 111)
code_space_mask = 0x0000004300000020 ""
code_linear_p = false
iso_chars_96 = true
ascii_compatible_p = true
supplementary_p = true
compact_codes_p = false
unified_p = true
iso_final = 93
iso_revision = 93
emacs_mule_id = 10
method = 0x20
min_code = 32
max_code = 34
char_index_offset = 85
min_char = 110
max_char = 105
invalid_code = 99
fast_map = "o"
code_offset = 104
}
(lldb) p cs_i
(int) $2 = 183
(lldb) p charset_table_used
(int) $3 = 183
Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
contains uninitialized contents. So writing to the area that
cs->code_space_mask points to can cause crash or memory corruption.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-13 1:38 bug#52461: spontaneous crash with portable dumper YAMAMOTO Mitsuharu
@ 2021-12-13 1:44 ` YAMAMOTO Mitsuharu
2021-12-13 13:31 ` Eli Zaretskii
1 sibling, 0 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-13 1:44 UTC (permalink / raw)
To: 52461
On Mon, 13 Dec 2021 10:38:28 +0900,
YAMAMOTO Mitsuharu wrote:
>
> Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
> contains uninitialized contents. So writing to the area that
> cs->code_space_mask points to can cause crash or memory corruption.
Sorry, cs->code_space_mask was not the destination address but the
source address. So it does not cause memory corruption, but still
crash can happen.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-13 1:38 bug#52461: spontaneous crash with portable dumper YAMAMOTO Mitsuharu
2021-12-13 1:44 ` YAMAMOTO Mitsuharu
@ 2021-12-13 13:31 ` Eli Zaretskii
2021-12-13 14:43 ` Pip Cet
1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-13 13:31 UTC (permalink / raw)
To: YAMAMOTO Mitsuharu; +Cc: 52461
> Date: Mon, 13 Dec 2021 10:38:28 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
>
> (lldb) p cs_i
> (int) $2 = 183
> (lldb) p charset_table_used
> (int) $3 = 183
>
> Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
> contains uninitialized contents.
So we somehow have a charset that is not in charset_table, is that
what you are saying? Because otherwise how could its ID be beyond the
table, when define-charset-internal enlarges the table as needed?
Any idea what charset is that, and where it is added/loaded?
FWIW, on my system, charset_table_used is 179, so maybe the mac port
defines some additional charsets?
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-13 13:31 ` Eli Zaretskii
@ 2021-12-13 14:43 ` Pip Cet
2021-12-13 16:52 ` Eli Zaretskii
0 siblings, 1 reply; 11+ messages in thread
From: Pip Cet @ 2021-12-13 14:43 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52461
On Mon, Dec 13, 2021 at 1:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Mon, 13 Dec 2021 10:38:28 +0900
> > From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> >
> > (lldb) p cs_i
> > (int) $2 = 183
> > (lldb) p charset_table_used
> > (int) $3 = 183
> >
> > Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
> > contains uninitialized contents.
>
> So we somehow have a charset that is not in charset_table, is that
> what you are saying? Because otherwise how could its ID be beyond the
> table, when define-charset-internal enlarges the table as needed?
I think there's no great mystery here, and the initial analysis is correct:
define-charset-internal enlarges the table, but doesn't zero out the
added entries (>= charset_table_used, < charset_table_size). pdumper
assumes all entries (< charset_table_size), including the unused ones,
contain either valid pointers or NULL. When they're not, but happen to
have an invalid pointer in the wrong place, we get a crash.
Often, realloc will return zeroed memory so this bug may have stayed
invisible for some time.
What we can try is debugging the crashing temacs binary, setting a
breakpoint to the point where define-charset-internal calls xpalloc
(in a very strange manner), at about line 1126 in charset.c, and zero
the memory after the call by executing
memset (new_table, 0, new_size * sizeof (new_table[0]))
in the debugger (I don't use lldb so I'm not sure precisely how to do
that), then continuing to the memcpy.
If that fixes things, we should make the obvious change, I think.
Pip
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-13 14:43 ` Pip Cet
@ 2021-12-13 16:52 ` Eli Zaretskii
2021-12-14 8:04 ` YAMAMOTO Mitsuharu
0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-13 16:52 UTC (permalink / raw)
To: Pip Cet; +Cc: 52461
> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 13 Dec 2021 14:43:55 +0000
> Cc: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>, 52461@debbugs.gnu.org
>
> What we can try is debugging the crashing temacs binary, setting a
> breakpoint to the point where define-charset-internal calls xpalloc
> (in a very strange manner), at about line 1126 in charset.c, and zero
> the memory after the call by executing
> memset (new_table, 0, new_size * sizeof (new_table[0]))
> in the debugger (I don't use lldb so I'm not sure precisely how to do
> that), then continuing to the memcpy.
>
> If that fixes things, we should make the obvious change, I think.
Sounds like a plan, thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-13 16:52 ` Eli Zaretskii
@ 2021-12-14 8:04 ` YAMAMOTO Mitsuharu
2021-12-14 13:20 ` Eli Zaretskii
0 siblings, 1 reply; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-14 8:04 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52461, Pip Cet
On Mon, 13 Dec 2021 22:31:09 +0900,
Eli Zaretskii wrote:
>
> FWIW, on my system, charset_table_used is 179, so maybe the mac port
> defines some additional charsets?
Yes.
On Tue, 14 Dec 2021 01:52:51 +0900,
Eli Zaretskii wrote:
>
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 13 Dec 2021 14:43:55 +0000
> > Cc: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>, 52461@debbugs.gnu.org
> >
> > What we can try is debugging the crashing temacs binary, setting a
> > breakpoint to the point where define-charset-internal calls xpalloc
> > (in a very strange manner), at about line 1126 in charset.c, and zero
> > the memory after the call by executing
> > memset (new_table, 0, new_size * sizeof (new_table[0]))
> > in the debugger (I don't use lldb so I'm not sure precisely how to do
> > that), then continuing to the memcpy.
> >
> > If that fixes things, we should make the obvious change, I think.
>
> Sounds like a plan, thanks.
I directly inserted the memset line just after the xpalloc call in
charset.c, and dumped 10 times. No crash occured.
I also tried the change below, and it seems to work, too.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
diff --git a/src/charset.c b/src/charset.c
index 7cd0fa78f0..670fd48a2d 100644
--- a/src/charset.c
+++ b/src/charset.c
@@ -63,7 +63,7 @@ Copyright (C) 2003, 2004
/* Table of struct charset. */
struct charset *charset_table;
int charset_table_size;
-static int charset_table_used;
+int charset_table_used;
/* Special charsets corresponding to symbols. */
int charset_ascii;
diff --git a/src/charset.h b/src/charset.h
index 97122d82a6..8c538234d8 100644
--- a/src/charset.h
+++ b/src/charset.h
@@ -249,6 +249,7 @@ #define EMACS_CHARSET_H
/* Table of struct charset. */
extern struct charset *charset_table;
extern int charset_table_size;
+extern int charset_table_used;
#define CHARSET_FROM_ID(id) (charset_table + (id))
diff --git a/src/pdumper.c b/src/pdumper.c
index 98c760162e..2782648e7a 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -3174,7 +3174,7 @@ dump_charset (struct dump_context *ctx, int cs_i)
DUMP_FIELD_COPY (&out, cs, hash_index);
DUMP_FIELD_COPY (&out, cs, dimension);
memcpy (out.code_space, &cs->code_space, sizeof (cs->code_space));
- if (cs->code_space_mask)
+ if (cs_i < charset_table_used && cs->code_space_mask)
dump_field_fixup_later (ctx, &out, cs, &cs->code_space_mask);
DUMP_FIELD_COPY (&out, cs, code_linear_p);
DUMP_FIELD_COPY (&out, cs, iso_chars_96);
@@ -3195,7 +3195,7 @@ dump_charset (struct dump_context *ctx, int cs_i)
memcpy (out.fast_map, &cs->fast_map, sizeof (cs->fast_map));
DUMP_FIELD_COPY (&out, cs, code_offset);
dump_off offset = dump_object_finish (ctx, &out, sizeof (out));
- if (cs->code_space_mask)
+ if (cs_i < charset_table_used && cs->code_space_mask)
dump_remember_cold_op (ctx, COLD_OP_CHARSET,
Fcons (dump_off_to_lisp (cs_i),
dump_off_to_lisp (offset)));
^ permalink raw reply related [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-14 8:04 ` YAMAMOTO Mitsuharu
@ 2021-12-14 13:20 ` Eli Zaretskii
2021-12-15 3:04 ` YAMAMOTO Mitsuharu
0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-14 13:20 UTC (permalink / raw)
To: YAMAMOTO Mitsuharu; +Cc: 52461, pipcet
> Date: Tue, 14 Dec 2021 17:04:48 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> Cc: Pip Cet <pipcet@gmail.com>,
> 52461@debbugs.gnu.org
>
> I directly inserted the memset line just after the xpalloc call in
> charset.c, and dumped 10 times. No crash occured.
>
> I also tried the change below, and it seems to work, too.
Thanks. I think I prefer the second variant.
Do you think we need more testing of this, since the problem was
intermittent?
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-14 13:20 ` Eli Zaretskii
@ 2021-12-15 3:04 ` YAMAMOTO Mitsuharu
2021-12-15 3:24 ` YAMAMOTO Mitsuharu
2021-12-15 3:30 ` Eli Zaretskii
0 siblings, 2 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-15 3:04 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52461, pipcet
On Tue, 14 Dec 2021 22:20:30 +0900,
Eli Zaretskii wrote:
>
> > I directly inserted the memset line just after the xpalloc call in
> > charset.c, and dumped 10 times. No crash occured.
> >
> > I also tried the change below, and it seems to work, too.
>
> Thanks. I think I prefer the second variant.
>
> Do you think we need more testing of this, since the problem was
> intermittent?
I don't think so. The members of struct charset other than
code_space_mask is non-pointer values, so they do not involve any
dereference.
Can I install it to the emacs-28 branch? Currently, we have
charset_table_used == 179, and charset_table_size == 180, so the
problem does not manifest itself without additional charsets defined
before dumping.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-15 3:04 ` YAMAMOTO Mitsuharu
@ 2021-12-15 3:24 ` YAMAMOTO Mitsuharu
2021-12-15 3:30 ` Eli Zaretskii
1 sibling, 0 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-15 3:24 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52461, pipcet
On Wed, 15 Dec 2021 12:04:31 +0900,
YAMAMOTO Mitsuharu wrote:
>
> Can I install it to the emacs-28 branch? Currently, we have
> charset_table_used == 179, and charset_table_size == 180, so the
> problem does not manifest itself without additional charsets defined
> before dumping.
Oops. The latter half of the last sentence was not right. Forget
about it.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-15 3:04 ` YAMAMOTO Mitsuharu
2021-12-15 3:24 ` YAMAMOTO Mitsuharu
@ 2021-12-15 3:30 ` Eli Zaretskii
2021-12-15 4:15 ` YAMAMOTO Mitsuharu
1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-15 3:30 UTC (permalink / raw)
To: YAMAMOTO Mitsuharu; +Cc: 52461, pipcet
> Date: Wed, 15 Dec 2021 12:04:31 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> Cc: pipcet@gmail.com,
> 52461@debbugs.gnu.org
>
> On Tue, 14 Dec 2021 22:20:30 +0900,
> Eli Zaretskii wrote:
> >
> > Thanks. I think I prefer the second variant.
> >
> > Do you think we need more testing of this, since the problem was
> > intermittent?
>
> I don't think so. The members of struct charset other than
> code_space_mask is non-pointer values, so they do not involve any
> dereference.
>
> Can I install it to the emacs-28 branch?
Yes, please.
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#52461: spontaneous crash with portable dumper
2021-12-15 3:30 ` Eli Zaretskii
@ 2021-12-15 4:15 ` YAMAMOTO Mitsuharu
0 siblings, 0 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-15 4:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52461-done, pipcet
On Wed, 15 Dec 2021 12:30:19 +0900,
Eli Zaretskii wrote:
>
> > Can I install it to the emacs-28 branch?
>
> Yes, please.
Done. Closing.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-12-15 4:15 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-13 1:38 bug#52461: spontaneous crash with portable dumper YAMAMOTO Mitsuharu
2021-12-13 1:44 ` YAMAMOTO Mitsuharu
2021-12-13 13:31 ` Eli Zaretskii
2021-12-13 14:43 ` Pip Cet
2021-12-13 16:52 ` Eli Zaretskii
2021-12-14 8:04 ` YAMAMOTO Mitsuharu
2021-12-14 13:20 ` Eli Zaretskii
2021-12-15 3:04 ` YAMAMOTO Mitsuharu
2021-12-15 3:24 ` YAMAMOTO Mitsuharu
2021-12-15 3:30 ` Eli Zaretskii
2021-12-15 4:15 ` YAMAMOTO Mitsuharu
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).