unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#52461: spontaneous crash with portable dumper
@ 2021-12-13  1:38 YAMAMOTO Mitsuharu
  2021-12-13  1:44 ` YAMAMOTO Mitsuharu
  2021-12-13 13:31 ` Eli Zaretskii
  0 siblings, 2 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-13  1:38 UTC (permalink / raw)
  To: 52461

During the development of the Mac port based on Emacs 28.0.90, I had
spontaneous crash inside dump_cold_charset.

% cd src; lldb temacs
(lldb) target create "temacs"
Current executable set to '/Users/mituharu/src/git/emacs-builds/work-debug/src/temacs' (arm64).
(lldb) r -batch  -l loadup --temacs=pdump --bin-dest /usr/local/bin/ --eln-dest /usr/local/lib/emacs/28.0.90/
Process 19997 launched: '/Users/mituharu/src/git/emacs-builds/work-debug/src/temacs' (arm64)
Loading loadup.el (source)...
Dump mode: pdump
Using load-path (/Users/mituharu/src/git/emacs-builds/work-debug/../../emacs/work/lisp)
Loading emacs-lisp/byte-run...
Loading emacs-lisp/backquote...
Loading subr...
Loading version...
Loading widget...
Loading custom...
Loading emacs-lisp/map-ynp...
Loading international/mule...
Loading international/mule-conf...
Loading env...
Loading format...
Loading bindings...
Loading window...
Loading files...
Loading emacs-lisp/macroexp...
Loading cus-face...
Loading faces...
Loading loaddefs.el (source)...
Loading button...
Loading emacs-lisp/nadvice...
Loading emacs-lisp/cl-preloaded...
Loading obarray...
Loading abbrev...
Loading simple...
Loading help...
Loading jka-cmpr-hook...
Loading epa-hook...
Loading international/mule-cmds...
Loading case-table...
Loading international/charprop.el (source)...
Loading international/characters...
Loading international/charscript...
Loading international/emoji-zwj...
Loading composite...
Loading language/chinese...
Loading language/cyrillic...
Loading language/indian...
Loading language/sinhala...
Loading language/english...
Loading language/ethiopic...
Loading language/european...
Loading language/czech...
Loading language/slovak...
Loading language/romanian...
Loading language/greek...
Loading language/hebrew...
Loading international/cp51932...
Loading international/eucjp-ms...
Loading language/japanese...
Loading language/korean...
Loading language/lao...
Loading language/tai-viet...
Loading language/thai...
Loading language/tibetan...
Loading language/vietnamese...
Loading language/misc-lang...
Loading language/utf-8-lang...
Loading language/georgian...
Loading language/khmer...
Loading language/burmese...
Loading language/cham...
Loading indent...
Loading emacs-lisp/cl-generic...
Loading minibuffer...
Loading frame...
Loading startup...
Loading term/tty-colors...
Loading font-core...
Loading emacs-lisp/syntax...
Loading font-lock...
Loading jit-lock...
Loading mouse...
Loading scroll-bar...
Loading select...
Loading emacs-lisp/timer...
Loading emacs-lisp/easymenu...
Loading isearch...
Loading rfn-eshadow...
Loading menu-bar...
Loading tab-bar...
Loading emacs-lisp/lisp...
Loading textmodes/page...
Loading register...
Loading textmodes/paragraphs...
Loading progmodes/prog-mode...
Loading emacs-lisp/lisp-mode...
Loading textmodes/text-mode...
Loading textmodes/fill...
Loading newcomment...
Loading replace...
Loading emacs-lisp/tabulated-list...
Loading buff-menu...
Loading fringe...
Loading emacs-lisp/regexp-opt...
Loading image...
Loading international/fontset...
Loading dnd...
Loading tool-bar...
Loading term/common-win...
Loading term/mac-win...
Loading mwheel...
Loading progmodes/elisp-mode...
Loading emacs-lisp/float-sup...
Loading vc/vc-hooks...
Loading vc/ediff-hook...
Loading uniquify...
Loading electric...
Loading paren...
Loading emacs-lisp/shorthands...
Loading emacs-lisp/eldoc...
Loading cus-start...
Loading tooltip...
Loading international/iso-transl...
Loading leim/leim-list.el (source)...
Waiting for git...
Waiting for git...
Finding pointers to doc strings...
Finding pointers to doc strings...done
Pure-hashed: 17091 strings, 5197 vectors, 42628 conses, 4696 bytecodes, 270 others
Dumping under the name emacs.pdmp
Dumping fingerprint: 134341316bf9884828a54d89e5feeb5b0544373e345d945d5498970dc66fa98c
Process 19997 stopped
* thread #2, name = 'org.gnu.Emacs.lisp-main', stop reason = EXC_BAD_ACCESS (code=2, address=0x4300000020)
    frame #0: 0x00000001912d41a0 libsystem_platform.dylib`_platform_memmove + 144
libsystem_platform.dylib`_platform_memmove:
->  0x1912d41a0 <+144>: ldnp   q2, q3, [x1]
    0x1912d41a4 <+148>: sub    x5, x3, x0
    0x1912d41a8 <+152>: add    x1, x1, x5
    0x1912d41ac <+156>: ldnp   q0, q1, [x1]
Target 0: (temacs) stopped.
(lldb) up
frame #1: 0x0000000100247c78 temacs`dump_write(ctx=0x0000000170793bf8, buf=0x0000004300000020, nbyte=256) at pdumper.c:779:3
   776 	  eassert (ctx->flags.dump_object_contents);
   777 	  while (ctx->offset + nbyte > ctx->buf_size)
   778 	    dump_grow_buffer (ctx);
-> 779 	  memcpy ((char *)ctx->buf + ctx->offset, buf, nbyte);
   780 	  ctx->offset += nbyte;
   781 	}
   782 	
(lldb) p buf
(const void *) $0 = 0x0000004300000020
(lldb) up
frame #2: 0x0000000100253654 temacs`dump_cold_charset(ctx=0x0000000170793bf8, data=(i = 0x0000000101121f53)) at pdumper.c:3361:3
   3358	     cs_dump_offset + dump_offsetof (struct charset, code_space_mask),
   3359	     ctx->offset);
   3360	  struct charset *cs = charset_table + cs_i;
-> 3361	  dump_write (ctx, cs->code_space_mask, 256);
   3362	}
   3363	
   3364	static void
(lldb) p *cs
(charset) $1 = {
  id = 90
  hash_index = 386547056672
  dimension = 108
  code_space = ([0] = 32, [1] = 90, [2] = 112, [3] = 32, [4] = 67, [5] = 99, [6] = 32, [7] = 67, [8] = 102, [9] = 32, [10] = 67, [11] = 115, [12] = 32, [13] = 67, [14] = 111)
  code_space_mask = 0x0000004300000020 ""
  code_linear_p = false
  iso_chars_96 = true
  ascii_compatible_p = true
  supplementary_p = true
  compact_codes_p = false
  unified_p = true
  iso_final = 93
  iso_revision = 93
  emacs_mule_id = 10
  method = 0x20
  min_code = 32
  max_code = 34
  char_index_offset = 85
  min_char = 110
  max_char = 105
  invalid_code = 99
  fast_map = "o"
  code_offset = 104
}
(lldb) p cs_i
(int) $2 = 183
(lldb) p charset_table_used
(int) $3 = 183

Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
contains uninitialized contents.  So writing to the area that
cs->code_space_mask points to can cause crash or memory corruption.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-13  1:38 bug#52461: spontaneous crash with portable dumper YAMAMOTO Mitsuharu
@ 2021-12-13  1:44 ` YAMAMOTO Mitsuharu
  2021-12-13 13:31 ` Eli Zaretskii
  1 sibling, 0 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-13  1:44 UTC (permalink / raw)
  To: 52461

On Mon, 13 Dec 2021 10:38:28 +0900,
YAMAMOTO Mitsuharu wrote:
> 
> Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
> contains uninitialized contents.  So writing to the area that
> cs->code_space_mask points to can cause crash or memory corruption.

Sorry, cs->code_space_mask was not the destination address but the
source address.  So it does not cause memory corruption, but still
crash can happen.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-13  1:38 bug#52461: spontaneous crash with portable dumper YAMAMOTO Mitsuharu
  2021-12-13  1:44 ` YAMAMOTO Mitsuharu
@ 2021-12-13 13:31 ` Eli Zaretskii
  2021-12-13 14:43   ` Pip Cet
  1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-13 13:31 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: 52461

> Date: Mon, 13 Dec 2021 10:38:28 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> 
> (lldb) p cs_i
> (int) $2 = 183
> (lldb) p charset_table_used
> (int) $3 = 183
> 
> Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
> contains uninitialized contents.

So we somehow have a charset that is not in charset_table, is that
what you are saying?  Because otherwise how could its ID be beyond the
table, when define-charset-internal enlarges the table as needed?

Any idea what charset is that, and where it is added/loaded?

FWIW, on my system, charset_table_used is 179, so maybe the mac port
defines some additional charsets?

Thanks.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-13 13:31 ` Eli Zaretskii
@ 2021-12-13 14:43   ` Pip Cet
  2021-12-13 16:52     ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Pip Cet @ 2021-12-13 14:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 52461

On Mon, Dec 13, 2021 at 1:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Mon, 13 Dec 2021 10:38:28 +0900
> > From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> >
> > (lldb) p cs_i
> > (int) $2 = 183
> > (lldb) p charset_table_used
> > (int) $3 = 183
> >
> > Because cs_i >= charset_table_used, charset_table[cs_i] (i.e., *cs)
> > contains uninitialized contents.
>
> So we somehow have a charset that is not in charset_table, is that
> what you are saying?  Because otherwise how could its ID be beyond the
> table, when define-charset-internal enlarges the table as needed?

I think there's no great mystery here, and the initial analysis is correct:

define-charset-internal enlarges the table, but doesn't zero out the
added entries (>= charset_table_used, < charset_table_size). pdumper
assumes all entries (< charset_table_size), including the unused ones,
contain either valid pointers or NULL. When they're not, but happen to
have an invalid pointer in the wrong place, we get a crash.

Often, realloc will return zeroed memory so this bug may have stayed
invisible for some time.

What we can try is debugging the crashing temacs binary, setting a
breakpoint to the point where define-charset-internal calls xpalloc
(in a very strange manner), at about line 1126 in charset.c, and zero
the memory after the call by executing
  memset (new_table, 0, new_size * sizeof (new_table[0]))
in the debugger (I don't use lldb so I'm not sure precisely how to do
that), then continuing to the memcpy.

If that fixes things, we should make the obvious change, I think.

Pip





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-13 14:43   ` Pip Cet
@ 2021-12-13 16:52     ` Eli Zaretskii
  2021-12-14  8:04       ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-13 16:52 UTC (permalink / raw)
  To: Pip Cet; +Cc: 52461

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 13 Dec 2021 14:43:55 +0000
> Cc: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>, 52461@debbugs.gnu.org
> 
> What we can try is debugging the crashing temacs binary, setting a
> breakpoint to the point where define-charset-internal calls xpalloc
> (in a very strange manner), at about line 1126 in charset.c, and zero
> the memory after the call by executing
>   memset (new_table, 0, new_size * sizeof (new_table[0]))
> in the debugger (I don't use lldb so I'm not sure precisely how to do
> that), then continuing to the memcpy.
> 
> If that fixes things, we should make the obvious change, I think.

Sounds like a plan, thanks.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-13 16:52     ` Eli Zaretskii
@ 2021-12-14  8:04       ` YAMAMOTO Mitsuharu
  2021-12-14 13:20         ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-14  8:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 52461, Pip Cet

On Mon, 13 Dec 2021 22:31:09 +0900,
Eli Zaretskii wrote:
> 
> FWIW, on my system, charset_table_used is 179, so maybe the mac port
> defines some additional charsets?

Yes.

On Tue, 14 Dec 2021 01:52:51 +0900,
Eli Zaretskii wrote:
> 
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 13 Dec 2021 14:43:55 +0000
> > Cc: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>, 52461@debbugs.gnu.org
> > 
> > What we can try is debugging the crashing temacs binary, setting a
> > breakpoint to the point where define-charset-internal calls xpalloc
> > (in a very strange manner), at about line 1126 in charset.c, and zero
> > the memory after the call by executing
> >   memset (new_table, 0, new_size * sizeof (new_table[0]))
> > in the debugger (I don't use lldb so I'm not sure precisely how to do
> > that), then continuing to the memcpy.
> > 
> > If that fixes things, we should make the obvious change, I think.
> 
> Sounds like a plan, thanks.

I directly inserted the memset line just after the xpalloc call in
charset.c, and dumped 10 times.  No crash occured.

I also tried the change below, and it seems to work, too.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

diff --git a/src/charset.c b/src/charset.c
index 7cd0fa78f0..670fd48a2d 100644
--- a/src/charset.c
+++ b/src/charset.c
@@ -63,7 +63,7 @@ Copyright (C) 2003, 2004
 /* Table of struct charset.  */
 struct charset *charset_table;
 int charset_table_size;
-static int charset_table_used;
+int charset_table_used;
 
 /* Special charsets corresponding to symbols.  */
 int charset_ascii;
diff --git a/src/charset.h b/src/charset.h
index 97122d82a6..8c538234d8 100644
--- a/src/charset.h
+++ b/src/charset.h
@@ -249,6 +249,7 @@ #define EMACS_CHARSET_H
 /* Table of struct charset.  */
 extern struct charset *charset_table;
 extern int charset_table_size;
+extern int charset_table_used;
 
 #define CHARSET_FROM_ID(id) (charset_table + (id))
 
diff --git a/src/pdumper.c b/src/pdumper.c
index 98c760162e..2782648e7a 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -3174,7 +3174,7 @@ dump_charset (struct dump_context *ctx, int cs_i)
   DUMP_FIELD_COPY (&out, cs, hash_index);
   DUMP_FIELD_COPY (&out, cs, dimension);
   memcpy (out.code_space, &cs->code_space, sizeof (cs->code_space));
-  if (cs->code_space_mask)
+  if (cs_i < charset_table_used && cs->code_space_mask)
     dump_field_fixup_later (ctx, &out, cs, &cs->code_space_mask);
   DUMP_FIELD_COPY (&out, cs, code_linear_p);
   DUMP_FIELD_COPY (&out, cs, iso_chars_96);
@@ -3195,7 +3195,7 @@ dump_charset (struct dump_context *ctx, int cs_i)
   memcpy (out.fast_map, &cs->fast_map, sizeof (cs->fast_map));
   DUMP_FIELD_COPY (&out, cs, code_offset);
   dump_off offset = dump_object_finish (ctx, &out, sizeof (out));
-  if (cs->code_space_mask)
+  if (cs_i < charset_table_used && cs->code_space_mask)
     dump_remember_cold_op (ctx, COLD_OP_CHARSET,
                            Fcons (dump_off_to_lisp (cs_i),
                                   dump_off_to_lisp (offset)));






^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-14  8:04       ` YAMAMOTO Mitsuharu
@ 2021-12-14 13:20         ` Eli Zaretskii
  2021-12-15  3:04           ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-14 13:20 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: 52461, pipcet

> Date: Tue, 14 Dec 2021 17:04:48 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> Cc: Pip Cet <pipcet@gmail.com>,
> 	52461@debbugs.gnu.org
> 
> I directly inserted the memset line just after the xpalloc call in
> charset.c, and dumped 10 times.  No crash occured.
> 
> I also tried the change below, and it seems to work, too.

Thanks.  I think I prefer the second variant.

Do you think we need more testing of this, since the problem was
intermittent?





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-14 13:20         ` Eli Zaretskii
@ 2021-12-15  3:04           ` YAMAMOTO Mitsuharu
  2021-12-15  3:24             ` YAMAMOTO Mitsuharu
  2021-12-15  3:30             ` Eli Zaretskii
  0 siblings, 2 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-15  3:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 52461, pipcet

On Tue, 14 Dec 2021 22:20:30 +0900,
Eli Zaretskii wrote:
> 
> > I directly inserted the memset line just after the xpalloc call in
> > charset.c, and dumped 10 times.  No crash occured.
> > 
> > I also tried the change below, and it seems to work, too.
> 
> Thanks.  I think I prefer the second variant.
> 
> Do you think we need more testing of this, since the problem was
> intermittent?

I don't think so.  The members of struct charset other than
code_space_mask is non-pointer values, so they do not involve any
dereference.

Can I install it to the emacs-28 branch?  Currently, we have
charset_table_used == 179, and charset_table_size == 180, so the
problem does not manifest itself without additional charsets defined
before dumping.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-15  3:04           ` YAMAMOTO Mitsuharu
@ 2021-12-15  3:24             ` YAMAMOTO Mitsuharu
  2021-12-15  3:30             ` Eli Zaretskii
  1 sibling, 0 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-15  3:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 52461, pipcet

On Wed, 15 Dec 2021 12:04:31 +0900,
YAMAMOTO Mitsuharu wrote:
> 
> Can I install it to the emacs-28 branch?  Currently, we have
> charset_table_used == 179, and charset_table_size == 180, so the
> problem does not manifest itself without additional charsets defined
> before dumping.

Oops.  The latter half of the last sentence was not right.  Forget
about it.
 
				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-15  3:04           ` YAMAMOTO Mitsuharu
  2021-12-15  3:24             ` YAMAMOTO Mitsuharu
@ 2021-12-15  3:30             ` Eli Zaretskii
  2021-12-15  4:15               ` YAMAMOTO Mitsuharu
  1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2021-12-15  3:30 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: 52461, pipcet

> Date: Wed, 15 Dec 2021 12:04:31 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> Cc: pipcet@gmail.com,
> 	52461@debbugs.gnu.org
> 
> On Tue, 14 Dec 2021 22:20:30 +0900,
> Eli Zaretskii wrote:
> > 
> > Thanks.  I think I prefer the second variant.
> > 
> > Do you think we need more testing of this, since the problem was
> > intermittent?
> 
> I don't think so.  The members of struct charset other than
> code_space_mask is non-pointer values, so they do not involve any
> dereference.
> 
> Can I install it to the emacs-28 branch?

Yes, please.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#52461: spontaneous crash with portable dumper
  2021-12-15  3:30             ` Eli Zaretskii
@ 2021-12-15  4:15               ` YAMAMOTO Mitsuharu
  0 siblings, 0 replies; 11+ messages in thread
From: YAMAMOTO Mitsuharu @ 2021-12-15  4:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 52461-done, pipcet

On Wed, 15 Dec 2021 12:30:19 +0900,
Eli Zaretskii wrote:
> 
> > Can I install it to the emacs-28 branch?
> 
> Yes, please.

Done.  Closing.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-12-15  4:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-13  1:38 bug#52461: spontaneous crash with portable dumper YAMAMOTO Mitsuharu
2021-12-13  1:44 ` YAMAMOTO Mitsuharu
2021-12-13 13:31 ` Eli Zaretskii
2021-12-13 14:43   ` Pip Cet
2021-12-13 16:52     ` Eli Zaretskii
2021-12-14  8:04       ` YAMAMOTO Mitsuharu
2021-12-14 13:20         ` Eli Zaretskii
2021-12-15  3:04           ` YAMAMOTO Mitsuharu
2021-12-15  3:24             ` YAMAMOTO Mitsuharu
2021-12-15  3:30             ` Eli Zaretskii
2021-12-15  4:15               ` YAMAMOTO Mitsuharu

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).