unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#68690: Segmentation fault building with native-comp
@ 2024-01-24 14:36 john muhl via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-24 17:10 ` Eli Zaretskii
  2024-01-25 18:12 ` Mattias Engdegård
  0 siblings, 2 replies; 29+ messages in thread
From: john muhl via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-24 14:36 UTC (permalink / raw)
  To: 68690

Bisect says 3018c6e7ba5 is the first bad commit. A build using
‘--without-native-compilation’ works fine. The segfault can be
reproduced on Fedora 39 and Debian testing.

make bootstrap
…
make -C ../lisp compile-first EMACS="../src/bootstrap-emacs"
make[3]: Entering directory '/home/jm/src/emacs-0/lisp'
  ELC+ELN  emacs-lisp/macroexp.elc
  ELC+ELN  emacs-lisp/cconv.elc
  ELC+ELN  emacs-lisp/byte-opt.elc
  ELC+ELN  emacs-lisp/bytecomp.elc
  ELC+ELN  emacs-lisp/comp.elc
  ELC+ELN  emacs-lisp/comp-cstr.elc
  ELC+ELN  emacs-lisp/comp-common.elc
  ELC+ELN  emacs-lisp/comp-run.elc
  ELC+ELN  emacs-lisp/loaddefs-gen.elc
  ELC+ELN  emacs-lisp/radix-tree.elc

Backtrace:
../src/bootstrap-emacs[0x57863b]
../src/bootstrap-emacs[0x42651e]
../src/bootstrap-emacs[0x426a10]
../src/bootstrap-emacs[0x576cf8]
../src/bootstrap-emacs[0x576d69]
/lib64/libc.so.6(+0x3e9a0)[0x7fce2b5089a0]
../src/bootstrap-emacs[0x60bb50]
../src/bootstrap-emacs[0x60df74]
../src/bootstrap-emacs[0x5e8910]
../src/bootstrap-emacs[0x5e9891]
../src/bootstrap-emacs[0x5e9d8c]
../src/bootstrap-emacs[0x5e8111]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e9891]
../src/bootstrap-emacs[0x5e4fe6]
../src/bootstrap-emacs[0x5e5d3a]
../src/bootstrap-emacs[0x428f09]
../src/bootstrap-emacs[0x5e4fe6]
/home/jm/src/emacs-0/native-lisp/30.0.50-ed70e66e/comp-7672a6ed-93f96353.eln(F636f6d702d2d6e61746976652d636f6d70696c65_comp__native_compile_0+0xaeb)[0x7fce281a030b]
../src/bootstrap-emacs[0x5e4fe6]
/home/jm/src/emacs-0/native-lisp/30.0.50-ed70e66e/comp-7672a6ed-93f96353.eln(F62617463682d6e61746976652d636f6d70696c65_batch_native_compile_0+0x186)[0x7fce281a0c26]
../src/bootstrap-emacs[0x5e4fe6]
/home/jm/src/emacs-0/native-lisp/30.0.50-ed70e66e/comp-7672a6ed-93f96353.eln(F62617463682d627974652b6e61746976652d636f6d70696c65_batch_bytenative_compile_0+0x144)[0x7fce281a0f84]
../src/bootstrap-emacs[0x5e4fe6]
../src/bootstrap-emacs[0x5e860d]
../src/bootstrap-emacs[0x5e9251]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e92d9]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea5d1]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e9101]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea5d1]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e8ab1]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea1b9]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea1b9]
...
make[3]: *** [Makefile:330: emacs-lisp/byte-opt.elc] Segmentation fault (core dumped)
make[3]: *** Waiting for unfinished jobs....

Backtrace:
../src/bootstrap-emacs[0x57863b]
../src/bootstrap-emacs[0x42651e]
../src/bootstrap-emacs[0x426a10]
../src/bootstrap-emacs[0x576cf8]
../src/bootstrap-emacs[0x576d69]
/lib64/libc.so.6(+0x3e9a0)[0x7f8c2be5c9a0]
../src/bootstrap-emacs[0x60b9e0]
../src/bootstrap-emacs[0x60df74]
../src/bootstrap-emacs[0x5e8910]
../src/bootstrap-emacs[0x5e9891]
../src/bootstrap-emacs[0x5e9d8c]
../src/bootstrap-emacs[0x5e8111]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e9891]
../src/bootstrap-emacs[0x5e4fe6]
../src/bootstrap-emacs[0x5e5d3a]
../src/bootstrap-emacs[0x428f09]
../src/bootstrap-emacs[0x5e4fe6]
/home/jm/src/emacs-0/native-lisp/30.0.50-ed70e66e/comp-7672a6ed-93f96353.eln(F636f6d702d2d6e61746976652d636f6d70696c65_comp__native_compile_0+0xaeb)[0x7f8c28abf30b]
../src/bootstrap-emacs[0x5e4fe6]
/home/jm/src/emacs-0/native-lisp/30.0.50-ed70e66e/comp-7672a6ed-93f96353.eln(F62617463682d6e61746976652d636f6d70696c65_batch_native_compile_0+0x186)[0x7f8c28abfc26]
../src/bootstrap-emacs[0x5e4fe6]
/home/jm/src/emacs-0/native-lisp/30.0.50-ed70e66e/comp-7672a6ed-93f96353.eln(F62617463682d627974652b6e61746976652d636f6d70696c65_batch_bytenative_compile_0+0x144)[0x7f8c28abff84]
../src/bootstrap-emacs[0x5e4fe6]
../src/bootstrap-emacs[0x5e860d]
../src/bootstrap-emacs[0x5e9251]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e92d9]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea5d1]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e9101]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea5d1]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e8ab1]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea1b9]
../src/bootstrap-emacs[0x5e87f7]
../src/bootstrap-emacs[0x5ea1b9]
...
make[3]: *** [Makefile:330: emacs-lisp/bytecomp.elc] Segmentation fault (core dumped)
make[3]: Leaving directory '/home/jm/src/emacs-0/lisp'
make[2]: *** [Makefile:1019: bootstrap-emacs.pdmp] Error 2
make[2]: Leaving directory '/home/jm/src/emacs-0/src'
make[1]: *** [Makefile:554: src] Error 2
make[1]: Leaving directory '/home/jm/src/emacs-0'
make[1]: Entering directory '/home/jm/src/emacs-0'
***
*** "make all" failed with exit status 2.
***
*** You could try to:
*** - run "make bootstrap", which might fix the problem
*** - run "make V=1", which displays the full commands invoked by make,
***   to further investigate the problem
***
make[1]: *** [Makefile:418: advice-on-failure] Error 2
make[1]: Leaving directory '/home/jm/src/emacs-0'
make: *** [Makefile:374: all] Error 2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 14:36 bug#68690: Segmentation fault building with native-comp john muhl via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-24 17:10 ` Eli Zaretskii
  2024-01-24 19:52   ` Gerd Möllmann
  2024-01-24 19:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-25 18:12 ` Mattias Engdegård
  1 sibling, 2 replies; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-24 17:10 UTC (permalink / raw)
  To: john muhl, Stefan Monnier; +Cc: 68690

> Date: Wed, 24 Jan 2024 08:36:15 -0600
> From:  john muhl via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> Bisect says 3018c6e7ba5 is the first bad commit. A build using
> ‘--without-native-compilation’ works fine. The segfault can be
> reproduced on Fedora 39 and Debian testing.
> 
> make bootstrap
> …
> make -C ../lisp compile-first EMACS="../src/bootstrap-emacs"
> make[3]: Entering directory '/home/jm/src/emacs-0/lisp'
>   ELC+ELN  emacs-lisp/macroexp.elc
>   ELC+ELN  emacs-lisp/cconv.elc
>   ELC+ELN  emacs-lisp/byte-opt.elc
>   ELC+ELN  emacs-lisp/bytecomp.elc
>   ELC+ELN  emacs-lisp/comp.elc
>   ELC+ELN  emacs-lisp/comp-cstr.elc
>   ELC+ELN  emacs-lisp/comp-common.elc
>   ELC+ELN  emacs-lisp/comp-run.elc
>   ELC+ELN  emacs-lisp/loaddefs-gen.elc
>   ELC+ELN  emacs-lisp/radix-tree.elc
> 
> Backtrace:
> ../src/bootstrap-emacs[0x57863b]
> ../src/bootstrap-emacs[0x42651e]

Adding Stefan, who installed that commit.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 17:10 ` Eli Zaretskii
@ 2024-01-24 19:52   ` Gerd Möllmann
  2024-01-24 19:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 29+ messages in thread
From: Gerd Möllmann @ 2024-01-24 19:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: john muhl, 68690, Stefan Monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Wed, 24 Jan 2024 08:36:15 -0600
>> From:  john muhl via "Bug reports for GNU Emacs,
>>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
>> 
>> Bisect says 3018c6e7ba5 is the first bad commit. A build using
>> ‘--without-native-compilation’ works fine. The segfault can be
>> reproduced on Fedora 39 and Debian testing.
>> 
>> make bootstrap
>> …
>> make -C ../lisp compile-first EMACS="../src/bootstrap-emacs"
>> make[3]: Entering directory '/home/jm/src/emacs-0/lisp'
>>   ELC+ELN  emacs-lisp/macroexp.elc
>>   ELC+ELN  emacs-lisp/cconv.elc
>>   ELC+ELN  emacs-lisp/byte-opt.elc
>>   ELC+ELN  emacs-lisp/bytecomp.elc
>>   ELC+ELN  emacs-lisp/comp.elc
>>   ELC+ELN  emacs-lisp/comp-cstr.elc
>>   ELC+ELN  emacs-lisp/comp-common.elc
>>   ELC+ELN  emacs-lisp/comp-run.elc
>>   ELC+ELN  emacs-lisp/loaddefs-gen.elc
>>   ELC+ELN  emacs-lisp/radix-tree.elc
>> 
>> Backtrace:
>> ../src/bootstrap-emacs[0x57863b]
>> ../src/bootstrap-emacs[0x42651e]
>
> Adding Stefan, who installed that commit.

FWIW, in an ASAN build, I see an abort.  This is with
1f3371b46e8a6a51f88c56785175b48af2a0bed7, on macOS.

  ELC+ELN  emacs-lisp/macroexp.elc
=================================================================
==32930==ERROR: AddressSanitizer: heap-use-after-free on address 0x60c0000353e0 at pc 0x000102b3fc97 bp 0x7ff7bdaf7250 sp 0x7ff7bdaf7248
READ of size 8 at 0x60c0000353e0 thread T0
    #0 0x102b3fc96 in Fmaphash fns.c:5665
    #1 0x102b062c8 in funcall_subr eval.c:3092
    #2 0x102bf85af in exec_byte_code bytecode.c:815
    #3 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #4 0x102b0766b in funcall_lambda eval.c:3207
    #5 0x102b05b80 in funcall_general eval.c:2972
    #6 0x102af5c86 in Ffuncall eval.c:3022
    #7 0x102b3fdee in Fmaphash fns.c:5666
    #8 0x102b062c8 in funcall_subr eval.c:3092
    #9 0x102bf85af in exec_byte_code bytecode.c:815
    #10 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #11 0x102b0766b in funcall_lambda eval.c:3207
    #12 0x102b05b80 in funcall_general eval.c:2972
    #13 0x102af5c86 in Ffuncall eval.c:3022
    #14 0x102af238f in eval_sub eval.c:2497
    #15 0x102af4477 in Fprogn eval.c:432
    #16 0x102af429d in Fif eval.c:388
    #17 0x102af1ecc in eval_sub eval.c:2476
    #18 0x102af4477 in Fprogn eval.c:432
    #19 0x102af46ae in Fcond eval.c:412
    #20 0x102af1ecc in eval_sub eval.c:2476
    #21 0x102af4477 in Fprogn eval.c:432
    #22 0x102af908b in FletX eval.c:972
    #23 0x102af1ecc in eval_sub eval.c:2476
    #24 0x102af4477 in Fprogn eval.c:432
    #25 0x102af4754 in prog_ignore eval.c:443
    #26 0x102afa345 in Fwhile eval.c:1061
    #27 0x102af1ecc in eval_sub eval.c:2476
    #28 0x102af4477 in Fprogn eval.c:432
    #29 0x102af908b in FletX eval.c:972
    #30 0x102af1ecc in eval_sub eval.c:2476
    #31 0x102af4477 in Fprogn eval.c:432
    #32 0x102af1ecc in eval_sub eval.c:2476
    #33 0x102af4244 in Fif eval.c:387
    #34 0x102af1ecc in eval_sub eval.c:2476
    #35 0x102af4477 in Fprogn eval.c:432
    #36 0x102af9d17 in Flet eval.c:1040
    #37 0x102af1ecc in eval_sub eval.c:2476
    #38 0x102af4477 in Fprogn eval.c:432
    #39 0x102af9d17 in Flet eval.c:1040
    #40 0x102af1ecc in eval_sub eval.c:2476
    #41 0x102af4477 in Fprogn eval.c:432
    #42 0x102b07db5 in funcall_lambda eval.c:3287
    #43 0x102b03941 in apply_lambda eval.c:3157
    #44 0x102af3d68 in eval_sub eval.c:2615
    #45 0x102af4477 in Fprogn eval.c:432
    #46 0x102af9d17 in Flet eval.c:1040
    #47 0x102af1ecc in eval_sub eval.c:2476
    #48 0x102af4477 in Fprogn eval.c:432
    #49 0x102b07db5 in funcall_lambda eval.c:3287
    #50 0x102b03941 in apply_lambda eval.c:3157
    #51 0x102af3d68 in eval_sub eval.c:2615
    #52 0x102afb992 in Funwind_protect eval.c:1321
    #53 0x102af1ecc in eval_sub eval.c:2476
    #54 0x102af4477 in Fprogn eval.c:432
    #55 0x102af9d17 in Flet eval.c:1040
    #56 0x102af1ecc in eval_sub eval.c:2476
    #57 0x102af4477 in Fprogn eval.c:432
    #58 0x102af429d in Fif eval.c:388
    #59 0x102af1ecc in eval_sub eval.c:2476
    #60 0x102af4477 in Fprogn eval.c:432
    #61 0x102b07db5 in funcall_lambda eval.c:3287
    #62 0x102b03941 in apply_lambda eval.c:3157
    #63 0x102af3d68 in eval_sub eval.c:2615
    #64 0x102b02223 in Feval eval.c:2389
    #65 0x1028d087a in top_level_2 keyboard.c:1173
    #66 0x102afd8e8 in internal_condition_case eval.c:1537
    #67 0x1028d06e0 in top_level_1 keyboard.c:1185
    #68 0x102afb4b5 in internal_catch eval.c:1217
    #69 0x10288e149 in command_loop keyboard.c:1134
    #70 0x10288db6d in recursive_edit_1 keyboard.c:744
    #71 0x10288eb2c in Frecursive_edit keyboard.c:827
    #72 0x1028867be in main emacs.c:2624
    #73 0x7ff808461385 in start+0x795 (dyld:x86_64+0xfffffffffff5c385)

0x60c0000353e0 is located 96 bytes inside of 128-byte region [0x60c000035380,0x60c000035400)
freed by thread T0 here:
    #0 0x1052b0e16 in free+0xa6 (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0xe0e16)
    #1 0x102eca876 in rpl_free free.c:48
    #2 0x102a567bf in xfree alloc.c:831
    #3 0x102a5eada in hash_table_free_bytes alloc.c:5653
    #4 0x102b3b781 in maybe_resize_hash_table fns.c:4723
    #5 0x102b3ae12 in hash_put fns.c:4864
    #6 0x102b3fa6f in Fputhash fns.c:5639
    #7 0x102b06416 in funcall_subr eval.c:3094
    #8 0x102bf85af in exec_byte_code bytecode.c:815
    #9 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #10 0x102b0766b in funcall_lambda eval.c:3207
    #11 0x102b05b80 in funcall_general eval.c:2972
    #12 0x102af5c86 in Ffuncall eval.c:3022
    #13 0x102b3fdee in Fmaphash fns.c:5666
    #14 0x102b062c8 in funcall_subr eval.c:3092
    #15 0x102bf85af in exec_byte_code bytecode.c:815
    #16 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #17 0x102b0766b in funcall_lambda eval.c:3207
    #18 0x102b05b80 in funcall_general eval.c:2972
    #19 0x102af5c86 in Ffuncall eval.c:3022
    #20 0x102b3fdee in Fmaphash fns.c:5666
    #21 0x102b062c8 in funcall_subr eval.c:3092
    #22 0x102bf85af in exec_byte_code bytecode.c:815
    #23 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #24 0x102b0766b in funcall_lambda eval.c:3207
    #25 0x102b05b80 in funcall_general eval.c:2972
    #26 0x102af5c86 in Ffuncall eval.c:3022
    #27 0x102af238f in eval_sub eval.c:2497
    #28 0x102af4477 in Fprogn eval.c:432
    #29 0x102af429d in Fif eval.c:388

previously allocated by thread T0 here:
    #0 0x1052b0ccd in malloc+0x9d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0xe0ccd)
    #1 0x102a564bd in lmalloc alloc.c:1402
    #2 0x102a563d6 in xmalloc alloc.c:772
    #3 0x102a5ea87 in hash_table_alloc_bytes alloc.c:5644
    #4 0x102b3b295 in maybe_resize_hash_table fns.c:4700
    #5 0x102b3ae12 in hash_put fns.c:4864
    #6 0x102b3fa6f in Fputhash fns.c:5639
    #7 0x102b06416 in funcall_subr eval.c:3094
    #8 0x102bf85af in exec_byte_code bytecode.c:815
    #9 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #10 0x102b0766b in funcall_lambda eval.c:3207
    #11 0x102b05b80 in funcall_general eval.c:2972
    #12 0x102af5c86 in Ffuncall eval.c:3022
    #13 0x102b3fdee in Fmaphash fns.c:5666
    #14 0x102b062c8 in funcall_subr eval.c:3092
    #15 0x102bf85af in exec_byte_code bytecode.c:815
    #16 0x102b0fd66 in fetch_and_exec_byte_code eval.c:3135
    #17 0x102b0766b in funcall_lambda eval.c:3207
    #18 0x102b05b80 in funcall_general eval.c:2972
    #19 0x102af5c86 in Ffuncall eval.c:3022
    #20 0x102af238f in eval_sub eval.c:2497
    #21 0x102af4477 in Fprogn eval.c:432
    #22 0x102af429d in Fif eval.c:388
    #23 0x102af1ecc in eval_sub eval.c:2476
    #24 0x102af4477 in Fprogn eval.c:432
    #25 0x102af46ae in Fcond eval.c:412
    #26 0x102af1ecc in eval_sub eval.c:2476
    #27 0x102af4477 in Fprogn eval.c:432
    #28 0x102af908b in FletX eval.c:972
    #29 0x102af1ecc in eval_sub eval.c:2476

SUMMARY: AddressSanitizer: heap-use-after-free fns.c:5665 in Fmaphash
Shadow bytes around the buggy address:
  0x60c000035100: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x60c000035180: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x60c000035200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x60c000035280: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x60c000035300: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x60c000035380: fd fd fd fd fd fd fd fd fd fd fd fd[fd]fd fd fd
  0x60c000035400: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x60c000035480: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x60c000035500: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x60c000035580: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x60c000035600: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==32930==ABORTING
Fatal error 6: Aborted





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 17:10 ` Eli Zaretskii
  2024-01-24 19:52   ` Gerd Möllmann
@ 2024-01-24 19:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-24 20:27     ` Eli Zaretskii
  2024-01-25  5:33     ` Gerd Möllmann
  1 sibling, 2 replies; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-24 19:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: john muhl, 68690

> Adding Stefan, who installed that commit.

Oops, should be fixed now,


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 19:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-24 20:27     ` Eli Zaretskii
  2024-01-24 23:59       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-25  5:33     ` Gerd Möllmann
  1 sibling, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-24 20:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: jm, 68690

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: john muhl <jm@pub.pink>,  68690@debbugs.gnu.org
> Date: Wed, 24 Jan 2024 14:56:49 -0500
> 
> > Adding Stefan, who installed that commit.
> 
> Oops, should be fixed now,

The build now crashes here (this is a 32-bit build with large ints):

  '../src/bootstrap-emacs.exe' -batch --no-site-file --no-site-lisp --eval "(setq load-prefer-newer t byte-compile-warnings 'all)"  --eval "(setq org--inhibit-version-check t)"  \
	  -l bytecomp -f byte-compile-refresh-preloaded \
	  -f batch-byte-compile ../lisp/mwheel.el

  lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)

Here's the backtrace from GDB:

  lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)

  Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22,
      backtrace_limit=2147483647) at emacs.c:442
  442       signal (sig, SIG_DFL);
  (gdb) bt
  #0  terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:442
  #1  0x00772401 in die (msg=0xddc80d <b_fwd+233> "VECTORLIKEP (a)",
      file=0xddc740 <b_fwd+28> "lisp.h", line=1784) at alloc.c:8062
  #2  0x00626a44 in XVECTOR (a=XIL(0x92348b000000000)) at lisp.h:1784
  #3  0x00626ace in gc_asize (array=XIL(0x92348b000000000)) at lisp.h:1800
  #4  0x00626bba in AREF (array=XIL(0x92348b000000000), idx=1) at lisp.h:1971
  #5  0x0063174d in Fcharset_after (pos=make_fixnum(113)) at charset.c:2084
  #6  0x007ba520 in funcall_subr (subr=0xdb7a80 <Scharset_after>, numargs=1,
      args=0x9d14a08) at eval.c:3090
  #7  0x0082d5b7 in exec_byte_code (fun=XIL(0xa0000000091dcec8),
      args_template=0, nargs=0, args=0x9d14a08) at bytecode.c:815
  #8  0x007baafa in fetch_and_exec_byte_code (fun=XIL(0xa000000009136058),
      args_template=642, nargs=10, args=0x5dce628) at eval.c:3135
  #9  0x007bb059 in funcall_lambda (fun=XIL(0xa000000009136058), nargs=10,
      arg_vector=0x5dce628) at eval.c:3207
  #10 0x007b9e7d in funcall_general (fun=XIL(0xa000000009136058), numargs=10,
      args=0x5dce628) at eval.c:2972
  #11 0x007ba202 in Ffuncall (nargs=11, args=0x5dce620) at eval.c:3022
  #12 0x007b90d0 in Fapply (nargs=2, args=0x9d14648) at eval.c:2693
  #13 0x007ba9ab in funcall_subr (subr=0xdc1f40 <Sapply>, numargs=2,
      args=0x9d14648) at eval.c:3113
  #14 0x0082d5b7 in exec_byte_code (fun=XIL(0xa000000009108c20),
      args_template=513, nargs=2, args=0x9d145e8) at bytecode.c:815
  #15 0x007baafa in fetch_and_exec_byte_code (fun=XIL(0xa00000000a6685e0),
      args_template=0, nargs=0, args=0x5dcee70) at eval.c:3135
  #16 0x007bb059 in funcall_lambda (fun=XIL(0xa00000000a6685e0), nargs=0,
      arg_vector=0x5dcee70) at eval.c:3207
  #17 0x007b9e7d in funcall_general (fun=XIL(0xa00000000a6685e0), numargs=0,
      args=0x5dcee70) at eval.c:2972
  #18 0x007ba202 in Ffuncall (nargs=1, args=0x5dcee68) at eval.c:3022
  #19 0x007adffe in call0 (fn=XIL(0xa00000000a6685e0)) at lisp.h:3349
  #20 0x007b397d in Fhandler_bind_1 (nargs=3, args=0x9d14478) at eval.c:1403
  #21 0x007ba9ab in funcall_subr (subr=0xdc1dc0 <Shandler_bind_1>, numargs=3,
      args=0x9d14478) at eval.c:3113
  #22 0x0082d5b7 in exec_byte_code (fun=XIL(0xa000000009041ed0),
      args_template=257, nargs=1, args=0x9d14440) at bytecode.c:815
  #23 0x007baafa in fetch_and_exec_byte_code (fun=XIL(0xa000000009365d18),
      args_template=0, nargs=0, args=0x5dcf5a0) at eval.c:3135
  #24 0x007bb059 in funcall_lambda (fun=XIL(0xa000000009365d18), nargs=0,
      arg_vector=0x5dcf5a0) at eval.c:3207
  #25 0x007baca9 in apply_lambda (fun=XIL(0xa000000009365d18), args=XIL(0),
      count=128) at eval.c:3157
  #26 0x007b85e6 in eval_sub (form=XIL(0xc00000000982c7b0)) at eval.c:2572
  #27 0x007b75b9 in Feval (form=XIL(0xc00000000982c7b0), lexical=XIL(0x30))
      at eval.c:2389
  #28 0x006b01ec in top_level_2 () at keyboard.c:1173
  #29 0x007b4530 in internal_condition_case (bfun=0x6b0163 <top_level_2>,
      handlers=XIL(0x90), hfun=0x6af6c2 <cmd_error>) at eval.c:1537
  #30 0x006b027f in top_level_1 (ignore=XIL(0)) at keyboard.c:1185
  #31 0x007b319e in internal_catch (tag=XIL(0x10a40),
      func=0x6b020b <top_level_1>, arg=XIL(0)) at eval.c:1217
  #32 0x006b0056 in command_loop () at keyboard.c:1134
  #33 0x006af122 in recursive_edit_1 () at keyboard.c:744
  #34 0x006af3c0 in Frecursive_edit () at keyboard.c:827
  #35 0x006aa3cc in main (argc=15, argv=0x77a2970) at emacs.c:2624

  Lisp Backtrace:
  "charset-after" (0x9d14a08)
  "fill-move-to-break-point" (0x9d149b8)
  "fill-region-as-paragraph" (0x9d14948)
  "fill-region" (0x9d14898)
  "easy-mmode--mode-docstring" (0x9d14778)
  0x9136058 PVEC_COMPILED
  "apply" (0x9d14648)
  "macroexpand-1" (0x9d145d8)
  "macroexp-macroexpand" (0x9d14578)
  "byte-compile-recurse-toplevel" (0x9d14538)
  "byte-compile-toplevel-file-form" (0x9d144f0)
  0xa6685a8 PVEC_COMPILED
  0xa6685e0 PVEC_COMPILED
  "handler-bind-1" (0x9d14478)
  0x9041ed0 PVEC_COMPILED
  "bytecomp--displaying-warnings" (0x9d14390)
  "byte-compile-from-buffer" (0x9d14328)
  "byte-compile-file" (0x9d14288)
  "batch-byte-compile-file" (0x9d14228)
  "batch-byte-compile" (0x9d14198)
  "command-line-1" (0x9d140b8)
  "command-line" (0x9d14048)
  "normal-top-level" (0x5dcf5a0)
  (gdb)





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 20:27     ` Eli Zaretskii
@ 2024-01-24 23:59       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-25 10:26         ` Eli Zaretskii
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-24 23:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jm, 68690

> The build now crashes here (this is a 32-bit build with large ints):
>
>   '../src/bootstrap-emacs.exe' -batch --no-site-file --no-site-lisp --eval "(setq load-prefer-newer t byte-compile-warnings 'all)"  --eval "(setq org--inhibit-version-check t)"  \
> 	  -l bytecomp -f byte-compile-refresh-preloaded \
> 	  -f batch-byte-compile ../lisp/mwheel.el
>
>   lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)
>
> Here's the backtrace from GDB:
>
>   lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)
>
>   Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22,
>       backtrace_limit=2147483647) at emacs.c:442
>   442       signal (sig, SIG_DFL);
>   (gdb) bt
>   #0  terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:442
>   #1  0x00772401 in die (msg=0xddc80d <b_fwd+233> "VECTORLIKEP (a)",
>       file=0xddc740 <b_fwd+28> "lisp.h", line=1784) at alloc.c:8062
>   #2  0x00626a44 in XVECTOR (a=XIL(0x92348b000000000)) at lisp.h:1784
>   #3  0x00626ace in gc_asize (array=XIL(0x92348b000000000)) at lisp.h:1800
>   #4  0x00626bba in AREF (array=XIL(0x92348b000000000), idx=1) at lisp.h:1971
>   #5  0x0063174d in Fcharset_after (pos=make_fixnum(113)) at charset.c:2084

Hmm... I can't reproduce it here (even with native-comp and
`--with-wide-int`).  The above stack frame suggests it might be related
to commit 33b8d5b6c5a (and hence unrelated to the original bug#68690
which was a bug in `DOHASH`).

Any chance you can investigate what is this `0x92348b000000000`?
It should be a charset's attributes and the "idx=1" is because
we're using `CHARSET_ATTR_NAME` to extract the name.


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 19:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-24 20:27     ` Eli Zaretskii
@ 2024-01-25  5:33     ` Gerd Möllmann
  2024-01-25  8:33       ` Gerd Möllmann
  1 sibling, 1 reply; 29+ messages in thread
From: Gerd Möllmann @ 2024-01-25  5:33 UTC (permalink / raw)
  To: 68690; +Cc: eliz, jm, monnier

Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs@gnu.org> writes:

>> Adding Stefan, who installed that commit.
>
> Oops, should be fixed now,
>

I wonder if a puthash while being in a DOHASH (which is the ASAN failure
I showed) is something we should pursue. I don't think that's something
that's guaranteed to work in a meaningful way. WDYT?





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-25  5:33     ` Gerd Möllmann
@ 2024-01-25  8:33       ` Gerd Möllmann
  2024-01-25 15:58         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Gerd Möllmann @ 2024-01-25  8:33 UTC (permalink / raw)
  To: 68690; +Cc: eliz, jm, monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
> text editors" <bug-gnu-emacs@gnu.org> writes:
>
>>> Adding Stefan, who installed that commit.
>>
>> Oops, should be fixed now,
>>
>
> I wonder if a puthash while being in a DOHASH (which is the ASAN failure
> I showed) is something we should pursue. I don't think that's something
> that's guaranteed to work in a meaningful way. WDYT?

BTW, I'm using the code below for CL packages, which have a hash table.
A bit less hideous ;-).

/* Iterator for hash tables.  */

struct h_iter
{
  /* Hash table being iterated over.  */
  const struct Lisp_Hash_Table *h;

  /* Current index in key/value vector of H.  */
  ptrdiff_t i;

  /* Key and value at I, or nil.  */
  Lisp_Object key, value;
};

/* Return a freshly initialized iterator for iterating over hash table
   TABLE.  */

static struct h_iter
h_init (Lisp_Object table)
{
  struct Lisp_Hash_Table *h = check_hash_table (table);
  struct h_iter it = {.h = h, .i = 0, .key = Qnil, .value = Qnil};
  return it;
}

/* Value is true if iterator IT is on a valid poisition.  If it is,
   IT->key and IT->value are set to key and value at that
   position.  */

static bool
h_valid (struct h_iter *it)
{
  for (; it->i < HASH_TABLE_SIZE (it->h); ++it->i)
    if (!hash_unused_entry_key_p (HASH_KEY (it->h, it->i)))
      {
	it->key = HASH_KEY (it->h, it->i);
	it->value = HASH_VALUE (it->h, it->i);
	return true;
      }
  return false;
}

/* Advance to next element.  */

static void
h_next (struct h_iter *it)
{
  ++it->i;
}

/* Macrology.  IT is a variable name that is bound to an iterator over
   hash table TABLE for the duration of the loop.  */

#define FOR_EACH_KEY_VALUE(it, table) \
  for (struct h_iter it = h_init (table); h_valid (&it); h_next (&it))





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 23:59       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-25 10:26         ` Eli Zaretskii
  2024-01-26  2:43           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-25 10:26 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: jm, 68690

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: jm@pub.pink,  68690@debbugs.gnu.org
> Date: Wed, 24 Jan 2024 18:59:44 -0500
> 
> > Here's the backtrace from GDB:
> >
> >   lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)
> >
> >   Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22,
> >       backtrace_limit=2147483647) at emacs.c:442
> >   442       signal (sig, SIG_DFL);
> >   (gdb) bt
> >   #0  terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:442
> >   #1  0x00772401 in die (msg=0xddc80d <b_fwd+233> "VECTORLIKEP (a)",
> >       file=0xddc740 <b_fwd+28> "lisp.h", line=1784) at alloc.c:8062
> >   #2  0x00626a44 in XVECTOR (a=XIL(0x92348b000000000)) at lisp.h:1784
> >   #3  0x00626ace in gc_asize (array=XIL(0x92348b000000000)) at lisp.h:1800
> >   #4  0x00626bba in AREF (array=XIL(0x92348b000000000), idx=1) at lisp.h:1971
> >   #5  0x0063174d in Fcharset_after (pos=make_fixnum(113)) at charset.c:2084
> 
> Hmm... I can't reproduce it here (even with native-comp and
> `--with-wide-int`).

This build is without native-comp, but it's a 32-bit build.  Did you
try that?  I think that's the key to unlock this (see below).

> The above stack frame suggests it might be related
> to commit 33b8d5b6c5a (and hence unrelated to the original bug#68690
> which was a bug in `DOHASH`).
> 
> Any chance you can investigate what is this `0x92348b000000000`?

It's obviously a bogus value, since Lisp objects in this build should
have their high 32 bits zero except for the type tag in the MSBs.

> It should be a charset's attributes and the "idx=1" is because
> we're using `CHARSET_ATTR_NAME` to extract the name.

It sounds like we are not dumping the charset attributes correctly,
and that also corrupts all the fields of a struct charset following
the attributes.  Here's this charset in temacs:

  Thread 1 hit Breakpoint 2, dump_charset (ctx=0x5f6dad0, cs_i=0)
      at pdumper.c:3224
  3224      dump_field_lv (ctx, &out, cs, &cs->attributes, WEIGHT_NORMAL);
  (gdb) p cs
  $1 = (const struct charset *) 0x1050de0 <charset_table_init>
  (gdb) p *cs
  $2 = {
    id = 0,
    attributes = XIL(0xa000000009023d88),
    dimension = 1,
    code_space = {0, 127, 128, 128, 0, 0, 1, 128, 0, 0, 1, 128, 0, 0, 1},
    code_space_mask = 0x0,
    code_linear_p = 1,
    iso_chars_96 = 0,
    ascii_compatible_p = 1,
    supplementary_p = 0,
    compact_codes_p = 1,
    unified_p = 0,
    iso_final = 66,
    iso_revision = -1,
    emacs_mule_id = 0,
    method = CHARSET_METHOD_OFFSET,
    min_code = 0,
    max_code = 127,
    char_index_offset = 0,
    min_char = 0,
    max_char = 127,
    invalid_code = 128,
    fast_map = "\001", '\000' <repeats 188 times>,
    code_offset = 0
  }
  (gdb) p cs->attributes
  $3 = XIL(0xa000000009023d88)
  (gdb) xtype
  Lisp_Vectorlike
  PVEC_NORMAL_VECTOR
  (gdb) xvector
  $4 = (struct Lisp_Vector *) 0x9023d88
  {make_fixnum(0), XIL(0x2ca0), XIL(0xc0000000091014e0), XIL(0), XIL(0), XIL(0),
    XIL(0), XIL(0), XIL(0), XIL(0)}
  (gdb) p AREF(cs->attributes,1)
  $5 = 11424
  (gdb) xtype
  Lisp_Symbol
  (gdb) xsymbol
  $6 = (struct Lisp_Symbol *) 0x10beda0 <lispsym+11424>
  "ascii"

Looks entirely reasonable, and is the ASCII charset (makes sense since
the ID is zero).

And here's the same charset in emacs, after we restore from dump:

  #5  0x0063174d in Fcharset_after (pos=make_fixnum(113)) at charset.c:2084
  2084      return (CHARSET_NAME (charset));
  (gdb) p charset
  $1 = (struct charset *) 0x9100064
  (gdb) p *charset
  $2 = {
    id = 0,
    attributes = XIL(0x92848b000000000),
    dimension = -1610612736,
    code_space = {1, 0, 127, 128, 128, 0, 0, 1, 128, 0, 0, 1, 128, 0, 0},
    code_space_mask = 0x1 <error: Cannot access memory at address 0x1>,
    code_linear_p = 0,
    iso_chars_96 = 0,
    ascii_compatible_p = 0,
    supplementary_p = 0,
    compact_codes_p = 0,
    unified_p = 0,
    iso_final = 21,
    iso_revision = 66,
    emacs_mule_id = -1,
    method = CHARSET_METHOD_OFFSET,
    min_code = 0,
    max_code = 0,
    char_index_offset = 127,
    min_char = 0,
    max_char = 0,
    invalid_code = 127,
    fast_map = "\200\000\000\000\001", '\000' <repeats 184 times>,
    code_offset = 0
  }

Note that the attributes are bogus (zero-extended on the right to 64
bits), and all the fields after that are shifted (by 32 bits, I'm
guessing).

So I think we fail to dump the attributes, and my guess is that this
is related to the fact that in this build a pointer is 32-bit wide,
but a Lisp object is a 64-bit data type.

I tried to figure out what is wrong with how we dump this new field,
but got lost in the proverbial twisty little passages of pdumper.c,
all alike.  For example, I cannot understand why some fields which are
Lisp objects are dumped with dump_field_lv while others with
dump_field_lv_or_rawptr, and what is the significance of WEIGHT_NORMAL
vs WEIGHT_STRONG.  Hopefully, the above gives enough information for
you to figure this out.

TIA





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-25  8:33       ` Gerd Möllmann
@ 2024-01-25 15:58         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-25 15:58 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: eliz, jm, 68690

>> I wonder if a puthash while being in a DOHASH (which is the ASAN failure
>> I showed) is something we should pursue. I don't think that's something
>> that's guaranteed to work in a meaningful way. WDYT?

The original DOHASH's comment indeed said it didn't support that
operation, yet the code used DOHASH to implement `maphash`, which *does*
support such operations, and it used DOHASH in places which perform such
operations, so I think it's clear we do want to support `puthash` there.

> BTW, I'm using the code below for CL packages, which have a hash table.
> A bit less hideous ;-)

Nice.

The motivation for the change from `DOHASH (h, i)` to `DOHASH (h, k, v)`
was not only to offer cleaner code but also to avoid reloading
`h->key_and_value` and `h->table_size` at every iteration
(`h->key_and_value` is particularly annoying because it's on the
critical path to load `key` and `value`).


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-24 14:36 bug#68690: Segmentation fault building with native-comp john muhl via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-24 17:10 ` Eli Zaretskii
@ 2024-01-25 18:12 ` Mattias Engdegård
  2024-01-25 22:39   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 29+ messages in thread
From: Mattias Engdegård @ 2024-01-25 18:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Gerd Möllmann, Eli Zaretskii, jm, 68690

[-- Attachment #1: Type: text/plain, Size: 1060 bytes --]

> The original DOHASH's comment indeed said it didn't support that
> operation, yet the code used DOHASH to implement `maphash`, which *does*
> support such operations, and it used DOHASH in places which perform such
> operations, so I think it's clear we do want to support `puthash` there.

Sorry, my fault -- indeed maphash 'supports' irregular mutation in the sense that it shouldn't crash or corrupt Emacs if the rules are violated. I can't reproduce the reported crash(es) on my platform but is my understanding correct that no other uses of DOHASH caused any trouble?

This patch reverts my last change to Fmaphash and yours to DOHASH. It's perfectly fine to forego DOHASH in Fmaphash, it's chums with the hash-table implementation. Assuming that the problems were confined to Fmaphash, this should be safe to apply.

What I certainly would accept is an assertion in DOHASH that verifies the assumptions but doesn't result in any code at all with checking disabled. I'll add that if you think it's warranted (and maybe even if you don't).



[-- Attachment #2: 0001-Revert-to-fast-and-simple-DOHASH-keeping-Fmaphash-ro.patch --]
[-- Type: application/octet-stream, Size: 4139 bytes --]

From 1b44dc419c55cba7e41e8fd8376ebfbae12f04e4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Thu, 25 Jan 2024 18:56:03 +0100
Subject: [PATCH] Revert to fast and simple DOHASH keeping Fmaphash robust
 (bug#68690)

`maphash` mustn't crash if the supplied function does odd things but
that doesn't mean we have to make DOHASH more expensive.
This change essentially reverts ad004f10f3 and the Fmaphash part of
fec87a4b36.

* src/lisp.h (DOHASH): Go back to fast design with more rules.
* src/fns.c (Fmaphash): Ditch DOHASH in favour of explicit loop.
---
 src/fns.c  | 11 +++++++++--
 src/lisp.h | 38 ++++++++++++++------------------------
 2 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/src/fns.c b/src/fns.c
index 859df6748f7..519d85df288 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -5662,8 +5662,15 @@ DEFUN ("maphash", Fmaphash, Smaphash, 2, 2, 0,
   (Lisp_Object function, Lisp_Object table)
 {
   struct Lisp_Hash_Table *h = check_hash_table (table);
-  DOHASH (h, k, v)
-    call2 (function, k, v);
+  /* We can't use DOHASH here since FUNCTION may violate the rules and
+     we shouldn't crash as a result (although the effects are
+     unpredictable).  */
+  for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); i++)
+    {
+      Lisp_Object k = HASH_KEY (h, i);
+      if (!hash_unused_entry_key_p (k))
+        call2 (function, k, HASH_VALUE (h, i));
+    }
   return Qnil;
 }
 
diff --git a/src/lisp.h b/src/lisp.h
index d07d9d14e2f..f822417ffb1 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2604,30 +2604,20 @@ hash_from_key (struct Lisp_Hash_Table *h, Lisp_Object key)
 }
 
 /* Iterate K and V as key and value of valid entries in hash table H.
-   The body may mutate the hash-table.  */
-#define DOHASH(h, k, v)							 \
-  for (Lisp_Object *dohash_##k##_##v##_base = (h)->key_and_value,	 \
-                   *dohash_##k##_##v##_kv   = dohash_##k##_##v##_base,	 \
-                   *dohash_##k##_##v##_end  = dohash_##k##_##v##_base	 \
-                                              + 2 * HASH_TABLE_SIZE (h), \
-                   k, v;						 \
-       dohash_##k##_##v##_kv < dohash_##k##_##v##_end			 \
-       && (dohash_##k##_##v##_base == (h)->key_and_value                 \
-           /* The `key_and_value` table has been reallocated!  */        \
-           || (dohash_##k##_##v##_kv                                     \
-                  = (dohash_##k##_##v##_kv - dohash_##k##_##v##_base)	 \
-                    + (h)->key_and_value,                                \
-               dohash_##k##_##v##_base = (h)->key_and_value,             \
-               dohash_##k##_##v##_end  = dohash_##k##_##v##_base	 \
-                                         + 2 * HASH_TABLE_SIZE (h),      \
-               /* Check again, in case the table has shrunk.  */         \
-               dohash_##k##_##v##_kv < dohash_##k##_##v##_end))          \
-       && (k = dohash_##k##_##v##_kv[0],                                 \
-           v = dohash_##k##_##v##_kv[1], /*maybe unused*/ (void)v,       \
-           true);			                                 \
-        dohash_##k##_##v##_kv += 2)				         \
-    if (hash_unused_entry_key_p (k))				         \
-      ;								         \
+   The body may remove the current entry or alter its value slot, but not
+   mutate TABLE in any other way.  */
+#define DOHASH(h, k, v)							\
+  for (Lisp_Object *dohash_##k##_##v##_kv = (h)->key_and_value,		\
+                   *dohash_##k##_##v##_end = dohash_##k##_##v##_kv	\
+                                             + 2 * HASH_TABLE_SIZE (h),	\
+                   k, v;						\
+       dohash_##k##_##v##_kv < dohash_##k##_##v##_end			\
+       && (k = dohash_##k##_##v##_kv[0],				\
+           v = dohash_##k##_##v##_kv[1], /*maybe unsed*/ (void)v,       \
+           true);			                                \
+        dohash_##k##_##v##_kv += 2)					\
+    if (hash_unused_entry_key_p (k))					\
+      ;									\
     else
 
 
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-25 18:12 ` Mattias Engdegård
@ 2024-01-25 22:39   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26 16:07     ` Mattias Engdegård
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-25 22:39 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Gerd Möllmann, Eli Zaretskii, jm, 68690

> Sorry, my fault -- indeed maphash 'supports' irregular mutation in the sense
> that it shouldn't crash or corrupt Emacs if the rules are violated. I can't
> reproduce the reported crash(es) on my platform but is my understanding
> correct that no other uses of DOHASH caused any trouble?

AFAIK the current DOHASH code in `master` works fine (tho a bit ugly).
The remaining build failure that Eli is seeing seems unrelated.

> This patch reverts my last change to Fmaphash and yours to DOHASH. It's
> perfectly fine to forego DOHASH in Fmaphash, it's chums with the hash-table
> implementation. Assuming that the problems were confined to Fmaphash, this
> should be safe to apply.

The build failure didn't come via maphash` but via the DOHASH in
`comp.c` that calls `compile_function` (which apparently can cause the
hash table to be resized).
So `maphash` is clearly not the only "offender".


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-25 10:26         ` Eli Zaretskii
@ 2024-01-26  2:43           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26  8:40             ` Eli Zaretskii
                               ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-26  2:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jm, 68690

>> Hmm... I can't reproduce it here (even with native-comp and
>> `--with-wide-int`).
> This build is without native-comp, but it's a 32-bit build.  Did you
> try that?  I think that's the key to unlock this (see below).

I tried 32bit with and without native-comp, and with and without wide-int,
but can't reproduce it here.  Maybe it only manifest itself under w32?

In your message I see it crashes compiling `mwheel.el`; is that the
first place where it crashes?  Does it crash on most other files as
well?  In interactive use?

>> The above stack frame suggests it might be related
>> to commit 33b8d5b6c5a (and hence unrelated to the original bug#68690
>> which was a bug in `DOHASH`).
>> Any chance you can investigate what is this `0x92348b000000000`?
> It's obviously a bogus value, since Lisp objects in this build should
> have their high 32 bits zero except for the type tag in the MSBs.

Indeed.

>> It should be a charset's attributes and the "idx=1" is because
>> we're using `CHARSET_ATTR_NAME` to extract the name.
> It sounds like we are not dumping the charset attributes correctly,
> and that also corrupts all the fields of a struct charset following
> the attributes.  Here's this charset in temacs:
[...]
>   (gdb) p cs->attributes
>   $3 = XIL(0xa000000009023d88)
[...]
> And here's the same charset in emacs, after we restore from dump:
[...]
>     attributes = XIL(0x92848b000000000),

Yup, sure looks like the bytes got shifted by 4 bytes for some reason.

> I tried to figure out what is wrong with how we dump this new field,
> but got lost in the proverbial twisty little passages of pdumper.c,
> all alike.

🙁

> For example, I cannot understand why some fields which are
> Lisp objects are dumped with dump_field_lv while others with
> dump_field_lv_or_rawptr, and what is the significance of WEIGHT_NORMAL
> vs WEIGHT_STRONG.  Hopefully, the above gives enough information for
> you to figure this out.

I'm just as lost as you are in pdumper.c, sadly.


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26  2:43           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-26  8:40             ` Eli Zaretskii
  2024-01-26  9:26             ` Gerd Möllmann
  2024-01-26 10:18             ` Andreas Schwab
  2 siblings, 0 replies; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-26  8:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: jm, 68690

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: jm@pub.pink,  68690@debbugs.gnu.org
> Date: Thu, 25 Jan 2024 21:43:01 -0500
> 
> >> Hmm... I can't reproduce it here (even with native-comp and
> >> `--with-wide-int`).
> > This build is without native-comp, but it's a 32-bit build.  Did you
> > try that?  I think that's the key to unlock this (see below).
> 
> I tried 32bit with and without native-comp, and with and without wide-int,
> but can't reproduce it here.  Maybe it only manifest itself under w32?

It could be, but I cannot imagine why the way we dump charsets has
anything to do with w32.

> In your message I see it crashes compiling `mwheel.el`; is that the
> first place where it crashes?

"First place" in what sense?

What happens in the build is that temacs is built and dumped to
produce bootstrap-emacs, then bootstrap-emacs starts compiling Lisp
files and crashes on the first one it tries to compile, with the
backtrace I posted.

> Does it crash on most other files as well?

It seems to crash with any Lisp file I try.  It also crashes in this
command:

  '../src/emacs.exe' -batch --no-site-file --no-site-lisp \
	      -l ./emacs-lisp/loaddefs-gen.elc \
      -f loaddefs-generate--emacs-batch . ./calc ./calendar ./cedet ./cedet/ede ./cedet/semantic ./cedet/semantic/analyze ./cedet/semantic/bovine ./cedet/semantic/decorate ./cedet/semantic/symref ./cedet/semantic/wisent ./cedet/srecode ./emacs-lisp ./emulation ./erc ./eshell ./gnus ./image ./international ./language ./leim ./leim/ja-dic ./leim/quail ./mail ./mh-e ./net ./nxml ./org ./play ./progmodes ./textmodes ./url ./use-package ./vc

> In interactive use?

If I invoke bootstrap-emacs interactively, it crashes during startup,
in window-system-initialization.  The backtrace for that is below, and
it again seems to show a bogus value of charset attributes.

> >> The above stack frame suggests it might be related
> >> to commit 33b8d5b6c5a (and hence unrelated to the original bug#68690
> >> which was a bug in `DOHASH`).
> >> Any chance you can investigate what is this `0x92348b000000000`?
> > It's obviously a bogus value, since Lisp objects in this build should
> > have their high 32 bits zero except for the type tag in the MSBs.
> 
> Indeed.
> 
> >> It should be a charset's attributes and the "idx=1" is because
> >> we're using `CHARSET_ATTR_NAME` to extract the name.
> > It sounds like we are not dumping the charset attributes correctly,
> > and that also corrupts all the fields of a struct charset following
> > the attributes.  Here's this charset in temacs:
> [...]
> >   (gdb) p cs->attributes
> >   $3 = XIL(0xa000000009023d88)
> [...]
> > And here's the same charset in emacs, after we restore from dump:
> [...]
> >     attributes = XIL(0x92848b000000000),
> 
> Yup, sure looks like the bytes got shifted by 4 bytes for some reason.
> 
> > I tried to figure out what is wrong with how we dump this new field,
> > but got lost in the proverbial twisty little passages of pdumper.c,
> > all alike.
> 
> 🙁
> 
> > For example, I cannot understand why some fields which are
> > Lisp objects are dumped with dump_field_lv while others with
> > dump_field_lv_or_rawptr, and what is the significance of WEIGHT_NORMAL
> > vs WEIGHT_STRONG.  Hopefully, the above gives enough information for
> > you to figure this out.
> 
> I'm just as lost as you are in pdumper.c, sadly.

Well, can I provide more info for you to try to figure out what's
wrong?  E.g., why and how did you decide to use dump_field_lv for
dumping the charset's attributes (which is a Lisp vector)?
Alternatively, any ideas for how should I proceed to debug this
myself?

See, I cannot afford to have the master branch be broken for me for
prolonged periods of time, as that prevents me from doing my job of
installing patches by others and testing various issues.  We must fix
this build, or face the need to revert the charset-related changes for
now.

Here's the backtrace during startup when bootstrap-emacs is invoked
with -Q:

  lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)

  Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22,
      backtrace_limit=2147483647) at emacs.c:442
  442       signal (sig, SIG_DFL);
  (gdb) bt
  #0  terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:442
  #1  0x00802435 in die (msg=0xe6c80d <b_fwd+233> "VECTORLIKEP (a)",
      file=0xe6c740 <b_fwd+28> "lisp.h", line=1784) at alloc.c:8062
  #2  0x006b6a78 in XVECTOR (a=XIL(0x926755800000000)) at lisp.h:1784
  #3  0x006b6b02 in gc_asize (array=XIL(0x926755800000000)) at lisp.h:1800
  #4  0x006b6bee in AREF (array=XIL(0x926755800000000), idx=7) at lisp.h:1971
  #5  0x006ba13f in map_charset_chars (c_function=0x9a7fa1 <set_fontset_font>,
      function=XIL(0), arg=XIL(0xa000000009140a08), charset=0x91d912c, from=0,
      to=33) at charset.c:775
  #6  0x009a9cf2 in Fset_fontset_font (fontset=XIL(0x800000000912a370),
      characters=XIL(0xa3e0), font_spec=XIL(0xa000000009122ed0),
      frame=XIL(0xa0000000090d9de8), add=XIL(0x2a90)) at fontset.c:1668
  #7  0x009aa8f2 in Fnew_fontset (name=XIL(0x800000000912a370),
      fontlist=XIL(0xc000000005ed6570)) at fontset.c:1786
  #8  0x0084a583 in funcall_subr (subr=0xe5a5c0 <Snew_fontset>, numargs=2,
      args=0x9cd7250) at eval.c:3092
  #9  0x008bd5eb in exec_byte_code (fun=XIL(0xa000000009576978),
      args_template=0, nargs=0, args=0x9cd7208) at bytecode.c:815
  #10 0x0084ab2e in fetch_and_exec_byte_code (fun=XIL(0xa000000009279ff0),
      args_template=256, nargs=0, args=0x9cd7148) at eval.c:3135
  #11 0x0084b08d in funcall_lambda (fun=XIL(0xa000000009279ff0), nargs=0,
      arg_vector=0x9cd7148) at eval.c:3207
  #12 0x00849eb1 in funcall_general (fun=XIL(0xa000000009279ff0), numargs=0,
      args=0x9cd7148) at eval.c:2972
  #13 0x0084a236 in Ffuncall (nargs=1, args=0x9cd7140) at eval.c:3022
  #14 0x00848c4a in Fapply (nargs=2, args=0x9cd7140) at eval.c:2646
  #15 0x0084a9df in funcall_subr (subr=0xe51f40 <Sapply>, numargs=2,
      args=0x9cd7140) at eval.c:3113
  #16 0x008bd5eb in exec_byte_code (fun=XIL(0xa0000000095b4058),
      args_template=770, nargs=3, args=0x9cd7390) at bytecode.c:815
  #17 0x0084ab2e in fetch_and_exec_byte_code (fun=XIL(0xa00000000945dc90),
      args_template=0, nargs=0, args=0x5e5f5a0) at eval.c:3135
  #18 0x0084b08d in funcall_lambda (fun=XIL(0xa00000000945dc90), nargs=0,
      arg_vector=0x5e5f5a0) at eval.c:3207
  #19 0x0084acdd in apply_lambda (fun=XIL(0xa00000000945dc90), args=XIL(0),
      count=128) at eval.c:3157
  #20 0x0084861a in eval_sub (form=XIL(0xc0000000098d6248)) at eval.c:2572
  #21 0x008475ed in Feval (form=XIL(0xc0000000098d6248), lexical=XIL(0x30))
      at eval.c:2389
  #22 0x00740220 in top_level_2 () at keyboard.c:1173
  #23 0x00844564 in internal_condition_case (bfun=0x740197 <top_level_2>,
      handlers=XIL(0x90), hfun=0x73f6f6 <cmd_error>) at eval.c:1537
  #24 0x007402b3 in top_level_1 (ignore=XIL(0)) at keyboard.c:1185
  #25 0x008431d2 in internal_catch (tag=XIL(0x10a40),
      func=0x74023f <top_level_1>, arg=XIL(0)) at eval.c:1217
  #26 0x0074008a in command_loop () at keyboard.c:1134
  #27 0x0073f156 in recursive_edit_1 () at keyboard.c:744
  #28 0x0073f3f4 in Frecursive_edit () at keyboard.c:827
  #29 0x0073a400 in main (argc=2, argv=0x1f2570) at emacs.c:2624

  Lisp Backtrace:
  "new-fontset" (0x9cd7250)
  "setup-default-fontset" (0x9cd7208)
  "create-default-fontset" (0x9cd71c0)
  0x9279ff0 PVEC_COMPILED
  "apply" (0x9cd7140)
  "window-system-initialization" (0x9cd70b8)
  "command-line" (0x9cd7048)
  "normal-top-level" (0x5e5f5a0)
  (gdb) fr 5
  #5  0x006ba13f in map_charset_chars (c_function=0x9a7fa1 <set_fontset_font>,
      function=XIL(0), arg=XIL(0xa000000009140a08), charset=0x91d912c, from=0,
      to=33) at charset.c:775
  775           for (parents = CHARSET_SUPERSET (charset); CONSP (parents);
  (gdb) p charset
  $1 = (struct charset *) 0x91d912c
  (gdb) p *$
  $2 = {
    id = 0,
    attributes = XIL(0x926755800000000),
    dimension = -1610612736,
    code_space = {1, 33, 126, 94, 94, 0, 0, 1, 94, 0, 0, 1, 94, 0, 0},
    code_space_mask = 0x1 <error: Cannot access memory at address 0x1>,
    code_linear_p = 0,
    iso_chars_96 = 0,
    ascii_compatible_p = 0,
    supplementary_p = 0,
    compact_codes_p = 0,
    unified_p = 0,
    iso_final = 25,
    iso_revision = 49,
    emacs_mule_id = -1,
    method = 167,
    min_code = 0,
    max_code = 33,
    char_index_offset = 126,
    min_char = 0,
    max_char = 3713,
    invalid_code = 3806,
    fast_map = "\000\000\000\000\001\000\000 ", '\000' <repeats 181 times>,
    code_offset = 0
  }
  (gdb)





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26  2:43           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26  8:40             ` Eli Zaretskii
@ 2024-01-26  9:26             ` Gerd Möllmann
  2024-01-26 13:48               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26 14:30               ` Eli Zaretskii
  2024-01-26 10:18             ` Andreas Schwab
  2 siblings, 2 replies; 29+ messages in thread
From: Gerd Möllmann @ 2024-01-26  9:26 UTC (permalink / raw)
  To: 68690; +Cc: eliz, jm, monnier

Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs@gnu.org> writes:

>> For example, I cannot understand why some fields which are
>> Lisp objects are dumped with dump_field_lv while others with
>> dump_field_lv_or_rawptr, and what is the significance of WEIGHT_NORMAL
>> vs WEIGHT_STRONG.  Hopefully, the above gives enough information for
>> you to figure this out.
>
> I'm just as lost as you are in pdumper.c, sadly.

I remembered seeing something in pdumper.c that could be related, namely

  /* Start the cold section.  This section contains bytes that should
     never change and so can be direct-mapped from the dump without
     special processing.  */
  dump_drain_cold_data (ctx);

And if you follow that function you'll see that it treats charsets
specially.

I find the comment about directly mapping very suspicious, when the
charset contains a Lisp_Object, possibly requiring relocation. But it
could well be that I misundertand something here.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26  2:43           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26  8:40             ` Eli Zaretskii
  2024-01-26  9:26             ` Gerd Möllmann
@ 2024-01-26 10:18             ` Andreas Schwab
  2024-01-26 13:49               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2024-01-26 10:18 UTC (permalink / raw)
  To: 68690; +Cc: eliz, jm, monnier

On Jan 25 2024, Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote:

>>> Hmm... I can't reproduce it here (even with native-comp and
>>> `--with-wide-int`).
>> This build is without native-comp, but it's a 32-bit build.  Did you
>> try that?  I think that's the key to unlock this (see below).
>
> I tried 32bit with and without native-comp, and with and without wide-int,
> but can't reproduce it here.  Maybe it only manifest itself under w32?

https://build.opensuse.org/package/live_build_log/home:AndreasSchwab:emacs:master/emacs/13.1/ppc
https://build.opensuse.org/package/live_build_log/home:AndreasSchwab:emacs:master/emacs/a/armv6l
https://build.opensuse.org/package/live_build_log/home:AndreasSchwab:emacs:master/emacs/a/armv7l

They are all 32-bit with wide-int (with and without native-comp).  And
the crash happens since 33b8d5b6c5a.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26  9:26             ` Gerd Möllmann
@ 2024-01-26 13:48               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26 14:36                 ` Eli Zaretskii
  2024-01-26 14:30               ` Eli Zaretskii
  1 sibling, 1 reply; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-26 13:48 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 68690, jm, eliz

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]

> I remembered seeing something in pdumper.c that could be related, namely
>
>   /* Start the cold section.  This section contains bytes that should
>      never change and so can be direct-mapped from the dump without
>      special processing.  */
>   dump_drain_cold_data (ctx);
>
> And if you follow that function you'll see that it treats charsets
> specially.
>
> I find the comment about directly mapping very suspicious, when the
> charset contains a Lisp_Object, possibly requiring relocation. But it
> could well be that I misundertand something here.

Hmm... would a patch like the one below fix the problem, then?


        Stefan

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cold-charset.patch --]
[-- Type: text/x-diff, Size: 1852 bytes --]

diff --git a/src/pdumper.c b/src/pdumper.c
index f42d1777371..56177d3fd89 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -440,7 +440,6 @@ dump_fingerprint (FILE *output, char const *label,
   {
     COLD_OP_OBJECT,
     COLD_OP_STRING,
-    COLD_OP_CHARSET,
     COLD_OP_BUFFER,
     COLD_OP_BIGNUM,
     COLD_OP_NATIVE_SUBR,
@@ -3245,10 +3244,6 @@ dump_charset (struct dump_context *ctx, int cs_i)
   memcpy (out.fast_map, &cs->fast_map, sizeof (cs->fast_map));
   DUMP_FIELD_COPY (&out, cs, code_offset);
   dump_off offset = dump_object_finish (ctx, &out, sizeof (out));
-  if (cs_i < charset_table_used && cs->code_space_mask)
-    dump_remember_cold_op (ctx, COLD_OP_CHARSET,
-                           Fcons (dump_off_to_lisp (cs_i),
-                                  dump_off_to_lisp (offset)));
   return offset;
 }
 
@@ -3402,20 +3397,6 @@ dump_cold_string (struct dump_context *ctx, Lisp_Object string)
   dump_write (ctx, XSTRING (string)->u.s.data, total_size);
 }
 
-static void
-dump_cold_charset (struct dump_context *ctx, Lisp_Object data)
-{
-  /* Dump charset lookup tables.  */
-  int cs_i = XFIXNUM (XCAR (data));
-  dump_off cs_dump_offset = dump_off_from_lisp (XCDR (data));
-  dump_remember_fixup_ptr_raw
-    (ctx,
-     cs_dump_offset + dump_offsetof (struct charset, code_space_mask),
-     ctx->offset);
-  struct charset *cs = charset_table + cs_i;
-  dump_write (ctx, cs->code_space_mask, 256);
-}
-
 static void
 dump_cold_buffer (struct dump_context *ctx, Lisp_Object data)
 {
@@ -3509,9 +3490,6 @@ dump_drain_cold_data (struct dump_context *ctx)
         case COLD_OP_STRING:
           dump_cold_string (ctx, data);
           break;
-        case COLD_OP_CHARSET:
-          dump_cold_charset (ctx, data);
-          break;
         case COLD_OP_BUFFER:
           dump_cold_buffer (ctx, data);
           break;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 10:18             ` Andreas Schwab
@ 2024-01-26 13:49               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-26 14:50                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-26 13:49 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 68690, jm, eliz

>>>> Hmm... I can't reproduce it here (even with native-comp and
>>>> `--with-wide-int`).
>>> This build is without native-comp, but it's a 32-bit build.  Did you
>>> try that?  I think that's the key to unlock this (see below).
>>
>> I tried 32bit with and without native-comp, and with and without wide-int,
>> but can't reproduce it here.  Maybe it only manifest itself under w32?
>
> https://build.opensuse.org/package/live_build_log/home:AndreasSchwab:emacs:master/emacs/13.1/ppc
> https://build.opensuse.org/package/live_build_log/home:AndreasSchwab:emacs:master/emacs/a/armv6l
> https://build.opensuse.org/package/live_build_log/home:AndreasSchwab:emacs:master/emacs/a/armv7l
>
> They are all 32-bit with wide-int (with and without native-comp).  And
> the crash happens since 33b8d5b6c5a.

Thanks Andreas.  So clearly it's not just w32.


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26  9:26             ` Gerd Möllmann
  2024-01-26 13:48               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-26 14:30               ` Eli Zaretskii
  2024-01-26 14:47                 ` Gerd Möllmann
  1 sibling, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-26 14:30 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 68690, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Stefan Monnier
>  <monnier@iro.umontreal.ca>,  jm@pub.pink,  68690@debbugs.gnu.org
> Date: Fri, 26 Jan 2024 10:26:00 +0100
> 
> Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
> text editors" <bug-gnu-emacs@gnu.org> writes:
> 
> > I'm just as lost as you are in pdumper.c, sadly.
> 
> I remembered seeing something in pdumper.c that could be related, namely
> 
>   /* Start the cold section.  This section contains bytes that should
>      never change and so can be direct-mapped from the dump without
>      special processing.  */
>   dump_drain_cold_data (ctx);
> 
> And if you follow that function you'll see that it treats charsets
> specially.

AFAIU, that special handling is for dumping fields that are pointers.
For example, the string data in a Lisp string, buffer text in a
buffer, and the data pointed to by code_space_mask in a charset.

But the charset's attributes are not a pointer, they are a Lisp
vector.

Moreover, the offending charset (ID = 0) is not processed by
dump_cold_charset because its code_space_mask is NULL (which makes
sense since the dimension of the ASCII charset is 1).

> I find the comment about directly mapping very suspicious, when the
> charset contains a Lisp_Object, possibly requiring relocation. But it
> could well be that I misundertand something here.

First, before Stefan's changes there was no Lisp objects in 'struct
charset'.

And second, what do you mean by "possibly requiring relocation"?  Do
you mean relocation after restoring from dump, or do you mean
relocation during dumping?  Or something else entirely?





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 13:48               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-26 14:36                 ` Eli Zaretskii
  2024-01-26 15:51                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-26 14:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, 68690

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
>  text editors" <bug-gnu-emacs@gnu.org>,  Eli Zaretskii <eliz@gnu.org>,
>   jm@pub.pink,  68690@debbugs.gnu.org
> Date: Fri, 26 Jan 2024 08:48:46 -0500
> 
> > I remembered seeing something in pdumper.c that could be related, namely
> >
> >   /* Start the cold section.  This section contains bytes that should
> >      never change and so can be direct-mapped from the dump without
> >      special processing.  */
> >   dump_drain_cold_data (ctx);
> >
> > And if you follow that function you'll see that it treats charsets
> > specially.
> >
> > I find the comment about directly mapping very suspicious, when the
> > charset contains a Lisp_Object, possibly requiring relocation. But it
> > could well be that I misundertand something here.
> 
> Hmm... would a patch like the one below fix the problem, then?

What is the logic behind this patch?

Anyway, the offending charset (ASCII) doesn't have a non-NULL
code_space_mask, so it is not processed by dump_cold_charset.  I
therefore doubt that this will have any effect on the problem.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 14:30               ` Eli Zaretskii
@ 2024-01-26 14:47                 ` Gerd Möllmann
  2024-01-26 14:55                   ` Eli Zaretskii
  0 siblings, 1 reply; 29+ messages in thread
From: Gerd Möllmann @ 2024-01-26 14:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 68690, monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  Stefan Monnier
>>  <monnier@iro.umontreal.ca>,  jm@pub.pink,  68690@debbugs.gnu.org
>> Date: Fri, 26 Jan 2024 10:26:00 +0100
>> 
>> Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
>> text editors" <bug-gnu-emacs@gnu.org> writes:
>> 
>> > I'm just as lost as you are in pdumper.c, sadly.
>> 
>> I remembered seeing something in pdumper.c that could be related, namely
>> 
>>   /* Start the cold section.  This section contains bytes that should
>>      never change and so can be direct-mapped from the dump without
>>      special processing.  */
>>   dump_drain_cold_data (ctx);
>> 
>> And if you follow that function you'll see that it treats charsets
>> specially.
>
> AFAIU, that special handling is for dumping fields that are pointers.
> For example, the string data in a Lisp string, buffer text in a
> buffer, and the data pointed to by code_space_mask in a charset.
>
> But the charset's attributes are not a pointer, they are a Lisp
> vector.
>

We're probably talking about different things. I was talking about the
fact that struct charset, before Stefan's change, sonsisted of,
basically, integers only (no pointer, nothing), so that it could just be
dumped as-is, and, after loading the dump file, used as-is.

> Moreover, the offending charset (ID = 0) is not processed by
> dump_cold_charset because its code_space_mask is NULL (which makes
> sense since the dimension of the ASCII charset is 1).
>
>> I find the comment about directly mapping very suspicious, when the
>> charset contains a Lisp_Object, possibly requiring relocation. But it
>> could well be that I misundertand something here.
>
> First, before Stefan's changes there was no Lisp objects in 'struct
> charset'.

My point.

> And second, what do you mean by "possibly requiring relocation"?  Do
> you mean relocation after restoring from dump, or do you mean
> relocation during dumping?  Or something else entirely?

Lisp_Object fields require writing something to the dump file that can
be used, when the dump is loaded, to compute the real value in the the
new Emacs session. So, something is done when dumping, and when loading.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 13:49               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-26 14:50                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-26 14:50 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 68690, jm, eliz

> Thanks Andreas.  So clearly it's not just w32.

Ha!  I can reproduce it on my `armhf` machine!


        Stefan







^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 14:47                 ` Gerd Möllmann
@ 2024-01-26 14:55                   ` Eli Zaretskii
  2024-01-27  0:08                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-26 14:55 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 68690, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 68690@debbugs.gnu.org,  monnier@iro.umontreal.ca
> Date: Fri, 26 Jan 2024 15:47:06 +0100
> 
> > And second, what do you mean by "possibly requiring relocation"?  Do
> > you mean relocation after restoring from dump, or do you mean
> > relocation during dumping?  Or something else entirely?
> 
> Lisp_Object fields require writing something to the dump file that can
> be used, when the dump is loaded, to compute the real value in the the
> new Emacs session. So, something is done when dumping, and when loading.

Something _is_ being done, AFAIU.  If you step through dump_field_lv,
you will see that it dumps a placeholder (0xDEADF00D) instead of the
actual value, and records a "fixup" to be processed later.  When the
fixup is processed, it schedules a "relocation", which AFAIU is
supposed to replace the placeholder with the offset of the actual Lisp
object in the dump file.  So the machinery seems to be in place, it
just doesn't work somehow in this case...





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 14:36                 ` Eli Zaretskii
@ 2024-01-26 15:51                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-26 15:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 68690

>> Hmm... would a patch like the one below fix the problem, then?

[ Tried it on armhf.  ]

> What is the logic behind this patch?

The logic was "shot in the dark".
As usual, that logic didn't give good results.


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-25 22:39   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-26 16:07     ` Mattias Engdegård
  0 siblings, 0 replies; 29+ messages in thread
From: Mattias Engdegård @ 2024-01-26 16:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Gerd Möllmann, Eli Zaretskii, jm, 68690

25 jan. 2024 kl. 23.39 skrev Stefan Monnier <monnier@iro.umontreal.ca>:

> AFAIK the current DOHASH code in `master` works fine (tho a bit ugly).

I think it's less good (and more complex) than not trying to cache anything at all where the compiler actually will help out a bit, and it's slower than the fast caching version.

> The build failure didn't come via maphash` but via the DOHASH in
> `comp.c` that calls `compile_function` (which apparently can cause the
> hash table to be resized).

Do you know which one? It's quite possible that that code wasn't written taking into account the problems of extending the table being iterated over. We should definitely ask Andrea.

Meanwhile, I suggest adding a DOHASH_SAFE (simple, safe) to be used in these cases, including Fmaphash, and use the (someone less simple, fast) DOHASH in my previous patch with an extra assertion to catch mistakes elsewhere in checking builds.






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-26 14:55                   ` Eli Zaretskii
@ 2024-01-27  0:08                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-27  4:07                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-27  0:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 68690

I've been spending a lot of time on this and still haven't found the
origin of the problem nor a fix.  Here's what I have currently, in
short:

    % src/bootstrap-emacs --batch -f batch-byte-compile lisp/ido.el 
    charset 0: a0000000f20e1328 => f1f2c070
    CHARSET_ATTRIBUTES(ID=0, cs=f1f2c064) = f20e132800000000 (@ f1f2c06c)
    
    lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)

The first message comes from `dump_do_dump_relocation` when we load the
dump file and shows (I believe) the relocation we do for the
`attributes` field of the charset 0: we write the (relocated)
Lisp_Object value `a0000000f20e1328` at address `f1f2c070`.

The second message comes from `CHARSET_ATTRIBUTES` which tells us that
it reads the Lisp_Object value `f20e132800000000` at address `f1f2c06c`.

I sadly have no clue yet why there is this 4byte difference between
those two addresses (but it does explain the bogus Lisp_Object we get).


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-27  0:08                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-27  4:07                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-01-27  7:50                         ` Eli Zaretskii
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-27  4:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 68690

> I sadly have no clue yet why there is this 4byte difference between
> those two addresses (but it does explain the bogus Lisp_Object we get).

OK, I think I found it.
Please try again and let me know if it fixes it for you.
Damn, that was a nasty bugger!


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-27  4:07                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-27  7:50                         ` Eli Zaretskii
  2024-01-27 14:45                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2024-01-27  7:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, 68690

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>,
>   68690@debbugs.gnu.org
> Date: Fri, 26 Jan 2024 23:07:31 -0500
> 
> > I sadly have no clue yet why there is this 4byte difference between
> > those two addresses (but it does explain the bogus Lisp_Object we get).
> 
> OK, I think I found it.
> Please try again and let me know if it fixes it for you.
> Damn, that was a nasty bugger!

Thanks, the build works now.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#68690: Segmentation fault building with native-comp
  2024-01-27  7:50                         ` Eli Zaretskii
@ 2024-01-27 14:45                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-27 14:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 68690-done

>> > I sadly have no clue yet why there is this 4byte difference between
>> > those two addresses (but it does explain the bogus Lisp_Object we get).
>> OK, I think I found it.
>> Please try again and let me know if it fixes it for you.
>> Damn, that was a nasty bugger!
> Thanks, the build works now.

Great, closing.


        Stefan






^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-01-27 14:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-24 14:36 bug#68690: Segmentation fault building with native-comp john muhl via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-24 17:10 ` Eli Zaretskii
2024-01-24 19:52   ` Gerd Möllmann
2024-01-24 19:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-24 20:27     ` Eli Zaretskii
2024-01-24 23:59       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-25 10:26         ` Eli Zaretskii
2024-01-26  2:43           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-26  8:40             ` Eli Zaretskii
2024-01-26  9:26             ` Gerd Möllmann
2024-01-26 13:48               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-26 14:36                 ` Eli Zaretskii
2024-01-26 15:51                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-26 14:30               ` Eli Zaretskii
2024-01-26 14:47                 ` Gerd Möllmann
2024-01-26 14:55                   ` Eli Zaretskii
2024-01-27  0:08                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-27  4:07                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-27  7:50                         ` Eli Zaretskii
2024-01-27 14:45                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-26 10:18             ` Andreas Schwab
2024-01-26 13:49               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-26 14:50                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-25  5:33     ` Gerd Möllmann
2024-01-25  8:33       ` Gerd Möllmann
2024-01-25 15:58         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-25 18:12 ` Mattias Engdegård
2024-01-25 22:39   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-26 16:07     ` Mattias Engdegård

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).