* MPS: Please check if scratch/igc builds with native compilation @ 2024-05-21 14:00 Gerd Möllmann 2024-05-21 16:57 ` Andrea Corallo ` (2 more replies) 0 siblings, 3 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 14:00 UTC (permalink / raw) To: Emacs Devel; +Cc: Eli Zaretskii, Helmut Eller I'm throwing the towel now wrt to native compilation + MPS on macOS. It fails here both on arm64 and x86_64 on macOS 14. It's a long story what all I tried to debug this, let's me just say I suspect, with the highest probability among all the possibilited, a bug in MPS, without me being able to point to it. Gut feeling. Anyway - it was an experiment. What I'd like to ask anyone who can is to try building scratch/igc with native compilation (default) and --enable-checking=all. Please tell your OS, and if you get assertion failures. Maybe do 2 or more builds. This could help to assess if scratch/igc is viable. I currently think it isn't on macOS, to be honest. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 14:00 MPS: Please check if scratch/igc builds with native compilation Gerd Möllmann @ 2024-05-21 16:57 ` Andrea Corallo 2024-05-21 17:02 ` Gerd Möllmann 2024-05-21 17:06 ` Gerd Möllmann 2024-05-22 8:18 ` Helmut Eller 2024-06-03 5:35 ` Gerd Möllmann 2 siblings, 2 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 16:57 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Gerd Möllmann <gerd.moellmann@gmail.com> writes: > I'm throwing the towel now wrt to native compilation + MPS on macOS. It > fails here both on arm64 and x86_64 on macOS 14. It's a long story what > all I tried to debug this, let's me just say I suspect, with the highest > probability among all the possibilited, a bug in MPS, without me being > able to point to it. Gut feeling. Anyway - it was an experiment. > > What I'd like to ask anyone who can is to try building scratch/igc with > native compilation (default) and --enable-checking=all. Please tell your > OS, and if you get assertion failures. Maybe do 2 or more builds. > > This could help to assess if scratch/igc is viable. > > I currently think it isn't on macOS, to be honest. Hi Gerd, I tried bootstrapping scratch/igc (my MPS is latest master) and got the following assertion triggerd on GNU/Linux x86_64. igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD Fatal error 6: Aborted Backtrace: ../src/bootstrap-emacs(+0x226e47)[0x5d4f66028e47] ../src/bootstrap-emacs(+0x6047d)[0x5d4f65e6247d] ../src/bootstrap-emacs(+0x34a13c)[0x5d4f6614c13c] ../src/bootstrap-emacs(+0x34aebe)[0x5d4f6614cebe] ../src/bootstrap-emacs(+0x34d6d8)[0x5d4f6614f6d8] ../src/bootstrap-emacs(+0x291c51)[0x5d4f66093c51] ../src/bootstrap-emacs(+0x291dbd)[0x5d4f66093dbd] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F636f6d702d2d61737369676e2d6f702d70_comp__assign_op_p_0+0x27)[0x7e2cc4156a77] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F636f6d702d2d7373612d72656e616d652d696e736e_comp__ssa_rename_insn_0+0x43d)[0x7e2cc419139d] ../src/bootstrap-emacs(+0x3063a4)[0x5d4f661083a4] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F636f6d702d2d7373612d72656e616d65_comp__ssa_rename_0+0x21e)[0x7e2cc419200e] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_100+0x1c0)[0x7e2cc4192c10] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] ../src/bootstrap-emacs(+0x2b6014)[0x5d4f660b8014] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F636f6d702d2d667770726f70_comp__fwprop_0+0x35)[0x7e2cc4197c35] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F636f6d702d2d6e61746976652d636f6d70696c65_comp__native_compile_0+0x7d6)[0x7e2cc41a0976] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F62617463682d6e61746976652d636f6d70696c65_batch_native_compile_0+0x186)[0x7e2cc41a1526] ../src/bootstrap-emacs(+0x2a9bac)[0x5d4f660abbac] /home/andcor03/emacs4/native-lisp/30.0.50-00c2e4a4/comp-7672a6ed-f5e09f0c.eln(F62617463682d627974652b6e61746976652d636f6d70696c65_batch_bytenative_compile_0+0x144)[0x7e2cc41a1884] [...] Not sure is the same you see. Anyway this is the native compiler that (after being native compiled) is compiling something else. If you need more details I can look into with gdb. Thanks Andrea PS I'm rebuilding with -j1 to make it reproducible here. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 16:57 ` Andrea Corallo @ 2024-05-21 17:02 ` Gerd Möllmann 2024-05-21 17:06 ` Gerd Möllmann 1 sibling, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 17:02 UTC (permalink / raw) To: Andrea Corallo; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Andrea Corallo <acorallo@gnu.org> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> I'm throwing the towel now wrt to native compilation + MPS on macOS. It >> fails here both on arm64 and x86_64 on macOS 14. It's a long story what >> all I tried to debug this, let's me just say I suspect, with the highest >> probability among all the possibilited, a bug in MPS, without me being >> able to point to it. Gut feeling. Anyway - it was an experiment. >> >> What I'd like to ask anyone who can is to try building scratch/igc with >> native compilation (default) and --enable-checking=all. Please tell your >> OS, and if you get assertion failures. Maybe do 2 or more builds. >> >> This could help to assess if scratch/igc is viable. >> >> I currently think it isn't on macOS, to be honest. > > Hi Gerd, > > I tried bootstrapping scratch/igc (my MPS is latest master) and got the > following assertion triggerd on GNU/Linux x86_64. > > igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD > Fatal error 6: Aborted > > Not sure is the same you see. Thanks, Andrea. That's exactly the assertion I see. > Anyway this is the native compiler that (after being native compiled) is > compiling something else. Same here on macOS, with x86_64 and arm64. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 16:57 ` Andrea Corallo 2024-05-21 17:02 ` Gerd Möllmann @ 2024-05-21 17:06 ` Gerd Möllmann 2024-05-21 17:57 ` Andrea Corallo 2024-05-22 5:43 ` Helmut Eller 1 sibling, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 17:06 UTC (permalink / raw) To: Andrea Corallo; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Here's something about my debugging attenpts so far: I'm throwing the towel wrt to native compilation with MPS on macOS. Which makes it a failure for me. The situation is as follows: When building with native compilation with --enable-checking=all, I am observing errors of the form igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD when compiling Lisp files, for example ELC+ELN ../lisp/international/mule-cmds.elc ELC+ELN ../lisp/files.elc What file triggers the error is not predictable, and it is not reproducible when running under LLDB, with or without ASLR. To debug this, I changed the check in igc.c to not assert, but print the PID, and enter an endless loop sleeping. This makes it possible to attach to the process with LLDB. In all cases I investigated in this way, I'm seeing a pattern: What is happening is that a function in the Emacs core is called from a native-compiled function. Things look like, simplified, /* In some .eln */ Lisp_Object d_reloc[100]; Lisp_Object some_native_compiled_lisp_function () { Lisp_Object frame[2]; frame[0] = d_reloc[17]; // some symbol frame[1] = ... f_reloc->funcall (2, frame); } where f_reloc is a large struct with function pointer members for function being called from the .eln. Doesn't matter. We then land in Ffuncall in the Emacs core, and the first element of its args vector, a symbol, is found to be forwarded which leads to the assertion. d_reloc in the .eln is scanned in igc.c, and it being on the control stack, in frame[], or in a register, should pin it, one would assume. So how comes Ffuncall in Emacs receives an invalid symbol? I've checked that d_reloc is indeed scanned by fix_comp_unit. The check gives me reasonable confidence that this "should work". But as an alternative, I also made all the things like d_reloc in the .elns ambiguous roots, so that they cannot possibly be moved, if all works as expected. - No change, it still asserts in the same way. - Changing optimization levels - no change. - Changing from arm64 to x86_64 - no change. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 17:06 ` Gerd Möllmann @ 2024-05-21 17:57 ` Andrea Corallo 2024-05-21 18:09 ` Gerd Möllmann 2024-05-21 18:34 ` Helmut Eller 2024-05-22 5:43 ` Helmut Eller 1 sibling, 2 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 17:57 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Here's something about my debugging attenpts so far: > > I'm throwing the towel wrt to native compilation with MPS on macOS. > Which makes it a failure for me. > > The situation is as follows: > > When building with native compilation with --enable-checking=all, I am > observing errors of the form > > igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD > > when compiling Lisp files, for example > > ELC+ELN ../lisp/international/mule-cmds.elc > ELC+ELN ../lisp/files.elc > > What file triggers the error is not predictable, and it is not > reproducible when running under LLDB, with or without ASLR. At least here the error seems reproducible. Bootstrapping with -j1 makes native compiling leim/ja-dic/ja-dic.el always fail. And if I run it under gdb I see we get a SIGSEGV in 'maybe_resize_hash_table' at fns.c:4987 memcpy (key, h->key, old_size * sizeof *key); with the following bt (gdb) bt #0 maybe_resize_hash_table (h=0x7fffe7dabd48) at fns.c:4987 #1 hash_put (h=0x7fffe7dabd48, key=XIL(0x7fffe4fc297b), value=XIL(0x30), hash=1644298) at fns.c:5162 #2 0x0000555555817fc0 in Fputhash (key=XIL(0x7fffe4fc297b), value=XIL(0x30), table=<optimized out>) at fns.c:5993 #3 0x00007ffff14f6313 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #4 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc010) at eval.c:3032 #5 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #6 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc0d0) at eval.c:3032 #7 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #8 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc190) at eval.c:3032 #9 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #10 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc250) at eval.c:3032 #11 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #12 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc310) at eval.c:3032 #13 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #14 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc3d0) at eval.c:3032 #15 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #16 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc490) at eval.c:3032 #17 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #18 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc550) at eval.c:3032 #19 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #20 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc610) at eval.c:3032 #21 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #22 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc6d0) at eval.c:3032 #23 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln #24 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc760) at eval.c:3032 #25 0x00007ffff14f692c in F627974652d72756e2d73747269702d73796d626f6c2d706f736974696f6e73_byte_run_strip_symbol_positions_0 () [...] Which is admittedly different to what I saw from command line. > To debug this, I changed the check in igc.c to not assert, but print > the PID, and enter an endless loop sleeping. This makes it possible to > attach to the process with LLDB. > > In all cases I investigated in this way, I'm seeing a pattern: What is > happening is that a function in the Emacs core is called from a > native-compiled function. Things look like, simplified, > > /* In some .eln */ > Lisp_Object d_reloc[100]; > > Lisp_Object some_native_compiled_lisp_function () > { > Lisp_Object frame[2]; > frame[0] = d_reloc[17]; // some symbol > frame[1] = ... > f_reloc->funcall (2, frame); > } > > where f_reloc is a large struct with function pointer members for > function being called from the .eln. Doesn't matter. We then land in > Ffuncall in the Emacs core, and the first element of its args vector, > a symbol, is found to be forwarded which leads to the assertion. > > d_reloc in the .eln is scanned in igc.c, and it being on the control > stack, in frame[], or in a register, should pin it, one would assume. > So how comes Ffuncall in Emacs receives an invalid symbol? > > I've checked that d_reloc is indeed scanned by fix_comp_unit. The > check gives me reasonable confidence that this "should work". But as > an alternative, I also made all the things like d_reloc in the .elns > ambiguous roots, so that they cannot possibly be moved, if all works as > expected. > > - No change, it still asserts in the same way. > > - Changing optimization levels - no change. > - Changing from arm64 to x86_64 - no change. That's very bizarre, I've hard time believing we are hitting such a bug :/ Hope we are missing something. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 17:57 ` Andrea Corallo @ 2024-05-21 18:09 ` Gerd Möllmann 2024-05-21 18:17 ` Andrea Corallo 2024-05-21 18:35 ` Eli Zaretskii 2024-05-21 18:34 ` Helmut Eller 1 sibling, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 18:09 UTC (permalink / raw) To: Andrea Corallo; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Andrea Corallo <acorallo@gnu.org> writes: > At least here the error seems reproducible. Bootstrapping with -j1 > makes native compiling leim/ja-dic/ja-dic.el always fail. > > And if I run it under gdb I see we get a SIGSEGV in > 'maybe_resize_hash_table' at fns.c:4987 > > memcpy (key, h->key, old_size * sizeof *key); That's a new one for me. Maybe you are hitting a read/write barrier? I think Eli & Helmut can help here with what to do for the signals in GDB. (On macOS, MPS is using Mach exceptions, not signals.) > > with the following bt > > (gdb) bt > #0 maybe_resize_hash_table (h=0x7fffe7dabd48) at fns.c:4987 > #1 hash_put (h=0x7fffe7dabd48, key=XIL(0x7fffe4fc297b), value=XIL(0x30), hash=1644298) at fns.c:5162 > #2 0x0000555555817fc0 in Fputhash (key=XIL(0x7fffe4fc297b), value=XIL(0x30), table=<optimized out>) at fns.c:5993 > #3 0x00007ffff14f6313 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #4 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc010) at eval.c:3032 > #5 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #6 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc0d0) at eval.c:3032 > #7 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #8 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc190) at eval.c:3032 > #9 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #10 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc250) at eval.c:3032 > #11 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #12 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc310) at eval.c:3032 > #13 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #14 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc3d0) at eval.c:3032 > #15 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #16 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc490) at eval.c:3032 > #17 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #18 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc550) at eval.c:3032 > #19 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #20 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc610) at eval.c:3032 > #21 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #22 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc6d0) at eval.c:3032 > #23 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln > #24 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc760) at eval.c:3032 > #25 0x00007ffff14f692c in F627974652d72756e2d73747269702d73796d626f6c2d706f736974696f6e73_byte_run_strip_symbol_positions_0 () > [...] > > Which is admittedly different to what I saw from command line. > >> To debug this, I changed the check in igc.c to not assert, but print >> the PID, and enter an endless loop sleeping. This makes it possible to >> attach to the process with LLDB. >> >> In all cases I investigated in this way, I'm seeing a pattern: What is >> happening is that a function in the Emacs core is called from a >> native-compiled function. Things look like, simplified, >> >> /* In some .eln */ >> Lisp_Object d_reloc[100]; >> >> Lisp_Object some_native_compiled_lisp_function () >> { >> Lisp_Object frame[2]; >> frame[0] = d_reloc[17]; // some symbol >> frame[1] = ... >> f_reloc->funcall (2, frame); >> } >> >> where f_reloc is a large struct with function pointer members for >> function being called from the .eln. Doesn't matter. We then land in >> Ffuncall in the Emacs core, and the first element of its args vector, >> a symbol, is found to be forwarded which leads to the assertion. >> >> d_reloc in the .eln is scanned in igc.c, and it being on the control >> stack, in frame[], or in a register, should pin it, one would assume. >> So how comes Ffuncall in Emacs receives an invalid symbol? >> >> I've checked that d_reloc is indeed scanned by fix_comp_unit. The >> check gives me reasonable confidence that this "should work". But as >> an alternative, I also made all the things like d_reloc in the .elns >> ambiguous roots, so that they cannot possibly be moved, if all works as >> expected. >> >> - No change, it still asserts in the same way. >> >> - Changing optimization levels - no change. >> - Changing from arm64 to x86_64 - no change. > > That's very bizarre, I've hard time believing we are hitting such a bug :/ > Hope we are missing something. Yes, bizarre is a good description. I'm out of ideas. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 18:09 ` Gerd Möllmann @ 2024-05-21 18:17 ` Andrea Corallo 2024-05-21 19:00 ` Gerd Möllmann 2024-05-21 18:35 ` Eli Zaretskii 1 sibling, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 18:17 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> At least here the error seems reproducible. Bootstrapping with -j1 >> makes native compiling leim/ja-dic/ja-dic.el always fail. >> >> And if I run it under gdb I see we get a SIGSEGV in >> 'maybe_resize_hash_table' at fns.c:4987 >> >> memcpy (key, h->key, old_size * sizeof *key); > > That's a new one for me. Maybe you are hitting a read/write barrier? Ah right maybe, interesting! > I think Eli & Helmut can help here with what to do for the signals in > GDB. (On macOS, MPS is using Mach exceptions, not signals.) > >> >> with the following bt > > > >> >> (gdb) bt >> #0 maybe_resize_hash_table (h=0x7fffe7dabd48) at fns.c:4987 >> #1 hash_put (h=0x7fffe7dabd48, key=XIL(0x7fffe4fc297b), value=XIL(0x30), hash=1644298) at fns.c:5162 >> #2 0x0000555555817fc0 in Fputhash (key=XIL(0x7fffe4fc297b), value=XIL(0x30), table=<optimized out>) at fns.c:5993 >> #3 0x00007ffff14f6313 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #4 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc010) at eval.c:3032 >> #5 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #6 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc0d0) at eval.c:3032 >> #7 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #8 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc190) at eval.c:3032 >> #9 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #10 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc250) at eval.c:3032 >> #11 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #12 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc310) at eval.c:3032 >> #13 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #14 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc3d0) at eval.c:3032 >> #15 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #16 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc490) at eval.c:3032 >> #17 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #18 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc550) at eval.c:3032 >> #19 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #20 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc610) at eval.c:3032 >> #21 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #22 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc6d0) at eval.c:3032 >> #23 0x00007ffff14f6476 in F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln >> #24 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc760) at eval.c:3032 >> #25 0x00007ffff14f692c in F627974652d72756e2d73747269702d73796d626f6c2d706f736974696f6e73_byte_run_strip_symbol_positions_0 () >> [...] >> >> Which is admittedly different to what I saw from command line. >> >>> To debug this, I changed the check in igc.c to not assert, but print >>> the PID, and enter an endless loop sleeping. This makes it possible to >>> attach to the process with LLDB. >>> >>> In all cases I investigated in this way, I'm seeing a pattern: What is >>> happening is that a function in the Emacs core is called from a >>> native-compiled function. Things look like, simplified, >>> >>> /* In some .eln */ >>> Lisp_Object d_reloc[100]; >>> >>> Lisp_Object some_native_compiled_lisp_function () >>> { >>> Lisp_Object frame[2]; >>> frame[0] = d_reloc[17]; // some symbol >>> frame[1] = ... >>> f_reloc->funcall (2, frame); >>> } >>> >>> where f_reloc is a large struct with function pointer members for >>> function being called from the .eln. Doesn't matter. We then land in >>> Ffuncall in the Emacs core, and the first element of its args vector, >>> a symbol, is found to be forwarded which leads to the assertion. >>> >>> d_reloc in the .eln is scanned in igc.c, and it being on the control >>> stack, in frame[], or in a register, should pin it, one would assume. >>> So how comes Ffuncall in Emacs receives an invalid symbol? >>> >>> I've checked that d_reloc is indeed scanned by fix_comp_unit. The >>> check gives me reasonable confidence that this "should work". But as >>> an alternative, I also made all the things like d_reloc in the .elns >>> ambiguous roots, so that they cannot possibly be moved, if all works as >>> expected. >>> >>> - No change, it still asserts in the same way. >>> >>> - Changing optimization levels - no change. >>> - Changing from arm64 to x86_64 - no change. >> >> That's very bizarre, I've hard time believing we are hitting such a bug :/ >> Hope we are missing something. > > Yes, bizarre is a good description. I'm out of ideas. Do you think is very difficult to debug MPS to understand why a certain object is being moved (while it should not)? On GNU/Linux we can record the rr trace (so that everything is reproducible) and do some back and forward to try to spread some light on this maybe? Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 18:17 ` Andrea Corallo @ 2024-05-21 19:00 ` Gerd Möllmann 2024-05-21 19:05 ` Andrea Corallo 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 19:00 UTC (permalink / raw) To: Andrea Corallo; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Andrea Corallo <acorallo@gnu.org> writes: >> Yes, bizarre is a good description. I'm out of ideas. > > Do you think is very difficult to debug MPS to understand why a certain > object is being moved (while it should not)? On GNU/Linux we can record > the rr trace (so that everything is reproducible) and do some back and > forward to try to spread some light on this maybe? I at least have no idea how to do it. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:00 ` Gerd Möllmann @ 2024-05-21 19:05 ` Andrea Corallo 2024-05-21 19:08 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 19:05 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >>> Yes, bizarre is a good description. I'm out of ideas. >> >> Do you think is very difficult to debug MPS to understand why a certain >> object is being moved (while it should not)? On GNU/Linux we can record >> the rr trace (so that everything is reproducible) and do some back and >> forward to try to spread some light on this maybe? > > I at least have no idea how to do it. Maybe is something we could ask the mps project how they would go debugging this? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:05 ` Andrea Corallo @ 2024-05-21 19:08 ` Gerd Möllmann 2024-05-21 19:20 ` Andrea Corallo 2024-05-21 19:21 ` Eli Zaretskii 0 siblings, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 19:08 UTC (permalink / raw) To: Andrea Corallo; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Andrea Corallo <acorallo@gnu.org> writes: > Maybe is something we could ask the mps project how they would go > debugging this? So far I haven't found an active mailing list or something, alas. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:08 ` Gerd Möllmann @ 2024-05-21 19:20 ` Andrea Corallo 2024-05-21 19:21 ` Eli Zaretskii 1 sibling, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 19:20 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Emacs Devel, Eli Zaretskii, Helmut Eller Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> Maybe is something we could ask the mps project how they would go >> debugging this? > > So far I haven't found an active mailing list or something, alas. 😢, maybe a github issue? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:08 ` Gerd Möllmann 2024-05-21 19:20 ` Andrea Corallo @ 2024-05-21 19:21 ` Eli Zaretskii 2024-05-21 19:23 ` Gerd Möllmann 1 sibling, 1 reply; 72+ messages in thread From: Eli Zaretskii @ 2024-05-21 19:21 UTC (permalink / raw) To: Gerd Möllmann; +Cc: acorallo, emacs-devel, eller.helmut > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: Emacs Devel <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, > Helmut Eller <eller.helmut@gmail.com> > Date: Tue, 21 May 2024 21:08:45 +0200 > > Andrea Corallo <acorallo@gnu.org> writes: > > > Maybe is something we could ask the mps project how they would go > > debugging this? > > So far I haven't found an active mailing list or something, alas. They have an issue tracker, but I don't know how actively they respond to new issues. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:21 ` Eli Zaretskii @ 2024-05-21 19:23 ` Gerd Möllmann 0 siblings, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 19:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acorallo, emacs-devel, eller.helmut On 2024-05-21 21:21, Eli Zaretskii wrote: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: Emacs Devel <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, >> Helmut Eller <eller.helmut@gmail.com> >> Date: Tue, 21 May 2024 21:08:45 +0200 >> >> Andrea Corallo <acorallo@gnu.org> writes: >> >>> Maybe is something we could ask the mps project how they would go >>> debugging this? >> >> So far I haven't found an active mailing list or something, alas. > > They have an issue tracker, but I don't know how actively they respond > to new issues. Pretty inactive. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 18:09 ` Gerd Möllmann 2024-05-21 18:17 ` Andrea Corallo @ 2024-05-21 18:35 ` Eli Zaretskii 1 sibling, 0 replies; 72+ messages in thread From: Eli Zaretskii @ 2024-05-21 18:35 UTC (permalink / raw) To: Gerd Möllmann; +Cc: acorallo, emacs-devel, eller.helmut > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: Emacs Devel <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, > Helmut Eller <eller.helmut@gmail.com> > Date: Tue, 21 May 2024 20:09:50 +0200 > > Andrea Corallo <acorallo@gnu.org> writes: > > > At least here the error seems reproducible. Bootstrapping with -j1 > > makes native compiling leim/ja-dic/ja-dic.el always fail. > > > > And if I run it under gdb I see we get a SIGSEGV in > > 'maybe_resize_hash_table' at fns.c:4987 > > > > memcpy (key, h->key, old_size * sizeof *key); > > That's a new one for me. Maybe you are hitting a read/write barrier? > I think Eli & Helmut can help here with what to do for the signals in > GDB. (On macOS, MPS is using Mach exceptions, not signals.) If this is a SIGSEGV due to a write barrier, you can "continue" the program and it continues running. If it's a real SIGSEGV, typing "continue" will hit the same SIGSEGV again. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 17:57 ` Andrea Corallo 2024-05-21 18:09 ` Gerd Möllmann @ 2024-05-21 18:34 ` Helmut Eller 2024-05-21 18:46 ` Andrea Corallo 1 sibling, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-21 18:34 UTC (permalink / raw) To: Andrea Corallo; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii > And if I run it under gdb I see we get a SIGSEGV in > 'maybe_resize_hash_table' at fns.c:4987 > > memcpy (key, h->key, old_size * sizeof *key); Yes, this probably a memory barrier triggered by an allocation. In gdb you need to say: signal SIGSEGV nostop print pass so that it can continue. Or noprint, what you prefer. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 18:34 ` Helmut Eller @ 2024-05-21 18:46 ` Andrea Corallo 2024-05-21 19:10 ` Helmut Eller 0 siblings, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 18:46 UTC (permalink / raw) To: Helmut Eller; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: >> And if I run it under gdb I see we get a SIGSEGV in >> 'maybe_resize_hash_table' at fns.c:4987 >> >> memcpy (key, h->key, old_size * sizeof *key); > > Yes, this probably a memory barrier triggered by an allocation. In > gdb you need to say: signal SIGSEGV nostop print pass > so that it can continue. Or noprint, what you prefer. Thanks Helmut, (for the record if someone else see this, it was 'handle SIGSEGV nostop print pass') Anyway yes, to confirm that at least here looks reproducible. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 18:46 ` Andrea Corallo @ 2024-05-21 19:10 ` Helmut Eller 2024-05-21 19:17 ` Andrea Corallo 2024-05-21 19:22 ` Eli Zaretskii 0 siblings, 2 replies; 72+ messages in thread From: Helmut Eller @ 2024-05-21 19:10 UTC (permalink / raw) To: Andrea Corallo; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii Is there some way to disable optimizations for libgccjit so that the native compiler isn't quite so slow? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:10 ` Helmut Eller @ 2024-05-21 19:17 ` Andrea Corallo 2024-05-21 19:35 ` Andrea Corallo 2024-05-21 19:22 ` Eli Zaretskii 1 sibling, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 19:17 UTC (permalink / raw) To: Helmut Eller; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > Is there some way to disable optimizations for libgccjit so that the > native compiler isn't quite so slow? The following should do the job, maybe is also an interesting test to do... modified lisp/emacs-lisp/comp.el @@ -54,7 +54,7 @@ comp "Emacs Lisp native compiler." :group 'lisp) -(defcustom native-comp-speed 2 +(defcustom native-comp-speed 0 "Optimization level for native compilation, a number between -1 and 3. -1 functions are kept in bytecode form and no native compilation is performed (but *.eln files are still produced, and include the compiled code in ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:17 ` Andrea Corallo @ 2024-05-21 19:35 ` Andrea Corallo 2024-05-21 19:38 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 19:35 UTC (permalink / raw) To: Helmut Eller; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii Andrea Corallo <acorallo@gnu.org> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> Is there some way to disable optimizations for libgccjit so that the >> native compiler isn't quite so slow? > > The following should do the job, maybe is also an interesting test to > do... > > modified lisp/emacs-lisp/comp.el > @@ -54,7 +54,7 @@ comp > "Emacs Lisp native compiler." > :group 'lisp) > > -(defcustom native-comp-speed 2 > +(defcustom native-comp-speed 0 > "Optimization level for native compilation, a number between -1 and 3. > -1 functions are kept in bytecode form and no native compilation is performed > (but *.eln files are still produced, and include the compiled code in Okay a datapoint, I did managed to bootstrap two times a native compiled Emacs with only this patch installed with no errors. But it might be related to the fact that the excuted code is simpler and less stressful to the GC, AFAIR ssa renamining (the nativecomp pass where I saw the crash initially) is heavy at consing other than computing. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:35 ` Andrea Corallo @ 2024-05-21 19:38 ` Gerd Möllmann 2024-05-21 19:50 ` Andrea Corallo 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-21 19:38 UTC (permalink / raw) To: Andrea Corallo; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Andrea Corallo <acorallo@gnu.org> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> Helmut Eller <eller.helmut@gmail.com> writes: >> >>> Is there some way to disable optimizations for libgccjit so that the >>> native compiler isn't quite so slow? >> >> The following should do the job, maybe is also an interesting test to >> do... >> >> modified lisp/emacs-lisp/comp.el >> @@ -54,7 +54,7 @@ comp >> "Emacs Lisp native compiler." >> :group 'lisp) >> >> -(defcustom native-comp-speed 2 >> +(defcustom native-comp-speed 0 >> "Optimization level for native compilation, a number between -1 and 3. >> -1 functions are kept in bytecode form and no native compilation is performed >> (but *.eln files are still produced, and include the compiled code in > > Okay a datapoint, I did managed to bootstrap two times a native compiled > Emacs with only this patch installed with no errors. Interesting. On my systems, speed 0 didn't change things. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:38 ` Gerd Möllmann @ 2024-05-21 19:50 ` Andrea Corallo 0 siblings, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 19:50 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> Andrea Corallo <acorallo@gnu.org> writes: >> >>> Helmut Eller <eller.helmut@gmail.com> writes: >>> >>>> Is there some way to disable optimizations for libgccjit so that the >>>> native compiler isn't quite so slow? >>> >>> The following should do the job, maybe is also an interesting test to >>> do... >>> >>> modified lisp/emacs-lisp/comp.el >>> @@ -54,7 +54,7 @@ comp >>> "Emacs Lisp native compiler." >>> :group 'lisp) >>> >>> -(defcustom native-comp-speed 2 >>> +(defcustom native-comp-speed 0 >>> "Optimization level for native compilation, a number between -1 and 3. >>> -1 functions are kept in bytecode form and no native compilation is performed >>> (but *.eln files are still produced, and include the compiled code in >> >> Okay a datapoint, I did managed to bootstrap two times a native compiled >> Emacs with only this patch installed with no errors. > > Interesting. On my systems, speed 0 didn't change things. Yeah, tried now speed 0 + -with-native-compilation=aot and it fails with the very same assertion just after, so it's probably just that speed 0 stresses less the GC, but the bug is still there. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:10 ` Helmut Eller 2024-05-21 19:17 ` Andrea Corallo @ 2024-05-21 19:22 ` Eli Zaretskii 2024-05-21 19:28 ` Andrea Corallo 1 sibling, 1 reply; 72+ messages in thread From: Eli Zaretskii @ 2024-05-21 19:22 UTC (permalink / raw) To: Helmut Eller; +Cc: acorallo, gerd.moellmann, emacs-devel > From: Helmut Eller <eller.helmut@gmail.com> > Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Emacs Devel > <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org> > Date: Tue, 21 May 2024 21:10:14 +0200 > > Is there some way to disable optimizations for libgccjit so that the > native compiler isn't quite so slow? You shouldn't need to do that, you just need to have comp*.el byte-compiled ASAP, then native compilations are much faster. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 19:22 ` Eli Zaretskii @ 2024-05-21 19:28 ` Andrea Corallo 0 siblings, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-21 19:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Helmut Eller, gerd.moellmann, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Helmut Eller <eller.helmut@gmail.com> >> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>, Emacs Devel >> <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org> >> Date: Tue, 21 May 2024 21:10:14 +0200 >> >> Is there some way to disable optimizations for libgccjit so that the >> native compiler isn't quite so slow? > > You shouldn't need to do that, you just need to have comp*.el > byte-compiled ASAP, then native compilations are much faster. Right that's a good point, compiling with no optimizations will build faster at the beginning but having an un-optimzed compiler will make compilations afterwards slower. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 17:06 ` Gerd Möllmann 2024-05-21 17:57 ` Andrea Corallo @ 2024-05-22 5:43 ` Helmut Eller 2024-05-22 6:07 ` Gerd Möllmann 1 sibling, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-22 5:43 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii > I've checked that d_reloc is indeed scanned by fix_comp_unit. The > check gives me reasonable confidence that this "should work". But as > an alternative, I also made all the things like d_reloc in the .elns > ambiguous roots, so that they cannot possibly be moved, if all works as > expected. Registering the dump as root happens rather late. The relocation code allocates a hash table and stores a reference to it in comp_u->lambda_gc_guard_h. By that time the dump should already be a root. Can we register the dump earlier? AFAIU, the dumper writes zeros in the cells for to-be-relocated references and the scan code will ignore them. So I think this could work: diff --git a/src/pdumper.c b/src/pdumper.c index b039e375c1f..39484a16c95 100644 --- a/src/pdumper.c +++ b/src/pdumper.c @@ -5958,6 +5958,7 @@ pdumper_load (const char *dump_filename, char *argv0) & ~(DUMP_ALIGNMENT - 1)); void *hot_start = (void *) (dump_base + aligned_header_size); void *hot_end = (void *) (dump_base + header->discardable_start); + igc_on_pdump_loaded (hot_start, hot_end); #endif dump_do_all_dump_reloc_for_phase (header, dump_base, EARLY_RELOCS); @@ -6002,10 +6003,6 @@ pdumper_load (const char *dump_filename, char *argv0) dump_private.load_time = timespectod (load_timespec); dump_private.dump_filename = dump_filename_copy; -# ifdef HAVE_MPS - igc_on_pdump_loaded (hot_start, hot_end); -# endif - out: for (int i = 0; i < ARRAYELTS (sections); ++i) dump_mmap_release (§ions[i]); ^ permalink raw reply related [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 5:43 ` Helmut Eller @ 2024-05-22 6:07 ` Gerd Möllmann 2024-05-22 6:27 ` Gerd Möllmann 2024-05-22 6:34 ` Helmut Eller 0 siblings, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-22 6:07 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: >> I've checked that d_reloc is indeed scanned by fix_comp_unit. The >> check gives me reasonable confidence that this "should work". But as >> an alternative, I also made all the things like d_reloc in the .elns >> ambiguous roots, so that they cannot possibly be moved, if all works as >> expected. > > Registering the dump as root happens rather late. The relocation code > allocates a hash table and stores a reference to it in > comp_u->lambda_gc_guard_h. By that time the dump should already be a > root. Can we register the dump earlier? AFAIU, the dumper writes zeros > in the cells for to-be-relocated references and the scan code will > ignore them. So I think this could work: > > diff --git a/src/pdumper.c b/src/pdumper.c > index b039e375c1f..39484a16c95 100644 > --- a/src/pdumper.c > +++ b/src/pdumper.c > @@ -5958,6 +5958,7 @@ pdumper_load (const char *dump_filename, char *argv0) > & ~(DUMP_ALIGNMENT - 1)); > void *hot_start = (void *) (dump_base + aligned_header_size); > void *hot_end = (void *) (dump_base + header->discardable_start); > + igc_on_pdump_loaded (hot_start, hot_end); > #endif AFAIU, at the point above the necessary relocations haven't yet been done, so the dump is still in its "raw" form as it is in the file. Which means, among other things, that Lisp_Objects haven't been changed to point to where the dump is mmap'd or where the data segment of Emacs is and so on. So I don't think that would work. > dump_do_all_dump_reloc_for_phase (header, dump_base, EARLY_RELOCS); > @@ -6002,10 +6003,6 @@ pdumper_load (const char *dump_filename, char *argv0) > dump_private.load_time = timespectod (load_timespec); > dump_private.dump_filename = dump_filename_copy; > > -# ifdef HAVE_MPS > - igc_on_pdump_loaded (hot_start, hot_end); > -# endif > - > out: > for (int i = 0; i < ARRAYELTS (sections); ++i) > dump_mmap_release (§ions[i]); But, what if we park MPS while we are loading the dump? WDYT? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 6:07 ` Gerd Möllmann @ 2024-05-22 6:27 ` Gerd Möllmann 2024-05-22 6:34 ` Helmut Eller 1 sibling, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-22 6:27 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >>> I've checked that d_reloc is indeed scanned by fix_comp_unit. The >>> check gives me reasonable confidence that this "should work". But as >>> an alternative, I also made all the things like d_reloc in the .elns >>> ambiguous roots, so that they cannot possibly be moved, if all works as >>> expected. >> >> Registering the dump as root happens rather late. The relocation code >> allocates a hash table and stores a reference to it in >> comp_u->lambda_gc_guard_h. By that time the dump should already be a >> root. Can we register the dump earlier? AFAIU, the dumper writes zeros >> in the cells for to-be-relocated references and the scan code will >> ignore them. So I think this could work: >> >> diff --git a/src/pdumper.c b/src/pdumper.c >> index b039e375c1f..39484a16c95 100644 >> --- a/src/pdumper.c >> +++ b/src/pdumper.c >> @@ -5958,6 +5958,7 @@ pdumper_load (const char *dump_filename, char *argv0) >> & ~(DUMP_ALIGNMENT - 1)); >> void *hot_start = (void *) (dump_base + aligned_header_size); >> void *hot_end = (void *) (dump_base + header->discardable_start); >> + igc_on_pdump_loaded (hot_start, hot_end); >> #endif > > AFAIU, at the point above the necessary relocations haven't yet been > done, so the dump is still in its "raw" form as it is in the file. Which > means, among other things, that Lisp_Objects haven't been changed to > point to where the dump is mmap'd or where the data segment of Emacs is > and so on. So I don't think that would work. > >> dump_do_all_dump_reloc_for_phase (header, dump_base, EARLY_RELOCS); >> @@ -6002,10 +6003,6 @@ pdumper_load (const char *dump_filename, char *argv0) >> dump_private.load_time = timespectod (load_timespec); >> dump_private.dump_filename = dump_filename_copy; >> >> -# ifdef HAVE_MPS >> - igc_on_pdump_loaded (hot_start, hot_end); >> -# endif >> - >> out: >> for (int i = 0; i < ARRAYELTS (sections); ++i) >> dump_mmap_release (§ions[i]); > > But, what if we park MPS while we are loading the dump? WDYT? That would be something like the below. Alas, it doesn't change the asserts in the native comp build. Unstaged modified lisp/emacs-lisp/comp.el @@ -54,7 +54,7 @@ comp "Emacs Lisp native compiler." :group 'lisp) -(defcustom native-comp-speed 2 +(defcustom native-comp-speed 0 "Optimization level for native compilation, a number between -1 and 3. -1 functions are kept in bytecode form and no native compilation is performed (but *.eln files are still produced, and include the compiled code in @@ -68,7 +68,7 @@ native-comp-speed :safe #'integerp :version "28.1") -(defcustom native-comp-debug 0 +(defcustom native-comp-debug 2 "Debug level for native compilation, a number between 0 and 3. This is intended for debugging the compiler itself. 0 no debug output. modified src/igc.c @@ -2281,6 +2281,18 @@ igc_park_arena (void) return count; } +void +igc_park (void) +{ + mps_arena_park (global_igc->arena); +} + +void +igc_release (void) +{ + mps_arena_release (global_igc->arena); +} + static igc_root_list * root_find (void *start) { modified src/igc.h @@ -121,6 +121,8 @@ #define EMACS_IGC_H size_t igc_hash (Lisp_Object key); void igc_create_charset_root (void *table, size_t size); specpdl_ref igc_park_arena (void); +void igc_park (void); +void igc_release (void); void igc_check_vector (const struct Lisp_Vector *v); void igc_postmortem (void); void igc_on_grow_specpdl (void); modified src/pdumper.c @@ -5809,6 +5809,10 @@ pdumper_load (const char *dump_filename, char *argv0) /* We can load only one dump. */ eassert (!dump_loaded_p ()); +#ifdef HAVE_MPS + igc_park (); +# endif + int err; int dump_fd = emacs_open_noquit (dump_filename, O_RDONLY, 0); if (dump_fd < 0) @@ -5980,6 +5984,9 @@ pdumper_load (const char *dump_filename, char *argv0) if (dump_fd >= 0) emacs_close (dump_fd); +#ifdef HAVE_MPS + igc_release (); +#endif return err; } ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 6:07 ` Gerd Möllmann 2024-05-22 6:27 ` Gerd Möllmann @ 2024-05-22 6:34 ` Helmut Eller 2024-05-22 6:56 ` Gerd Möllmann 1 sibling, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-22 6:34 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Wed, May 22 2024, Gerd Möllmann wrote: > Helmut Eller <eller.helmut@gmail.com> writes: > >>> I've checked that d_reloc is indeed scanned by fix_comp_unit. The >>> check gives me reasonable confidence that this "should work". But as >>> an alternative, I also made all the things like d_reloc in the .elns >>> ambiguous roots, so that they cannot possibly be moved, if all works as >>> expected. >> >> Registering the dump as root happens rather late. The relocation code >> allocates a hash table and stores a reference to it in >> comp_u->lambda_gc_guard_h. By that time the dump should already be a >> root. Can we register the dump earlier? AFAIU, the dumper writes zeros >> in the cells for to-be-relocated references and the scan code will >> ignore them. So I think this could work: >> > AFAIU, at the point above the necessary relocations haven't yet been > done, so the dump is still in its "raw" form as it is in the file. Which > means, among other things, that Lisp_Objects haven't been changed to > point to where the dump is mmap'd or where the data segment of Emacs is > and so on. So I don't think that would work. As I said: NULLs shouldn't cause any trouble. The scan code ignores that. And pdumper_next_object seems to its own address calculations. Can you try the patch? (And perhaps delete ~/.emacs.d/eln-cache/* and remove write permissions. I've seem elns being loaded from there. Which is not not good for repeatability.) > But, what if we park MPS while we are loading the dump? WDYT? Is it possible to allocate from a parked arena? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 6:34 ` Helmut Eller @ 2024-05-22 6:56 ` Gerd Möllmann 2024-05-22 7:59 ` Helmut Eller 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-22 6:56 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Wed, May 22 2024, Gerd Möllmann wrote: > >> Helmut Eller <eller.helmut@gmail.com> writes: >> >>>> I've checked that d_reloc is indeed scanned by fix_comp_unit. The >>>> check gives me reasonable confidence that this "should work". But as >>>> an alternative, I also made all the things like d_reloc in the .elns >>>> ambiguous roots, so that they cannot possibly be moved, if all works as >>>> expected. >>> >>> Registering the dump as root happens rather late. The relocation code >>> allocates a hash table and stores a reference to it in >>> comp_u->lambda_gc_guard_h. By that time the dump should already be a >>> root. Can we register the dump earlier? AFAIU, the dumper writes zeros >>> in the cells for to-be-relocated references and the scan code will >>> ignore them. So I think this could work: >>> >> AFAIU, at the point above the necessary relocations haven't yet been >> done, so the dump is still in its "raw" form as it is in the file. Which >> means, among other things, that Lisp_Objects haven't been changed to >> point to where the dump is mmap'd or where the data segment of Emacs is >> and so on. So I don't think that would work. > > As I said: NULLs shouldn't cause any trouble. The scan code ignores > that. And pdumper_next_object seems to its own address calculations. > > Can you try the patch? Yes, of course, sorry. > (And perhaps delete ~/.emacs.d/eln-cache/* and remove write > permissions. I've seem elns being loaded from there. Which is not not > good for repeatability.) So, clean build and alas no visible change igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD gmake[3]: *** [Makefile:285: ../lisp/files.elc] Abort trap: 6 gmake[2]: *** [Makefile:933: ../lisp/files.elc] Error 2 ELC+ELN ../lisp/emacs-lisp/syntax.elc ELC+ELN ../lisp/emacs-lisp/cconv.elc ELC+ELN ../lisp/emacs-lisp/tabulated-list.elc ELC+ELN ../lisp/faces.elc igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD gmake[3]: *** [Makefile:285: ../lisp/faces.elc] Abort trap: 6 >> But, what if we park MPS while we are loading the dump? WDYT? > > Is it possible to allocate from a parked arena? Yes, that works. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 6:56 ` Gerd Möllmann @ 2024-05-22 7:59 ` Helmut Eller 2024-05-22 8:46 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-22 7:59 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Wed, May 22 2024, Gerd Möllmann wrote: > So, clean build and alas no visible change > > igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD > gmake[3]: *** [Makefile:285: ../lisp/files.elc] Abort trap: 6 > gmake[2]: *** [Makefile:933: ../lisp/files.elc] Error 2 > ELC+ELN ../lisp/emacs-lisp/syntax.elc > ELC+ELN ../lisp/emacs-lisp/cconv.elc > ELC+ELN ../lisp/emacs-lisp/tabulated-list.elc > ELC+ELN ../lisp/faces.elc > > igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD > gmake[3]: *** [Makefile:285: ../lisp/faces.elc] Abort trap: 6 I don't know why this works for me. Maybe because my machine is nice and slow enough :-) I've seen a different problem, later after all .eln where compiled, during pdumper_load where a hash table in comp_u->lambda_gc_guard_h wasn't traced. But I called igc_collect at the end of load_comp_unit to produce earlier crashes. >>> But, what if we park MPS while we are loading the dump? WDYT? >> >> Is it possible to allocate from a parked arena? > > Yes, that works. It seems to work. But I had to remove the call to igc_collect. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 7:59 ` Helmut Eller @ 2024-05-22 8:46 ` Gerd Möllmann 2024-05-22 18:03 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-22 8:46 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Wed, May 22 2024, Gerd Möllmann wrote: > >> So, clean build and alas no visible change >> >> igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD >> gmake[3]: *** [Makefile:285: ../lisp/files.elc] Abort trap: 6 >> gmake[2]: *** [Makefile:933: ../lisp/files.elc] Error 2 >> ELC+ELN ../lisp/emacs-lisp/syntax.elc >> ELC+ELN ../lisp/emacs-lisp/cconv.elc >> ELC+ELN ../lisp/emacs-lisp/tabulated-list.elc >> ELC+ELN ../lisp/faces.elc >> >> igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD >> gmake[3]: *** [Makefile:285: ../lisp/faces.elc] Abort trap: 6 > > I don't know why this works for me. Maybe because my machine is nice > and slow enough :-) Well possible. I see differences, for example, in where it asserts between my old x86 MB and the arm MB, and between different -jN settigns. Which makes me believe there is something in a native compiled function that MPS doesn't understand, and when it kicks in at the perfect time bad things happen. Or something completely different, I actually have no idea :-). ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 8:46 ` Gerd Möllmann @ 2024-05-22 18:03 ` Gerd Möllmann 2024-05-23 6:12 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-22 18:03 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Or something completely different, I actually have no idea :-). I came to one conclusion this afternoon nemely that fix_comp_unit is 100% wrong: it may not scan d_reloc etc. in the shared object because there is no synchronization between threads reading/writing these. Remains the question why making them roots did not work for me. A root ensures synchronization by stopping the world while roots are scanned. I guess I'll have to double check if that was really the same error as w/o roots, or if it only looked like it was. I'm too old for this sh*t. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 18:03 ` Gerd Möllmann @ 2024-05-23 6:12 ` Gerd Möllmann 2024-05-23 6:38 ` Helmut Eller ` (3 more replies) 0 siblings, 4 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 6:12 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Or something completely different, I actually have no idea :-). > > I came to one conclusion this afternoon nemely that fix_comp_unit is > 100% wrong: it may not scan d_reloc etc. in the shared object > because there is no synchronization between threads reading/writing > these. > > Remains the question why making them roots did not work for me. A root > ensures synchronization by stopping the world while roots are scanned. > > I guess I'll have to double check if that was really the same error as > w/o roots, or if it only looked like it was. I've now pushed something. This doesn't make the native comp build work, but the errors are different, and in the cases of IGC_OBJ_FWD assertions I think these don't have the same cause, at least they don't follow the patterns I've previously seen in LLDB. In summary, I think this is an improvement. Could anyone (of the currently n = 3 people reporting back (not disappointing because expected) try this? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:12 ` Gerd Möllmann @ 2024-05-23 6:38 ` Helmut Eller 2024-05-23 6:58 ` Gerd Möllmann 2024-05-23 6:40 ` Helmut Eller ` (2 subsequent siblings) 3 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 6:38 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii > Could anyone (of the currently n = 3 people reporting back > (not disappointing because expected) try this? Some I hit this very quickly: Loading international/charscript (native compiled elisp)... igc.c:2191: Emacs fatal error: assertion failed: n > 0 For the charscript eln, u->n_data_eph_relocs is 0 and root_create_exact_n doesn't like that. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:38 ` Helmut Eller @ 2024-05-23 6:58 ` Gerd Möllmann 0 siblings, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 6:58 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: >> Could anyone (of the currently n = 3 people reporting back >> (not disappointing because expected) try this? > > Some I hit this very quickly: > > Loading international/charscript (native compiled elisp)... > > igc.c:2191: Emacs fatal error: assertion failed: n > 0 > > For the charscript eln, u->n_data_eph_relocs is 0 and > root_create_exact_n doesn't like that. You probably get further in the build than I do. Should be fixed now. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:12 ` Gerd Möllmann 2024-05-23 6:38 ` Helmut Eller @ 2024-05-23 6:40 ` Helmut Eller 2024-05-23 6:58 ` Gerd Möllmann 2024-05-23 7:50 ` Andrea Corallo 2024-05-23 8:30 ` Ihor Radchenko 3 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 6:40 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii > In summary, I think this is an improvement. I also think that roots should be registered before top_level_run is executed. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:40 ` Helmut Eller @ 2024-05-23 6:58 ` Gerd Möllmann 2024-05-23 7:12 ` Helmut Eller 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 6:58 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: >> In summary, I think this is an improvement. > > I also think that roots should be registered before top_level_run is > executed. Could you please elaborate? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:58 ` Gerd Möllmann @ 2024-05-23 7:12 ` Helmut Eller 2024-05-23 7:35 ` Gerd Möllmann 2024-05-23 7:38 ` Andrea Corallo 0 siblings, 2 replies; 72+ messages in thread From: Helmut Eller @ 2024-05-23 7:12 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Thu, May 23 2024, Gerd Möllmann wrote: > Helmut Eller <eller.helmut@gmail.com> writes: > >>> In summary, I think this is an improvement. >> >> I also think that roots should be registered before top_level_run is >> executed. > > Could you please elaborate? AFAIU, top_level_run executes the top level code of a file. I.e. it usually creates a lot of functions and puts them into the symbol-function slots; for lambdas it puts the subr in d_reloc (see comp--register-lambda). In general, top_level_run can trigger GC flips because it calls make_subr a lot. It would be problematic if d_reloc where not traced during those GC flips. text_data_reloc_eph probably too, but I have no clue what that is for. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 7:12 ` Helmut Eller @ 2024-05-23 7:35 ` Gerd Möllmann 2024-05-23 7:38 ` Andrea Corallo 1 sibling, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 7:35 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Gerd Möllmann wrote: > >> Helmut Eller <eller.helmut@gmail.com> writes: >> >>>> In summary, I think this is an improvement. >>> >>> I also think that roots should be registered before top_level_run is >>> executed. >> >> Could you please elaborate? > > AFAIU, top_level_run executes the top level code of a file. I.e. it > usually creates a lot of functions and puts them into the > symbol-function slots; for lambdas it puts the subr in d_reloc (see > comp--register-lambda). > > In general, top_level_run can trigger GC flips because it calls > make_subr a lot. It would be problematic if d_reloc where not traced > during those GC flips. text_data_reloc_eph probably too, but I have no > clue what that is for. Thanks, I'll take a look. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 7:12 ` Helmut Eller 2024-05-23 7:35 ` Gerd Möllmann @ 2024-05-23 7:38 ` Andrea Corallo 1 sibling, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-23 7:38 UTC (permalink / raw) To: Helmut Eller; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Gerd Möllmann wrote: > >> Helmut Eller <eller.helmut@gmail.com> writes: >> >>>> In summary, I think this is an improvement. >>> >>> I also think that roots should be registered before top_level_run is >>> executed. >> >> Could you please elaborate? > > AFAIU, top_level_run executes the top level code of a file. I.e. it > usually creates a lot of functions and puts them into the > symbol-function slots; for lambdas it puts the subr in d_reloc (see > comp--register-lambda). > > In general, top_level_run can trigger GC flips because it calls > make_subr a lot. It would be problematic if d_reloc where not traced > during those GC flips. +1 > text_data_reloc_eph probably too, but I have no > clue what that is for. 'd_reloc_eph' is data used only at load time and never again. Note that 'top_level_run' uses 'd_reloc' as well so yes I believe as well they need all to be tracked before 'top_level_run' runs. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:12 ` Gerd Möllmann 2024-05-23 6:38 ` Helmut Eller 2024-05-23 6:40 ` Helmut Eller @ 2024-05-23 7:50 ` Andrea Corallo 2024-05-23 7:52 ` Gerd Möllmann 2024-05-23 8:30 ` Ihor Radchenko 3 siblings, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-23 7:50 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >>> Or something completely different, I actually have no idea :-). >> >> I came to one conclusion this afternoon nemely that fix_comp_unit is >> 100% wrong: it may not scan d_reloc etc. in the shared object >> because there is no synchronization between threads reading/writing >> these. >> >> Remains the question why making them roots did not work for me. A root >> ensures synchronization by stopping the world while roots are scanned. >> >> I guess I'll have to double check if that was really the same error as >> w/o roots, or if it only looked like it was. > > I've now pushed something. > > This doesn't make the native comp build work, but the errors are > different, and in the cases of IGC_OBJ_FWD assertions I think these > don't have the same cause, at least they don't follow the patterns I've > previously seen in LLDB. > > In summary, I think this is an improvement. > > Could anyone (of the currently n = 3 people reporting back > (not disappointing because expected) try this? Hi Gerd, on I 2bd8ee79d93 I get: igc.c:2189: Emacs fatal error: assertion failed: n > 0 Backtrace: ../src/emacs(+0x226e87)[0x5e2f8ba24e87] ../src/emacs(+0x6047d)[0x5e2f8b85e47d] ../src/emacs(+0x34a2ac)[0x5e2f8bb482ac] ../src/emacs(+0x34bd88)[0x5e2f8bb49d88] ../src/emacs(+0x34d968)[0x5e2f8bb4b968] ../src/emacs(+0x3128c6)[0x5e2f8bb108c6] ../src/emacs(+0x312fd1)[0x5e2f8bb10fd1] ../src/emacs(+0x2a9bec)[0x5e2f8baa7bec] /home/andcor03/emacs4/native-lisp/30.0.50-74ae6c6c/comp-7672a6ed-f5e09f0c.eln(F636f6d702d2d6e61746976652d636f6d70696c65_comp__native_compile_0+0xb39)[0x7e83c178fcd9] ../src/emacs(+0x2a9bec)[0x5e2f8baa7bec] /home/andcor03/emacs4/native-lisp/30.0.50-74ae6c6c/comp-7672a6ed-f5e09f0c.eln(F636f6d702d7472616d706f6c696e652d636f6d70696c65_comp_trampoline_compile_0+0x232)[0x7e83c178e522] ../src/emacs(+0x2a9bec)[0x5e2f8baa7bec] /home/andcor03/emacs4/native-lisp/30.0.50-74ae6c6c/comp-run-a15747ee-f15895e9.eln(F636f6d702d737562722d7472616d706f6c696e652d696e7374616c6c_comp_subr_trampoline_install_0+0x1e7)[0x7e83c17160d7] ../src/emacs(+0x2a9bec)[0x5e2f8baa7bec] ../src/emacs(+0x28d7c5)[0x5e2f8ba8b7c5] ../src/emacs(+0x3060ef)[0x5e2f8bb040ef] ../src/emacs(+0x2a9bec)[0x5e2f8baa7bec] /home/andcor03/emacs4/src/../native-lisp/30.0.50-74ae6c6c/preloaded/nadvice-64630aaa-9efa993d.eln(F6164766963652d2d6164642d66756e6374696f6e_advice__add_function_0+0x217)[0x7e83c1a331e7] ../src/emacs(+0x2a9bec)[0x5e2f8baa7bec] /home/andcor03/emacs4/src/../native-lisp/30.0.50-74ae6c6c/preloaded/nadvice-64630aaa-9efa993d.eln(F6164766963652d616464_advice_add_0+0x19e)[0x7e83c1a3462e] ../src/emacs(+0x2ae310)[0x5e2f8baac310] ../src/emacs(+0x2ae6b4)[0x5e2f8baac6b4] [...] Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 7:50 ` Andrea Corallo @ 2024-05-23 7:52 ` Gerd Möllmann 2024-05-23 7:57 ` Gerd Möllmann 2024-05-23 8:12 ` Helmut Eller 0 siblings, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 7:52 UTC (permalink / raw) To: Andrea Corallo; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Andrea Corallo <acorallo@gnu.org> writes: > on I 2bd8ee79d93 I get: > > igc.c:2189: Emacs fatal error: assertion failed: n > 0 Thanks, Andrea. I think that is already fixed, but if you wait a bit, I'll also push the root creation before running the toplevel forms. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 7:52 ` Gerd Möllmann @ 2024-05-23 7:57 ` Gerd Möllmann 2024-05-23 8:12 ` Helmut Eller 1 sibling, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 7:57 UTC (permalink / raw) To: Andrea Corallo; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> on I 2bd8ee79d93 I get: >> >> igc.c:2189: Emacs fatal error: assertion failed: n > 0 > > Thanks, Andrea. I think that is already fixed, but if you wait a bit, > I'll also push the root creation before running the toplevel forms. Now done. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 7:52 ` Gerd Möllmann 2024-05-23 7:57 ` Gerd Möllmann @ 2024-05-23 8:12 ` Helmut Eller 2024-05-23 8:18 ` Gerd Möllmann 1 sibling, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 8:12 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Thu, May 23 2024, Gerd Möllmann wrote: > Andrea Corallo <acorallo@gnu.org> writes: > >> on I 2bd8ee79d93 I get: >> >> igc.c:2189: Emacs fatal error: assertion failed: n > 0 > > Thanks, Andrea. I think that is already fixed, but if you wait a bit, > I'll also push the root creation before running the toplevel forms. For me, it's the same assertion but this time because n_data_imp_relocs=0. In some subr--trampoline-6d6163726f657870616e64_macroexpand_0.eln. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 8:12 ` Helmut Eller @ 2024-05-23 8:18 ` Gerd Möllmann 2024-05-23 8:42 ` Andrea Corallo 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 8:18 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Gerd Möllmann wrote: > >> Andrea Corallo <acorallo@gnu.org> writes: >> >>> on I 2bd8ee79d93 I get: >>> >>> igc.c:2189: Emacs fatal error: assertion failed: n > 0 >> >> Thanks, Andrea. I think that is already fixed, but if you wait a bit, >> I'll also push the root creation before running the toplevel forms. > > For me, it's the same assertion but this time because > n_data_imp_relocs=0. In some > subr--trampoline-6d6163726f657870616e64_macroexpand_0.eln. Sorry, for that. Should be fixed now. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 8:18 ` Gerd Möllmann @ 2024-05-23 8:42 ` Andrea Corallo 2024-05-23 9:06 ` Helmut Eller 0 siblings, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-23 8:42 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> On Thu, May 23 2024, Gerd Möllmann wrote: >> >>> Andrea Corallo <acorallo@gnu.org> writes: >>> >>>> on I 2bd8ee79d93 I get: >>>> >>>> igc.c:2189: Emacs fatal error: assertion failed: n > 0 >>> >>> Thanks, Andrea. I think that is already fixed, but if you wait a bit, >>> I'll also push the root creation before running the toplevel forms. >> >> For me, it's the same assertion but this time because >> n_data_imp_relocs=0. In some >> subr--trampoline-6d6163726f657870616e64_macroexpand_0.eln. > > Sorry, for that. Should be fixed now. 1b16c64669d just bootstrapped fine for me here (even with --with-native-compilation=aot). Congrats 🎉😀 Andrea PS I'll do a couple of more to be 100% sure ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 8:42 ` Andrea Corallo @ 2024-05-23 9:06 ` Helmut Eller 2024-05-23 9:14 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 9:06 UTC (permalink / raw) To: Andrea Corallo; +Cc: Gerd Möllmann, Emacs Devel, Eli Zaretskii On Thu, May 23 2024, Andrea Corallo wrote: > 1b16c64669d just bootstrapped fine for me here For me too. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 9:06 ` Helmut Eller @ 2024-05-23 9:14 ` Gerd Möllmann 2024-05-23 9:39 ` Helmut Eller 2024-05-23 12:09 ` Andrea Corallo 0 siblings, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 9:14 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Andrea Corallo wrote: > >> 1b16c64669d just bootstrapped fine for me here > > For me too. Not for me :-(. ELC+ELN ../lisp/replace.elc igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD gmake[3]: *** [Makefile:285: ../lisp/replace.elc] Abort trap: 6 gmake[2]: *** [Makefile:925: ../lisp/replace.elc] Error 2 gmake[2]: *** Waiting for unfinished jobs.... ELC+ELN ../lisp/version.elc ELC+ELN ../lisp/vc/vc-hooks.elc ELC+ELN ../lisp/textmodes/fill.elc ELC+ELN ../lisp/w32-fns.elc ELC+ELN ../lisp/uniquify.elc ELC+ELN ../lisp/subr.elc igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD gmake[3]: *** [Makefile:285: ../lisp/subr.elc] Abort trap: 6 That's from git clean -xdf, $ /Users/gerd/emacs/savannah/igc/configure --cache-file /var/folders/1d/k_6t25f94sl83szqbf8gpkrh0000gn/T//config.cache.igc --enable-checking=all --with-native-compilation=aot --with-mps=debug CC=clang 'CFLAGS=-g -O0' On a 10 core M1 pro, -j10 Damn. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 9:14 ` Gerd Möllmann @ 2024-05-23 9:39 ` Helmut Eller 2024-05-23 10:19 ` Gerd Möllmann 2024-05-23 12:09 ` Andrea Corallo 1 sibling, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 9:39 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Thu, May 23 2024, Gerd Möllmann wrote: > Not for me :-(. Can something like for (EMACS_INT i = 0; i < d_vec_len; i++) data_relocs[i] = AREF (comp_u->data_vec, i); trigger a GC flip in the middle of the array? Maybe only on macOS? Maybe we should be extra careful and register the roots before writing to data_relocs? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 9:39 ` Helmut Eller @ 2024-05-23 10:19 ` Gerd Möllmann 2024-05-23 10:35 ` Ihor Radchenko ` (2 more replies) 0 siblings, 3 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 10:19 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Gerd Möllmann wrote: > >> Not for me :-(. > > Can something like > > for (EMACS_INT i = 0; i < d_vec_len; i++) > data_relocs[i] = AREF (comp_u->data_vec, i); > > trigger a GC flip in the middle of the array? Hm. > Maybe only on macOS? Maybe we should be extra > careful and register the roots before writing to data_relocs? Could be. Even nices would we if things like struct Lisp_X *d_reloc|[123] were struct Lisp_X **d_reloc. Then one could just set a pointer, for example to v->contents of a vector that is currently there anyway. No idea why that isn't the case. I'll let it rest a bit now. Ihor will meanwhile take over ;-). ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 10:19 ` Gerd Möllmann @ 2024-05-23 10:35 ` Ihor Radchenko 2024-05-23 11:27 ` Gerd Möllmann 2024-05-23 12:15 ` Andrea Corallo 2024-05-23 14:46 ` Helmut Eller 2 siblings, 1 reply; 72+ messages in thread From: Ihor Radchenko @ 2024-05-23 10:35 UTC (permalink / raw) To: Gerd Möllmann Cc: Helmut Eller, Andrea Corallo, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Ihor will meanwhile take over ;-). If you need specific help, let me know what is needed. A while ago, I have managed to compile the whole thing and run Emacs for some time with random crashes or messages like "memory exhausted", but did not want to report a pile of random problems while you are focusing on native compilation support. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 10:35 ` Ihor Radchenko @ 2024-05-23 11:27 ` Gerd Möllmann 0 siblings, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 11:27 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Helmut Eller, Andrea Corallo, Emacs Devel, Eli Zaretskii Ihor Radchenko <yantar92@posteo.net> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Ihor will meanwhile take over ;-). > > If you need specific help, let me know what is needed. > A while ago, I have managed to compile the whole thing and run Emacs for > some time with random crashes or messages like "memory exhausted", but > did not want to report a pile of random problems while you are focusing > on native compilation support. Don't take me too serious, just a bit of teasing :-) ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 10:19 ` Gerd Möllmann 2024-05-23 10:35 ` Ihor Radchenko @ 2024-05-23 12:15 ` Andrea Corallo 2024-05-23 14:46 ` Helmut Eller 2 siblings, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-23 12:15 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> On Thu, May 23 2024, Gerd Möllmann wrote: >> >>> Not for me :-(. >> >> Can something like >> >> for (EMACS_INT i = 0; i < d_vec_len; i++) >> data_relocs[i] = AREF (comp_u->data_vec, i); >> >> trigger a GC flip in the middle of the array? > > Hm. > >> Maybe only on macOS? Maybe we should be extra >> careful and register the roots before writing to data_relocs? > > Could be. > > Even nices would we if things like struct Lisp_X *d_reloc|[123] were > struct Lisp_X **d_reloc. Then one could just set a pointer, for example > to v->contents of a vector that is currently there anyway. No idea why > that isn't the case. I don't think we want an indirection more there, generated code would be slower. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 10:19 ` Gerd Möllmann 2024-05-23 10:35 ` Ihor Radchenko 2024-05-23 12:15 ` Andrea Corallo @ 2024-05-23 14:46 ` Helmut Eller 2024-05-23 16:40 ` Gerd Möllmann 2 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 14:46 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii [-- Attachment #1: Type: text/plain, Size: 230 bytes --] On Thu, May 23 2024, Gerd Möllmann wrote: > I'll let it rest a bit now. When you have some time to spare, could you try the patch below? I think that load_static_obj can trigger GC and we should be prepared for that. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Register-roots-for-comp-units-earlier.patch --] [-- Type: text/x-diff, Size: 6861 bytes --] From 000702e21a347d3bdd2b74d025ed9f358a2b6702 Mon Sep 17 00:00:00 2001 From: Helmut Eller <helmut@249.130.205.92.host.secureserver.net> Date: Thu, 23 May 2024 16:42:06 +0200 Subject: [PATCH] Register roots for comp units earlier --- src/comp.c | 23 +++++++++++------------ src/comp.h | 5 ++++- src/igc.c | 46 +++++++++++++++++++++++----------------------- src/igc.h | 3 +++ 4 files changed, 41 insertions(+), 36 deletions(-) diff --git a/src/comp.c b/src/comp.c index fdf99d51c78..a3756857205 100644 --- a/src/comp.c +++ b/src/comp.c @@ -5369,6 +5369,7 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, #ifdef HAVE_MPS comp_u->data_relocs = dynlib_sym (handle, DATA_RELOC_SYM); comp_u->comp_unit = dynlib_sym (handle, COMP_UNIT_SYM); + comp_u->comp_unit_root = igc_root_create_n (saved_cu, 1); #endif if (!comp_u->loaded_once) @@ -5418,6 +5419,8 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, EMACS_INT d_vec_len = XFIXNUM (Flength (comp_u->data_vec)); #ifdef HAVE_MPS comp_u->n_data_relocs = d_vec_len; + if (d_vec_len > 0) + comp_u->data_relocs_root = igc_root_create_n (data_relocs, d_vec_len); #endif for (EMACS_INT i = 0; i < d_vec_len; i++) data_relocs[i] = AREF (comp_u->data_vec, i); @@ -5425,6 +5428,9 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, d_vec_len = XFIXNUM (Flength (comp_u->data_impure_vec)); #ifdef HAVE_MPS comp_u->n_data_imp_relocs = d_vec_len; + if (d_vec_len > 0) + comp_u->data_imp_relocs_root + = igc_root_create_n (data_imp_relocs, d_vec_len); #endif for (EMACS_INT i = 0; i < d_vec_len; i++) data_imp_relocs[i] = AREF (comp_u->data_impure_vec, i); @@ -5452,22 +5458,19 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, load_static_obj (comp_u, TEXT_DATA_RELOC_EPHEMERAL_SYM); EMACS_INT d_vec_len = XFIXNUM (Flength (data_ephemeral_vec)); - for (EMACS_INT i = 0; i < d_vec_len; i++) - data_eph_relocs[i] = AREF (data_ephemeral_vec, i); # ifdef HAVE_MPS /* FIXME: If we want to get rid of these objects, stop tracing these references at some point. */ comp_u->data_eph_relocs = data_eph_relocs; comp_u->n_data_eph_relocs = d_vec_len; + if (d_vec_len > 0) + comp_u->data_eph_relocs_root + = igc_root_create_n (data_eph_relocs, d_vec_len); # endif + for (EMACS_INT i = 0; i < d_vec_len; i++) + data_eph_relocs[i] = AREF (data_ephemeral_vec, i); } - /* FIXME: Remvoe eph root once it is no longer needed. */ -# ifdef HAVE_MPS - if (comp_u->igc_info == NULL) - igc_root_create_comp_unit (comp_u); -# endif - /* Executing this will perform all the expected environment modifications. */ res = top_level_run (comp_u_lisp_obj); @@ -5477,10 +5480,6 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, eassert (check_comp_unit_relocs (comp_u)); } -# ifdef HAVE_MPS - if (comp_u->igc_info == NULL) - igc_root_create_comp_unit (comp_u); -# endif if (!recursive_load) /* Clean-up the load ongoing flag in case. */ diff --git a/src/comp.h b/src/comp.h index e2ff7296722..c48e93a0242 100644 --- a/src/comp.h +++ b/src/comp.h @@ -54,7 +54,10 @@ #define COMP_H Lisp_Object *data_eph_relocs; size_t n_data_eph_relocs; Lisp_Object *comp_unit; - void *igc_info; + struct igc_root *data_relocs_root; + struct igc_root *data_imp_relocs_root; + struct igc_root *data_eph_relocs_root; + struct igc_root *comp_unit_root; # endif bool loaded_once; bool load_ongoing; diff --git a/src/igc.c b/src/igc.c index 7cb42722a85..dc4512713f7 100644 --- a/src/igc.c +++ b/src/igc.c @@ -1724,7 +1724,9 @@ fix_terminal (mps_ss_t ss, struct terminal *t) { IGC_FIX_CALL_FN (ss, struct Lisp_Vector, t, fix_vectorlike); IGC_FIX12_RAW (ss, &t->next_terminal); +#ifdef HAVE_WINDOW_SYSTEM IGC_FIX12_RAW (ss, &t->image_cache); +#endif // These are malloc'd, so they can be accessed. IGC_FIX_CALL_FN (ss, struct coding_system, t->keyboard_coding, fix_coding); IGC_FIX_CALL_FN (ss, struct coding_system, t->terminal_coding, fix_coding); @@ -2190,37 +2192,33 @@ root_create_exact_n (Lisp_Object *start, size_t n) return root_create_exact (global_igc, start, start + n, scan_exact); } +struct igc_root * +igc_root_create_n (Lisp_Object start[], size_t n) +{ + return (struct igc_root *)root_create_exact_n (start, n); +} + +void +igc_root_destroy (struct igc_root **root) +{ + destroy_root ((struct igc_root_list **)root); +} + void igc_root_create_comp_unit (struct Lisp_Native_Comp_Unit *u) { - struct igc_cu_roots *r = xzalloc (sizeof *r); - if (u->n_data_relocs) - r->data_relocs = root_create_exact_n (u->data_relocs, u->n_data_relocs); - if (u->n_data_imp_relocs) - r->data_imp_relocs - = root_create_exact_n (u->data_imp_relocs, u->n_data_imp_relocs); - if (u->n_data_eph_relocs) - r->data_eph_relocs - = root_create_exact_n (u->data_eph_relocs, u->n_data_eph_relocs); - r->comp_unit = root_create_exact_n (u->comp_unit, 1); - igc_assert (u->igc_info == NULL); - u->igc_info = r; } void igc_root_destroy_comp_unit (struct Lisp_Native_Comp_Unit *u) { - igc_assert (u->igc_info != NULL); - struct igc_cu_roots *r = u->igc_info; - if (r->data_relocs) - destroy_root (&r->data_relocs); - if (r->data_imp_relocs) - destroy_root (&r->data_imp_relocs); - if (r->data_eph_relocs) - destroy_root (&r->data_eph_relocs); - destroy_root (&r->comp_unit); - xfree (r); - u->igc_info = NULL; + if (u->data_relocs_root) + igc_root_destroy (&u->data_relocs_root); + if (u->data_imp_relocs_root) + igc_root_destroy (&u->data_imp_relocs_root); + if (u->data_eph_relocs_root) + igc_root_destroy (&u->data_eph_relocs_root); + igc_root_destroy (&u->comp_unit_root); } static mps_res_t @@ -3211,12 +3209,14 @@ igc_grow_ptr_vec (ptrdiff_t *n, ptrdiff_t n_incr_min, ptrdiff_t n_max) return new_vec; } +#ifdef HAVE_WINDOW_SYSTEM struct image_cache * igc_make_image_cache (void) { struct image_cache *c = alloc (sizeof *c, IGC_OBJ_IMAGE_CACHE); return c; } +#endif DEFUN ("igc-make-weak-ref", Figc_make_weak_ref, Sigc_make_weak_ref, 1, 1, 0, doc diff --git a/src/igc.h b/src/igc.h index 25b37a8ac41..fb37ce22e22 100644 --- a/src/igc.h +++ b/src/igc.h @@ -131,6 +131,9 @@ #define EMACS_IGC_H void igc_root_create_exact_ptr (void *var_addr); void igc_root_create_comp_unit (struct Lisp_Native_Comp_Unit *u); void igc_root_destroy_comp_unit (struct Lisp_Native_Comp_Unit *u); +struct igc_root; +struct igc_root *igc_root_create_n (Lisp_Object start[], size_t n); +void igc_root_destroy (struct igc_root **); struct Lisp_Weak_Ref; Lisp_Object igc_weak_ref_deref (struct Lisp_Weak_Ref *); -- 2.39.2 ^ permalink raw reply related [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 14:46 ` Helmut Eller @ 2024-05-23 16:40 ` Gerd Möllmann 2024-05-23 18:26 ` Helmut Eller 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-23 16:40 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Gerd Möllmann wrote: > >> I'll let it rest a bit now. > > When you have some time to spare, could you try the patch below? I > think that load_static_obj can trigger GC and we should be prepared for > that. Thanks, but it didn't help, I'm afraid. '../src/bootstrap-emacs' -batch --no-site-file --no-site-lisp --eval "(setq load-prefer-newer t byte-compile-warnings 'all)" --eval "(setq org--inhibit-version-check t)" \ -l comp -f byte-compile-refresh-preloaded \ -f batch-byte+native-compile ../lisp/files.el igc.c:345: Emacs fatal error: assertion failed: h->obj_type != IGC_OBJ_FWD I've pushed it anyway. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 16:40 ` Gerd Möllmann @ 2024-05-23 18:26 ` Helmut Eller 2024-05-24 3:10 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-23 18:26 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Thu, May 23 2024, Gerd Möllmann wrote: > Thanks, but it didn't help, I'm afraid. There's some checking code in comp.c. Maybe you could add this void check_all_comp_units (void) { igc_collect (); struct Lisp_Hash_Table *h = XHASH_TABLE (Vcomp_loaded_comp_units_h); DOHASH (h, k, cu) { eassert (NATIVE_COMP_UNITP (cu)); fprintf (stderr, "key: %s\n", SSDATA (k)); check_comp_unit_relocs (XNATIVE_COMP_UNIT (cu)); } } and call it from the debugger. If this check passes then I think the relocs are in pretty good shape. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 18:26 ` Helmut Eller @ 2024-05-24 3:10 ` Gerd Möllmann 2024-05-24 13:01 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-24 3:10 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Thu, May 23 2024, Gerd Möllmann wrote: > >> Thanks, but it didn't help, I'm afraid. > > There's some checking code in comp.c. Maybe you could add this > > void > check_all_comp_units (void) > { > igc_collect (); > struct Lisp_Hash_Table *h = XHASH_TABLE (Vcomp_loaded_comp_units_h); > DOHASH (h, k, cu) > { > eassert (NATIVE_COMP_UNITP (cu)); > fprintf (stderr, "key: %s\n", SSDATA (k)); > check_comp_unit_relocs (XNATIVE_COMP_UNIT (cu)); > } > } > > and call it from the debugger. If this check passes then I think the > relocs are in pretty good shape. Thanks for helping me! Since I'm not sure if relocs are the culprit this time around, I'll also add the roots for a dylib very early, so that there is no "reasonable" doubt they exist. I'll use this fact: .../igc/native-lisp/30_0_50-5cce80dd % nm -g -s __DATA __common radix-tree-669a468d-316fbcdc.eln 00000000000050e0 S _comp_unit 00000000000050e8 S _current_thread_reloc 00000000000050f0 S _d_reloc 0000000000005220 S _d_reloc_eph 00000000000053d8 S _d_reloc_imp 0000000000005438 S _f_symbols_with_pos_enabled_reloc 0000000000005440 S _freloc_link_table 0000000000005448 S _pure_reloc IOW, I can compute the sizes of the reloc arrays from symbol addresses. (On macOS, the format is Mach-O, the segment/section names are probably different in ELF). Let's see. But I need to collect enough energy for doing this first ;-). ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-24 3:10 ` Gerd Möllmann @ 2024-05-24 13:01 ` Gerd Möllmann 2024-05-24 14:19 ` Helmut Eller 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-24 13:01 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii [-- Attachment #1: Type: text/plain, Size: 1891 bytes --] Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> On Thu, May 23 2024, Gerd Möllmann wrote: >> >>> Thanks, but it didn't help, I'm afraid. >> >> There's some checking code in comp.c. Maybe you could add this >> >> void >> check_all_comp_units (void) >> { >> igc_collect (); >> struct Lisp_Hash_Table *h = XHASH_TABLE (Vcomp_loaded_comp_units_h); >> DOHASH (h, k, cu) >> { >> eassert (NATIVE_COMP_UNITP (cu)); >> fprintf (stderr, "key: %s\n", SSDATA (k)); >> check_comp_unit_relocs (XNATIVE_COMP_UNIT (cu)); >> } >> } >> >> and call it from the debugger. If this check passes then I think the >> relocs are in pretty good shape. > > Thanks for helping me! > > Since I'm not sure if relocs are the culprit this time around, I'll also > add the roots for a dylib very early, so that there is no "reasonable" > doubt they exist. I'll use this fact: > > .../igc/native-lisp/30_0_50-5cce80dd % nm -g -s __DATA __common radix-tree-669a468d-316fbcdc.eln > 00000000000050e0 S _comp_unit > 00000000000050e8 S _current_thread_reloc > 00000000000050f0 S _d_reloc > 0000000000005220 S _d_reloc_eph > 00000000000053d8 S _d_reloc_imp > 0000000000005438 S _f_symbols_with_pos_enabled_reloc > 0000000000005440 S _freloc_link_table > 0000000000005448 S _pure_reloc > > IOW, I can compute the sizes of the reloc arrays from symbol addresses. > (On macOS, the format is Mach-O, the segment/section names are probably > different in ELF). > > Let's see. But I need to collect enough energy for doing this first ;-). Given the choice of either ironing or trying what I described above, I evaded ironing. Diff at the end. Effect none. Which lets me update my beliefs so that relocs are unlikely to be the cause this time. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: roots --] [-- Type: text/x-patch, Size: 3890 bytes --] Unstaged modified lisp/emacs-lisp/comp.el @@ -68,7 +68,7 @@ native-comp-speed :safe #'integerp :version "28.1") -(defcustom native-comp-debug 0 +(defcustom native-comp-debug 2 "Debug level for native compilation, a number between 0 and 3. This is intended for debugging the compiler itself. 0 no debug output. modified src/comp.c @@ -5322,6 +5322,32 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, Lisp_Object comp_u_lisp_obj; XSETNATIVE_COMP_UNIT (comp_u_lisp_obj, comp_u); +# ifdef HAVE_MPS + { + comp_u->comp_unit = dynlib_sym (handle, COMP_UNIT_SYM); + comp_u->data_relocs = dynlib_sym (handle, DATA_RELOC_SYM); + comp_u->data_imp_relocs = dynlib_sym (handle, DATA_RELOC_IMPURE_SYM); + comp_u->data_eph_relocs = dynlib_sym (handle, DATA_RELOC_EPHEMERAL_SYM); + Lisp_Object *f_symbols_with_pos_enabled_reloc + = dynlib_sym (handle, F_SYMBOLS_WITH_POS_ENABLED_RELOC_SYM); + + comp_u->n_data_relocs = comp_u->data_eph_relocs - comp_u->data_relocs; + comp_u->n_data_eph_relocs = comp_u->data_imp_relocs - comp_u->data_eph_relocs; + comp_u->n_data_imp_relocs = f_symbols_with_pos_enabled_reloc - comp_u->data_imp_relocs; + + comp_u->comp_unit_root = igc_root_create_n (comp_u->comp_unit, 1); + if (comp_u->n_data_relocs) + comp_u->data_relocs_root + = igc_root_create_n (comp_u->data_relocs, comp_u->n_data_relocs); + if (comp_u->n_data_eph_relocs) + comp_u->data_eph_relocs_root + = igc_root_create_n (comp_u->data_eph_relocs, comp_u->n_data_eph_relocs); + if (comp_u->n_data_imp_relocs) + comp_u->data_imp_relocs_root + = igc_root_create_n (comp_u->data_imp_relocs, comp_u->n_data_imp_relocs); + } +#endif + Lisp_Object *saved_cu = dynlib_sym (handle, COMP_UNIT_SYM); if (!saved_cu) xsignal1 (Qnative_lisp_file_inconsistent, comp_u->file); @@ -5366,11 +5392,6 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, /* Always set data_imp_relocs pointer in the compilation unit (in can be used in 'dump_do_dump_relocation'). */ comp_u->data_imp_relocs = dynlib_sym (handle, DATA_RELOC_IMPURE_SYM); -#ifdef HAVE_MPS - comp_u->data_relocs = dynlib_sym (handle, DATA_RELOC_SYM); - comp_u->comp_unit = dynlib_sym (handle, COMP_UNIT_SYM); - comp_u->comp_unit_root = igc_root_create_n (saved_cu, 1); -#endif if (!comp_u->loaded_once) { @@ -5418,19 +5439,14 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, EMACS_INT d_vec_len = XFIXNUM (Flength (comp_u->data_vec)); #ifdef HAVE_MPS - comp_u->n_data_relocs = d_vec_len; - if (d_vec_len > 0) - comp_u->data_relocs_root = igc_root_create_n (data_relocs, d_vec_len); + eassert (comp_u->n_data_relocs == d_vec_len); #endif for (EMACS_INT i = 0; i < d_vec_len; i++) data_relocs[i] = AREF (comp_u->data_vec, i); d_vec_len = XFIXNUM (Flength (comp_u->data_impure_vec)); #ifdef HAVE_MPS - comp_u->n_data_imp_relocs = d_vec_len; - if (d_vec_len > 0) - comp_u->data_imp_relocs_root - = igc_root_create_n (data_imp_relocs, d_vec_len); + eassert (comp_u->n_data_imp_relocs == d_vec_len); #endif for (EMACS_INT i = 0; i < d_vec_len; i++) data_imp_relocs[i] = AREF (comp_u->data_impure_vec, i); @@ -5461,11 +5477,7 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, # ifdef HAVE_MPS /* FIXME: If we want to get rid of these objects, stop tracing these references at some point. */ - comp_u->data_eph_relocs = data_eph_relocs; - comp_u->n_data_eph_relocs = d_vec_len; - if (d_vec_len > 0) - comp_u->data_eph_relocs_root - = igc_root_create_n (data_eph_relocs, d_vec_len); + eassert (comp_u->n_data_eph_relocs = d_vec_len); # endif for (EMACS_INT i = 0; i < d_vec_len; i++) data_eph_relocs[i] = AREF (data_ephemeral_vec, i); ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-24 13:01 ` Gerd Möllmann @ 2024-05-24 14:19 ` Helmut Eller 2024-05-25 7:37 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-24 14:19 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii On Fri, May 24 2024, Gerd Möllmann wrote: > Given the choice of either ironing or trying what I described above, I > evaded ironing. Diff at the end. Effect none. Which lets me update my > beliefs so that relocs are unlikely to be the cause this time. I also think that the problem more likely somewhere else. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-24 14:19 ` Helmut Eller @ 2024-05-25 7:37 ` Gerd Möllmann 2024-05-25 14:38 ` Gerd Möllmann 2024-05-25 15:02 ` Andrea Corallo 0 siblings, 2 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-25 7:37 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: > On Fri, May 24 2024, Gerd Möllmann wrote: > >> Given the choice of either ironing or trying what I described above, I >> evaded ironing. Diff at the end. Effect none. Which lets me update my >> beliefs so that relocs are unlikely to be the cause this time. > > I also think that the problem more likely somewhere else. FWIW, some more factoids: - A native build on macos x64 was successful. - The native build with default speed == 2 fails with h->obj_type != IGC_OBJ_FWD. It's a long story, let's just say that I could track that to a cons cell that is pushed onto current-load-list in an eval in byte-compile-eval, gets collected, and reappears again in a call to length. Heaven knows what's going on. The C generated by the native compiler looks correct to me. The arm64 assembly I don't want to read :-). - A native build on arm64 with speed == 0 gets much further than the default build with speed == 2, but then gets a EXC_BAD_ACCESS in some pretty printing code when compiling char-code.el. It doesn't show the IGC_OBJ_FWD problem. - In bug#70796 I found on arm64 that a native build shows the bug, a elc build does not. I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into account. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 7:37 ` Gerd Möllmann @ 2024-05-25 14:38 ` Gerd Möllmann 2024-05-25 15:02 ` Andrea Corallo 1 sibling, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-25 14:38 UTC (permalink / raw) To: Helmut Eller; +Cc: Andrea Corallo, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> On Fri, May 24 2024, Gerd Möllmann wrote: >> >>> Given the choice of either ironing or trying what I described above, I >>> evaded ironing. Diff at the end. Effect none. Which lets me update my >>> beliefs so that relocs are unlikely to be the cause this time. >> >> I also think that the problem more likely somewhere else. > > FWIW, some more factoids: > > - A native build on macos x64 was successful. > > - The native build with default speed == 2 fails with h->obj_type != > IGC_OBJ_FWD. It's a long story, let's just say that I could track that > to a cons cell that is pushed onto current-load-list in an eval in > byte-compile-eval, gets collected, and reappears again in a call to > length. Heaven knows what's going on. The C generated by the native > compiler looks correct to me. The arm64 assembly I don't want to read > :-). > > - A native build on arm64 with speed == 0 gets much further than the > default build with speed == 2, but then gets a EXC_BAD_ACCESS in some > pretty printing code when compiling char-code.el. It doesn't show the > IGC_OBJ_FWD problem. > > - In bug#70796 I found on arm64 that a native build shows the bug, a elc > build does not. > > I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into > account. With the latest fix I pushed, a native build with speed == 0 also succeeds on arm64. Hm. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 7:37 ` Gerd Möllmann 2024-05-25 14:38 ` Gerd Möllmann @ 2024-05-25 15:02 ` Andrea Corallo 2024-05-25 15:07 ` Gerd Möllmann 2024-05-25 15:11 ` Eli Zaretskii 1 sibling, 2 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-25 15:02 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into > account. You can just make a build from master to exclude or confirm that. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 15:02 ` Andrea Corallo @ 2024-05-25 15:07 ` Gerd Möllmann 2024-05-25 15:26 ` Andrea Corallo 2024-05-25 15:11 ` Eli Zaretskii 1 sibling, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-25 15:07 UTC (permalink / raw) To: Andrea Corallo; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Andrea Corallo <acorallo@gnu.org> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into >> account. > > You can just make a build from master to exclude or confirm that. Sorry, I don't understand what you mean. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 15:07 ` Gerd Möllmann @ 2024-05-25 15:26 ` Andrea Corallo 2024-05-25 15:29 ` Gerd Möllmann 0 siblings, 1 reply; 72+ messages in thread From: Andrea Corallo @ 2024-05-25 15:26 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >>> I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into >>> account. >> >> You can just make a build from master to exclude or confirm that. > > Sorry, I don't understand what you mean. Sorry let me be more clear, I doubt that a libgccjit 14 bug on arm would show up only on the igc branch and not on master. I'd be prone to think that if libgccjit 14 bug on arm works on master then is okay for our use. Of course I might be wrong. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 15:26 ` Andrea Corallo @ 2024-05-25 15:29 ` Gerd Möllmann 2024-05-25 20:04 ` Andrea Corallo 0 siblings, 1 reply; 72+ messages in thread From: Gerd Möllmann @ 2024-05-25 15:29 UTC (permalink / raw) To: Andrea Corallo; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Andrea Corallo <acorallo@gnu.org> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Andrea Corallo <acorallo@gnu.org> writes: >> >>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>> >>>> I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into >>>> account. >>> >>> You can just make a build from master to exclude or confirm that. >> >> Sorry, I don't understand what you mean. > > Sorry let me be more clear, I doubt that a libgccjit 14 bug on arm would > show up only on the igc branch and not on master. I'd be prone to think > that if libgccjit 14 bug on arm works on master then is okay for our > use. > > Of course I might be wrong. Well, the bug#70796 I mentioned is on master, and the behaviour is mysterious. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 15:29 ` Gerd Möllmann @ 2024-05-25 20:04 ` Andrea Corallo 0 siblings, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-25 20:04 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >>> Andrea Corallo <acorallo@gnu.org> writes: >>> >>>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>> >>>>> I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into >>>>> account. >>>> >>>> You can just make a build from master to exclude or confirm that. >>> >>> Sorry, I don't understand what you mean. >> >> Sorry let me be more clear, I doubt that a libgccjit 14 bug on arm would >> show up only on the igc branch and not on master. I'd be prone to think >> that if libgccjit 14 bug on arm works on master then is okay for our >> use. >> >> Of course I might be wrong. > > Well, the bug#70796 I mentioned is on master, and the behaviour is > mysterious. That's possible as well, my experience doesn't suggest me it's a libgccjit bug, nor here nor in 70796, but as mentioned it might be entirely wrong :) Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 15:02 ` Andrea Corallo 2024-05-25 15:07 ` Gerd Möllmann @ 2024-05-25 15:11 ` Eli Zaretskii 2024-05-26 7:39 ` Gerd Möllmann 1 sibling, 1 reply; 72+ messages in thread From: Eli Zaretskii @ 2024-05-25 15:11 UTC (permalink / raw) To: Andrea Corallo; +Cc: gerd.moellmann, eller.helmut, emacs-devel > From: Andrea Corallo <acorallo@gnu.org> > Cc: Helmut Eller <eller.helmut@gmail.com>, Emacs Devel > <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org> > Date: Sat, 25 May 2024 11:02:34 -0400 > > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > > > I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into > > account. > > You can just make a build from master to exclude or confirm that. Which changeset(s) specifically? AFAIK, the MPS branch is not being synchronized with master, and is actually quite behind it in many parts. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-25 15:11 ` Eli Zaretskii @ 2024-05-26 7:39 ` Gerd Möllmann 0 siblings, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-26 7:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andrea Corallo, eller.helmut, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Andrea Corallo <acorallo@gnu.org> >> Cc: Helmut Eller <eller.helmut@gmail.com>, Emacs Devel >> <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org> >> Date: Sat, 25 May 2024 11:02:34 -0400 >> >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >> > I'm beginning to seriously take a libgccjit 14.1 bug on arm64 into >> > account. >> >> You can just make a build from master to exclude or confirm that. > > Which changeset(s) specifically? AFAIK, the MPS branch is not being > synchronized with master, and is actually quite behind it in many > parts. FWIW, I've now merged master and fixed the errors in the non-native builds with and without MPS. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 9:14 ` Gerd Möllmann 2024-05-23 9:39 ` Helmut Eller @ 2024-05-23 12:09 ` Andrea Corallo 1 sibling, 0 replies; 72+ messages in thread From: Andrea Corallo @ 2024-05-23 12:09 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Helmut Eller, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> On Thu, May 23 2024, Andrea Corallo wrote: >> >>> 1b16c64669d just bootstrapped fine for me here >> >> For me too. > > Not for me :-(. Sorry :( But is still a good step forward cause here the aot build seems to work reliably. Andrea ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-23 6:12 ` Gerd Möllmann ` (2 preceding siblings ...) 2024-05-23 7:50 ` Andrea Corallo @ 2024-05-23 8:30 ` Ihor Radchenko 3 siblings, 0 replies; 72+ messages in thread From: Ihor Radchenko @ 2024-05-23 8:30 UTC (permalink / raw) To: Gerd Möllmann Cc: Helmut Eller, Andrea Corallo, Emacs Devel, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Could anyone (of the currently n = 3 people reporting back > (not disappointing because expected) try this? For the record, at least I am also watching the progress and can help if necessary. But n = 3 appears to be sufficient to keep the work going, so I do not interfere :) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 14:00 MPS: Please check if scratch/igc builds with native compilation Gerd Möllmann 2024-05-21 16:57 ` Andrea Corallo @ 2024-05-22 8:18 ` Helmut Eller 2024-05-22 8:39 ` Gerd Möllmann 2024-06-03 5:35 ` Gerd Möllmann 2 siblings, 1 reply; 72+ messages in thread From: Helmut Eller @ 2024-05-22 8:18 UTC (permalink / raw) To: Gerd Möllmann; +Cc: Emacs Devel, Eli Zaretskii > I'm throwing the towel now wrt to native compilation + MPS on macOS. It > fails here both on arm64 and x86_64 on macOS 14. It's a long story what > all I tried to debug this, let's me just say I suspect, with the highest > probability among all the possibilited, a bug in MPS, without me being > able to point to it. Gut feeling. Anyway - it was an experiment. I have small change to MPS because gcc complained about an uninitialized variable. But I doubt it has much effect: diff --git a/code/arenavm.c b/code/arenavm.c index 50708e956..1a789b959 100644 --- a/code/arenavm.c +++ b/code/arenavm.c @@ -600,7 +600,7 @@ static Res VMArenaCreate(Arena *arenaReturn, ArgList args) VM vm = &vmStruct; Chunk chunk; mps_arg_s arg; - char vmParams[VMParamSize]; + char vmParams[VMParamSize] = { 0 }; AVER(arenaReturn != NULL); AVERT(ArgList, args); ^ permalink raw reply related [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-22 8:18 ` Helmut Eller @ 2024-05-22 8:39 ` Gerd Möllmann 0 siblings, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-05-22 8:39 UTC (permalink / raw) To: Helmut Eller; +Cc: Emacs Devel, Eli Zaretskii Helmut Eller <eller.helmut@gmail.com> writes: >> I'm throwing the towel now wrt to native compilation + MPS on macOS. It >> fails here both on arm64 and x86_64 on macOS 14. It's a long story what >> all I tried to debug this, let's me just say I suspect, with the highest >> probability among all the possibilited, a bug in MPS, without me being >> able to point to it. Gut feeling. Anyway - it was an experiment. > > I have small change to MPS because gcc complained about an uninitialized > variable. But I doubt it has much effect: > > diff --git a/code/arenavm.c b/code/arenavm.c > index 50708e956..1a789b959 100644 > --- a/code/arenavm.c > +++ b/code/arenavm.c > @@ -600,7 +600,7 @@ static Res VMArenaCreate(Arena *arenaReturn, ArgList args) > VM vm = &vmStruct; > Chunk chunk; > mps_arg_s arg; > - char vmParams[VMParamSize]; > + char vmParams[VMParamSize] = { 0 }; > > AVER(arenaReturn != NULL); > AVERT(ArgList, args); Right, made no difference for me either. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: MPS: Please check if scratch/igc builds with native compilation 2024-05-21 14:00 MPS: Please check if scratch/igc builds with native compilation Gerd Möllmann 2024-05-21 16:57 ` Andrea Corallo 2024-05-22 8:18 ` Helmut Eller @ 2024-06-03 5:35 ` Gerd Möllmann 2 siblings, 0 replies; 72+ messages in thread From: Gerd Möllmann @ 2024-06-03 5:35 UTC (permalink / raw) To: Emacs Devel; +Cc: Eli Zaretskii, Helmut Eller Gerd Möllmann <gerd.moellmann@gmail.com> writes: > I'm throwing the towel now wrt to native compilation + MPS on macOS. It > fails here both on arm64 and x86_64 on macOS 14. It's a long story what > all I tried to debug this, let's me just say I suspect, with the highest > probability among all the possibilited, a bug in MPS, without me being > able to point to it. Gut feeling. Anyway - it was an experiment. > > What I'd like to ask anyone who can is to try building scratch/igc with > native compilation (default) and --enable-checking=all. Please tell your > OS, and if you get assertion failures. Maybe do 2 or more builds. > > This could help to assess if scratch/igc is viable. > > I currently think it isn't on macOS, to be honest. An update of what's going on my side. I had and still have problems bulding scratch/igc with native compilation. It builds for example with -lmps but not -lmps-debug, and if it builds depends on the setting of native-comp-speed and so on. I now consider the successful builds just a fluke. If I take such a build and use it interactively it will crash in more or less the same way the unsuccessful builds crash. Even worse is that a build without native compilation also eventually fails when used interactively. It just takes a lot more time. At the moment, the crashes look pretty random and I haven't yet succeeded in making them reproducible and debuggable. ^ permalink raw reply [flat|nested] 72+ messages in thread
end of thread, other threads:[~2024-06-03 5:35 UTC | newest] Thread overview: 72+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-21 14:00 MPS: Please check if scratch/igc builds with native compilation Gerd Möllmann 2024-05-21 16:57 ` Andrea Corallo 2024-05-21 17:02 ` Gerd Möllmann 2024-05-21 17:06 ` Gerd Möllmann 2024-05-21 17:57 ` Andrea Corallo 2024-05-21 18:09 ` Gerd Möllmann 2024-05-21 18:17 ` Andrea Corallo 2024-05-21 19:00 ` Gerd Möllmann 2024-05-21 19:05 ` Andrea Corallo 2024-05-21 19:08 ` Gerd Möllmann 2024-05-21 19:20 ` Andrea Corallo 2024-05-21 19:21 ` Eli Zaretskii 2024-05-21 19:23 ` Gerd Möllmann 2024-05-21 18:35 ` Eli Zaretskii 2024-05-21 18:34 ` Helmut Eller 2024-05-21 18:46 ` Andrea Corallo 2024-05-21 19:10 ` Helmut Eller 2024-05-21 19:17 ` Andrea Corallo 2024-05-21 19:35 ` Andrea Corallo 2024-05-21 19:38 ` Gerd Möllmann 2024-05-21 19:50 ` Andrea Corallo 2024-05-21 19:22 ` Eli Zaretskii 2024-05-21 19:28 ` Andrea Corallo 2024-05-22 5:43 ` Helmut Eller 2024-05-22 6:07 ` Gerd Möllmann 2024-05-22 6:27 ` Gerd Möllmann 2024-05-22 6:34 ` Helmut Eller 2024-05-22 6:56 ` Gerd Möllmann 2024-05-22 7:59 ` Helmut Eller 2024-05-22 8:46 ` Gerd Möllmann 2024-05-22 18:03 ` Gerd Möllmann 2024-05-23 6:12 ` Gerd Möllmann 2024-05-23 6:38 ` Helmut Eller 2024-05-23 6:58 ` Gerd Möllmann 2024-05-23 6:40 ` Helmut Eller 2024-05-23 6:58 ` Gerd Möllmann 2024-05-23 7:12 ` Helmut Eller 2024-05-23 7:35 ` Gerd Möllmann 2024-05-23 7:38 ` Andrea Corallo 2024-05-23 7:50 ` Andrea Corallo 2024-05-23 7:52 ` Gerd Möllmann 2024-05-23 7:57 ` Gerd Möllmann 2024-05-23 8:12 ` Helmut Eller 2024-05-23 8:18 ` Gerd Möllmann 2024-05-23 8:42 ` Andrea Corallo 2024-05-23 9:06 ` Helmut Eller 2024-05-23 9:14 ` Gerd Möllmann 2024-05-23 9:39 ` Helmut Eller 2024-05-23 10:19 ` Gerd Möllmann 2024-05-23 10:35 ` Ihor Radchenko 2024-05-23 11:27 ` Gerd Möllmann 2024-05-23 12:15 ` Andrea Corallo 2024-05-23 14:46 ` Helmut Eller 2024-05-23 16:40 ` Gerd Möllmann 2024-05-23 18:26 ` Helmut Eller 2024-05-24 3:10 ` Gerd Möllmann 2024-05-24 13:01 ` Gerd Möllmann 2024-05-24 14:19 ` Helmut Eller 2024-05-25 7:37 ` Gerd Möllmann 2024-05-25 14:38 ` Gerd Möllmann 2024-05-25 15:02 ` Andrea Corallo 2024-05-25 15:07 ` Gerd Möllmann 2024-05-25 15:26 ` Andrea Corallo 2024-05-25 15:29 ` Gerd Möllmann 2024-05-25 20:04 ` Andrea Corallo 2024-05-25 15:11 ` Eli Zaretskii 2024-05-26 7:39 ` Gerd Möllmann 2024-05-23 12:09 ` Andrea Corallo 2024-05-23 8:30 ` Ihor Radchenko 2024-05-22 8:18 ` Helmut Eller 2024-05-22 8:39 ` Gerd Möllmann 2024-06-03 5:35 ` Gerd Möllmann
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).