* Merging scratch/no-purespace to remove unexec and purespace @ 2024-12-17 10:47 Stefan Kangas 2024-12-17 13:12 ` Gerd Möllmann ` (2 more replies) 0 siblings, 3 replies; 65+ messages in thread From: Stefan Kangas @ 2024-12-17 10:47 UTC (permalink / raw) To: emacs-devel; +Cc: Pip Cet, Stefan Monnier In August, we decided to remove the unexec build along with purespace. The scratch/no-purespace branch removes both, and has been rebased on top of a recent master. We, the maintainers, believe that the scratch/no-purespace branch is now ready to merge, and would appreciate any final feedback, testing, and code reviews. Specifically, the branch has been primarily tested on GNU/Linux and macOS, so testing on other systems would be greatly appreciated. Unless we uncover any serious blocking issues, or significant or unexpected objections from the community, we plan to merge the branch on February 1, 2025. During our last discussion, we identified some issues with using pdumper on some legacy proprietary systems: MS-DOS, Windows 98, and Solaris 10 Zone. As we have explained previously, patches are welcome for these issues, but we do not currently consider them as blockers for this merge. Thanks! Stefan Kangas, on behalf of the Emacs maintainers ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 10:47 Merging scratch/no-purespace to remove unexec and purespace Stefan Kangas @ 2024-12-17 13:12 ` Gerd Möllmann 2024-12-17 14:20 ` Gerd Möllmann 2024-12-17 19:30 ` Helmut Eller 2024-12-18 0:50 ` Po Lu 2 siblings, 1 reply; 65+ messages in thread From: Gerd Möllmann @ 2024-12-17 13:12 UTC (permalink / raw) To: Stefan Kangas; +Cc: emacs-devel, Pip Cet, Stefan Monnier Stefan Kangas <stefankangas@gmail.com> writes: > In August, we decided to remove the unexec build along with purespace. > The scratch/no-purespace branch removes both, and has been rebased on > top of a recent master. > > We, the maintainers, believe that the scratch/no-purespace branch is > now ready to merge, and would appreciate any final feedback, testing, > and code reviews. Specifically, the branch has been primarily tested > on GNU/Linux and macOS, so testing on other systems would be greatly > appreciated. > > Unless we uncover any serious blocking issues, or significant or > unexpected objections from the community, we plan to merge the > branch on February 1, 2025. > > During our last discussion, we identified some issues with using > pdumper on some legacy proprietary systems: MS-DOS, Windows 98, and > Solaris 10 Zone. As we have explained previously, patches are welcome > for these issues, but we do not currently consider them as blockers > for this merge. > > Thanks! > > Stefan Kangas, on behalf of the Emacs maintainers Building on macOS with --enable-checking=all --with-native-compilation gives assertion violations. It builds without the --enable-checking. gmake -C ../lisp compile-first EMACS="../src/bootstrap-emacs" ELC+ELN emacs-lisp/comp-common.elc ELC+ELN emacs-lisp/comp-run.elc ELC+ELN emacs-lisp/radix-tree.elc ELC+ELN emacs-lisp/loaddefs-gen.elc ELC+ELN emacs-lisp/macroexp.elc ELC+ELN emacs-lisp/cconv.elc ELC+ELN emacs-lisp/comp-cstr.elc ELC+ELN emacs-lisp/bytecomp.elc ELC+ELN emacs-lisp/byte-opt.elc ELC+ELN emacs-lisp/comp.elc gmake -C ../lisp autoloads EMACS="../src/bootstrap-emacs" ELC+ELN ../lisp/cus-face.elc comp.c:5322: Emacs fatal error: assertion failed: check_comp_unit_relocs (comp_u) gmake[3]: *** [Makefile:285: ../lisp/cus-face.elc] Abort trap: 6 ELC+ELN international/titdic-cnv.elc comp.c:5322: Emacs fatal error: assertion failed: check_comp_unit_relocs (comp_u) Fatal error 6: Aborted gmake[3]: *** [Makefile:330: international/titdic-cnv.elc] Abort trap: 6 gmake[2]: *** [Makefile:901: ../lisp/cus-face.elc] Error 2 gmake[2]: *** Waiting for unfinished jobs.... gmake[2]: *** [Makefile:961: ../lisp/loaddefs.el] Error 2 ELC+ELN ../lisp/abbrev.elc comp.c:5322: Emacs fatal error: assertion failed: check_comp_unit_relocs (comp_u) gmake[3]: *** [Makefile:285: ../lisp/abbrev.elc] Abort trap: 6 ELC+ELN ../lisp/button.elc ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 13:12 ` Gerd Möllmann @ 2024-12-17 14:20 ` Gerd Möllmann 2024-12-17 14:30 ` Gerd Möllmann 0 siblings, 1 reply; 65+ messages in thread From: Gerd Möllmann @ 2024-12-17 14:20 UTC (permalink / raw) To: Stefan Kangas; +Cc: emacs-devel, Pip Cet, Stefan Monnier Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Stefan Kangas <stefankangas@gmail.com> writes: > >> In August, we decided to remove the unexec build along with purespace. >> The scratch/no-purespace branch removes both, and has been rebased on >> top of a recent master. >> >> We, the maintainers, believe that the scratch/no-purespace branch is >> now ready to merge, and would appreciate any final feedback, testing, >> and code reviews. Specifically, the branch has been primarily tested >> on GNU/Linux and macOS, so testing on other systems would be greatly >> appreciated. >> >> Unless we uncover any serious blocking issues, or significant or >> unexpected objections from the community, we plan to merge the >> branch on February 1, 2025. >> >> During our last discussion, we identified some issues with using >> pdumper on some legacy proprietary systems: MS-DOS, Windows 98, and >> Solaris 10 Zone. As we have explained previously, patches are welcome >> for these issues, but we do not currently consider them as blockers >> for this merge. >> >> Thanks! >> >> Stefan Kangas, on behalf of the Emacs maintainers > > Building on macOS with --enable-checking=all --with-native-compilation > gives assertion violations. It builds without the --enable-checking. > > gmake -C ../lisp compile-first EMACS="../src/bootstrap-emacs" > ELC+ELN emacs-lisp/comp-common.elc > ELC+ELN emacs-lisp/comp-run.elc > ELC+ELN emacs-lisp/radix-tree.elc > ELC+ELN emacs-lisp/loaddefs-gen.elc > ELC+ELN emacs-lisp/macroexp.elc > ELC+ELN emacs-lisp/cconv.elc > ELC+ELN emacs-lisp/comp-cstr.elc > ELC+ELN emacs-lisp/bytecomp.elc > ELC+ELN emacs-lisp/byte-opt.elc > ELC+ELN emacs-lisp/comp.elc > gmake -C ../lisp autoloads EMACS="../src/bootstrap-emacs" > ELC+ELN ../lisp/cus-face.elc > > comp.c:5322: Emacs fatal error: assertion failed: check_comp_unit_relocs (comp_u) > gmake[3]: *** [Makefile:285: ../lisp/cus-face.elc] Abort trap: 6 > ELC+ELN international/titdic-cnv.elc > > comp.c:5322: Emacs fatal error: assertion failed: check_comp_unit_relocs (comp_u) > Fatal error 6: Aborted > gmake[3]: *** [Makefile:330: international/titdic-cnv.elc] Abort trap: 6 > gmake[2]: *** [Makefile:901: ../lisp/cus-face.elc] Error 2 > gmake[2]: *** Waiting for unfinished jobs.... > gmake[2]: *** [Makefile:961: ../lisp/loaddefs.el] Error 2 > ELC+ELN ../lisp/abbrev.elc > > comp.c:5322: Emacs fatal error: assertion failed: check_comp_unit_relocs (comp_u) > gmake[3]: *** [Makefile:285: ../lisp/abbrev.elc] Abort trap: 6 > ELC+ELN ../lisp/button.elc I think check_comp_unit_relocs should be removed in the branch. What's left of it the branch, checks in master if everything has been put in purespace which should be there. IIUC correctly, of course. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 14:20 ` Gerd Möllmann @ 2024-12-17 14:30 ` Gerd Möllmann 2024-12-17 17:56 ` Gerd Möllmann 0 siblings, 1 reply; 65+ messages in thread From: Gerd Möllmann @ 2024-12-17 14:30 UTC (permalink / raw) To: Stefan Kangas; +Cc: emacs-devel, Pip Cet, Stefan Monnier [-- Attachment #1: Type: text/plain, Size: 284 bytes --] Gerd Möllmann <gerd.moellmann@gmail.com> writes: > I think check_comp_unit_relocs should be removed in the branch. What's > left of it the branch, checks in master if everything has been > put in purespace which should be there. IIUC correctly, of course. Patch attached. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Remove-check_comp_unit_relocs.patch --] [-- Type: text/x-patch, Size: 1722 bytes --] From 66e73f65966e40af67338cfdf18f89549d687935 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gerd=20M=C3=B6llmann?= <gerd@gnu.org> Date: Tue, 17 Dec 2024 15:28:14 +0100 Subject: [PATCH] Remove check_comp_unit_relocs * src/comp.c (check_comp_unit_relocs): Removed. (load_comp_unit): Remove use. --- src/comp.c | 27 --------------------------- 1 file changed, 27 deletions(-) diff --git a/src/comp.c b/src/comp.c index f91f7c57f45..2c7f2b704f6 100644 --- a/src/comp.c +++ b/src/comp.c @@ -5164,32 +5164,6 @@ load_static_obj (struct Lisp_Native_Comp_Unit *comp_u, const char *name) } -/* Return false when something is wrong or true otherwise. */ - -static bool -check_comp_unit_relocs (struct Lisp_Native_Comp_Unit *comp_u) -{ - dynlib_handle_ptr handle = comp_u->handle; - Lisp_Object *data_relocs = dynlib_sym (handle, DATA_RELOC_SYM); - - EMACS_INT d_vec_len = XFIXNUM (Flength (comp_u->data_vec)); - - for (ptrdiff_t i = 0; i < d_vec_len; i++) - { - Lisp_Object x = data_relocs[i]; - if (EQ (x, Qlambda_fixup)) - return false; - else if (NATIVE_COMP_FUNCTIONP (x)) - { - if (NILP (Fgethash (x, comp_u->lambda_gc_guard_h, Qnil))) - return false; - } - else if (!EQ (x, AREF (comp_u->data_vec, i))) - return false; - } - return true; -} - static void unset_cu_load_ongoing (Lisp_Object comp_u) { @@ -5319,7 +5293,6 @@ load_comp_unit (struct Lisp_Native_Comp_Unit *comp_u, bool loading_dump, /* Make sure data_ephemeral_vec still exists after top_level_run has run. Guard against sibling call optimization (or any other). */ data_ephemeral_vec = data_ephemeral_vec; - eassert (check_comp_unit_relocs (comp_u)); } if (!recursive_load) -- 2.47.1 ^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 14:30 ` Gerd Möllmann @ 2024-12-17 17:56 ` Gerd Möllmann 2024-12-17 18:50 ` Eli Zaretskii 2024-12-18 0:18 ` Stefan Kangas 0 siblings, 2 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-17 17:56 UTC (permalink / raw) To: Stefan Kangas; +Cc: emacs-devel, Pip Cet, Stefan Monnier Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> I think check_comp_unit_relocs should be removed in the branch. What's >> left of it the branch, checks in master if everything has been >> put in purespace which should be there. IIUC correctly, of course. > > Patch attached. Pushed. Complaints to me please. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 17:56 ` Gerd Möllmann @ 2024-12-17 18:50 ` Eli Zaretskii 2024-12-17 18:56 ` Gerd Möllmann 2024-12-18 0:18 ` Stefan Kangas 1 sibling, 1 reply; 65+ messages in thread From: Eli Zaretskii @ 2024-12-17 18:50 UTC (permalink / raw) To: Gerd Möllmann, Andrea Corallo Cc: stefankangas, emacs-devel, pipcet, monnier > From: Gerd Möllmann <gerd.moellmann@gmail.com> > Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>, Stefan Monnier > <monnier@iro.umontreal.ca> > Date: Tue, 17 Dec 2024 18:56:05 +0100 > > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > > > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > > > >> I think check_comp_unit_relocs should be removed in the branch. What's > >> left of it the branch, checks in master if everything has been > >> put in purespace which should be there. IIUC correctly, of course. > > > > Patch attached. > > Pushed. Complaints to me please. I'd like at least Andrea to take a look and confirm. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 18:50 ` Eli Zaretskii @ 2024-12-17 18:56 ` Gerd Möllmann 2024-12-18 12:55 ` Andrea Corallo 0 siblings, 1 reply; 65+ messages in thread From: Gerd Möllmann @ 2024-12-17 18:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andrea Corallo, stefankangas, emacs-devel, pipcet, monnier Eli Zaretskii <eliz@gnu.org> writes: >> From: Gerd Möllmann <gerd.moellmann@gmail.com> >> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>, Stefan Monnier >> <monnier@iro.umontreal.ca> >> Date: Tue, 17 Dec 2024 18:56:05 +0100 >> >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >> > Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> > >> >> I think check_comp_unit_relocs should be removed in the branch. What's >> >> left of it the branch, checks in master if everything has been >> >> put in purespace which should be there. IIUC correctly, of course. >> > >> > Patch attached. >> >> Pushed. Complaints to me please. > > I'd like at least Andrea to take a look and confirm. It's 81fc23b5d6a60ca4f3d269ab2c88eb9a850bac4c ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 18:56 ` Gerd Möllmann @ 2024-12-18 12:55 ` Andrea Corallo 2024-12-18 14:03 ` Gerd Möllmann 0 siblings, 1 reply; 65+ messages in thread From: Andrea Corallo @ 2024-12-18 12:55 UTC (permalink / raw) To: Gerd Möllmann Cc: Eli Zaretskii, stefankangas, emacs-devel, pipcet, monnier Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>, Stefan Monnier >>> <monnier@iro.umontreal.ca> >>> Date: Tue, 17 Dec 2024 18:56:05 +0100 >>> >>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>> >>> > Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>> > >>> >> I think check_comp_unit_relocs should be removed in the branch. What's >>> >> left of it the branch, checks in master if everything has been >>> >> put in purespace which should be there. IIUC correctly, of course. >>> > >>> > Patch attached. >>> >>> Pushed. Complaints to me please. >> >> I'd like at least Andrea to take a look and confirm. > > It's 81fc23b5d6a60ca4f3d269ab2c88eb9a850bac4c Hi Gerd, looking at the commit now, why do you think 'check_comp_unit_relocs' should be removed? Even if now the situation is simpler 'check_comp_unit_relocs' is still performing some sanity checks like if lambdas are all been fixed-up and present in 'comp_u->lambda_gc_guard_h'. Andrea ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 12:55 ` Andrea Corallo @ 2024-12-18 14:03 ` Gerd Möllmann 2024-12-18 16:05 ` Pip Cet via Emacs development discussions. 2024-12-18 16:25 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-18 14:03 UTC (permalink / raw) To: Andrea Corallo; +Cc: Eli Zaretskii, stefankangas, emacs-devel, pipcet, monnier Andrea Corallo <acorallo@gnu.org> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Eli Zaretskii <eliz@gnu.org> writes: >> >>>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>>> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>, Stefan Monnier >>>> <monnier@iro.umontreal.ca> >>>> Date: Tue, 17 Dec 2024 18:56:05 +0100 >>>> >>>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>> >>>> > Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>> > >>>> >> I think check_comp_unit_relocs should be removed in the branch. What's >>>> >> left of it the branch, checks in master if everything has been >>>> >> put in purespace which should be there. IIUC correctly, of course. >>>> > >>>> > Patch attached. >>>> >>>> Pushed. Complaints to me please. >>> >>> I'd like at least Andrea to take a look and confirm. >> >> It's 81fc23b5d6a60ca4f3d269ab2c88eb9a850bac4c > > Hi Gerd, > > looking at the commit now, why do you think 'check_comp_unit_relocs' > should be removed? > > Even if now the situation is simpler 'check_comp_unit_relocs' is still > performing some sanity checks like if lambdas are all been fixed-up and > present in 'comp_u->lambda_gc_guard_h'. > > Andrea Hi Andrea. The check you mention checks something that I don't see how it could happen. (With the usual disclaimers, because it's been some time since I was in that code for igc.) When we comp--register-lambda the code putting the result of make_subr into the hash table and putting it in data_relocs (in the branch) is just a handful of lines apart. I thereforeo thought we could do without. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 14:03 ` Gerd Möllmann @ 2024-12-18 16:05 ` Pip Cet via Emacs development discussions. 2024-12-18 16:30 ` Gerd Möllmann 2024-12-18 16:25 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-18 16:05 UTC (permalink / raw) To: Gerd Möllmann Cc: Andrea Corallo, Eli Zaretskii, stefankangas, emacs-devel, monnier Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >>> Eli Zaretskii <eliz@gnu.org> writes: >>> >>>>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>>>> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>, Stefan Monnier >>>>> <monnier@iro.umontreal.ca> >>>>> Date: Tue, 17 Dec 2024 18:56:05 +0100 >>>>> >>>>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>>> >>>>> > Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>>> > >>>>> >> I think check_comp_unit_relocs should be removed in the branch. What's >>>>> >> left of it the branch, checks in master if everything has been >>>>> >> put in purespace which should be there. IIUC correctly, of course. >>>>> > >>>>> > Patch attached. >>>>> >>>>> Pushed. Complaints to me please. >>>> >>>> I'd like at least Andrea to take a look and confirm. >>> >>> It's 81fc23b5d6a60ca4f3d269ab2c88eb9a850bac4c >> >> Hi Gerd, >> >> looking at the commit now, why do you think 'check_comp_unit_relocs' >> should be removed? >> >> Even if now the situation is simpler 'check_comp_unit_relocs' is still >> performing some sanity checks like if lambdas are all been fixed-up and >> present in 'comp_u->lambda_gc_guard_h'. >> >> Andrea > > Hi Andrea. > > The check you mention checks something that I don't see how it could > happen. (With the usual disclaimers, because it's been some time since I > was in that code for igc.) I'm confused, I thought you did hit the assertion? AFAICT, the problem is simply that comp.el uses the symbol lambda-fixup. So a relocation for that symbol is emitted. But no fixup is, because this isn't an actual lambda, it's merely the symbol. The debug code then sees "lambda-fixup", assumes it's a failed fixup, and asserts. IOW, the old code happened not to run into this problem because lambda-fixup was pure, and we never applied the sanity checks to the pure section. This "fix" appears to work: diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el index dbd14b2740d..5d364e77e66 100644 --- a/lisp/emacs-lisp/comp.el +++ b/lisp/emacs-lisp/comp.el @@ -3254,7 +3254,7 @@ comp--finalize-container ;; from the corresponding m-var. collect (if (gethash obj (comp-ctxt-byte-func-to-func-h comp-ctxt)) - 'lambda-fixup + (intern (concat "lambda" (make-string 1 ?-) "fixup")) obj)))) (defun comp--finalize-relocs () My suggestion is to fix the "sanity check" on the master branch, change it to apply to pure relocs there, and restore the fixed check on scratch/no-purespace afterwards. Pip ^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 16:05 ` Pip Cet via Emacs development discussions. @ 2024-12-18 16:30 ` Gerd Möllmann 0 siblings, 0 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-18 16:30 UTC (permalink / raw) To: Pip Cet; +Cc: Andrea Corallo, Eli Zaretskii, stefankangas, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Andrea Corallo <acorallo@gnu.org> writes: >> >>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>> >>>> Eli Zaretskii <eliz@gnu.org> writes: >>>> >>>>>> From: Gerd Möllmann <gerd.moellmann@gmail.com> >>>>>> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>, Stefan Monnier >>>>>> <monnier@iro.umontreal.ca> >>>>>> Date: Tue, 17 Dec 2024 18:56:05 +0100 >>>>>> >>>>>> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>>>> >>>>>> > Gerd Möllmann <gerd.moellmann@gmail.com> writes: >>>>>> > >>>>>> >> I think check_comp_unit_relocs should be removed in the branch. What's >>>>>> >> left of it the branch, checks in master if everything has been >>>>>> >> put in purespace which should be there. IIUC correctly, of course. >>>>>> > >>>>>> > Patch attached. >>>>>> >>>>>> Pushed. Complaints to me please. >>>>> >>>>> I'd like at least Andrea to take a look and confirm. >>>> >>>> It's 81fc23b5d6a60ca4f3d269ab2c88eb9a850bac4c >>> >>> Hi Gerd, >>> >>> looking at the commit now, why do you think 'check_comp_unit_relocs' >>> should be removed? >>> >>> Even if now the situation is simpler 'check_comp_unit_relocs' is still >>> performing some sanity checks like if lambdas are all been fixed-up and >>> present in 'comp_u->lambda_gc_guard_h'. >>> >>> Andrea >> >> Hi Andrea. >> >> The check you mention checks something that I don't see how it could >> happen. (With the usual disclaimers, because it's been some time since I >> was in that code for igc.) > > I'm confused, I thought you did hit the assertion? Correct, building with --enable-checking failed. What confuses? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 14:03 ` Gerd Möllmann 2024-12-18 16:05 ` Pip Cet via Emacs development discussions. @ 2024-12-18 16:25 ` Pip Cet via Emacs development discussions. 2024-12-18 22:27 ` Andrea Corallo 1 sibling, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-18 16:25 UTC (permalink / raw) To: Gerd Möllmann Cc: Andrea Corallo, Eli Zaretskii, stefankangas, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > IOW, the old code happened not to run into this problem because > lambda-fixup was pure, and we never applied the sanity checks to the > pure section. Just to be clear: the code on master is fine. I misunderstood it when modifying it for purespace removal, resulting in my bug which Gerd discovered and fixed. The code on no-purespace is also fine now, but it's Andrea's call whether he wants some of the checking code restored, and how. > My suggestion is to fix the "sanity check" on the master branch, change > it to apply to pure relocs there, and restore the fixed check on > scratch/no-purespace afterwards. Please ignore that. My suggestion is to EXTEND the sanity check on the master branch to cover pure and impure relocs, and restore the EXTENDED check before merging scratch/no-purespace. There is no bug to fix on master. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 16:25 ` Pip Cet via Emacs development discussions. @ 2024-12-18 22:27 ` Andrea Corallo 2024-12-19 9:28 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 65+ messages in thread From: Andrea Corallo @ 2024-12-18 22:27 UTC (permalink / raw) To: Pip Cet Cc: Gerd Möllmann, Eli Zaretskii, stefankangas, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: >> IOW, the old code happened not to run into this problem because >> lambda-fixup was pure, and we never applied the sanity checks to the >> pure section. > > Just to be clear: the code on master is fine. I misunderstood it when > modifying it for purespace removal, resulting in my bug which Gerd > discovered and fixed. The code on no-purespace is also fine now, but > it's Andrea's call whether he wants some of the checking code restored, > and how. > >> My suggestion is to fix the "sanity check" on the master branch, change >> it to apply to pure relocs there, and restore the fixed check on >> scratch/no-purespace afterwards. > > Please ignore that. My suggestion is to EXTEND the sanity check on the > master branch to cover pure and impure relocs, and restore the EXTENDED > check before merging scratch/no-purespace. > > There is no bug to fix on master. Right your analysis is correct, the new code in the branch just made the symbol 'fixup-lambda' not compilable. I restored the check and applied a variant of your fix with a comment around. scratch/no-purespace work for me now. On master I don't think I see what we should do and the motivation for. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 22:27 ` Andrea Corallo @ 2024-12-19 9:28 ` Pip Cet via Emacs development discussions. 2024-12-19 10:38 ` Andrea Corallo 2024-12-19 10:50 ` Stefan Kangas 0 siblings, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-19 9:28 UTC (permalink / raw) To: Andrea Corallo Cc: Gerd Möllmann, Eli Zaretskii, stefankangas, emacs-devel, monnier "Andrea Corallo" <acorallo@gnu.org> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >>> IOW, the old code happened not to run into this problem because >>> lambda-fixup was pure, and we never applied the sanity checks to the >>> pure section. >> >> Just to be clear: the code on master is fine. I misunderstood it when >> modifying it for purespace removal, resulting in my bug which Gerd >> discovered and fixed. The code on no-purespace is also fine now, but >> it's Andrea's call whether he wants some of the checking code restored, >> and how. >> >>> My suggestion is to fix the "sanity check" on the master branch, change >>> it to apply to pure relocs there, and restore the fixed check on >>> scratch/no-purespace afterwards. >> >> Please ignore that. My suggestion is to EXTEND the sanity check on the >> master branch to cover pure and impure relocs, and restore the EXTENDED >> check before merging scratch/no-purespace. >> >> There is no bug to fix on master. > > Right your analysis is correct, the new code in the branch just made the > symbol 'fixup-lambda' not compilable. > > I restored the check and applied a variant of your fix with a comment > around. scratch/no-purespace work for me now. Just to summarize this: There's now a forbidden symbol, --lambda-fixup. If you use this symbol in your code and compile the code with nativecomp, that may appear to work, but loading the resulting object file into another Emacs will crash that Emacs, if that Emacs was built with checks enabled. > On master I don't think I see what we should do and the motivation for. The scratch/no-purespace branch now tests things more rigorously than the master branch does: master performs three checks on all impure relocations and a single check on pure ones, but scratch/no-purespace performs all three checks on all relocations. That means when we merge scratch/no-purespace, and hit one of the new assertions, it may be (and was, in Gerd's case) because the test would have failed on the master branch but was never performed there, and that would be unrelated to purespace removal. What I think we should do doesn't really matter, but it seems quite obvious to me that we should make the code on the master branch perform all three checks on all relocations, as the code on no-purespace does. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 9:28 ` Pip Cet via Emacs development discussions. @ 2024-12-19 10:38 ` Andrea Corallo 2024-12-19 10:50 ` Stefan Kangas 1 sibling, 0 replies; 65+ messages in thread From: Andrea Corallo @ 2024-12-19 10:38 UTC (permalink / raw) To: Pip Cet Cc: Gerd Möllmann, Eli Zaretskii, stefankangas, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > "Andrea Corallo" <acorallo@gnu.org> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >> >>> Pip Cet <pipcet@protonmail.com> writes: >>>> IOW, the old code happened not to run into this problem because >>>> lambda-fixup was pure, and we never applied the sanity checks to the >>>> pure section. >>> >>> Just to be clear: the code on master is fine. I misunderstood it when >>> modifying it for purespace removal, resulting in my bug which Gerd >>> discovered and fixed. The code on no-purespace is also fine now, but >>> it's Andrea's call whether he wants some of the checking code restored, >>> and how. >>> >>>> My suggestion is to fix the "sanity check" on the master branch, change >>>> it to apply to pure relocs there, and restore the fixed check on >>>> scratch/no-purespace afterwards. >>> >>> Please ignore that. My suggestion is to EXTEND the sanity check on the >>> master branch to cover pure and impure relocs, and restore the EXTENDED >>> check before merging scratch/no-purespace. >>> >>> There is no bug to fix on master. >> >> Right your analysis is correct, the new code in the branch just made the >> symbol 'fixup-lambda' not compilable. >> >> I restored the check and applied a variant of your fix with a comment >> around. scratch/no-purespace work for me now. > > Just to summarize this: > There's now a forbidden symbol, --lambda-fixup. If you use this symbol > in your code and compile the code with nativecomp, that may appear to work, > but loading the resulting object file into another Emacs will crash that > Emacs, if that Emacs was built with checks enabled. Correct, maybe we should use 'comp--lambda-fixup' so it's even more evidently private to the compiler, not sure if would be worth using a different kind of placeholder object. >> On master I don't think I see what we should do and the motivation for. > > The scratch/no-purespace branch now tests things more rigorously than > the master branch does: master performs three checks on all impure > relocations and a single check on pure ones, but scratch/no-purespace > performs all three checks on all relocations. > > That means when we merge scratch/no-purespace, and hit one of the new > assertions, it may be (and was, in Gerd's case) because the test would > have failed on the master branch but was never performed there, and that > would be unrelated to purespace removal. > > What I think we should do doesn't really matter, but it seems quite > obvious to me that we should make the code on the master branch perform > all three checks on all relocations, as the code on no-purespace does. master doesn't have forbidden symbols that can't be stored in data_vec so why should we test for that? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 9:28 ` Pip Cet via Emacs development discussions. 2024-12-19 10:38 ` Andrea Corallo @ 2024-12-19 10:50 ` Stefan Kangas 2024-12-19 12:08 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 65+ messages in thread From: Stefan Kangas @ 2024-12-19 10:50 UTC (permalink / raw) To: Pip Cet, Andrea Corallo Cc: Gerd Möllmann, Eli Zaretskii, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > What I think we should do doesn't really matter, but it seems quite > obvious to me that we should make the code on the master branch > perform all three checks on all relocations, as the code on > no-purespace does. Maybe. But won't we get those checks with no additional effort once we merge no-purespace, and if so, can't it wait until then? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 10:50 ` Stefan Kangas @ 2024-12-19 12:08 ` Pip Cet via Emacs development discussions. 2024-12-19 17:55 ` Stefan Kangas 0 siblings, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-19 12:08 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Gerd Möllmann, Eli Zaretskii, emacs-devel, monnier "Stefan Kangas" <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> What I think we should do doesn't really matter, but it seems quite >> obvious to me that we should make the code on the master branch >> perform all three checks on all relocations, as the code on >> no-purespace does. > > Maybe. But won't we get those checks with no additional effort once we > merge no-purespace, Yes, we will. (And the forbidden symbol; even if the forbidden symbol doesn't cause trouble, which I think it will, it's simply very poor programming practice to do things that way, particularly since the crash may happen a long time after the compilation. But, again, what I think obviously doesn't matter here. I'll just remember that --enable-checking causes false positive crashes and shouldn't be used). That's a problem, because if we run into problems there, we'll have no way of knowing whether the problem was present on the pre-merge master (where we didn't check) or not. But, again, as it's specific to --enable-checking, we can simply stop using that. > and if so, can't it wait until then? Of course, but changing two things at a time makes debugging harder. (And IIUC, we won't rename the symbol on master until we merge, so that's three changes which can cause trouble with the nativecomp code, all introduced at the same time). Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 12:08 ` Pip Cet via Emacs development discussions. @ 2024-12-19 17:55 ` Stefan Kangas 2024-12-19 20:13 ` Pip Cet via Emacs development discussions. 2024-12-20 8:42 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 65+ messages in thread From: Stefan Kangas @ 2024-12-19 17:55 UTC (permalink / raw) To: Pip Cet Cc: Andrea Corallo, Gerd Möllmann, Eli Zaretskii, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > "Stefan Kangas" <stefankangas@gmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >> >>> What I think we should do doesn't really matter, but it seems quite >>> obvious to me that we should make the code on the master branch >>> perform all three checks on all relocations, as the code on >>> no-purespace does. >> >> Maybe. But won't we get those checks with no additional effort once we >> merge no-purespace, > > Yes, we will. (And the forbidden symbol; even if the forbidden symbol > doesn't cause trouble, which I think it will, it's simply very poor > programming practice to do things that way, particularly since the > crash may happen a long time after the compilation. But, again, what I > think obviously doesn't matter here. I'll just remember that > --enable-checking causes false positive crashes and shouldn't be used). I don't think the existence of one symbol that will crash Emacs in some situations means that --enable-checking should be completely avoided. It's still quite useful, and we're fine as long as we avoid using that one symbol, right? OTOH and IMHO, it would be preferable if that symbol could not crash Emacs. Can we come up with a good way to fix that, while preserving the check that Andrea wants to keep? > That's a problem, because if we run into problems there, we'll have no > way of knowing whether the problem was present on the pre-merge master > (where we didn't check) or not. But, again, as it's specific to > --enable-checking, we can simply stop using that. > >> and if so, can't it wait until then? > > Of course, but changing two things at a time makes debugging harder. > (And IIUC, we won't rename the symbol on master until we merge, so > that's three changes which can cause trouble with the nativecomp code, > all introduced at the same time). I still don't think I understand your argument here, sorry. The scratch/no-purespace branch contains several different changes, all of which have had to pass through review, testing and verification. Why is it particularly important to "backport" this change to master, in advance of the merge, but not any of the other changes on that branch? What am I missing? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 17:55 ` Stefan Kangas @ 2024-12-19 20:13 ` Pip Cet via Emacs development discussions. 2024-12-20 15:59 ` Stefan Monnier 2025-01-03 22:36 ` Pip Cet via Emacs development discussions. 2024-12-20 8:42 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-19 20:13 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Gerd Möllmann, Eli Zaretskii, emacs-devel, monnier "Stefan Kangas" <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> "Stefan Kangas" <stefankangas@gmail.com> writes: >> >>> Pip Cet <pipcet@protonmail.com> writes: >>> >>>> What I think we should do doesn't really matter, but it seems quite >>>> obvious to me that we should make the code on the master branch >>>> perform all three checks on all relocations, as the code on >>>> no-purespace does. >>> >>> Maybe. But won't we get those checks with no additional effort once we >>> merge no-purespace, >> >> Yes, we will. (And the forbidden symbol; even if the forbidden symbol >> doesn't cause trouble, which I think it will, it's simply very poor >> programming practice to do things that way, particularly since the >> crash may happen a long time after the compilation. But, again, what I >> think obviously doesn't matter here. I'll just remember that >> --enable-checking causes false positive crashes and shouldn't be used). > > I don't think the existence of one symbol that will crash Emacs in some > situations means that --enable-checking should be completely avoided. > It's still quite useful, and we're fine as long as we avoid using that > one symbol, right? > > OTOH and IMHO, it would be preferable if that symbol could not crash > Emacs. Can we come up with a good way to fix that, while preserving the > check that Andrea wants to keep? That sounds like a good thing to focus on, yes. We need to have a value in a vector that we Fread that is distinguishable from all other values. One cheap way of doing that is to make the structure circular, and use references to the object being read as a placeholder value. So instead of having "[a b lambda-fixup c]" in the blob, we'd have "#1=[a b #1# c]" Note that that makes element 2 of the vector EQ to the vector being read, which only comes into existence during reading, so there's no way to create such a reference in any other way. We could also make this more explicit by using a more complex expression involving #1# instead of #1# directly. For example, "#1=[a b (unresolved-lambda-fixup #1#) c]" One side effect is that we need to make sure read-circle is t before reading the string (which is, IIUC, already produced with print-circle bound to t, so reading it with read-circle seems like the correct thing to do). For example, right now this code doesn't work: (let ((print-circle t) (read-circle nil)) (message "%S" (funcall (native-compile (lambda () #1=[#1#]))))) (read, of course, with read-circle bound to t) but this does: (let ((print-circle t) (read-circle nil)) (message "%S" (funcall (lambda () #1=[#1#])))) So this seems like a cheap drive-by fix. A more complicated approach would be to add a special read syntax (if there isn't one that I missed) producing references to objects defined outside of the read string: in essence, we'd call Fread in a special way that makes #s(magic-cookie x) evaluate to be eq to something passed in as a "magic cookie". This might be generally useful and avoids the problems of circular data structures. But I'll try to come up with a patch doing the simple #1# thing first. >> That's a problem, because if we run into problems there, we'll have no >> way of knowing whether the problem was present on the pre-merge master >> (where we didn't check) or not. But, again, as it's specific to >> --enable-checking, we can simply stop using that. >> >>> and if so, can't it wait until then? >> >> Of course, but changing two things at a time makes debugging harder. >> (And IIUC, we won't rename the symbol on master until we merge, so >> that's three changes which can cause trouble with the nativecomp code, >> all introduced at the same time). > > I still don't think I understand your argument here, sorry. > > The scratch/no-purespace branch contains several different changes, all > of which have had to pass through review, testing and verification. > Why is it particularly important to "backport" this change to master, in > advance of the merge, but not any of the other changes on that branch? > What am I missing? On reflection, it was I who was missing something, that no-purespace already has moved from "the minimum set of changes necessary for getting rid of purespace and unexec" to "a coherent set of changes including purespace and unexec removal and quite a few other things". I think that's a good change to have happened, it just took me a while to catch up with it. So no continued objections there. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 20:13 ` Pip Cet via Emacs development discussions. @ 2024-12-20 15:59 ` Stefan Monnier 2024-12-20 16:22 ` Pip Cet via Emacs development discussions. 2025-01-03 22:36 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 65+ messages in thread From: Stefan Monnier @ 2024-12-20 15:59 UTC (permalink / raw) To: Pip Cet Cc: Stefan Kangas, Andrea Corallo, Gerd Möllmann, Eli Zaretskii, emacs-devel >> OTOH and IMHO, it would be preferable if that symbol could not crash >> Emacs. Can we come up with a good way to fix that, while preserving the >> check that Andrea wants to keep? > > That sounds like a good thing to focus on, yes. We need to have a value > in a vector that we Fread that is distinguishable from all other values. How 'bout an uninterned symbol `#:foo`? Stefan ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 15:59 ` Stefan Monnier @ 2024-12-20 16:22 ` Pip Cet via Emacs development discussions. 2024-12-20 17:25 ` Gerd Möllmann 0 siblings, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-20 16:22 UTC (permalink / raw) To: Stefan Monnier Cc: Stefan Kangas, Andrea Corallo, Gerd Möllmann, Eli Zaretskii, emacs-devel "Stefan Monnier" <monnier@iro.umontreal.ca> writes: >>> OTOH and IMHO, it would be preferable if that symbol could not crash >>> Emacs. Can we come up with a good way to fix that, while preserving the >>> check that Andrea wants to keep? >> >> That sounds like a good thing to focus on, yes. We need to have a value >> in a vector that we Fread that is distinguishable from all other values. > > How 'bout an uninterned symbol `#:foo`? I think those are legal for native-compiled subrs to use (there's a comment about it, at least), so that wouldn't do us any good, would it? Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 16:22 ` Pip Cet via Emacs development discussions. @ 2024-12-20 17:25 ` Gerd Möllmann 2024-12-20 20:35 ` Andrea Corallo 2024-12-20 20:38 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-20 17:25 UTC (permalink / raw) To: Pip Cet via Emacs development discussions. Cc: Stefan Monnier, Pip Cet, Stefan Kangas, Andrea Corallo, Eli Zaretskii Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > "Stefan Monnier" <monnier@iro.umontreal.ca> writes: > >>>> OTOH and IMHO, it would be preferable if that symbol could not crash >>>> Emacs. Can we come up with a good way to fix that, while preserving the >>>> check that Andrea wants to keep? >>> >>> That sounds like a good thing to focus on, yes. We need to have a value >>> in a vector that we Fread that is distinguishable from all other values. >> >> How 'bout an uninterned symbol `#:foo`? > > I think those are legal for native-compiled subrs to use (there's a > comment about it, at least), so that wouldn't do us any good, would it? > > Pip Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. Wouldn't that help? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 17:25 ` Gerd Möllmann @ 2024-12-20 20:35 ` Andrea Corallo 2024-12-20 20:39 ` Pip Cet via Emacs development discussions. 2024-12-20 20:38 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 65+ messages in thread From: Andrea Corallo @ 2024-12-20 20:35 UTC (permalink / raw) To: Gerd Möllmann Cc: Pip Cet via Emacs development discussions., Stefan Monnier, Pip Cet, Stefan Kangas, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> "Stefan Monnier" <monnier@iro.umontreal.ca> writes: >> >>>>> OTOH and IMHO, it would be preferable if that symbol could not crash >>>>> Emacs. Can we come up with a good way to fix that, while preserving the >>>>> check that Andrea wants to keep? >>>> >>>> That sounds like a good thing to focus on, yes. We need to have a value >>>> in a vector that we Fread that is distinguishable from all other values. >>> >>> How 'bout an uninterned symbol `#:foo`? >> >> I think those are legal for native-compiled subrs to use (there's a >> comment about it, at least), so that wouldn't do us any good, would it? >> >> Pip > > Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. > Wouldn't that help? Yes. we want and still can use eq in the test, I guess the best is to create and store an uninterned symbol somewhere, so we can use it to mark and test those locations in the relocation vector. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 20:35 ` Andrea Corallo @ 2024-12-20 20:39 ` Pip Cet via Emacs development discussions. 2024-12-21 6:33 ` Gerd Möllmann 2024-12-21 6:56 ` Andrea Corallo 0 siblings, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-20 20:39 UTC (permalink / raw) To: Andrea Corallo Cc: Gerd Möllmann, Pip Cet via "Emacs development discussions.", Stefan Monnier, Stefan Kangas, Eli Zaretskii "Andrea Corallo" <acorallo@gnu.org> writes: >> Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. >> Wouldn't that help? > > Yes. we want and still can use eq in the test, I guess the best is to > create and store an uninterned symbol somewhere, so we can use it to > mark and test those locations in the relocation vector. I still don't see how, sorry. We can put a new uninterned symbol into the relocation vector, but there's no read syntax for "put this specific uninterned symbol here", is there? Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 20:39 ` Pip Cet via Emacs development discussions. @ 2024-12-21 6:33 ` Gerd Möllmann 2024-12-21 6:56 ` Andrea Corallo 1 sibling, 0 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-21 6:33 UTC (permalink / raw) To: Pip Cet via Emacs development discussions. Cc: Andrea Corallo, Pip Cet, Stefan Monnier, Stefan Kangas, Eli Zaretskii Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > "Andrea Corallo" <acorallo@gnu.org> writes: > >>> Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. >>> Wouldn't that help? >> >> Yes. we want and still can use eq in the test, I guess the best is to >> create and store an uninterned symbol somewhere, so we can use it to >> mark and test those locations in the relocation vector. > > I still don't see how, sorry. We can put a new uninterned symbol into > the relocation vector, but there's no read syntax for "put this specific > uninterned symbol here", is there? No, there isn't except for the #1=, #1# thing I mentioned. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 20:39 ` Pip Cet via Emacs development discussions. 2024-12-21 6:33 ` Gerd Möllmann @ 2024-12-21 6:56 ` Andrea Corallo 1 sibling, 0 replies; 65+ messages in thread From: Andrea Corallo @ 2024-12-21 6:56 UTC (permalink / raw) To: Pip Cet Cc: Gerd Möllmann, Pip Cet via "Emacs development discussions.", Stefan Monnier, Stefan Kangas, Eli Zaretskii Pip Cet <pipcet@protonmail.com> writes: > "Andrea Corallo" <acorallo@gnu.org> writes: > >>> Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. >>> Wouldn't that help? >> >> Yes. we want and still can use eq in the test, I guess the best is to >> create and store an uninterned symbol somewhere, so we can use it to >> mark and test those locations in the relocation vector. > > I still don't see how, sorry. We can put a new uninterned symbol into > the relocation vector, but there's no read syntax for "put this specific > uninterned symbol here", is there? Ops you are totally right. Mmhhh... ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 17:25 ` Gerd Möllmann 2024-12-20 20:35 ` Andrea Corallo @ 2024-12-20 20:38 ` Pip Cet via Emacs development discussions. 2024-12-20 20:57 ` Gerd Möllmann 1 sibling, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-20 20:38 UTC (permalink / raw) To: Gerd Möllmann Cc: Pip Cet via "Emacs development discussions.", Stefan Monnier, Stefan Kangas, Andrea Corallo, Eli Zaretskii Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> "Stefan Monnier" <monnier@iro.umontreal.ca> writes: >> >>>>> OTOH and IMHO, it would be preferable if that symbol could not crash >>>>> Emacs. Can we come up with a good way to fix that, while preserving the >>>>> check that Andrea wants to keep? >>>> >>>> That sounds like a good thing to focus on, yes. We need to have a value >>>> in a vector that we Fread that is distinguishable from all other values. >>> >>> How 'bout an uninterned symbol `#:foo`? >> >> I think those are legal for native-compiled subrs to use (there's a >> comment about it, at least), so that wouldn't do us any good, would it? >> >> Pip > > Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. > Wouldn't that help? I don't see how, sorry. Nativecomp produces a string in the .eln file that is passed to make_string(), then to Fread(). The string can include source code for uninterned symbols, but there's no way for us to tell whether a #:lambda-fixup in there was put in by us or by the user. What we could do is expand read so this: (let* ((cookie (cons nil nil)) (read-table (list (cons "cookie" cookie))) (result (read "#s(cookie cookie)"))) (eq result cookie)) to return t. Then we'd have an expressive syntax for this kind of thing (obviously, "cookie" should be something useful in the #s(...) syntax. A slightly different option is to do this: (let* ((cookie (cons nil nil)) (read-circle (list (cons 1 cookie))) (result (read "#1#"))) (eq result cookie)) in effect allowing us to use prime the read-circle table with externally-provided objects. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-20 20:38 ` Pip Cet via Emacs development discussions. @ 2024-12-20 20:57 ` Gerd Möllmann 0 siblings, 0 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-20 20:57 UTC (permalink / raw) To: Pip Cet Cc: Pip Cet via "Emacs development discussions.", Stefan Monnier, Stefan Kangas, Andrea Corallo, Eli Zaretskii Pip Cet <pipcet@protonmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> writes: >> >>> "Stefan Monnier" <monnier@iro.umontreal.ca> writes: >>> >>>>>> OTOH and IMHO, it would be preferable if that symbol could not crash >>>>>> Emacs. Can we come up with a good way to fix that, while preserving the >>>>>> check that Andrea wants to keep? >>>>> >>>>> That sounds like a good thing to focus on, yes. We need to have a value >>>>> in a vector that we Fread that is distinguishable from all other values. >>>> >>>> How 'bout an uninterned symbol `#:foo`? >>> >>> I think those are legal for native-compiled subrs to use (there's a >>> comment about it, at least), so that wouldn't do us any good, would it? >>> >>> Pip >> >> Uninterned symbols are unique, i.e. (eq '#:a '#:a) => nil. >> Wouldn't that help? > > I don't see how, sorry. Nativecomp produces a string in the .eln file > that is passed to make_string(), then to Fread(). The string can > include source code for uninterned symbols, but there's no way for us to > tell whether a #:lambda-fixup in there was put in by us or by the user. > > What we could do is expand read so this: > > (let* ((cookie (cons nil nil)) > (read-table (list (cons "cookie" cookie))) > (result (read "#s(cookie cookie)"))) > (eq result cookie)) > > to return t. Then we'd have an expressive syntax for this kind of > thing (obviously, "cookie" should be something useful in the #s(...) > syntax. > > A slightly different option is to do this: > > (let* ((cookie (cons nil nil)) > (read-circle (list (cons 1 cookie))) > (result (read "#1#"))) > (eq result cookie)) > > in effect allowing us to use prime the read-circle table with > externally-provided objects. > I was thinking of using something like (read-from-string "[#1=#:a #1# #1#]") => ([a a a] . 16) in which case the first element of the result vector is the uninterned symbol that, only in this read result, would serve as a marker. The marker would be different with each read. Don't know if that's good, though. Just what I think Stefan's idea was. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 20:13 ` Pip Cet via Emacs development discussions. 2024-12-20 15:59 ` Stefan Monnier @ 2025-01-03 22:36 ` Pip Cet via Emacs development discussions. 2025-01-03 23:50 ` Stefan Monnier 2025-01-04 7:26 ` Eli Zaretskii 1 sibling, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2025-01-03 22:36 UTC (permalink / raw) To: Pip Cet via "Emacs development discussions." Cc: Stefan Kangas, Andrea Corallo, Gerd Möllmann, Eli Zaretskii, monnier "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org> writes: > "Stefan Kangas" <stefankangas@gmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: >> >>> "Stefan Kangas" <stefankangas@gmail.com> writes: >>> >>>> Pip Cet <pipcet@protonmail.com> writes: >>>> >>>>> What I think we should do doesn't really matter, but it seems quite >>>>> obvious to me that we should make the code on the master branch >>>>> perform all three checks on all relocations, as the code on >>>>> no-purespace does. >>>> >>>> Maybe. But won't we get those checks with no additional effort once we >>>> merge no-purespace, >>> >>> Yes, we will. (And the forbidden symbol; even if the forbidden symbol >>> doesn't cause trouble, which I think it will, it's simply very poor >>> programming practice to do things that way, particularly since the >>> crash may happen a long time after the compilation. But, again, what I >>> think obviously doesn't matter here. I'll just remember that >>> --enable-checking causes false positive crashes and shouldn't be used). >> >> I don't think the existence of one symbol that will crash Emacs in some >> situations means that --enable-checking should be completely avoided. >> It's still quite useful, and we're fine as long as we avoid using that >> one symbol, right? >> >> OTOH and IMHO, it would be preferable if that symbol could not crash >> Emacs. Can we come up with a good way to fix that, while preserving the >> check that Andrea wants to keep? > > That sounds like a good thing to focus on, yes. We need to have a value > in a vector that we Fread that is distinguishable from all other values. I just reread the code, and #$ may be what we're looking for. It's a unique value that we can pass in to Fread (let-binding load-file-name), and it already exists. OTOH, it's used for docstrings normally, so it may be cleaner to invent new read syntax. I'd really like to fix this. > For example, right now this code doesn't work: > > (let ((print-circle t) (read-circle nil)) > (message "%S" (funcall (native-compile (lambda () #1=[#1#]))))) > > (read, of course, with read-circle bound to t) > > but this does: > > (let ((print-circle t) (read-circle nil)) > (message "%S" (funcall (lambda () #1=[#1#])))) > > So this seems like a cheap drive-by fix. The non-nativecomp code also breaks if read-circle is nil (I assume this is related to autoload), so it isn't just the nativecomp code that should be fixed. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-03 22:36 ` Pip Cet via Emacs development discussions. @ 2025-01-03 23:50 ` Stefan Monnier 2025-01-04 7:26 ` Eli Zaretskii 1 sibling, 0 replies; 65+ messages in thread From: Stefan Monnier @ 2025-01-03 23:50 UTC (permalink / raw) To: Pip Cet Cc: Pip Cet via "Emacs development discussions.", Stefan Kangas, Andrea Corallo, Gerd Möllmann, Eli Zaretskii > I just reread the code, and #$ may be what we're looking for. It's a > unique value that we can pass in to Fread (let-binding load-file-name), > and it already exists. OTOH, it's used for docstrings normally, so it > may be cleaner to invent new read syntax. I still don't understand what this is going to be used for, so I'm not sure how hackish this would be, but indeed #$ can be used to have the reader return a known, unique object that we can check with `eq`. Stefan ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-03 22:36 ` Pip Cet via Emacs development discussions. 2025-01-03 23:50 ` Stefan Monnier @ 2025-01-04 7:26 ` Eli Zaretskii 2025-01-04 11:12 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 65+ messages in thread From: Eli Zaretskii @ 2025-01-04 7:26 UTC (permalink / raw) To: Pip Cet; +Cc: emacs-devel, stefankangas, acorallo, gerd.moellmann, monnier > Date: Fri, 03 Jan 2025 22:36:42 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Stefan Kangas <stefankangas@gmail.com>, Andrea Corallo <acorallo@gnu.org>, Gerd Möllmann <gerd.moellmann@gmail.com>, Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca > > "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org> writes: > > > "Stefan Kangas" <stefankangas@gmail.com> writes: > > > >> Pip Cet <pipcet@protonmail.com> writes: > >> > >>> "Stefan Kangas" <stefankangas@gmail.com> writes: > >>> > >>>> Pip Cet <pipcet@protonmail.com> writes: > >>>> > >>>>> What I think we should do doesn't really matter, but it seems quite > >>>>> obvious to me that we should make the code on the master branch > >>>>> perform all three checks on all relocations, as the code on > >>>>> no-purespace does. > >>>> > >>>> Maybe. But won't we get those checks with no additional effort once we > >>>> merge no-purespace, > >>> > >>> Yes, we will. (And the forbidden symbol; even if the forbidden symbol > >>> doesn't cause trouble, which I think it will, it's simply very poor > >>> programming practice to do things that way, particularly since the > >>> crash may happen a long time after the compilation. But, again, what I > >>> think obviously doesn't matter here. I'll just remember that > >>> --enable-checking causes false positive crashes and shouldn't be used). > >> > >> I don't think the existence of one symbol that will crash Emacs in some > >> situations means that --enable-checking should be completely avoided. > >> It's still quite useful, and we're fine as long as we avoid using that > >> one symbol, right? > >> > >> OTOH and IMHO, it would be preferable if that symbol could not crash > >> Emacs. Can we come up with a good way to fix that, while preserving the > >> check that Andrea wants to keep? > > > > That sounds like a good thing to focus on, yes. We need to have a value > > in a vector that we Fread that is distinguishable from all other values. > > I just reread the code, and #$ may be what we're looking for. It's a > unique value that we can pass in to Fread (let-binding load-file-name), > and it already exists. OTOH, it's used for docstrings normally, so it > may be cleaner to invent new read syntax. > > I'd really like to fix this. > > > For example, right now this code doesn't work: > > > > (let ((print-circle t) (read-circle nil)) > > (message "%S" (funcall (native-compile (lambda () #1=[#1#]))))) > > > > (read, of course, with read-circle bound to t) > > > > but this does: > > > > (let ((print-circle t) (read-circle nil)) > > (message "%S" (funcall (lambda () #1=[#1#])))) > > > > So this seems like a cheap drive-by fix. > > The non-nativecomp code also breaks if read-circle is nil (I assume this > is related to autoload), so it isn't just the nativecomp code that > should be fixed. Is this about bug#74966? If so, let's not discuss the same issue here. Andrea suggested a change that he prefers, and from where I stand, we should wait for him to publish such a patch. Then we can discuss the patch if there are comments or objections. But let's do all that in the bug thread, not here. Apologies if I misunderstood what the above is about. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-04 7:26 ` Eli Zaretskii @ 2025-01-04 11:12 ` Pip Cet via Emacs development discussions. 2025-01-04 13:50 ` Eli Zaretskii 0 siblings, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2025-01-04 11:12 UTC (permalink / raw) To: Eli Zaretskii Cc: emacs-devel, stefankangas, acorallo, gerd.moellmann, monnier "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Fri, 03 Jan 2025 22:36:42 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Stefan Kangas <stefankangas@gmail.com>, Andrea Corallo >> <acorallo@gnu.org>, Gerd Möllmann <gerd.moellmann@gmail.com>, Eli >> Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca >> >> "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org> writes: >> >> > "Stefan Kangas" <stefankangas@gmail.com> writes: >> > >> >> Pip Cet <pipcet@protonmail.com> writes: >> >> >> >>> "Stefan Kangas" <stefankangas@gmail.com> writes: >> >>> >> >>>> Pip Cet <pipcet@protonmail.com> writes: >> >>>> >> >>>>> What I think we should do doesn't really matter, but it seems quite >> >>>>> obvious to me that we should make the code on the master branch >> >>>>> perform all three checks on all relocations, as the code on >> >>>>> no-purespace does. >> >>>> >> >>>> Maybe. But won't we get those checks with no additional effort once we >> >>>> merge no-purespace, >> >>> >> >>> Yes, we will. (And the forbidden symbol; even if the forbidden symbol >> >>> doesn't cause trouble, which I think it will, it's simply very poor >> >>> programming practice to do things that way, particularly since the >> >>> crash may happen a long time after the compilation. But, again, what I >> >>> think obviously doesn't matter here. I'll just remember that >> >>> --enable-checking causes false positive crashes and shouldn't be used). >> >> >> >> I don't think the existence of one symbol that will crash Emacs in some >> >> situations means that --enable-checking should be completely avoided. >> >> It's still quite useful, and we're fine as long as we avoid using that >> >> one symbol, right? >> >> >> >> OTOH and IMHO, it would be preferable if that symbol could not crash >> >> Emacs. Can we come up with a good way to fix that, while preserving the >> >> check that Andrea wants to keep? >> > >> > That sounds like a good thing to focus on, yes. We need to have a value >> > in a vector that we Fread that is distinguishable from all other values. >> >> I just reread the code, and #$ may be what we're looking for. It's a >> unique value that we can pass in to Fread (let-binding load-file-name), >> and it already exists. OTOH, it's used for docstrings normally, so it >> may be cleaner to invent new read syntax. >> >> I'd really like to fix this. >> >> > For example, right now this code doesn't work: >> > >> > (let ((print-circle t) (read-circle nil)) >> > (message "%S" (funcall (native-compile (lambda () #1=[#1#]))))) >> > >> > (read, of course, with read-circle bound to t) >> > >> > but this does: >> > >> > (let ((print-circle t) (read-circle nil)) >> > (message "%S" (funcall (lambda () #1=[#1#])))) >> > >> > So this seems like a cheap drive-by fix. >> >> The non-nativecomp code also breaks if read-circle is nil (I assume this >> is related to autoload), so it isn't just the nativecomp code that >> should be fixed. > > Is this about bug#74966? If so, let's not discuss the same issue No. This is about the --enable-checking code treating --lambda-fixup as a forbidden symbol and crashing if it is used in the obvious fashion by any code that happens to be native-compiled, which is the current status on scratch/no-purespace. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-04 11:12 ` Pip Cet via Emacs development discussions. @ 2025-01-04 13:50 ` Eli Zaretskii 2025-01-06 19:05 ` Andrea Corallo 0 siblings, 1 reply; 65+ messages in thread From: Eli Zaretskii @ 2025-01-04 13:50 UTC (permalink / raw) To: Pip Cet, acorallo; +Cc: emacs-devel, stefankangas, gerd.moellmann, monnier > Date: Sat, 04 Jan 2025 11:12:48 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: emacs-devel@gnu.org, stefankangas@gmail.com, acorallo@gnu.org, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > Is this about bug#74966? If so, let's not discuss the same issue > > No. This is about the --enable-checking code treating --lambda-fixup as > a forbidden symbol and crashing if it is used in the obvious fashion by > any code that happens to be native-compiled, which is the current status > on scratch/no-purespace. Then I'd like to hear Andrea's opinion on this. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-04 13:50 ` Eli Zaretskii @ 2025-01-06 19:05 ` Andrea Corallo 2025-01-07 12:46 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 65+ messages in thread From: Andrea Corallo @ 2025-01-06 19:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, emacs-devel, stefankangas, gerd.moellmann, monnier Eli Zaretskii <eliz@gnu.org> writes: >> Date: Sat, 04 Jan 2025 11:12:48 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: emacs-devel@gnu.org, stefankangas@gmail.com, acorallo@gnu.org, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> > Is this about bug#74966? If so, let's not discuss the same issue >> >> No. This is about the --enable-checking code treating --lambda-fixup as >> a forbidden symbol and crashing if it is used in the obvious fashion by >> any code that happens to be native-compiled, which is the current status >> on scratch/no-purespace. > > Then I'd like to hear Andrea's opinion on this. My opinion is that, to fix the scratch/no-purespace bug, I'm okay with the #$ solution if it works. Honestly I'd be fine also using a comp specific symbol like comp--something which is documented not to be compilable, maybe less nice but probably less convoluted. Anyway if Pip prefers #$ and is not too complex I'm okay with that. Andrea ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-06 19:05 ` Andrea Corallo @ 2025-01-07 12:46 ` Pip Cet via Emacs development discussions. 2025-01-18 9:40 ` Eli Zaretskii 2025-01-18 19:52 ` Andrea Corallo 0 siblings, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2025-01-07 12:46 UTC (permalink / raw) To: Andrea Corallo Cc: Eli Zaretskii, emacs-devel, stefankangas, gerd.moellmann, monnier "Andrea Corallo" <acorallo@gnu.org> writes: > Eli Zaretskii <eliz@gnu.org> writes: > >>> Date: Sat, 04 Jan 2025 11:12:48 +0000 >>> From: Pip Cet <pipcet@protonmail.com> >>> Cc: emacs-devel@gnu.org, stefankangas@gmail.com, acorallo@gnu.org, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca >>> >>> "Eli Zaretskii" <eliz@gnu.org> writes: >>> >>> > Is this about bug#74966? If so, let's not discuss the same issue >>> >>> No. This is about the --enable-checking code treating --lambda-fixup as >>> a forbidden symbol and crashing if it is used in the obvious fashion by >>> any code that happens to be native-compiled, which is the current status >>> on scratch/no-purespace. >> >> Then I'd like to hear Andrea's opinion on this. > > My opinion is that, to fix the scratch/no-purespace bug, I'm okay with > the #$ solution if it works. Honestly I'd be fine also using a comp > specific symbol like comp--something which is documented not to be > compilable, maybe less nice but probably less convoluted. Anyway if Pip > prefers #$ and is not too complex I'm okay with that. Here's the patch. Okay to push? (As with all the DEFVARs in comp.c, space for the new symbol is reserved in globals, and the documentation ends up in etc/DOC, even in non-nativecomp builds. The same is true for DEFSYM: make-docfile.c doesn't care about #ifdef. If someone fixes this so non-nativecomp builds don't pay a price for nativecomp, this code will need fixing, too.) commit a0abd3033414a011125e815d30890ed22b80a84e (HEAD) Author: Pip Cet <pipcet@protonmail.com> Date: Tue Jan 7 12:13:40 2025 +0000 Use #$ for lambda fixups in native compilation data vectors * lisp/emacs-lisp/comp.el (comp--#$): New defvar. (comp--finalize-container): Use it. * src/comp.c (ABI_VERSION): Bump. (emit_static_object): Ensure 'comp--#$' prints as "#$". (load_static_obj): Ensure '#$' reads as Vcomp__hashdollar. (check_comp_unit_relocs): Adjust assertion. (syms_of_comp): Define 'comp--#$'. * src/pdumper.c (dump_do_dump_relocation): Adjust assertion. diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el index ab6fd77f11a..a8698ef6454 100644 --- a/lisp/emacs-lisp/comp.el +++ b/lisp/emacs-lisp/comp.el @@ -42,6 +42,7 @@ comp-native-version-dir (defvar comp-subr-arities-h) (defvar native-comp-eln-load-path) (defvar native-comp-enable-subr-trampolines) +(defvar comp--\#$) (declare-function comp--compile-ctxt-to-file0 "comp.c") (declare-function comp--init-ctxt "comp.c") @@ -3254,10 +3255,9 @@ comp--finalize-container ;; from the corresponding m-var. collect (if (gethash obj (comp-ctxt-byte-func-to-func-h comp-ctxt)) - ;; Hack not to have `--lambda-fixup' in - ;; data relocations as it would trigger the - ;; check in 'check_comp_unit_relocs'. - (intern (concat (make-string 1 ?-) "-lambda-fixup")) + ;; This prints as #$, so we can assert this + ;; value does not remain in the data vector + comp--\#$ obj)))) (defun comp--finalize-relocs () diff --git a/src/comp.c b/src/comp.c index 8b38adec252..ecdf01b91cb 100644 --- a/src/comp.c +++ b/src/comp.c @@ -468,7 +468,7 @@ load_gccjit_if_necessary (bool mandatory) \f /* Increase this number to force a new Vcomp_abi_hash to be generated. */ -#define ABI_VERSION "9" +#define ABI_VERSION "10" /* Length of the hashes used for eln file naming. */ #define HASH_LENGTH 8 @@ -2682,6 +2682,12 @@ emit_static_object (const char *name, Lisp_Object obj) specbind (intern_c_string ("print-quoted"), Qt); specbind (intern_c_string ("print-gensym"), Qt); specbind (intern_c_string ("print-circle"), Qt); + /* Bind print-number-table and print-continuous-numbering so comp--#$ + prints as #$. */ + Lisp_Object print_number_table = CALLN (Fmake_hash_table, QCtest, Qeq); + Fputhash (Vcomp__hashdollar, build_string ("#$") , print_number_table); + specbind (intern_c_string ("print-number-table"), print_number_table); + specbind (intern_c_string ("print-continuous-numbering"), Qt); Lisp_Object str = Fprin1_to_string (obj, Qnil, Qnil); unbind_to (count, Qnil); @@ -5145,18 +5151,22 @@ fixup_eln_load_path (Lisp_Object eln_filename) static Lisp_Object load_static_obj (struct Lisp_Native_Comp_Unit *comp_u, const char *name) { + specpdl_ref count = SPECPDL_INDEX (); static_obj_t *blob = dynlib_sym (comp_u->handle, format_string ("%s_blob", name)); + /* Special value so we can recognize #$. */ + specbind (intern_c_string ("load-file-name"), + Vcomp__hashdollar); if (blob) /* New blob format. */ - return Fread (make_string (blob->data, blob->len)); + return unbind_to (count, Fread (make_string (blob->data, blob->len))); static_obj_t *(*f)(void) = dynlib_sym (comp_u->handle, name); if (!f) xsignal1 (Qnative_lisp_file_inconsistent, comp_u->file); blob = f (); - return Fread (make_string (blob->data, blob->len)); + return unbind_to (count, Fread (make_string (blob->data, blob->len))); } @@ -5173,7 +5183,7 @@ check_comp_unit_relocs (struct Lisp_Native_Comp_Unit *comp_u) for (ptrdiff_t i = 0; i < d_vec_len; i++) { Lisp_Object x = data_relocs[i]; - if (EQ (x, Q__lambda_fixup)) + if (EQ (x, Vcomp__hashdollar)) return false; else if (NATIVE_COMP_FUNCTIONP (x)) { @@ -5622,7 +5632,6 @@ syms_of_comp (void) DEFSYM (Qfixnum, "fixnum"); DEFSYM (Qscratch, "scratch"); DEFSYM (Qlate, "late"); - DEFSYM (Q__lambda_fixup, "--lambda-fixup"); DEFSYM (Qgccjit, "gccjit"); DEFSYM (Qcomp_subr_trampoline_install, "comp-subr-trampoline-install"); DEFSYM (Qnative_comp_warning_on_missing_source, @@ -5804,6 +5813,10 @@ syms_of_comp (void) verification of the native compiler. */); comp_sanitizer_active = false; + DEFVAR_LISP ("comp--#$", Vcomp__hashdollar, + doc: /* Special value which will print as "#$". */); + Vcomp__hashdollar = build_string ("#$"); + Fprovide (intern_c_string ("native-compile"), Qnil); #endif /* #ifdef HAVE_NATIVE_COMP */ diff --git a/src/pdumper.c b/src/pdumper.c index d45bbc84bba..a27df8d96d4 100644 --- a/src/pdumper.c +++ b/src/pdumper.c @@ -5504,7 +5504,7 @@ dump_do_dump_relocation (const uintptr_t dump_base, XSETSUBR (tem, subr); Lisp_Object *fixup = &(comp_u->data_relocs[XFIXNUM (lambda_data_idx)]); - eassert (EQ (*fixup, Q__lambda_fixup)); + eassert (EQ (*fixup, Vcomp__hashdollar)); *fixup = tem; Fputhash (tem, Qt, comp_u->lambda_gc_guard_h); } ^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-07 12:46 ` Pip Cet via Emacs development discussions. @ 2025-01-18 9:40 ` Eli Zaretskii 2025-01-18 19:52 ` Andrea Corallo 1 sibling, 0 replies; 65+ messages in thread From: Eli Zaretskii @ 2025-01-18 9:40 UTC (permalink / raw) To: acorallo, Pip Cet; +Cc: emacs-devel, stefankangas, gerd.moellmann, monnier Ping! Andrea, any comments to the patch? > Date: Tue, 07 Jan 2025 12:46:16 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, stefankangas@gmail.com, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca > > "Andrea Corallo" <acorallo@gnu.org> writes: > > > Eli Zaretskii <eliz@gnu.org> writes: > > > >>> Date: Sat, 04 Jan 2025 11:12:48 +0000 > >>> From: Pip Cet <pipcet@protonmail.com> > >>> Cc: emacs-devel@gnu.org, stefankangas@gmail.com, acorallo@gnu.org, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca > >>> > >>> "Eli Zaretskii" <eliz@gnu.org> writes: > >>> > >>> > Is this about bug#74966? If so, let's not discuss the same issue > >>> > >>> No. This is about the --enable-checking code treating --lambda-fixup as > >>> a forbidden symbol and crashing if it is used in the obvious fashion by > >>> any code that happens to be native-compiled, which is the current status > >>> on scratch/no-purespace. > >> > >> Then I'd like to hear Andrea's opinion on this. > > > > My opinion is that, to fix the scratch/no-purespace bug, I'm okay with > > the #$ solution if it works. Honestly I'd be fine also using a comp > > specific symbol like comp--something which is documented not to be > > compilable, maybe less nice but probably less convoluted. Anyway if Pip > > prefers #$ and is not too complex I'm okay with that. > > Here's the patch. Okay to push? > > (As with all the DEFVARs in comp.c, space for the new symbol is reserved > in globals, and the documentation ends up in etc/DOC, even in > non-nativecomp builds. The same is true for DEFSYM: make-docfile.c > doesn't care about #ifdef. If someone fixes this so non-nativecomp > builds don't pay a price for nativecomp, this code will need fixing, > too.) > > commit a0abd3033414a011125e815d30890ed22b80a84e (HEAD) > Author: Pip Cet <pipcet@protonmail.com> > Date: Tue Jan 7 12:13:40 2025 +0000 > > Use #$ for lambda fixups in native compilation data vectors > > * lisp/emacs-lisp/comp.el (comp--#$): New defvar. > (comp--finalize-container): Use it. > * src/comp.c (ABI_VERSION): Bump. > (emit_static_object): Ensure 'comp--#$' prints as "#$". > (load_static_obj): Ensure '#$' reads as Vcomp__hashdollar. > (check_comp_unit_relocs): Adjust assertion. > (syms_of_comp): Define 'comp--#$'. > * src/pdumper.c (dump_do_dump_relocation): Adjust assertion. > > diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el > index ab6fd77f11a..a8698ef6454 100644 > --- a/lisp/emacs-lisp/comp.el > +++ b/lisp/emacs-lisp/comp.el > @@ -42,6 +42,7 @@ comp-native-version-dir > (defvar comp-subr-arities-h) > (defvar native-comp-eln-load-path) > (defvar native-comp-enable-subr-trampolines) > +(defvar comp--\#$) > > (declare-function comp--compile-ctxt-to-file0 "comp.c") > (declare-function comp--init-ctxt "comp.c") > @@ -3254,10 +3255,9 @@ comp--finalize-container > ;; from the corresponding m-var. > collect (if (gethash obj > (comp-ctxt-byte-func-to-func-h comp-ctxt)) > - ;; Hack not to have `--lambda-fixup' in > - ;; data relocations as it would trigger the > - ;; check in 'check_comp_unit_relocs'. > - (intern (concat (make-string 1 ?-) "-lambda-fixup")) > + ;; This prints as #$, so we can assert this > + ;; value does not remain in the data vector > + comp--\#$ > obj)))) > > (defun comp--finalize-relocs () > diff --git a/src/comp.c b/src/comp.c > index 8b38adec252..ecdf01b91cb 100644 > --- a/src/comp.c > +++ b/src/comp.c > @@ -468,7 +468,7 @@ load_gccjit_if_necessary (bool mandatory) > > \f > /* Increase this number to force a new Vcomp_abi_hash to be generated. */ > -#define ABI_VERSION "9" > +#define ABI_VERSION "10" > > /* Length of the hashes used for eln file naming. */ > #define HASH_LENGTH 8 > @@ -2682,6 +2682,12 @@ emit_static_object (const char *name, Lisp_Object obj) > specbind (intern_c_string ("print-quoted"), Qt); > specbind (intern_c_string ("print-gensym"), Qt); > specbind (intern_c_string ("print-circle"), Qt); > + /* Bind print-number-table and print-continuous-numbering so comp--#$ > + prints as #$. */ > + Lisp_Object print_number_table = CALLN (Fmake_hash_table, QCtest, Qeq); > + Fputhash (Vcomp__hashdollar, build_string ("#$") , print_number_table); > + specbind (intern_c_string ("print-number-table"), print_number_table); > + specbind (intern_c_string ("print-continuous-numbering"), Qt); > Lisp_Object str = Fprin1_to_string (obj, Qnil, Qnil); > unbind_to (count, Qnil); > > @@ -5145,18 +5151,22 @@ fixup_eln_load_path (Lisp_Object eln_filename) > static Lisp_Object > load_static_obj (struct Lisp_Native_Comp_Unit *comp_u, const char *name) > { > + specpdl_ref count = SPECPDL_INDEX (); > static_obj_t *blob = > dynlib_sym (comp_u->handle, format_string ("%s_blob", name)); > + /* Special value so we can recognize #$. */ > + specbind (intern_c_string ("load-file-name"), > + Vcomp__hashdollar); > if (blob) > /* New blob format. */ > - return Fread (make_string (blob->data, blob->len)); > + return unbind_to (count, Fread (make_string (blob->data, blob->len))); > > static_obj_t *(*f)(void) = dynlib_sym (comp_u->handle, name); > if (!f) > xsignal1 (Qnative_lisp_file_inconsistent, comp_u->file); > > blob = f (); > - return Fread (make_string (blob->data, blob->len)); > + return unbind_to (count, Fread (make_string (blob->data, blob->len))); > > } > > @@ -5173,7 +5183,7 @@ check_comp_unit_relocs (struct Lisp_Native_Comp_Unit *comp_u) > for (ptrdiff_t i = 0; i < d_vec_len; i++) > { > Lisp_Object x = data_relocs[i]; > - if (EQ (x, Q__lambda_fixup)) > + if (EQ (x, Vcomp__hashdollar)) > return false; > else if (NATIVE_COMP_FUNCTIONP (x)) > { > @@ -5622,7 +5632,6 @@ syms_of_comp (void) > DEFSYM (Qfixnum, "fixnum"); > DEFSYM (Qscratch, "scratch"); > DEFSYM (Qlate, "late"); > - DEFSYM (Q__lambda_fixup, "--lambda-fixup"); > DEFSYM (Qgccjit, "gccjit"); > DEFSYM (Qcomp_subr_trampoline_install, "comp-subr-trampoline-install"); > DEFSYM (Qnative_comp_warning_on_missing_source, > @@ -5804,6 +5813,10 @@ syms_of_comp (void) > verification of the native compiler. */); > comp_sanitizer_active = false; > > + DEFVAR_LISP ("comp--#$", Vcomp__hashdollar, > + doc: /* Special value which will print as "#$". */); > + Vcomp__hashdollar = build_string ("#$"); > + > Fprovide (intern_c_string ("native-compile"), Qnil); > #endif /* #ifdef HAVE_NATIVE_COMP */ > > diff --git a/src/pdumper.c b/src/pdumper.c > index d45bbc84bba..a27df8d96d4 100644 > --- a/src/pdumper.c > +++ b/src/pdumper.c > @@ -5504,7 +5504,7 @@ dump_do_dump_relocation (const uintptr_t dump_base, > XSETSUBR (tem, subr); > Lisp_Object *fixup = > &(comp_u->data_relocs[XFIXNUM (lambda_data_idx)]); > - eassert (EQ (*fixup, Q__lambda_fixup)); > + eassert (EQ (*fixup, Vcomp__hashdollar)); > *fixup = tem; > Fputhash (tem, Qt, comp_u->lambda_gc_guard_h); > } > > ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-07 12:46 ` Pip Cet via Emacs development discussions. 2025-01-18 9:40 ` Eli Zaretskii @ 2025-01-18 19:52 ` Andrea Corallo 2025-01-18 20:20 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 65+ messages in thread From: Andrea Corallo @ 2025-01-18 19:52 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, emacs-devel, stefankangas, gerd.moellmann, monnier Pip Cet <pipcet@protonmail.com> writes: > "Andrea Corallo" <acorallo@gnu.org> writes: > >> Eli Zaretskii <eliz@gnu.org> writes: >> >>>> Date: Sat, 04 Jan 2025 11:12:48 +0000 >>>> From: Pip Cet <pipcet@protonmail.com> >>>> Cc: emacs-devel@gnu.org, stefankangas@gmail.com, acorallo@gnu.org, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca >>>> >>>> "Eli Zaretskii" <eliz@gnu.org> writes: >>>> >>>> > Is this about bug#74966? If so, let's not discuss the same issue >>>> >>>> No. This is about the --enable-checking code treating --lambda-fixup as >>>> a forbidden symbol and crashing if it is used in the obvious fashion by >>>> any code that happens to be native-compiled, which is the current status >>>> on scratch/no-purespace. >>> >>> Then I'd like to hear Andrea's opinion on this. >> >> My opinion is that, to fix the scratch/no-purespace bug, I'm okay with >> the #$ solution if it works. Honestly I'd be fine also using a comp >> specific symbol like comp--something which is documented not to be >> compilable, maybe less nice but probably less convoluted. Anyway if Pip >> prefers #$ and is not too complex I'm okay with that. > > Here's the patch. Okay to push? > > (As with all the DEFVARs in comp.c, space for the new symbol is reserved > in globals, and the documentation ends up in etc/DOC, even in > non-nativecomp builds. The same is true for DEFSYM: make-docfile.c > doesn't care about #ifdef. If someone fixes this so non-nativecomp > builds don't pay a price for nativecomp, this code will need fixing, > too.) > > commit a0abd3033414a011125e815d30890ed22b80a84e (HEAD) > Author: Pip Cet <pipcet@protonmail.com> > Date: Tue Jan 7 12:13:40 2025 +0000 > > Use #$ for lambda fixups in native compilation data vectors > > * lisp/emacs-lisp/comp.el (comp--#$): New defvar. > (comp--finalize-container): Use it. > * src/comp.c (ABI_VERSION): Bump. > (emit_static_object): Ensure 'comp--#$' prints as "#$". > (load_static_obj): Ensure '#$' reads as Vcomp__hashdollar. > (check_comp_unit_relocs): Adjust assertion. > (syms_of_comp): Define 'comp--#$'. > * src/pdumper.c (dump_do_dump_relocation): Adjust assertion. The code LGTM, I'd only expand the commit message with a quick explaination of the mechanism and a link to this discussion. > + specpdl_ref count = SPECPDL_INDEX (); > static_obj_t *blob = > dynlib_sym (comp_u->handle, format_string ("%s_blob", name)); > + /* Special value so we can recognize #$. */ Probably this comment should be a little more generous. Otherwise okay for me for scratch/no-purespace. Thanks! Andrea ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-18 19:52 ` Andrea Corallo @ 2025-01-18 20:20 ` Pip Cet via Emacs development discussions. 2025-01-18 20:36 ` Andrea Corallo 2025-01-19 5:18 ` Eli Zaretskii 0 siblings, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2025-01-18 20:20 UTC (permalink / raw) To: Andrea Corallo Cc: Eli Zaretskii, emacs-devel, stefankangas, gerd.moellmann, monnier "Andrea Corallo" <acorallo@gnu.org> writes: >> commit a0abd3033414a011125e815d30890ed22b80a84e (HEAD) >> Author: Pip Cet <pipcet@protonmail.com> >> Date: Tue Jan 7 12:13:40 2025 +0000 >> >> Use #$ for lambda fixups in native compilation data vectors >> >> * lisp/emacs-lisp/comp.el (comp--#$): New defvar. >> (comp--finalize-container): Use it. >> * src/comp.c (ABI_VERSION): Bump. >> (emit_static_object): Ensure 'comp--#$' prints as "#$". >> (load_static_obj): Ensure '#$' reads as Vcomp__hashdollar. >> (check_comp_unit_relocs): Adjust assertion. >> (syms_of_comp): Define 'comp--#$'. >> * src/pdumper.c (dump_do_dump_relocation): Adjust assertion. > > The code LGTM, I'd only expand the commit message with a quick > explaination of the mechanism and a link to this discussion. I tried to, but the relevant message never made it to the emacs-devel archives, it seems. Is this a known issue? >> + specpdl_ref count = SPECPDL_INDEX (); >> static_obj_t *blob = >> dynlib_sym (comp_u->handle, format_string ("%s_blob", name)); >> + /* Special value so we can recognize #$. */ > > Probably this comment should be a little more generous. Agreed. > Otherwise okay for me for scratch/no-purespace. I'll push without the link in a bit, unless we find a solution for that problem. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-18 20:20 ` Pip Cet via Emacs development discussions. @ 2025-01-18 20:36 ` Andrea Corallo 2025-01-19 5:18 ` Eli Zaretskii 1 sibling, 0 replies; 65+ messages in thread From: Andrea Corallo @ 2025-01-18 20:36 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, emacs-devel, stefankangas, gerd.moellmann, monnier Pip Cet <pipcet@protonmail.com> writes: > "Andrea Corallo" <acorallo@gnu.org> writes: > >>> commit a0abd3033414a011125e815d30890ed22b80a84e (HEAD) >>> Author: Pip Cet <pipcet@protonmail.com> >>> Date: Tue Jan 7 12:13:40 2025 +0000 >>> >>> Use #$ for lambda fixups in native compilation data vectors >>> >>> * lisp/emacs-lisp/comp.el (comp--#$): New defvar. >>> (comp--finalize-container): Use it. >>> * src/comp.c (ABI_VERSION): Bump. >>> (emit_static_object): Ensure 'comp--#$' prints as "#$". >>> (load_static_obj): Ensure '#$' reads as Vcomp__hashdollar. >>> (check_comp_unit_relocs): Adjust assertion. >>> (syms_of_comp): Define 'comp--#$'. >>> * src/pdumper.c (dump_do_dump_relocation): Adjust assertion. >> >> The code LGTM, I'd only expand the commit message with a quick >> explaination of the mechanism and a link to this discussion. > > I tried to, but the relevant message never made it to the emacs-devel > archives, it seems. Is this a known issue? Not to me at least :( >>> + specpdl_ref count = SPECPDL_INDEX (); >>> static_obj_t *blob = >>> dynlib_sym (comp_u->handle, format_string ("%s_blob", name)); >>> + /* Special value so we can recognize #$. */ >> >> Probably this comment should be a little more generous. > > Agreed. > >> Otherwise okay for me for scratch/no-purespace. > > I'll push without the link in a bit, unless we find a solution for that > problem. Okay, with some explaination in the comment + commit message is sufficient. Thanks Andrea ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2025-01-18 20:20 ` Pip Cet via Emacs development discussions. 2025-01-18 20:36 ` Andrea Corallo @ 2025-01-19 5:18 ` Eli Zaretskii 1 sibling, 0 replies; 65+ messages in thread From: Eli Zaretskii @ 2025-01-19 5:18 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, emacs-devel, stefankangas, gerd.moellmann, monnier > Date: Sat, 18 Jan 2025 20:20:42 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, stefankangas@gmail.com, gerd.moellmann@gmail.com, monnier@iro.umontreal.ca > > "Andrea Corallo" <acorallo@gnu.org> writes: > > > The code LGTM, I'd only expand the commit message with a quick > > explaination of the mechanism and a link to this discussion. > > I tried to, but the relevant message never made it to the emacs-devel > archives, it seems. Is this a known issue? I did not subscribe to the list, so each message from you to the list must be manually approved by the list moderators. And humans can make mistakes. If you want to avoid that, subscribe to the list. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-19 17:55 ` Stefan Kangas 2024-12-19 20:13 ` Pip Cet via Emacs development discussions. @ 2024-12-20 8:42 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-20 8:42 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Gerd Möllmann, Eli Zaretskii, emacs-devel, monnier Pip Cet <pipcet@protonmail.com> writes: > But I'll try to come up with a patch doing the simple #1# thing first. Easy enough to do, but it doesn't solve the read-circle problem (which appears to affect autoloaded .elc files as well as .eln files, so it's probably best to discuss that in a separate bug report once this is merged). From 6f4133b54edc890993d6198940800cc9a19df85c Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Fri, 20 Dec 2024 07:53:38 +0000 Subject: [PATCH] Use circular references as placeholders in nativecomp relocs * lisp/emacs-lisp/comp.el (comp--finalize-container): Use temporary placeholder rather than `--lambda-fixup'. * src/comp.c (declare_imported_data_relocs): Replace temporary placeholders by permanent placeholders. (check_comp_unit_relocs): Check for circular reference instead of `--lambda-fixup'. (syms_of_comp): Remove `--lambda-fixup'. * src/pdumper.c (dump_do_dump_relocation): Check for circular reference instead of `--lambda-fixup'. --- lisp/emacs-lisp/comp.el | 8 ++++---- src/comp.c | 14 ++++++++++---- src/pdumper.c | 2 +- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el index ab6fd77f11a..d5351fe440d 100644 --- a/lisp/emacs-lisp/comp.el +++ b/lisp/emacs-lisp/comp.el @@ -3254,10 +3254,10 @@ comp--finalize-container ;; from the corresponding m-var. collect (if (gethash obj (comp-ctxt-byte-func-to-func-h comp-ctxt)) - ;; Hack not to have `--lambda-fixup' in - ;; data relocations as it would trigger the - ;; check in 'check_comp_unit_relocs'. - (intern (concat (make-string 1 ?-) "-lambda-fixup")) + ;; Temporary placeholder, to be replaced by + ;; a circular reference to the vector + ;; containing the element + (comp-data-container-idx cont) obj)))) (defun comp--finalize-relocs () diff --git a/src/comp.c b/src/comp.c index 8b38adec252..3b3dfa58602 100644 --- a/src/comp.c +++ b/src/comp.c @@ -2854,13 +2854,20 @@ emit_static_object (const char *name, Lisp_Object obj) declare_imported_data_relocs (Lisp_Object container, const char *code_symbol, const char *text_symbol) { + Lisp_Object index_table = + CALL1I (comp-data-container-idx, container); /* Imported objects. */ reloc_array_t res; res.len = - XFIXNUM (CALL1I (hash-table-count, - CALL1I (comp-data-container-idx, container))); + XFIXNUM (CALL1I (hash-table-count, index_table)); Lisp_Object d_reloc = CALL1I (comp-data-container-l, container); d_reloc = Fvconcat (1, &d_reloc); + /* Replace temporary placeholders, turning them into circular + * references to the vector itself. */ + EMACS_INT n = XFIXNUM (Flength (d_reloc)); + for (EMACS_INT i = 0; i < n; i++) + if (EQ (AREF (d_reloc, i), index_table)) + ASET (d_reloc, i, d_reloc); res.r_val = gcc_jit_lvalue_as_rvalue ( @@ -5173,7 +5180,7 @@ check_comp_unit_relocs (struct Lisp_Native_Comp_Unit *comp_u) for (ptrdiff_t i = 0; i < d_vec_len; i++) { Lisp_Object x = data_relocs[i]; - if (EQ (x, Q__lambda_fixup)) + if (EQ (x, comp_u->data_vec)) return false; else if (NATIVE_COMP_FUNCTIONP (x)) { @@ -5622,7 +5629,6 @@ syms_of_comp (void) DEFSYM (Qfixnum, "fixnum"); DEFSYM (Qscratch, "scratch"); DEFSYM (Qlate, "late"); - DEFSYM (Q__lambda_fixup, "--lambda-fixup"); DEFSYM (Qgccjit, "gccjit"); DEFSYM (Qcomp_subr_trampoline_install, "comp-subr-trampoline-install"); DEFSYM (Qnative_comp_warning_on_missing_source, diff --git a/src/pdumper.c b/src/pdumper.c index d45bbc84bba..956da4d8c72 100644 --- a/src/pdumper.c +++ b/src/pdumper.c @@ -5504,7 +5504,7 @@ dump_do_dump_relocation (const uintptr_t dump_base, XSETSUBR (tem, subr); Lisp_Object *fixup = &(comp_u->data_relocs[XFIXNUM (lambda_data_idx)]); - eassert (EQ (*fixup, Q__lambda_fixup)); + eassert (EQ (*fixup, comp_u->data_vec)); *fixup = tem; Fputhash (tem, Qt, comp_u->lambda_gc_guard_h); } -- 2.47.0 ^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 17:56 ` Gerd Möllmann 2024-12-17 18:50 ` Eli Zaretskii @ 2024-12-18 0:18 ` Stefan Kangas 1 sibling, 0 replies; 65+ messages in thread From: Stefan Kangas @ 2024-12-18 0:18 UTC (permalink / raw) To: Gerd Möllmann; +Cc: emacs-devel, Pip Cet, Stefan Monnier Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> >>> I think check_comp_unit_relocs should be removed in the branch. What's >>> left of it the branch, checks in master if everything has been >>> put in purespace which should be there. IIUC correctly, of course. >> >> Patch attached. > > Pushed. Complaints to me please. Thanks! ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 10:47 Merging scratch/no-purespace to remove unexec and purespace Stefan Kangas 2024-12-17 13:12 ` Gerd Möllmann @ 2024-12-17 19:30 ` Helmut Eller 2024-12-17 20:47 ` Stefan Monnier 2024-12-18 9:30 ` Pip Cet via Emacs development discussions. 2024-12-18 0:50 ` Po Lu 2 siblings, 2 replies; 65+ messages in thread From: Helmut Eller @ 2024-12-17 19:30 UTC (permalink / raw) To: Stefan Kangas; +Cc: emacs-devel, Pip Cet, Stefan Monnier On Tue, Dec 17 2024, Stefan Kangas wrote: [...] > We, the maintainers, believe that the scratch/no-purespace branch is > now ready to merge, and would appreciate any final feedback, testing, > and code reviews. Specifically, the branch has been primarily tested > on GNU/Linux and macOS, so testing on other systems would be greatly > appreciated. Do you have an estimate what removing purespace will cost in terms of GC time? I mean something like "1ms per collection". Or perhaps a suggestion how I could measure it? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 19:30 ` Helmut Eller @ 2024-12-17 20:47 ` Stefan Monnier 2024-12-18 2:15 ` Stefan Kangas 2024-12-18 6:56 ` Helmut Eller 2024-12-18 9:30 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 65+ messages in thread From: Stefan Monnier @ 2024-12-17 20:47 UTC (permalink / raw) To: Helmut Eller; +Cc: Stefan Kangas, emacs-devel, Pip Cet >> We, the maintainers, believe that the scratch/no-purespace branch is >> now ready to merge, and would appreciate any final feedback, testing, >> and code reviews. Specifically, the branch has been primarily tested >> on GNU/Linux and macOS, so testing on other systems would be greatly >> appreciated. > > Do you have an estimate what removing purespace will cost in terms of GC > time? I mean something like "1ms per collection". Or perhaps a > suggestion how I could measure it? In the pdump case it should have no effect at all, or maybe even a slight *speedup*. That's because the pdump already fails to take advantage of the purespace (i.e. the GC traces through the purespace like the rest of the heap). Stefan ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 20:47 ` Stefan Monnier @ 2024-12-18 2:15 ` Stefan Kangas 2024-12-18 7:11 ` Helmut Eller 2024-12-18 6:56 ` Helmut Eller 1 sibling, 1 reply; 65+ messages in thread From: Stefan Kangas @ 2024-12-18 2:15 UTC (permalink / raw) To: Stefan Monnier, Helmut Eller; +Cc: emacs-devel, Pip Cet Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> We, the maintainers, believe that the scratch/no-purespace branch is >>> now ready to merge, and would appreciate any final feedback, testing, >>> and code reviews. Specifically, the branch has been primarily tested >>> on GNU/Linux and macOS, so testing on other systems would be greatly >>> appreciated. >> >> Do you have an estimate what removing purespace will cost in terms of GC >> time? I mean something like "1ms per collection". Or perhaps a >> suggestion how I could measure it? > > In the pdump case it should have no effect at all, or maybe even > a slight *speedup*. > > That's because the pdump already fails to take advantage of the > purespace (i.e. the GC traces through the purespace like the rest of > the heap). I'll note that the best solution to that is to have a generational GC instead. Simple, right? It's not entirely unrelated though: among other things, one reason why merging this would be good is that it would reportedly simplify the work on the igc branch. And indeed any GC-related work now or in future. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 2:15 ` Stefan Kangas @ 2024-12-18 7:11 ` Helmut Eller 2024-12-18 13:35 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 65+ messages in thread From: Helmut Eller @ 2024-12-18 7:11 UTC (permalink / raw) To: Stefan Kangas; +Cc: Stefan Monnier, emacs-devel, Pip Cet On Tue, Dec 17 2024, Stefan Kangas wrote: > Stefan Monnier <monnier@iro.umontreal.ca> writes: [...] >> That's because the pdump already fails to take advantage of the >> purespace (i.e. the GC traces through the purespace like the rest of >> the heap). > > I'll note that the best solution to that is to have a generational GC > instead. Simple, right? A generational GC is definitely simpler. Whether it's the "best" solution is not so clear: A copying GC, like MPS, still needs to trace and copy pure objects whenever the oldest generation is in the condemned set. Moving pure objects to a non-moving pool might be better. > It's not entirely unrelated though: among other things, one reason why > merging this would be good is that it would reportedly simplify the work > on the igc branch. And indeed any GC-related work now or in future. Objects in purespace are immutable and immortal. That's potentially useful information for the GC. Removing purespace also removes that information. Of course, if the pdumper already throws away this information, then purespace just adds useless complexity. Helmut ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 7:11 ` Helmut Eller @ 2024-12-18 13:35 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-18 13:35 UTC (permalink / raw) To: Helmut Eller; +Cc: Stefan Kangas, Stefan Monnier, emacs-devel "Helmut Eller" <eller.helmut@gmail.com> writes: > On Tue, Dec 17 2024, Stefan Kangas wrote: > >> Stefan Monnier <monnier@iro.umontreal.ca> writes: > [...] >>> That's because the pdump already fails to take advantage of the >>> purespace (i.e. the GC traces through the purespace like the rest of >>> the heap). >> >> I'll note that the best solution to that is to have a generational GC >> instead. Simple, right? > > A generational GC is definitely simpler. Whether it's the "best" > solution is not so clear: A copying GC, like MPS, still needs to trace > and copy pure objects whenever the oldest generation is in the condemned > set. > Moving pure objects to a non-moving pool might be better. I think we should rephrase that without presupposing the existence of "pure" objects: Hinting to MPS that an object is expected to be immutable and have a very long lifetime may have advantages, and that's a potential reason for introducing (and maintaining) a hinting mechanism. On the other hand, it's very common for objects to have those properties without us knowing in advance that they will, so it's important MPS works well in the absence of such hints. >> It's not entirely unrelated though: among other things, one reason why >> merging this would be good is that it would reportedly simplify the work >> on the igc branch. And indeed any GC-related work now or in future. > > Objects in purespace are immutable and immortal. That's potentially > useful information for the GC. I may be misunderstanding what you mean by "immutable", but the most important property "pure" objects had was that they only referenced other pure objects or static objects, so GC didn't need to look at them (but, IIUC, this optimization never worked in pdumper builds). This required us to make such objects read-only (which caused problems) and immortal. Immortality was an undesirable, but necessary, side effect of the "pure" optimization, not a feature. IMHO, so was immutability, but some people consider it a feature not to be able to modify certain objects. For example, on the no-purespace branch, you can execute (aset (symbol-name nil) 0 ?N) and rename nil to Nil, which will make the rest of your Emacs session unusable. On master, this code should (and did at one point, IIRC) throw a CHECK_IMPURE exception. Right now, it segfaults, which demonstrates that the current purespace code has suffered from some bit rot and removing purespace will fix some bugs for that reason alone. So the two major features of pure space are broken right now. Fixing that is an option, and it's the only way we'd ever see a fair comparison of purespace and no-purespace performance, but I hope it's not going to happen. So what's left is a weak hint to the GC that this object is likely to remain reachable, and not to be modified, but for MPS to set up a "pure" space of such objects based on this hint seems to me to be an unrealistic expectation. > Removing purespace also removes that information. Removing purespace does leave us with very few read-only objects. We can consider introducing a useful set of read-only objects (and agree on what that even means) after purespace is removed. However, I would be surprised if the lost information about whether an object used to be pure turns out to be very useful. > Of course, if the pdumper already throws away this > information, then purespace just adds useless complexity. I believe that's the case. In fact, once we remove purespace, we can look at improving how pdumper dumps are handled during GC. Maybe we can make it so the pdumper dump looks even more like an ordinary MPS segment in memory, and then we don't need to treat pdumper objects specially. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 20:47 ` Stefan Monnier 2024-12-18 2:15 ` Stefan Kangas @ 2024-12-18 6:56 ` Helmut Eller 2024-12-21 17:41 ` Helmut Eller 1 sibling, 1 reply; 65+ messages in thread From: Helmut Eller @ 2024-12-18 6:56 UTC (permalink / raw) To: Stefan Monnier; +Cc: Stefan Kangas, emacs-devel, Pip Cet On Tue, Dec 17 2024, Stefan Monnier wrote: >> Do you have an estimate what removing purespace will cost in terms of GC >> time? I mean something like "1ms per collection". Or perhaps a >> suggestion how I could measure it? > > In the pdump case it should have no effect at all, or maybe even > a slight *speedup*. > > That's because the pdump already fails to take advantage of the > purespace (i.e. the GC traces through the purespace like the rest of > the heap). So the pdumper copies objects from purespace to the dump like normal objects; when loading the dump, purespace stays empty. I had (wrongly) assumed that the pdumper creates a separate section for pure objects. Creating such a section sounds easy enough (hmm, maybe not so easy because of hashtables). Still not sure if it would be worth the effort. Helmut ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 6:56 ` Helmut Eller @ 2024-12-21 17:41 ` Helmut Eller 2024-12-21 18:32 ` Gerd Möllmann ` (2 more replies) 0 siblings, 3 replies; 65+ messages in thread From: Helmut Eller @ 2024-12-21 17:41 UTC (permalink / raw) To: Stefan Monnier; +Cc: Stefan Kangas, emacs-devel, Pip Cet [-- Attachment #1: Type: text/plain, Size: 889 bytes --] On Wed, Dec 18 2024, Helmut Eller wrote: [...] > So the pdumper copies objects from purespace to the dump like normal > objects; when loading the dump, purespace stays empty. > > I had (wrongly) assumed that the pdumper creates a separate section for > pure objects. Creating such a section sounds easy enough (hmm, maybe > not so easy because of hashtables). Still not sure if it would be worth > the effort. Out of curiosity, I implemented such a section with the attached patch. It seems that it would save ~2ms per collection cycle. To measure this, I compared the output of ./src/emacs -Q --batch --eval \ '(let* ((stats (benchmark-run 1000 (garbage-collect)))) (message "%s" (/ (elt stats 2) (elt stats 1))))' between the versions with and without the patch. The results was: without-pure-section: 0.006251480181 with-pure-section: 0.003986384857 Helmut [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-WIP-create-pure-space-in-dump.patch --] [-- Type: text/x-diff, Size: 10755 bytes --] From ad64bcdfa2d91e59e64adbcf896ddf66bba725cb Mon Sep 17 00:00:00 2001 From: Helmut Eller <eller.helmut@gmail.com> Date: Sat, 21 Dec 2024 16:06:45 +0100 Subject: [PATCH] WIP create pure space in dump --- src/alloc.c | 9 +++++ src/frame.c | 7 +++- src/pdumper.c | 108 +++++++++++++++++++++++++++++++++++++++++++------ src/puresize.h | 8 +++- 4 files changed, 118 insertions(+), 14 deletions(-) diff --git a/src/alloc.c b/src/alloc.c index 4fab0d54248..6f57563d80c 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -386,7 +386,12 @@ static char *spare_memory[7]; remapping on more recent systems because this is less important nowadays than in the days of small memories and timesharing. */ +#if PDUMPER_PURE == 1 +EMACS_INT *pure; +#else EMACS_INT pure[(PURESIZE + sizeof (EMACS_INT) - 1) / sizeof (EMACS_INT)] = {1,}; +#endif + #define PUREBEG (char *) pure /* Pointer to the pure area, and its size. */ @@ -8111,6 +8116,10 @@ init_alloc_once (void) static void init_alloc_once_for_pdumper (void) { +#ifdef PDUMPER_PURE + pure = xzalloc ((PURESIZE + sizeof (EMACS_INT) - 1)); + pure[0] = 2; +#endif purebeg = PUREBEG; pure_size = PURESIZE; mem_init (); diff --git a/src/frame.c b/src/frame.c index f22bd501a8d..0bef8ba8661 100644 --- a/src/frame.c +++ b/src/frame.c @@ -53,6 +53,7 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ #include "widget.h" #endif #include "pdumper.h" +#include "puresize.h" /* The currently selected frame. */ Lisp_Object selected_frame; @@ -1201,7 +1202,11 @@ make_initial_frame (void) Vframe_list = Fcons (frame, Vframe_list); tty_frame_count = 1; - fset_name (f, build_pure_c_string ("F1")); + + if (PDUMPER_PURE) + fset_name (f, build_string ("F1")); + else + fset_name (f, build_pure_c_string ("F1")); SET_FRAME_VISIBLE (f, 1); diff --git a/src/pdumper.c b/src/pdumper.c index c8baa311854..47fa979da8c 100644 --- a/src/pdumper.c +++ b/src/pdumper.c @@ -45,6 +45,7 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ #include "thread.h" #include "bignum.h" #include "treesit.h" +#include "puresize.h" #ifdef CHECK_STRUCTS # include "dmpstruct.h" @@ -399,6 +400,9 @@ struct dump_header /* Offset of a vector of the dumped hash tables. */ dump_off hash_list; + + dump_off pure_start; + dump_off pure_end; }; /* Double-ended singly linked list. */ @@ -480,6 +484,7 @@ struct dump_flags bool_bf defer_cold_objects : 1; /* Punt on copied objects: defer them to ctx->copied_queue. */ bool_bf defer_copied_objects : 1; + bool_bf defer_pure_objects : 1; }; /* Information we use while we dump. Note that we're not the garbage @@ -546,6 +551,7 @@ struct dump_context Lisp_Object copied_queue; /* Queue of cold objects to dump. */ Lisp_Object cold_queue; + Lisp_Object pure_queue; /* Relocations in the dump. */ Lisp_Object dump_relocs[RELOC_NUM_PHASES]; @@ -576,7 +582,8 @@ struct dump_context are physical dump offsets. */ enum dump_object_special_offset { - DUMP_OBJECT_IS_RUNTIME_MAGIC = -6, + DUMP_OBJECT_IS_RUNTIME_MAGIC = -7, + DUMP_OBJECT_ON_PURE_QUEUE = -6, DUMP_OBJECT_ON_COPIED_QUEUE = -5, DUMP_OBJECT_ON_HASH_TABLE_QUEUE = -4, DUMP_OBJECT_ON_SYMBOL_QUEUE = -3, @@ -2620,15 +2627,18 @@ dump_vectorlike_generic (struct dump_context *ctx, skip = 0; } - /* We may have written a non-Lisp vector prefix above. If we have, - pad to the lisp content start with zero, and make sure we didn't - scribble beyond that start. */ - dump_off prefix_size = ctx->offset - prefix_start_offset; - eassert (prefix_size > 0); - dump_off skip_start = ptrdiff_t_to_dump_off ((char *) &v->contents[skip] - - (char *) v); - eassert (skip_start >= prefix_size); - dump_write_zero (ctx, skip_start - prefix_size); + if (ctx->flags.dump_object_contents) + { + /* We may have written a non-Lisp vector prefix above. If we have, + pad to the lisp content start with zero, and make sure we didn't + scribble beyond that start. */ + dump_off prefix_size = ctx->offset - prefix_start_offset; + eassert (prefix_size > 0); + dump_off skip_start = ptrdiff_t_to_dump_off ((char *) &v->contents[skip] + - (char *) v); + eassert (skip_start >= prefix_size); + dump_write_zero (ctx, skip_start - prefix_size); + } /* dump_object_start isn't what records conservative-GC object starts --- dump_object_1 does --- so the hack below of using @@ -3148,6 +3158,24 @@ dump_vectorlike (struct dump_context *ctx, error_unsupported_dump_object (ctx, lv, msg); } +static bool +is_pure_object (Lisp_Object x) +{ + switch (XTYPE (x)) + { + case Lisp_Symbol: return PURE_P (XSYMBOL (x)); + case Lisp_String: return PURE_P (XSTRING (x)); + case Lisp_Cons: return PURE_P (XCONS (x)); + case Lisp_Vectorlike: return PURE_P (XVECTOR (x)); + //case Lisp_Float: return PURE_P (XFLOAT (x)); + case Lisp_Float: return false; + case Lisp_Int0: return false; + case Lisp_Int1: return false; + case Lisp_Type_Unused0: emacs_abort (); + } + emacs_abort (); +} + /* Add an object to the dump. CTX is the dump context; OBJECT is the object to add. Normally, @@ -3218,6 +3246,25 @@ dump_object (struct dump_context *ctx, Lisp_Object object) return offset; } + if (is_pure_object (object) && ctx->flags.defer_pure_objects) + { + if (offset != DUMP_OBJECT_ON_PURE_QUEUE) + { + eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE + || offset == DUMP_OBJECT_NOT_SEEN); + dump_push (&ctx->pure_queue, object); + offset = DUMP_OBJECT_ON_PURE_QUEUE; + dump_remember_object (ctx, object, offset); + + struct dump_flags old_flags = ctx->flags; + ctx->flags.dump_object_contents = false; + ctx->flags.defer_pure_objects = false; + dump_object (ctx, object); + ctx->flags = old_flags; + } + return offset; + } + /* Object needs to be dumped. */ if (dump_set_referrer (ctx)) ctx->current_referrer = object; @@ -3606,6 +3653,25 @@ dump_drain_cold_data (struct dump_context *ctx) ctx->flags = old_flags; } +static void +dump_drain_pure_data (struct dump_context *ctx) +{ + Lisp_Object pure_queue = Fnreverse (ctx->pure_queue); + ctx->pure_queue = Qnil; + + struct dump_flags old_flags = ctx->flags; + + /* Actually dump pure objects instead of deferring them. */ + ctx->flags.defer_pure_objects = false; + + while (!NILP (pure_queue)) + { + Lisp_Object item = dump_pop (&pure_queue); + dump_object (ctx, item); + } + ctx->flags = old_flags; +} + static void read_ptr_raw_and_lv (const void *mem, enum Lisp_Type type, @@ -4195,6 +4261,7 @@ types. */) ctx->symbol_aux = Qnil; ctx->copied_queue = Qnil; ctx->cold_queue = Qnil; + ctx->pure_queue = Qnil; for (int i = 0; i < RELOC_NUM_PHASES; ++i) ctx->dump_relocs[i] = Qnil; ctx->object_starts = Qnil; @@ -4216,6 +4283,7 @@ types. */) /* These objects go into special sections. */ ctx->flags.defer_cold_objects = true; ctx->flags.defer_copied_objects = true; + ctx->flags.defer_pure_objects = true; ctx->current_referrer = Qnil; if (!NILP (track_referrers)) @@ -4297,6 +4365,11 @@ types. */) ctx->header.hash_list = ctx->offset; dump_hash_table_list (ctx); + ctx->header.pure_start = ctx->offset; + dump_drain_pure_data (ctx); + ctx->header.pure_end = ctx->offset; + dump_write_zero (ctx, PURESIZE - (ctx->offset - ctx->header.pure_start)); + /* dump_hash_table_list just adds a new vector to the dump but all its content should already have been in the dump, so it doesn't add anything to any queue. */ @@ -4378,6 +4451,7 @@ types. */) eassert (dump_queue_empty_p (&ctx->dump_queue)); eassert (NILP (ctx->copied_queue)); eassert (NILP (ctx->cold_queue)); + eassert (NILP (ctx->pure_queue)); eassert (NILP (ctx->deferred_symbols)); eassert (NILP (ctx->deferred_hash_tables)); eassert (NILP (ctx->fixups)); @@ -4401,13 +4475,16 @@ types. */) header_bytes = header_end - header_start, hot_bytes = hot_end - hot_start, discardable_bytes = discardable_end - ctx->header.discardable_start, - cold_bytes = cold_end - ctx->header.cold_start; + cold_bytes = cold_end - ctx->header.cold_start, + pure_bytes = ctx->header.pure_end - ctx->header.pure_start; fprintf (stderr, ("Dump complete\n" "Byte counts: header=%"PRIdDUMP_OFF" hot=%"PRIdDUMP_OFF " discardable=%"PRIdDUMP_OFF" cold=%"PRIdDUMP_OFF"\n" + " pure=%"PRIdDUMP_OFF"\n" "Reloc counts: hot=%"PRIdDUMP_OFF" discardable=%"PRIdDUMP_OFF"\n"), header_bytes, hot_bytes, discardable_bytes, cold_bytes, + pure_bytes, number_hot_relocations, number_discardable_relocations); @@ -5270,7 +5347,8 @@ pdumper_find_object_type_impl (const void *obj) return PDUMPER_NO_OBJECT; ptrdiff_t bitno = offset / DUMP_ALIGNMENT; if (offset < dump_private.header.discardable_start - && !dump_bitset_bit_set_p (&dump_private.last_mark_bits, bitno)) + && !dump_bitset_bit_set_p (&dump_private.last_mark_bits, bitno) + && !PURE_P (obj)) return PDUMPER_NO_OBJECT; const struct dump_reloc *reloc = dump_find_relocation (&dump_private.header.object_starts, offset); @@ -5814,6 +5892,12 @@ pdumper_load (const char *dump_filename, char *argv0) dump_do_all_dump_reloc_for_phase (header, dump_base, LATE_RELOCS); dump_do_all_dump_reloc_for_phase (header, dump_base, VERY_LATE_RELOCS); + if (PDUMPER_PURE) + { + xfree (pure); + pure = (EMACS_INT *)(dump_base + header->pure_start); + } + /* Run the functions Emacs registered for doing post-dump-load initialization. */ for (int i = 0; i < nr_dump_late_hooks; ++i) diff --git a/src/puresize.h b/src/puresize.h index d7d8f0b4eec..1c906536eec 100644 --- a/src/puresize.h +++ b/src/puresize.h @@ -23,6 +23,8 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ INLINE_HEADER_BEGIN +#define PDUMPER_PURE 1 + /* Define PURESIZE, the number of bytes of pure Lisp code to leave space for. At one point, this was defined in config.h, meaning that changing @@ -79,13 +81,17 @@ INLINE_HEADER_BEGIN extern AVOID pure_write_error (Lisp_Object); +#if PDUMPER_PURE == 1 +extern EMACS_INT *pure; +#else extern EMACS_INT pure[]; +#endif /* The puresize_h_* macros are private to this include file. */ /* True if PTR is pure. */ -#define puresize_h_PURE_P(ptr) \ +#define puresize_h_PURE_P(ptr) \ ((uintptr_t) (ptr) - (uintptr_t) pure <= PURESIZE) INLINE bool -- 2.39.5 ^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-21 17:41 ` Helmut Eller @ 2024-12-21 18:32 ` Gerd Möllmann 2024-12-21 22:19 ` Pip Cet via Emacs development discussions. 2024-12-22 1:28 ` Stefan Kangas 2 siblings, 0 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-21 18:32 UTC (permalink / raw) To: Helmut Eller; +Cc: Stefan Monnier, Stefan Kangas, emacs-devel, Pip Cet Helmut Eller <eller.helmut@gmail.com> writes: > On Wed, Dec 18 2024, Helmut Eller wrote: > > [...] >> So the pdumper copies objects from purespace to the dump like normal >> objects; when loading the dump, purespace stays empty. >> >> I had (wrongly) assumed that the pdumper creates a separate section for >> pure objects. Creating such a section sounds easy enough (hmm, maybe >> not so easy because of hashtables). Still not sure if it would be worth >> the effort. > > Out of curiosity, I implemented such a section with the attached patch. > It seems that it would save ~2ms per collection cycle. > > To measure this, I compared the output of > > ./src/emacs -Q --batch --eval \ > '(let* ((stats (benchmark-run 1000 (garbage-collect)))) > (message "%s" (/ (elt stats 2) (elt stats 1))))' > > between the versions with and without the patch. The results was: > > without-pure-section: 0.006251480181 > with-pure-section: 0.003986384857 > > Helmut Interesting! And impressive, that's ca. 1/3 less. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-21 17:41 ` Helmut Eller 2024-12-21 18:32 ` Gerd Möllmann @ 2024-12-21 22:19 ` Pip Cet via Emacs development discussions. 2024-12-22 1:28 ` Stefan Kangas 2 siblings, 0 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-21 22:19 UTC (permalink / raw) To: Helmut Eller; +Cc: Stefan Monnier, Stefan Kangas, emacs-devel "Helmut Eller" <eller.helmut@gmail.com> writes: > On Wed, Dec 18 2024, Helmut Eller wrote: > > [...] >> So the pdumper copies objects from purespace to the dump like normal >> objects; when loading the dump, purespace stays empty. >> >> I had (wrongly) assumed that the pdumper creates a separate section for >> pure objects. Creating such a section sounds easy enough (hmm, maybe >> not so easy because of hashtables). Still not sure if it would be worth >> the effort. > > Out of curiosity, I implemented such a section with the attached patch. > It seems that it would save ~2ms per collection cycle. Very interesting, thank you for sharing! I tried everything I could think of to disprove your findings, but while I did find a minor optimization opportunity in the pdumper mark bits code (a missing eassume (offset >= 0) / eassume (word_number >= 0)), this affects both branches equally. I can only conclude that your observation is accurate, and we now have a number that is still close to zero, but measurable. The effect seems to be mostly constant (1-2 ms / GC cycle), but I did see some slightly larger differences when more objects were on the heap. Cache effects, probably (PURE_P doesn't have to wait for memory, pdumper_marked_p_impl does). (I did run into crashes when running emacs without --batch, but those are easily fixed and did not affect the performance measurements at all. I only mention this so that others running tests don't get their hopes (or fears) up). Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-21 17:41 ` Helmut Eller 2024-12-21 18:32 ` Gerd Möllmann 2024-12-21 22:19 ` Pip Cet via Emacs development discussions. @ 2024-12-22 1:28 ` Stefan Kangas 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. ` (2 more replies) 2 siblings, 3 replies; 65+ messages in thread From: Stefan Kangas @ 2024-12-22 1:28 UTC (permalink / raw) To: Helmut Eller, Stefan Monnier; +Cc: emacs-devel, Pip Cet Helmut Eller <eller.helmut@gmail.com> writes: > On Wed, Dec 18 2024, Helmut Eller wrote: > > [...] >> So the pdumper copies objects from purespace to the dump like normal >> objects; when loading the dump, purespace stays empty. >> >> I had (wrongly) assumed that the pdumper creates a separate section for >> pure objects. Creating such a section sounds easy enough (hmm, maybe >> not so easy because of hashtables). Still not sure if it would be worth >> the effort. > > Out of curiosity, I implemented such a section with the attached patch. > It seems that it would save ~2ms per collection cycle. > > To measure this, I compared the output of > > ./src/emacs -Q --batch --eval \ > '(let* ((stats (benchmark-run 1000 (garbage-collect)))) > (message "%s" (/ (elt stats 2) (elt stats 1))))' > > between the versions with and without the patch. The results was: > > without-pure-section: 0.006251480181 > with-pure-section: 0.003986384857 This is interesting, thanks. Would this experiment easily transfer to and be relevant to the MPS branch? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 1:28 ` Stefan Kangas @ 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. 2024-12-22 13:07 ` Eli Zaretskii 2024-12-22 15:51 ` Stefan Monnier 2024-12-22 13:13 ` Pip Cet via Emacs development discussions. 2024-12-22 14:16 ` Helmut Eller 2 siblings, 2 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-22 11:12 UTC (permalink / raw) To: Stefan Kangas; +Cc: Helmut Eller, Stefan Monnier, emacs-devel "Stefan Kangas" <stefankangas@gmail.com> writes: > Helmut Eller <eller.helmut@gmail.com> writes: > >> On Wed, Dec 18 2024, Helmut Eller wrote: >> >> [...] >>> So the pdumper copies objects from purespace to the dump like normal >>> objects; when loading the dump, purespace stays empty. >>> >>> I had (wrongly) assumed that the pdumper creates a separate section for >>> pure objects. Creating such a section sounds easy enough (hmm, maybe >>> not so easy because of hashtables). Still not sure if it would be worth >>> the effort. >> >> Out of curiosity, I implemented such a section with the attached patch. >> It seems that it would save ~2ms per collection cycle. >> >> To measure this, I compared the output of >> >> ./src/emacs -Q --batch --eval \ >> '(let* ((stats (benchmark-run 1000 (garbage-collect)))) >> (message "%s" (/ (elt stats 2) (elt stats 1))))' >> >> between the versions with and without the patch. The results was: >> >> without-pure-section: 0.006251480181 >> with-pure-section: 0.003986384857 > > This is interesting, thanks. Would this experiment easily transfer to > and be relevant to the MPS branch? TL;DR: let's add check_writable to replace CHECK_IMPURE rather than dropping the calls entirely. It'd be a nop in the initial merge, but it would facilitate experiments for regaining the advantages of purespace without the ugliness. If we want to perform an optimization such as this one, we should think of ways to do so without pure space and all its restrictions. I think I have one. My experiments suggest that the main performance impact is that when we reach a pure object, we stop scanning, in effect treating it as though it were already marked. Checking and setting the mark bit of the object doesn't affect the performance much, in my experiments. My idea is this: we add an extra mark bit area to the pdumper file for objects which we know to be "tenured": i.e. objects that we'll treat as immortal, but for which we also know that all referenced objects will also be "tenured", or static. Such "tenured" objects aren't scanned during GC: they're already marked when we start the GC cycle. If we write to such an object, we clear the bit, and put it on a special set to maintain its tenure (it'd be nicer to simply set another bit, but non-MPS pdumper cannot do so). This should happen rarely, but it's better than the current CHECK_IMPURE thing. Some housekeeping would be required to adjust the "tenured" set, but the important point is that "tenured" is no longer a user-visible property: correctness would be maintained even for dumb choices of the "tenured" set. The problems are: 1. choose a "tenured" set. I would hazard a guess that most objects in the pdumper file are usually not actually written to, ever, so maybe we could even get away with starting with a universally-filled "tenured" set. We could use, in essence, PGO to do so, by identifying call stacks for objects that are modified and excluding them from the "tenured" set based on the call stacks when the real Emacs is built. 2. keep the immortal set small. In particular, even if an object was written to, it's quite possible that it can be removed from the set again, because the new references are also "tenured". In extreme cases, we could get away with clearing the entire "pure" set and continuing without this optimization. However, if we find an object cannot have its tenured bit set anymore, we can recursively scan the remaining "tenured" objects which reference this object and revoke their tenure, and remove the problematic object from the immortal set. In essence, all this would turn the user-visible "pure" status, available only to explicitly-specified objects, into a user-invisible "tenured" status, awarded automatically to some objects without programmer interaction. The benefits may well be greater because the automatic detection of "tenured" objects might very easily be superior to the current "pure" selection. "Tenured" status would be available to all pdumped objects, but for MPS builds it would be possible to expand this so even objects created after loading the dump can become "tenured". (I realize "tenure" isn't a great term because a sensible implementation would probably revoke tenure in some cases. "GC-skippable" or "skippable" might be better.) The main implementation problem I see right now is that to correct our tenured set, and for the immortal set, we'd need an efficient "set" structure. In particular, while ordinary GC would only need to use the "immortal" set, analysis and fixing the "tenured" set would require knowing about all "tenured" objects. For the MPS build, this is comparatively easy as we can simply set a bit and linear-scan the pdumper area (all pdumped objects have igc headers). For non-MPS, a full hash table would probably be required, and it would have to be dumped. My important points are: 1. It's probably possible to do better performance-wise than Helmut's experiment. 2. This potential approach would not require the reintroduction of a user-visible "pure" status. 3. In particular, there's (almost) nothing to stop us from delaying this experiment until after pure space is removed. 4. The exception to the "almost" is: We should probably introduce dummy check_writable (Lisp_Object) calls rather than dropping CHECK_IMPURE () entirely. We'll need them for experiments, even if the standard implementation would be a nop. 5. While the details of the implementation would differ, this optimization could apply to the MPS build, too, and even be generalized (on no-purespace, we can't move an object into the pdumper are, but on scratch/igc, we may be able to hack MPS to move an object into the immovable "tenured" pool). However, I realize that (1) is currently a sheer guess. I haven't decided whether it's worth it to get an upper bound on the saved GC time by implementing a universal "tenured" set and performing a GC right after loading (which should be very fast, not marking any pdumped objects). The problem is that it's an upper bound: the effect will become less significant as the immortal set would grow over time; at some point, it's no longer worth it, and we could clear all "tenured" bits and continue without this optimization. While my experiments suggest it's not relevant, this would reduce some potential benefits of Helmut's approach: locality wouldn't be as good, we'd need to check a bit rather than doing a bounds check to identify skippable objects, and of course there's the extra overhead from another bits table and the set implementation. If anyone tries this, please also try whether increasing the pdumper alignment to 16 or further improves performance; that would mean some wasted space, but it would also reduce the sheer number of bits to check. If we want dynamic adjustment of the skippable/"tenured" set, we'd also need to duplicate a lot of the GC code to trace references for analysis rather than setting their mark bits. We might be able to avoid the code duplication with a trick, but realistically we'd be looking at two code paths: the "fast" GC path which performs a hardcoded action, and the "slow" analysis path which merely collects references and returns them to the caller. Sorry this is a bit long. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. @ 2024-12-22 13:07 ` Eli Zaretskii 2024-12-22 14:12 ` Pip Cet via Emacs development discussions. 2024-12-22 15:51 ` Stefan Monnier 1 sibling, 1 reply; 65+ messages in thread From: Eli Zaretskii @ 2024-12-22 13:07 UTC (permalink / raw) To: Pip Cet; +Cc: stefankangas, eller.helmut, monnier, emacs-devel > Date: Sun, 22 Dec 2024 11:12:30 +0000 > Cc: Helmut Eller <eller.helmut@gmail.com>, > Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org > From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > > 1. choose a "tenured" set. I would hazard a guess that most objects in > the pdumper file are usually not actually written to, ever, so maybe we > could even get away with starting with a universally-filled "tenured" > set. We could use, in essence, PGO to do so, by identifying call stacks > for objects that are modified and excluding them from the "tenured" set > based on the call stacks when the real Emacs is built. I don't think your guess is correct. There are definitely objects in the pdumper file that are written to. Example: the *scratch* buffer. Historically, the purecopy calls that created pure objects have their legacy from the unexec era, when any Lisp object generated in temacs was dumped into the emacs binary, and originally ended up in the BSS section (and was shared between different emacs instances running on the same system). It is thus possible that quite a few of pure objects were there "by accident", and there's no need to keep them from temacs stage. We already started to get rid of some of those, but meanwhile did that only where pdumper caused trouble. IOW, we are now in an interim state where some dumped objects don't need to be dumped at all, but should be recreated anew when Emacs starts. So going by dumped objects is IMO following the wrong lead. It would make sense to have a special API for defining objects that need not be scanned by GC, in effect reintroducing pure space (but with a different name and only for objects that are truly immutable). However, it sounds like this adds back one reason why we wanted to get rid of purespace: the tedious and error-prone job of identifying such objects and marking them in the sources. OTOH, what other reliable ways do we have? Whether a given object is immutable is a programmer-level decision. I really don't see how this decision could be made automatically by Emacs. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 13:07 ` Eli Zaretskii @ 2024-12-22 14:12 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-22 14:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefankangas, eller.helmut, monnier, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Sun, 22 Dec 2024 11:12:30 +0000 >> Cc: Helmut Eller <eller.helmut@gmail.com>, >> Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org >> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> >> 1. choose a "tenured" set. I would hazard a guess that most objects in >> the pdumper file are usually not actually written to, ever, so maybe we >> could even get away with starting with a universally-filled "tenured" >> set. We could use, in essence, PGO to do so, by identifying call stacks >> for objects that are modified and excluding them from the "tenured" set >> based on the call stacks when the real Emacs is built. > > I don't think your guess is correct. My proposal was about hints (or guesses), not about hard promises. I see no reason to prefer the latter to the former: on the C code side of things, it causes complexity, because we need to deal with broken promises somehow (currently, we just crash, in some cases). I'm convinced it's worse for performance, because there are many mostly-immutable-but-someone-might objects which are still worth optimizing for. As for my guess, it is correct. I guessed that few objects are written to, so we can give a blanket *hint* that an object in the pdump is probably not going to be mutated. It's obviously true that some objects are, so inevitably some of the hints are going to be incorrect, but that's why they're just hints. > There are definitely objects in the pdumper file that are written to. Of course, but that wasn't the question. > Historically, the purecopy calls that created pure objects have their > legacy from the unexec era, when any Lisp object generated in temacs > was dumped into the emacs binary, and originally ended up in the BSS > section (and was shared between different emacs instances running on > the same system). It is thus possible that quite a few of pure > objects were there "by accident", and there's no need to keep them > from temacs stage. We already started to get rid of some of those, > but meanwhile did that only where pdumper caused trouble. IOW, we are > now in an interim state where some dumped objects don't need to be > dumped at all, but should be recreated anew when Emacs starts. > > So going by dumped objects is IMO following the wrong lead. I agree. > It would make sense to have a special API for defining objects that > need not be scanned by GC, in effect reintroducing pure space (but > with a different name and only for objects that are truly immutable). I disagree. I don't see the need for the "hard promise" character of that API. Hints seem to be sufficient, they cause much less trouble, and we can generate them (semi-)automatically because an incorrect hint would cost some performance but wouldn't crash Emacs. Most importantly, whether an object needs to be scanned by GC or not is very difficult to decide, and liable to change, and then we have to check all the API calls, and if we make a mistake it'll result in crashes. We know that would happen because we already did it, with purespace, and now master-branch Emacs is crashable. > However, it sounds like this adds back one reason why we wanted to get > rid of purespace: the tedious and error-prone job of identifying such > objects and marking them in the sources. That's why I think hints/guesses are the only option: then we can just ignore them, or lose some performance because of wrong hints, or correct them automatically, or whatever, without ever crashing Emacs. > OTOH, what other reliable ways do we have? Whether a given object is With hints, we don't need to be reliable, we just need an "oops, turns out not to be read-only" fire escape, which isn't that hard to do. > immutable is a programmer-level decision. I really don't see how this Why encourage programmers to even consider the question, though? I think of it as analogous to deciding the right size of a C integer: Lisp avoids forcing the programmer to make that decision, and it's a better language because of that. Guesses or hints are sufficient for performance, and they maintain the flexibility of simply mutating an object when we need to. They're also better for performance. (Char tables can be "sealed" so we don't have to scan the entire table, just the values used in it (and each value only needs to be scanned once). Mutation would mean unsealing them, and then we could re-seal them after another GC cycle or two without modfications. That's the kind of approach we should generalize, not "I promise not to modify this object".) > decision could be made automatically by Emacs. Hints (or guesses) can be done automatically. Hard promises, no. I'm comfortable giving up on hard promises for good. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. 2024-12-22 13:07 ` Eli Zaretskii @ 2024-12-22 15:51 ` Stefan Monnier 2024-12-22 17:09 ` Gerd Möllmann 2024-12-22 17:10 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 65+ messages in thread From: Stefan Monnier @ 2024-12-22 15:51 UTC (permalink / raw) To: Pip Cet; +Cc: Stefan Kangas, Helmut Eller, emacs-devel > My idea is this: we add an extra mark bit area to the pdumper file for > objects which we know to be "tenured": i.e. objects that we'll treat as > immortal, but for which we also know that all referenced objects will > also be "tenured", or static. IIUC this sounds like a kind of generational GC, except that promotion to the "tenured" set is made somewhat visible instead of being 100% internal. > If we write to such an object, we clear the bit, and put it on a special > set to maintain its tenure (it'd be nicer to simply set another bit, but > non-MPS pdumper cannot do so). This should happen rarely, but it's > better than the current CHECK_IMPURE thing. If my understanding above is correct, then the `CHECK_IMPURE/check_writable` is what we usually call "write barrier", and the "special set" above is what we usually call the "remembered set". Stefan ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 15:51 ` Stefan Monnier @ 2024-12-22 17:09 ` Gerd Möllmann 2024-12-22 17:10 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 65+ messages in thread From: Gerd Möllmann @ 2024-12-22 17:09 UTC (permalink / raw) To: Stefan Monnier; +Cc: Pip Cet, Stefan Kangas, Helmut Eller, emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >> My idea is this: we add an extra mark bit area to the pdumper file for >> objects which we know to be "tenured": i.e. objects that we'll treat as >> immortal, but for which we also know that all referenced objects will >> also be "tenured", or static. > > IIUC this sounds like a kind of generational GC, except that promotion > to the "tenured" set is made somewhat visible instead of being 100% internal. > >> If we write to such an object, we clear the bit, and put it on a special >> set to maintain its tenure (it'd be nicer to simply set another bit, but >> non-MPS pdumper cannot do so). This should happen rarely, but it's >> better than the current CHECK_IMPURE thing. > > If my understanding above is correct, then the > `CHECK_IMPURE/check_writable` is what we usually call "write barrier", > and the "special set" above is what we usually call the "remembered set". > Same understanding here. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 15:51 ` Stefan Monnier 2024-12-22 17:09 ` Gerd Möllmann @ 2024-12-22 17:10 ` Pip Cet via Emacs development discussions. 2025-01-04 0:09 ` Stefan Monnier 1 sibling, 1 reply; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-22 17:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: Stefan Kangas, Helmut Eller, emacs-devel "Stefan Monnier" <monnier@iro.umontreal.ca> writes: >> My idea is this: we add an extra mark bit area to the pdumper file for >> objects which we know to be "tenured": i.e. objects that we'll treat as >> immortal, but for which we also know that all referenced objects will >> also be "tenured", or static. > > IIUC this sounds like a kind of generational GC, except that promotion > to the "tenured" set is made somewhat visible instead of being 100% internal. Kind of a very special case of it, but there's no major GC and we accept that objects in the "old" generation are simply immortal. I think it's easier to get to the concept by starting from "here's a quick hack" rather than "start with generational GC, then remove bits from it until all that's left is a quick hack". I was intending for the promotion to be invisible to "normal" Lisp users, though. Sorry if that didn't become clear. >> If we write to such an object, we clear the bit, and put it on a special >> set to maintain its tenure (it'd be nicer to simply set another bit, but >> non-MPS pdumper cannot do so). This should happen rarely, but it's >> better than the current CHECK_IMPURE thing. > > If my understanding above is correct, then the > `CHECK_IMPURE/check_writable` is what we usually call "write barrier", Yes. I was trying to avoid the term because there are other uses for check_writable than clearing a write barrier, particularly if we pair each call to check_writable with one to no_longer_care_whether_the_object_is_writable. > and the "special set" above is what we usually call the "remembered set". TBH, I'm not entirely sure about that one, and I see I misspoke there; while the object remains immortal (and thus "tenured"), it's no longer considered skippable; in effect, the object combines the disadvantages of being young and being old, permanently, rather than clearly being one of the two. I guess you could say we turn it into an additional root? It becomes much closer to generational GC if you reintroduce "major" GCs which would recalculate the tenure of all objects, but that's where we hit the limits of "quick hack" territory, and I don't see a way of detecting when we would want to do so automatically. Anyway, leaving those very general questions aside, char table GC is inefficient and fixing it will speed up the first few GCs considerably, and we can do so without making anything "pure". Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 17:10 ` Pip Cet via Emacs development discussions. @ 2025-01-04 0:09 ` Stefan Monnier 0 siblings, 0 replies; 65+ messages in thread From: Stefan Monnier @ 2025-01-04 0:09 UTC (permalink / raw) To: Pip Cet; +Cc: Stefan Kangas, Helmut Eller, emacs-devel >> and the "special set" above is what we usually call the "remembered set". > > TBH, I'm not entirely sure about that one, and I see I misspoke there; > while the object remains immortal (and thus "tenured"), it's no longer > considered skippable; in effect, the object combines the disadvantages > of being young and being old, permanently, rather than clearly being one > of the two. I guess you could say we turn it into an additional root? AFAIK in the context of generational GC, when a write-barrier records a mutation in the old/tenured object, different GCs make different choices w.r.t what is recorded in the "remembered set": the set may record the new target of the pointer that's assigned (assuming it's a pointer), or it record the address of the field that's mutated, or it can record the actual object that's mutated, or it records a "card" (i.e. a somewhat arbitrary memory region that contains the field, usually sized and aligned on a power of 2), ... So IMO what you describe still sits comfortably within the scope of "remembered set". > It becomes much closer to generational GC if you reintroduce "major" GCs > which would recalculate the tenure of all objects, but that's where we > hit the limits of "quick hack" territory, and I don't see a way of > detecting when we would want to do so automatically. Generational GCs don't have "a way of detecting when we would want to do [a major GC] automatically" either. They just use heuristics. Regarding "quick hack", I think the main issue is the write barrier. Stefan ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 1:28 ` Stefan Kangas 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. @ 2024-12-22 13:13 ` Pip Cet via Emacs development discussions. 2024-12-22 14:16 ` Helmut Eller 2 siblings, 0 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-22 13:13 UTC (permalink / raw) To: Stefan Kangas; +Cc: Helmut Eller, Stefan Monnier, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > However, I realize that (1) is currently a sheer guess. I haven't > decided whether it's worth it to get an upper bound on the saved GC time > by implementing a universal "tenured" set and performing a GC right > after loading (which should be very fast, not marking any pdumped > objects). I did. This got long again. That's because I wanted to be really sure that merging no-purespace isn't going to prevent worthwhile optimizations in the future, and I am now. Feel free to skip the rest :-) My initial results are that simply "tenuring" the char tables in the pdump seems to have such a drastic effect that it's hard to perform a fair measurement: process_mark_stack is called (in emacs -Q, no --batch) 21384 times if we "tenure" the char tables, and 135345 times if we don't. (This suggests that char tables may be worth optimizing for the "old" GC: simply keep a set of GC-relevant values in the char table, and scan that rather than scanning the entire char table. However, we can't do that with MPS, so I'm not overly interested in it. Also, I doubt the optimization decisions required for char tables would be made the same way if they were reimplemented today, so it may be more productive to start over from scratch, with a particular focus on reducing the time needed for GC rather than ordinary performance) Also, we need to add a few check_writable calls to avoid segfaults. I should have expected that, I guess. The good news is that few pdumped objects (256 once a non-batched Emacs is started) actually appear to be written to, so it's not entirely hopeless to identify those in one run and mark them non-tenured in the real Emacs. IOW, my tentative conclusion is that it's possible to perform such optimizations after pure space is dropped, and there's no reason to delay the merge. Optimizing based on a *hint* that an object probably won't be mutated is a potential way forward. Optimizing based on a hard promise that an object won't be mutated, as the old purespace code does, not so much. Even the old purespace code, with the years of development it's seen, ended up losing the optimization and causing preventable segfaults for valid-looking Elisp code. I must confess I'm fundamentally opposed to having objects come in a "read-only" and a "read-write" flavor. Either they should always be immutable, such as bignums and floats are now, or we should go to the trouble of supporting the rare cases in which an object hinted or guessed to be read-only turned out not to be. (This is independent of the question of whether the characters in a string can be changed or not.) It's very hard even to define what constitutes mutation of an object and what doesn't. Setting a symbol's global value is clearly a mutation in the current code, but what if we keep those global values in a hash table instead, and the struct Lisp_Symbol is never written to? Does lexically (or dynamically) binding a symbol mean the entire symbol is no longer read-only? If we ever implement hash-collision workarounds by randomizing hash seeds, would re-seeding count as a mutation of the hash table? What about (aset v 0 (aref v 0))? Hash table resizing? Removing dead keys from Weak hash tables? Pinning a string to use it in a byte code object? Wouldn't it make sense to protect hash table (or obarray) keys from mutation if that may result in irretrievable entries? Most of these questions have two good answers, one which aids in optimization, and one which Lisp programmers would expect. They're often different. To get back to the no-purespace branch, I think we should consider reintroducing check_writable () calls (which would currently be no-ops on the master branch) after the merge, if we can agree on precisely when this macro should be called and how. The old locations of CHECK_IMPURE can serve as a hint, but no more, so let's drop CHECK_IMPURE first and start with a clean slate there. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-22 1:28 ` Stefan Kangas 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. 2024-12-22 13:13 ` Pip Cet via Emacs development discussions. @ 2024-12-22 14:16 ` Helmut Eller 2 siblings, 0 replies; 65+ messages in thread From: Helmut Eller @ 2024-12-22 14:16 UTC (permalink / raw) To: Stefan Kangas; +Cc: Stefan Monnier, emacs-devel, Pip Cet On Sun, Dec 22 2024, Stefan Kangas wrote: > Helmut Eller <eller.helmut@gmail.com> writes: [...] >> between the versions with and without the patch. The results was: >> >> without-pure-section: 0.006251480181 >> with-pure-section: 0.003986384857 > > This is interesting, thanks. Would this experiment easily transfer to > and be relevant to the MPS branch? To be useful for MPS, we would probably need sections that can be loaded to different/non-contiguous memory blocks that are allocated by MPS. (One section per MPS pool). That would require some work on the pdumper. It's less relevant for MPS because MPS is incremental and generational and so pure objects contribute much less to the average GC pause time. Helmut ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 19:30 ` Helmut Eller 2024-12-17 20:47 ` Stefan Monnier @ 2024-12-18 9:30 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 65+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-18 9:30 UTC (permalink / raw) To: Helmut Eller; +Cc: Stefan Kangas, emacs-devel, Stefan Monnier "Helmut Eller" <eller.helmut@gmail.com> writes: > On Tue, Dec 17 2024, Stefan Kangas wrote: > > [...] >> We, the maintainers, believe that the scratch/no-purespace branch is >> now ready to merge, and would appreciate any final feedback, testing, >> and code reviews. Specifically, the branch has been primarily tested >> on GNU/Linux and macOS, so testing on other systems would be greatly >> appreciated. > > Do you have an estimate what removing purespace will cost in terms of GC > time? I mean something like "1ms per collection". Or perhaps a > suggestion how I could measure it? "Close to zero" is the best I can do at this time. In particular, that implies "not catastrophically worse", which is all that is relevant right now, IMHO. > I mean something like "1ms per collection". Or perhaps a > suggestion how I could measure it? I think that all the previously-pure data should be available right away after starting an Emacs session, so maybe something like: perf record -e cycles ./src/emacs -Q --batch --eval '(dotimes (i 1000) (garbage-collect)))' would be one data point. Some effects would only be visible in GCs with deep call stacks, or in large sessions, but that's hard to measure. Pip ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-17 10:47 Merging scratch/no-purespace to remove unexec and purespace Stefan Kangas 2024-12-17 13:12 ` Gerd Möllmann 2024-12-17 19:30 ` Helmut Eller @ 2024-12-18 0:50 ` Po Lu 2024-12-18 2:12 ` Stefan Kangas 2024-12-18 21:26 ` Stefan Monnier 2 siblings, 2 replies; 65+ messages in thread From: Po Lu @ 2024-12-18 0:50 UTC (permalink / raw) To: Stefan Kangas; +Cc: emacs-devel, Pip Cet, Stefan Monnier Stefan Kangas <stefankangas@gmail.com> writes: > In August, we decided to remove the unexec build along with purespace. > The scratch/no-purespace branch removes both, and has been rebased on > top of a recent master. > > We, the maintainers, believe that the scratch/no-purespace branch is > now ready to merge, and would appreciate any final feedback, testing, > and code reviews. Specifically, the branch has been primarily tested > on GNU/Linux and macOS, so testing on other systems would be greatly > appreciated. > > Unless we uncover any serious blocking issues, or significant or > unexpected objections from the community, we plan to merge the > branch on February 1, 2025. > > During our last discussion, we identified some issues with using > pdumper on some legacy proprietary systems: MS-DOS, Windows 98, and > Solaris 10 Zone. As we have explained previously, patches are welcome > for these issues, but we do not currently consider them as blockers > for this merge. Please don't merge this until Emacs 30 is released, whether that be before the 1st of February or after. ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 0:50 ` Po Lu @ 2024-12-18 2:12 ` Stefan Kangas 2024-12-18 21:26 ` Stefan Monnier 1 sibling, 0 replies; 65+ messages in thread From: Stefan Kangas @ 2024-12-18 2:12 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel, Pip Cet, Stefan Monnier Po Lu <luangruo@yahoo.com> writes: > Stefan Kangas <stefankangas@gmail.com> writes: > >> In August, we decided to remove the unexec build along with purespace. >> The scratch/no-purespace branch removes both, and has been rebased on >> top of a recent master. >> >> We, the maintainers, believe that the scratch/no-purespace branch is >> now ready to merge, and would appreciate any final feedback, testing, >> and code reviews. Specifically, the branch has been primarily tested >> on GNU/Linux and macOS, so testing on other systems would be greatly >> appreciated. >> >> Unless we uncover any serious blocking issues, or significant or >> unexpected objections from the community, we plan to merge the >> branch on February 1, 2025. >> >> During our last discussion, we identified some issues with using >> pdumper on some legacy proprietary systems: MS-DOS, Windows 98, and >> Solaris 10 Zone. As we have explained previously, patches are welcome >> for these issues, but we do not currently consider them as blockers >> for this merge. > > Please don't merge this until Emacs 30 is released, whether that be > before the 1st of February or after. I'm guessing here, but I take that to mean: 1. You intend to work on resolving the problems with pdumper on one or more legacy proprietary systems. 2. You will not have much time before Emacs 30 is out, as you have previously indicated. Is that a correct understanding? If not, may I ask that you elaborate? ^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: Merging scratch/no-purespace to remove unexec and purespace 2024-12-18 0:50 ` Po Lu 2024-12-18 2:12 ` Stefan Kangas @ 2024-12-18 21:26 ` Stefan Monnier 1 sibling, 0 replies; 65+ messages in thread From: Stefan Monnier @ 2024-12-18 21:26 UTC (permalink / raw) To: Po Lu; +Cc: Stefan Kangas, emacs-devel, Pip Cet > Please don't merge this until Emacs 30 is released, whether that be > before the 1st of February or after. Why? Stefan ^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2025-01-19 5:18 UTC | newest] Thread overview: 65+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-12-17 10:47 Merging scratch/no-purespace to remove unexec and purespace Stefan Kangas 2024-12-17 13:12 ` Gerd Möllmann 2024-12-17 14:20 ` Gerd Möllmann 2024-12-17 14:30 ` Gerd Möllmann 2024-12-17 17:56 ` Gerd Möllmann 2024-12-17 18:50 ` Eli Zaretskii 2024-12-17 18:56 ` Gerd Möllmann 2024-12-18 12:55 ` Andrea Corallo 2024-12-18 14:03 ` Gerd Möllmann 2024-12-18 16:05 ` Pip Cet via Emacs development discussions. 2024-12-18 16:30 ` Gerd Möllmann 2024-12-18 16:25 ` Pip Cet via Emacs development discussions. 2024-12-18 22:27 ` Andrea Corallo 2024-12-19 9:28 ` Pip Cet via Emacs development discussions. 2024-12-19 10:38 ` Andrea Corallo 2024-12-19 10:50 ` Stefan Kangas 2024-12-19 12:08 ` Pip Cet via Emacs development discussions. 2024-12-19 17:55 ` Stefan Kangas 2024-12-19 20:13 ` Pip Cet via Emacs development discussions. 2024-12-20 15:59 ` Stefan Monnier 2024-12-20 16:22 ` Pip Cet via Emacs development discussions. 2024-12-20 17:25 ` Gerd Möllmann 2024-12-20 20:35 ` Andrea Corallo 2024-12-20 20:39 ` Pip Cet via Emacs development discussions. 2024-12-21 6:33 ` Gerd Möllmann 2024-12-21 6:56 ` Andrea Corallo 2024-12-20 20:38 ` Pip Cet via Emacs development discussions. 2024-12-20 20:57 ` Gerd Möllmann 2025-01-03 22:36 ` Pip Cet via Emacs development discussions. 2025-01-03 23:50 ` Stefan Monnier 2025-01-04 7:26 ` Eli Zaretskii 2025-01-04 11:12 ` Pip Cet via Emacs development discussions. 2025-01-04 13:50 ` Eli Zaretskii 2025-01-06 19:05 ` Andrea Corallo 2025-01-07 12:46 ` Pip Cet via Emacs development discussions. 2025-01-18 9:40 ` Eli Zaretskii 2025-01-18 19:52 ` Andrea Corallo 2025-01-18 20:20 ` Pip Cet via Emacs development discussions. 2025-01-18 20:36 ` Andrea Corallo 2025-01-19 5:18 ` Eli Zaretskii 2024-12-20 8:42 ` Pip Cet via Emacs development discussions. 2024-12-18 0:18 ` Stefan Kangas 2024-12-17 19:30 ` Helmut Eller 2024-12-17 20:47 ` Stefan Monnier 2024-12-18 2:15 ` Stefan Kangas 2024-12-18 7:11 ` Helmut Eller 2024-12-18 13:35 ` Pip Cet via Emacs development discussions. 2024-12-18 6:56 ` Helmut Eller 2024-12-21 17:41 ` Helmut Eller 2024-12-21 18:32 ` Gerd Möllmann 2024-12-21 22:19 ` Pip Cet via Emacs development discussions. 2024-12-22 1:28 ` Stefan Kangas 2024-12-22 11:12 ` Pip Cet via Emacs development discussions. 2024-12-22 13:07 ` Eli Zaretskii 2024-12-22 14:12 ` Pip Cet via Emacs development discussions. 2024-12-22 15:51 ` Stefan Monnier 2024-12-22 17:09 ` Gerd Möllmann 2024-12-22 17:10 ` Pip Cet via Emacs development discussions. 2025-01-04 0:09 ` Stefan Monnier 2024-12-22 13:13 ` Pip Cet via Emacs development discussions. 2024-12-22 14:16 ` Helmut Eller 2024-12-18 9:30 ` Pip Cet via Emacs development discussions. 2024-12-18 0:50 ` Po Lu 2024-12-18 2:12 ` Stefan Kangas 2024-12-18 21:26 ` Stefan Monnier
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).