* Improving EQ @ 2024-12-11 22:37 Pip Cet via Emacs development discussions. 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 10:42 ` Improving EQ Óscar Fuentes 0 siblings, 2 replies; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 22:37 UTC (permalink / raw) To: emacs-devel I looked at the "new" code generated for our EQ macro, and decided that a fix was in order. I'm therefore sending a first proposal to explain what I think should be done, and some numbers. This patch: * moves the "slow path" of EQ into a NO_INLINE function * exits early if the arguments to EQ are actually BASE_EQ * returns quickly (after a single memory access which cannot be avoided until we fix our tagging scheme to distinguish exotic objects from ordinary ones) when symbols_with_pos_enabled isn't true. The effect on the code size of the stripped emacs binary is small, but significant: 8906336 bytes instead of 8955488 bytes on this machine. (The effect on the code size of the emacs binary with debugging information is much larger, reducing it from 32182000 bytes to 31125832 bytes on this system.) There is no effect on the size of the .pdmp file, which is expected. What's missing here is a benchmark, but unless there's a really nasty surprise when that happens, I'm quite confident that we can improve the code here. The proposed code doesn't use __builtin_expect anymore. I've deliberately written slow_eq so it returns the same value as EQ, even if the slow code path is disabled. Pip commit 2c807f7320bcb9654e0f148d64c92053b1a47b42 (HEAD -> faster-eq) Author: Pip Cet <pipcet@protonmail.com> Date: Wed Dec 11 22:31:07 2024 +0000 Change EQ to move slow code path into a separate function * src/data.c (slow_eq): New function. * src/lisp.h (EQ): Call it. diff --git a/src/data.c b/src/data.c index 66cf34c1e60..5ee383d2f48 100644 --- a/src/data.c +++ b/src/data.c @@ -162,6 +162,15 @@ circular_list (Lisp_Object list) \f /* Data type predicates. */ +/* NO_INLINE to avoid excessive code growth when LTO is in use. */ +NO_INLINE bool slow_eq (Lisp_Object x, Lisp_Object y) +{ + return BASE_EQ ((symbols_with_pos_enabled && SYMBOL_WITH_POS_P (x) ? + XSYMBOL_WITH_POS_SYM (x) : x), + (symbols_with_pos_enabled && SYMBOL_WITH_POS_P (y) ? + XSYMBOL_WITH_POS_SYM (y) : y)); +} + DEFUN ("eq", Feq, Seq, 2, 2, 0, doc: /* Return t if the two args are the same Lisp object. */ attributes: const) diff --git a/src/lisp.h b/src/lisp.h index 832a1755c04..64d4835a499 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -618,6 +618,7 @@ #define ENUM_BF(TYPE) enum TYPE extern Lisp_Object default_value (Lisp_Object symbol); extern void defalias (Lisp_Object symbol, Lisp_Object definition); extern char *fixnum_to_string (EMACS_INT number, char *buffer, char *end); +extern bool slow_eq (Lisp_Object x, Lisp_Object y); /* Defined in emacs.c. */ @@ -1353,10 +1354,12 @@ make_fixed_natnum (EMACS_INT n) INLINE bool EQ (Lisp_Object x, Lisp_Object y) { - return BASE_EQ ((__builtin_expect (symbols_with_pos_enabled, false) - && SYMBOL_WITH_POS_P (x) ? XSYMBOL_WITH_POS_SYM (x) : x), - (__builtin_expect (symbols_with_pos_enabled, false) - && SYMBOL_WITH_POS_P (y) ? XSYMBOL_WITH_POS_SYM (y) : y)); + if (BASE_EQ (x, y)) + return true; + else if (!symbols_with_pos_enabled) + return false; + else + return slow_eq (x, y); } INLINE intmax_t ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions. @ 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 8:23 ` Andrea Corallo 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 10:42 ` Improving EQ Óscar Fuentes 1 sibling, 2 replies; 25+ messages in thread From: Eli Zaretskii @ 2024-12-12 6:36 UTC (permalink / raw) To: Pip Cet, Mattias Engdegård, Paul Eggert; +Cc: emacs-devel > Date: Wed, 11 Dec 2024 22:37:04 +0000 > From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > > What's missing here is a benchmark, but unless there's a really nasty > surprise when that happens, I'm quite confident that we can improve the > code here. The usual easy benchmark is to byte-compile all the *.el files in the source tree. That is, remove all the *.elc files, then say "make" and time that. There was also some Emacs benchmark suite that someone posted, but I cannot find it now, maybe someone else will. Adding Mattias and Paul to this discussion. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 6:36 ` Eli Zaretskii @ 2024-12-12 8:23 ` Andrea Corallo 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 25+ messages in thread From: Andrea Corallo @ 2024-12-12 8:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, Mattias Engdegård, Paul Eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Wed, 11 Dec 2024 22:37:04 +0000 >> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > The usual easy benchmark is to byte-compile all the *.el files in the > source tree. That is, remove all the *.elc files, then say "make" and > time that. Agree, considering that tests the non zero 'symbols_with_pos_enabled' case. > There was also some Emacs benchmark suite that someone posted, but I > cannot find it now, maybe someone else will. <https://elpa.gnu.org/packages/elisp-benchmarks.html> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 8:23 ` Andrea Corallo @ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii ` (3 more replies) 1 sibling, 4 replies; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 8:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Mattias Engdegård, Paul Eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Wed, 11 Dec 2024 22:37:04 +0000 >> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > The usual easy benchmark is to byte-compile all the *.el files in the > source tree. That is, remove all the *.elc files, then say "make" and > time that. Considering the point of the optimization was to make compilation (when symbols_with_pos_enabled is true) slower, but speed up non-compilation use cases, I think that may be the opposite of what we want :-) Furthermore, the master branch doesn't currently build after deleting all the *.elc files, because recompilation exceeds max-lisp-eval-depth in that scenario (together with the known purespace issue, this pretty much means "make bootstrap" is the only way I can rebuild an emacs tree right now. It'd be great if Someone could look into this, but I've failed to understand the native-compilation code (and been told off for trying to) too often for that Someone to be me. Plus, of course, I fully understand that native compilation currently has wrong code generation bugs which obviously have to take priority over build issues...) > There was also some Emacs benchmark suite that someone posted, but I > cannot find it now, maybe someone else will. https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if we could agree on a benchmark, and even better if there were a way to reliably run it from emacs -Q :-) In fact, I would suggest to move a reduced benchmark suite to the emacs repo itself, and run it using "make benchmark". Also, just to let everyone know, I'm planning to make the "exotic" property (this object must or can use the slow_eq path) part (probably the LSB) of the tag rather than accessing it via a global variable and the PVEC type. This should reduce code size further, should speed up things, and has some other advantages which I'll go into when I have working code. Pip ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. @ 2024-12-12 9:18 ` Eli Zaretskii 2024-12-12 9:35 ` Visuwesh ` (2 subsequent siblings) 3 siblings, 0 replies; 25+ messages in thread From: Eli Zaretskii @ 2024-12-12 9:18 UTC (permalink / raw) To: Pip Cet; +Cc: mattiase, eggert, emacs-devel > Date: Thu, 12 Dec 2024 08:36:50 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > The usual easy benchmark is to byte-compile all the *.el files in the > > source tree. That is, remove all the *.elc files, then say "make" and > > time that. > > Considering the point of the optimization was to make compilation (when > symbols_with_pos_enabled is true) slower, but speed up non-compilation > use cases, I think that may be the opposite of what we want :-) That's fine, because knowing where this slows us down and by how much is also important. > Furthermore, the master branch doesn't currently build after deleting > all the *.elc files, because recompilation exceeds max-lisp-eval-depth > in that scenario (together with the known purespace issue, this pretty > much means "make bootstrap" is the only way I can rebuild an emacs tree > right now. It'd be great if Someone could look into this, but I've > failed to understand the native-compilation code (and been told off for > trying to) too often for that Someone to be me. Plus, of course, I fully > understand that native compilation currently has wrong code generation > bugs which obviously have to take priority over build issues...) If this is with native-compilation, how about trying without? Also, enlarging max-lisp-eval-depth (assuming you don't somehow hit infinite recursion) locally should be easy: just add that to the relevant Makefiles. > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) Our benchmark facilities are very rudimentary, so agreement is not an issue: we just use whatever is available. > In fact, I would suggest to move a reduced benchmark suite to the emacs > repo itself, and run it using "make benchmark". Working on a better benchmark is very useful, but maybe we should try solving one problem at a time? > Also, just to let everyone know, I'm planning to make the "exotic" > property (this object must or can use the slow_eq path) part (probably > the LSB) of the tag rather than accessing it via a global variable and > the PVEC type. This should reduce code size further, should speed up > things, and has some other advantages which I'll go into when I have > working code. Whenever you change something in the tags, please remember to update .gdbinit, otherwise we lose debugging support. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii @ 2024-12-12 9:35 ` Visuwesh 2024-12-12 10:40 ` Andrea Corallo 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas 3 siblings, 0 replies; 25+ messages in thread From: Visuwesh @ 2024-12-12 9:35 UTC (permalink / raw) To: Pip Cet via Emacs development discussions. Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert [வியாழன் டிசம்பர் 12, 2024] Pip Cet via "Emacs development discussions." wrote: >> There was also some Emacs benchmark suite that someone posted, but I >> cannot find it now, maybe someone else will. > > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) Will the command package-isolate help in this scenario? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii 2024-12-12 9:35 ` Visuwesh @ 2024-12-12 10:40 ` Andrea Corallo 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas 3 siblings, 1 reply; 25+ messages in thread From: Andrea Corallo @ 2024-12-12 10:40 UTC (permalink / raw) To: Pip Cet via Emacs development discussions. Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > "Eli Zaretskii" <eliz@gnu.org> writes: > >>> Date: Wed, 11 Dec 2024 22:37:04 +0000 >>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >>> >>> What's missing here is a benchmark, but unless there's a really nasty >>> surprise when that happens, I'm quite confident that we can improve the >>> code here. >> >> The usual easy benchmark is to byte-compile all the *.el files in the >> source tree. That is, remove all the *.elc files, then say "make" and >> time that. > > Considering the point of the optimization was to make compilation (when > symbols_with_pos_enabled is true) slower, but speed up non-compilation > use cases, I think that may be the opposite of what we want :-) Glad you finally agree on the goal of the optimization. > Furthermore, the master branch doesn't currently build after deleting > all the *.elc files, because recompilation exceeds max-lisp-eval-depth > in that scenario (together with the known purespace issue, this pretty > much means "make bootstrap" is the only way I can rebuild an emacs tree > right now. It'd be great if Someone could look into this, but I've > failed to understand the native-compilation code (and been told off for > trying to) too often for that Someone to be me. Plus, of course, I fully > understand that native compilation currently has wrong code generation > bugs which obviously have to take priority over build issues...) > >> There was also some Emacs benchmark suite that someone posted, but I >> cannot find it now, maybe someone else will. > > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) What is not reliable in the elisp-benchmarks invocation suggested in the instructions in it? > In fact, I would suggest to move a reduced benchmark suite to the emacs > repo itself, and run it using "make benchmark". That would be nice. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 10:40 ` Andrea Corallo @ 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. 2024-12-12 19:09 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:46 UTC (permalink / raw) To: Andrea Corallo Cc: Pip Cet via "Emacs development discussions.", Eli Zaretskii, Mattias Engdegård, Paul Eggert "Andrea Corallo" <acorallo@gnu.org> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >>>> Date: Wed, 11 Dec 2024 22:37:04 +0000 >>>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >>>> >>>> What's missing here is a benchmark, but unless there's a really nasty >>>> surprise when that happens, I'm quite confident that we can improve the >>>> code here. >>> >>> The usual easy benchmark is to byte-compile all the *.el files in the >>> source tree. That is, remove all the *.elc files, then say "make" and >>> time that. >> >> Considering the point of the optimization was to make compilation (when >> symbols_with_pos_enabled is true) slower, but speed up non-compilation >> use cases, I think that may be the opposite of what we want :-) > > Glad you finally agree on the goal of the optimization. I find that statement to be offensive, because you know it to be factually incorrect; but even if it weren't, gloating like that is extremely poor form for a maintainer. Pip ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. @ 2024-12-12 19:09 ` Eli Zaretskii 0 siblings, 0 replies; 25+ messages in thread From: Eli Zaretskii @ 2024-12-12 19:09 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, emacs-devel, mattiase, eggert > Date: Thu, 12 Dec 2024 17:46:55 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu> > > "Andrea Corallo" <acorallo@gnu.org> writes: > > >> Considering the point of the optimization was to make compilation (when > >> symbols_with_pos_enabled is true) slower, but speed up non-compilation > >> use cases, I think that may be the opposite of what we want :-) > > > > Glad you finally agree on the goal of the optimization. > > I find that statement to be offensive, because you know it to be > factually incorrect; but even if it weren't, gloating like that is > extremely poor form for a maintainer. I'm quite sure Andrea meant it as tongue-in-cheek, nowhere near gloating. Please keep in mind that for Andrea and others here, English is not their first language, and their choice of words could reflect that. ^ permalink raw reply [flat|nested] 25+ messages in thread
* New "make benchmark" target 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. ` (2 preceding siblings ...) 2024-12-12 10:40 ` Andrea Corallo @ 2024-12-12 10:53 ` Stefan Kangas 2024-12-12 10:59 ` Andrea Corallo 3 siblings, 1 reply; 25+ messages in thread From: Stefan Kangas @ 2024-12-12 10:53 UTC (permalink / raw) To: Pip Cet, Eli Zaretskii Cc: Mattias Engdegård, Paul Eggert, emacs-devel, Andrea Corallo Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) > > In fact, I would suggest to move a reduced benchmark suite to the emacs > repo itself, and run it using "make benchmark". SGTM, but why a reduced suite and not just the whole thing? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas @ 2024-12-12 10:59 ` Andrea Corallo 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 25+ messages in thread From: Andrea Corallo @ 2024-12-12 10:59 UTC (permalink / raw) To: Stefan Kangas Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if >> we could agree on a benchmark, and even better if there were a way to >> reliably run it from emacs -Q :-) >> >> In fact, I would suggest to move a reduced benchmark suite to the emacs >> repo itself, and run it using "make benchmark". > > SGTM, but why a reduced suite and not just the whole thing? My fear is that if we start going into the rabbit hole of which benchmark of elisp-benchmarks should or should not be included, we will never agree and as a consequence succeed. So I guess I'd favor as well including all elisp-benchmarks. Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-12 10:59 ` Andrea Corallo @ 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. 2024-12-13 0:49 ` Stefan Kangas 0 siblings, 1 reply; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 16:53 UTC (permalink / raw) To: Andrea Corallo Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel "Andrea Corallo" <acorallo@gnu.org> writes: > Stefan Kangas <stefankangas@gmail.com> writes: > >> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> writes: >> >>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if >>> we could agree on a benchmark, and even better if there were a way to >>> reliably run it from emacs -Q :-) >>> >>> In fact, I would suggest to move a reduced benchmark suite to the emacs >>> repo itself, and run it using "make benchmark". >> >> SGTM, but why a reduced suite and not just the whole thing? > > My fear is that if we start going into the rabbit hole of which > benchmark of elisp-benchmarks should or should not be included, we will > never agree and as a consequence succeed. So I guess I'd favor as well > including all elisp-benchmarks. I agree a full benchmark suite would be even better, I don't recall why I typed "reduced" there. So let's do that? Pip ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. @ 2024-12-13 0:49 ` Stefan Kangas 2024-12-13 7:37 ` Andrea Corallo 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 25+ messages in thread From: Stefan Kangas @ 2024-12-13 0:49 UTC (permalink / raw) To: Pip Cet, Andrea Corallo Cc: Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > I agree a full benchmark suite would be even better, I don't recall why > I typed "reduced" there. So let's do that? Please go ahead, thanks. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-13 0:49 ` Stefan Kangas @ 2024-12-13 7:37 ` Andrea Corallo 2024-12-14 12:00 ` Stefan Kangas 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 25+ messages in thread From: Andrea Corallo @ 2024-12-13 7:37 UTC (permalink / raw) To: Stefan Kangas Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> I agree a full benchmark suite would be even better, I don't recall why >> I typed "reduced" there. So let's do that? > > Please go ahead, thanks. Asking as elisp-benchmark author/maintainer, the way to move an external package in core is to copy the files and keep them manually in sync or there are other ways? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-13 7:37 ` Andrea Corallo @ 2024-12-14 12:00 ` Stefan Kangas 2024-12-14 14:06 ` Stefan Monnier 0 siblings, 1 reply; 25+ messages in thread From: Stefan Kangas @ 2024-12-14 12:00 UTC (permalink / raw) To: Andrea Corallo Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, Stefan Monnier Andrea Corallo <acorallo@gnu.org> writes: > Asking as elisp-benchmark author/maintainer, the way to move an external > package in core is to copy the files and keep them manually in sync or > there are other ways? We can make it into a :core package, which means that you copy the file to emacs.git, and when you update the "Version" header on Emacs master, the GNU ELPA scripts make a release based on that commit. Stefan Monnier (in CC) will know if there are any other adjustments that are needed, but one thing that needs doing is a change to the GNU ELPA `elpa-packages` file, something like this: - (elisp-benchmarks :url nil) + (elisp-benchmarks :core ("lisp/emacs-lisp/elisp-benchmarks.el")) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-14 12:00 ` Stefan Kangas @ 2024-12-14 14:06 ` Stefan Monnier 0 siblings, 0 replies; 25+ messages in thread From: Stefan Monnier @ 2024-12-14 14:06 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel >> Asking as elisp-benchmark author/maintainer, the way to move an external >> package in core is to copy the files and keep them manually in sync or >> there are other ways? Well, there's the all too famous "bundled ELPA packages" feature we never finished, of course (see `git branch -a | grep elpa` for various approaches Philip Lord proposed and that we never integrated 🙁). [ Stepping down before installing such a feature is my biggest regret w.r.t my time as Emacs maintainer. ] > - (elisp-benchmarks :url nil) > + (elisp-benchmarks :core ("lisp/emacs-lisp/elisp-benchmarks.el")) I suspect the above would need to be completed with more files/dirs in the list. Stefan ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-13 0:49 ` Stefan Kangas 2024-12-13 7:37 ` Andrea Corallo @ 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 2024-12-14 11:58 ` Stefan Kangas 1 sibling, 1 reply; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-14 11:34 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel "Stefan Kangas" <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> I agree a full benchmark suite would be even better, I don't recall why >> I typed "reduced" there. So let's do that? > > Please go ahead, thanks. Here's my proposal: Expand ERT to handle the benchmark case. Copy ALL benchmarks from elisp-benchmarks, but use the ERT framework instead of elisp-benchmarks.el. Keep things minimal for now, try it out, and add complexity only if we get the impression this would make a useful addition; otherwise, revert the changes and go back to using elisp-benchmarks.el. This is what would work: 1. add a :benchmark tag to ert tests 2. create a new directory test/benchmark/ for benchmarks 3. modify test/Makefile not to run benchmark tests by default 4. add make targets to run the benchmark tests only I think the mechanism used by elisp-benchmarks.el to select tests is very ad-hoc and less powerful than the ert tagging mechanism. It also doesn't work for me: executing (progn (elisp-benchmarks-run "" t 1) (elisp-benchmarks-run "bubble" t 1)) means all tests are run twice, but I intended to run all tests once, as a warm up, then run only the bubble test again. However, I suspect this is merely a bug which can easily be fixed (maybe it's intentional, though?). I'm also seeing problems with a "Window size not as stipulated by the benchmark" error message, but I'll have to investigate that one... The mathematical part of elisp-benchmarks.el is questionable: it's built around the idea of using an arithmetic average of several test runs (which is equivalent to a single test run with more iterations); I believe a median/percentile-based approach to selecting a "typical" run would yield more useful numbers. So I'm not proposing to reuse the avg-based code. I tried to resist the temptation of making ert.el overly general; for example, instead of defining a new defstruct to determine HOW tests are run, I merely added a benchmark flag. Maybe we can revisit this if the approach is adopted and the need for more detailed benchmark specifications (inhibit GC? warmup? iteration counts? interact with "perf"?) becomes apparent. However, I did fail and give in to the temptation to allow an inhibit-gc mode specifier, which should probably be removed again... The main problems I see are that "make benchmark" starts emacs instances for all files in test/, which takes a lot of time (but that's a general ERT problem that should be fixed for pass-or-fail testing, too). A minor problem is how to copy the elisp-benchmark tests and keep them in sync. This would very much depend on how much work Andrea is willing to do. Finally, elisp-benchmarks has a very useful feature, somewhat hidden, that this code lacks: while calculating the arithmetic average of several benchmark runs isn't useful, calculating the standard deviation from that average is, because it gives us an indication of how scattered the results are; scattered test results (i.e. high numbers reported in the "tot avg err" column) are a sufficient, but not a necessary, condition for knowing when to discard the test results because something unexpected happened (most likely system load issues or CPU clock games). Benchmarking is, of course, very hard. I understand Paul Eggert is using an ancient machine for benchmarking because it avoids many of the issues that have arisen in the past decade. With a modern machine, we have a heterogeneous set of CPU cores (energy/performance), each of which can be configured in different ways (energy-performance preference for each core) in addition to running at a variable and unpredictable clock speed (cpufreq/boost); CPU caches are now large enough to persist across benchmark runs, and the system memory clock is also variable. This is a very rough start which would allow us to detect many, but not all, unexpected catastrophic performance reductions due to code changes. Finally, if Someone is willing to work on this, we should look into finding a set of benchmarks representative of "typical" Emacs usage so we can use PGO when building Emacs. While I'd prefer doing neither of the two, playing with PGO is a much more promising and maintainable approach than adding __builtin_expect noise to our C source code. Pip From 4217a5b8f990760775709392b24e0205041cfed3 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Sat, 14 Dec 2024 10:45:42 +0000 Subject: [PATCH 1/3] Add a benchmark from elisp-benchmarks DO NOT MERGE FIXME BEFORE MERGING: Should we add a link to https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\ f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\ hb=refs/heads/externals/elisp-benchmarks here? Losing git history because we copy a file from elpa to emacs seems suboptimal... * test/benchmark/bubble.el: New file. --- test/benchmark/bubble.el | 52 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 test/benchmark/bubble.el diff --git a/test/benchmark/bubble.el b/test/benchmark/bubble.el new file mode 100644 index 00000000000..0c38cdbce39 --- /dev/null +++ b/test/benchmark/bubble.el @@ -0,0 +1,52 @@ +;; -*- lexical-binding: t; -*- + +;; Copyright (C) 2019 Free Software Foundation, Inc. + +;; This file is part of GNU Emacs. + +;; GNU Emacs is free software: you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation, either version 3 of the License, or +;; (at your option) any later version. + +;; GNU Emacs is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. + +;;; Commentary: + +;; From: +;; https://www.emacswiki.org/emacs/EmacsLispBenchmark + +(require 'ert) + +(require 'cl-lib) + +(defvar elb-bubble-len 1000) +(defvar elb-bubble-list (mapcar #'random (make-list elb-bubble-len + most-positive-fixnum))) + +(defun elb-bubble (list) + (let ((i (length list))) + (while (> i 1) + (let ((b list)) + (while (cdr b) + (when (< (cadr b) (car b)) + (setcar b (prog1 (cadr b) + (setcdr b (cons (car b) (cddr b)))))) + (setq b (cdr b)))) + (setq i (1- i))) + list)) + +(defun elb-bubble-entry () + (cl-loop repeat 100 + for l = (copy-sequence elb-bubble-list) + do (elb-bubble l))) + +(ert-deftest benchmark-bubble () + :tags '(:benchmark) + (elb-bubble-entry)) -- 2.47.0 From df31e19452dff0fe804af2fd3c73f4cee84b6d16 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Sat, 14 Dec 2024 10:56:19 +0000 Subject: [PATCH 2/3] Expand the ERT framework to allow for benchmarks * lisp/emacs-lisp/ert.el (ert-test-result, ert--test-execution-info): Expand structs to include "benchmark" field. (ert--run-benchmark-test-internal): New function. (ert-run-test, ert-run-or-rerun-test): (ert-run-tests-batch-and-exit): (ert-run-tests): Add benchmark argument. (ert-run-tests-batch): Include GC information when running benchmarks. (ert-summarize-tests-batch-and-exit): Handle 1.0e+INF. --- lisp/emacs-lisp/ert.el | 91 +++++++++++++++++++++++++++++++++--------- 1 file changed, 73 insertions(+), 18 deletions(-) diff --git a/lisp/emacs-lisp/ert.el b/lisp/emacs-lisp/ert.el index 97aa233f6e2..76365ed8152 100644 --- a/lisp/emacs-lisp/ert.el +++ b/lisp/emacs-lisp/ert.el @@ -698,6 +698,7 @@ ert-test-result (messages nil) (should-forms nil) (duration 0) + (benchmark nil) ) (cl-defstruct (ert-test-passed (:include ert-test-result))) (cl-defstruct (ert-test-result-with-condition (:include ert-test-result)) @@ -723,7 +724,8 @@ ert--test-execution-info ;; execution of the current test. We store it to avoid being ;; affected by any new bindings the test itself may establish. (I ;; don't remember whether this feature is important.) - ert-debug-on-error) + ert-debug-on-error + benchmark) (defun ert--run-test-debugger (info condition debugfun) "Error handler used during the test run. @@ -801,6 +803,39 @@ ert--run-test-internal (make-ert-test-passed)) nil) +(defun ert--run-benchmark-test-internal (test-execution-info benchmark) + (setf (ert--test-execution-info-ert-debug-on-error test-execution-info) + ert-debug-on-error) + (catch 'ert--pass + ;; For now, each test gets its own temp buffer and its own + ;; window excursion, just to be safe. If this turns out to be + ;; too expensive, we can remove it. + (with-temp-buffer + (save-window-excursion + (let ((lexical-binding t) ;;FIXME: Why? + (ert--infos '()) + time) + (letrec ((debugfun (lambda (err) + (ert--run-test-debugger test-execution-info + err debugfun)))) + (handler-bind (((error quit) debugfun)) + (garbage-collect) + (let ((gc-cons-threshold (if (eq benchmark 'inhibit-gc) + most-positive-fixnum + gc-cons-threshold))) + (setq time (benchmark-run nil + (funcall (ert-test-body (ert--test-execution-info-test + test-execution-info)))))) + (and (eq benchmark 'inhibit-gc) + (not (= (cadr time) 0)) + (warn "failed to inhibit gc; time %S" time)) + (setf (ert--test-execution-info-benchmark test-execution-info) + time)))))) + (ert-pass)) + (setf (ert--test-execution-info-result test-execution-info) + (make-ert-test-passed)) + nil) + (defun ert--force-message-log-buffer-truncation () "Immediately truncate *Messages* buffer according to `message-log-max'. @@ -832,7 +867,7 @@ ert--running-tests The elements are of type `ert-test'.") -(defun ert-run-test (ert-test) +(defun ert-run-test (ert-test &optional benchmark) "Run ERT-TEST. Returns the result and stores it in ERT-TEST's `most-recent-result' slot." @@ -855,8 +890,12 @@ ert-run-test (push form-description should-form-accu))) (message-log-max t) (ert--running-tests (cons ert-test ert--running-tests))) - (ert--run-test-internal info)) + (if benchmark + (ert--run-benchmark-test-internal info benchmark) + (ert--run-test-internal info))) (let ((result (ert--test-execution-info-result info))) + (setf (ert-test-result-benchmark result) + (ert--test-execution-info-benchmark info)) (setf (ert-test-result-messages result) (with-current-buffer (messages-buffer) (buffer-substring begin-marker (point-max)))) @@ -1206,7 +1245,7 @@ ert--make-stats :test-start-times (make-vector (length tests) nil) :test-end-times (make-vector (length tests) nil)))) -(defun ert-run-or-rerun-test (stats test listener) +(defun ert-run-or-rerun-test (stats test listener &optional benchmark) ;; checkdoc-order: nil "Run the single test TEST and record the result using STATS and LISTENER." (let ((ert--current-run-stats stats) @@ -1221,19 +1260,26 @@ ert-run-or-rerun-test (setf (ert-test-most-recent-result test) nil) (setf (aref (ert--stats-test-start-times stats) pos) (current-time)) (unwind-protect - (ert-run-test test) + (ert-run-test test benchmark) (setf (aref (ert--stats-test-end-times stats) pos) (current-time)) (let ((result (ert-test-most-recent-result test))) - (setf (ert-test-result-duration result) - (float-time - (time-subtract - (aref (ert--stats-test-end-times stats) pos) - (aref (ert--stats-test-start-times stats) pos)))) + (cond ((ert-test-result-benchmark result) + (setf (ert-test-result-duration result) + (if (memq benchmark '(no-gc inhibit-gc)) + (- (car (ert-test-result-benchmark result)) + (caddr (ert-test-result-benchmark result))) + (car (ert-test-result-benchmark result))))) + (t + (setf (ert-test-result-duration result) + (float-time + (time-subtract + (aref (ert--stats-test-end-times stats) pos) + (aref (ert--stats-test-start-times stats) pos)))))) (ert--stats-set-test-and-result stats pos test result) (funcall listener 'test-ended stats test result)) (setf (ert--stats-current-test stats) nil)))) -(defun ert-run-tests (selector listener &optional interactively) +(defun ert-run-tests (selector listener &optional interactively benchmark) "Run the tests specified by SELECTOR, sending progress updates to LISTENER." (let* ((tests (ert-select-tests selector t)) (stats (ert--make-stats tests selector))) @@ -1245,7 +1291,7 @@ ert-run-tests (force-mode-line-update) (unwind-protect (cl-loop for test in tests do - (ert-run-or-rerun-test stats test listener) + (ert-run-or-rerun-test stats test listener benchmark) (when (and interactively (ert-test-quit-p (ert-test-most-recent-result test)) @@ -1367,7 +1413,7 @@ ert-batch-backtrace-right-margin "The maximum line length for printing backtraces in `ert-run-tests-batch'.") ;;;###autoload -(defun ert-run-tests-batch (&optional selector) +(defun ert-run-tests-batch (&optional selector benchmark) "Run the tests specified by SELECTOR, printing results to the terminal. SELECTOR selects which tests to run as described in `ert-select-tests' when @@ -1493,7 +1539,7 @@ ert-run-tests-batch (let* ((max (prin1-to-string (length (ert--stats-tests stats)))) (format-string (concat "%9s %" (prin1-to-string (length max)) - "s/" max " %S (%f sec)%s"))) + "s/" max " %S (%f sec%s)%s"))) (message format-string (ert-string-for-test-result result (ert-test-result-expected-p @@ -1501,13 +1547,19 @@ ert-run-tests-batch (1+ (ert--stats-test-pos stats test)) (ert-test-name test) (ert-test-result-duration result) + (if (ert-test-result-benchmark result) + (format ", %f sec in GC, %d GCs" + (caddr (ert-test-result-benchmark result)) + (cadr (ert-test-result-benchmark result))) + "") (if (ert-test-result-expected-p test result) "" (concat " " (ert-test-location test)))))))))) - nil)) + nil + benchmark)) ;;;###autoload -(defun ert-run-tests-batch-and-exit (&optional selector) +(defun ert-run-tests-batch-and-exit (&optional selector benchmark) "Like `ert-run-tests-batch', but exits Emacs when done. The exit status will be 0 if all test results were as expected, 1 @@ -1525,7 +1577,7 @@ ert-run-tests-batch-and-exit (setq attempt-stack-overflow-recovery nil attempt-orderly-shutdown-on-fatal-signal nil) (unwind-protect - (let ((stats (ert-run-tests-batch selector))) + (let ((stats (ert-run-tests-batch selector benchmark))) (when eln-dir (ignore-errors (delete-directory eln-dir t))) @@ -1726,7 +1778,10 @@ ert-summarize-tests-batch-and-exit If HIGH is a natural number, the HIGH long lasting tests are summarized." (or noninteractive (user-error "This function is only for use in batch mode")) - (or (natnump high) (setq high 0)) + (cond + ;; FIXME: ntake doesn't allow an infinity argument + ((eql high 1.0e+INF) (setq high most-positive-fixnum)) + ((not (natnump high)) (setq high 0))) ;; Better crash loudly than attempting to recover from undefined ;; behavior. (setq attempt-stack-overflow-recovery nil -- 2.47.0 From 8cd4053967a0aa6521039ba887c911daa13b0cf0 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Sat, 14 Dec 2024 10:59:38 +0000 Subject: [PATCH 3/3] Add "make benchmark" rule * Makefile.in (benchmark): New recipe. * test/Makefile.in (SELECTOR_BENCHMARK): New selector. (SELECTOR_ALL, SELECTOR_EXPENSIVE, SELECTOR_DEFAULT): Modify selectors not to include benchmark tests. (check-benchmark): New recipe. --- Makefile.in | 7 +++++++ test/Makefile.in | 25 +++++++++++++++++-------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/Makefile.in b/Makefile.in index 30a762ed03b..13a55452da2 100644 --- a/Makefile.in +++ b/Makefile.in @@ -69,6 +69,9 @@ # check-expensive includes additional tests that can be slow. # check-all runs all tests, including ones that can be slow, or # fail unpredictably +# +# make benchmark +# Run the Emacs benchmark suite. SHELL = @SHELL@ @@ -1138,6 +1141,10 @@ $(CHECK_TARGETS): test/%: $(MAKE) -C test $* +BENCHMARK_TARGETS = benchmark +.PHONY: $(BENCHMARK_TARGETS) +$(BENCHMARK_TARGETS): all + $(MAKE) SUMMARIZE_TESTS="1.0e+INF" BENCHMARKP="t" -C test check-benchmark dist: cd ${srcdir}; ./make-dist diff --git a/test/Makefile.in b/test/Makefile.in index 7a3178546a1..18a478b3e6c 100644 --- a/test/Makefile.in +++ b/test/Makefile.in @@ -78,9 +78,9 @@ TEST_TIMEOUT = TEST_INTERACTIVE ?= no ifeq ($(TEST_INTERACTIVE),yes) -TEST_RUN_ERT = --eval '(ert (quote ${SELECTOR_ACTUAL}))' +TEST_RUN_ERT = --eval '(ert (quote ${SELECTOR_ACTUAL}) ${BENCHMARKP})' else -TEST_RUN_ERT = --batch --eval '(ert-run-tests-batch-and-exit (quote ${SELECTOR_ACTUAL}))' ${WRITE_LOG} +TEST_RUN_ERT = --batch --eval '(ert-run-tests-batch-and-exit (quote ${SELECTOR_ACTUAL}) ${BENCHMARKP})' ${WRITE_LOG} endif # Whether to run tests from .el files in preference to .elc, we do @@ -136,13 +136,15 @@ TEST_NATIVE_COMP = TEST_NATIVE_COMP = no endif ifeq ($(TEST_NATIVE_COMP),yes) -SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable))) -SELECTOR_EXPENSIVE = (not (tag :unstable)) -SELECTOR_ALL = t +SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :benchmark))) +SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :benchmark))) +SELECTOR_ALL = (not (tag :benchmark)) +SELECTOR_BENCHMARK = (tag :benchmark) else -SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :nativecomp))) -SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :nativecomp))) -SELECTOR_ALL = (not (tag :nativecomp)) +SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :nativecomp) (tag :benchmark))) +SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :nativecomp) (tag :benchmark))) +SELECTOR_ALL = (not (or (tag :nativecomp) (tag :benchmark))) +SELECTOR_BENCHMARK = (and (tag :benchmark) (not (tag :nativecomp))) endif ifdef SELECTOR SELECTOR_ACTUAL=$(SELECTOR) @@ -154,6 +156,8 @@ SELECTOR_ACTUAL= SELECTOR_ACTUAL=$(SELECTOR_DEFAULT) else ifeq ($(MAKECMDGOALS),check-maybe) SELECTOR_ACTUAL=$(SELECTOR_DEFAULT) +else ifeq ($(MAKECMDGOALS),check-benchmark) +SELECTOR_ACTUAL=$(SELECTOR_BENCHMARK) else SELECTOR_ACTUAL=$(SELECTOR_EXPENSIVE) endif @@ -323,6 +327,11 @@ .PHONY: check-all: mostlyclean check-no-automated-subdir @${MAKE} check-doit SELECTOR="${SELECTOR_ALL}" +## Run all benchmark tests, regardless of tag. +.PHONY: check-benchmark +check-benchmark: mostlyclean check-no-automated-subdir + @${MAKE} check-doit SELECTOR="${SELECTOR_BENCHMARK}" + ## Re-run all tests which are outdated. A test is outdated if its ## logfile is out-of-date with either the test file, or the source ## files that the tests depend on. See test_template. -- 2.47.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: New "make benchmark" target 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. @ 2024-12-14 11:58 ` Stefan Kangas [not found] ` <875xnmf2qp.fsf@protonmail.com> 0 siblings, 1 reply; 25+ messages in thread From: Stefan Kangas @ 2024-12-14 11:58 UTC (permalink / raw) To: Pip Cet Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > FIXME BEFORE MERGING: Should we add a link to > https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\ > f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\ > hb=refs/heads/externals/elisp-benchmarks > here? Losing git history because we copy a file from elpa to emacs > seems suboptimal... Instead of copying the file, it might be preferable to import the entire git history into emacs.git, like we did for use-package and eglot. Then the old branch on GNU ELPA can be dropped, as we won't lose any history. João has some scripts that he used for eglot, and I adapted them for use-package. Note that he also had some copyright assignment issues to take care of, so it could probably be simplified. Please take a look here: https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770 ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <875xnmf2qp.fsf@protonmail.com>]
* Re: New "make benchmark" target [not found] ` <875xnmf2qp.fsf@protonmail.com> @ 2024-12-14 20:20 ` João Távora 0 siblings, 0 replies; 25+ messages in thread From: João Távora @ 2024-12-14 20:20 UTC (permalink / raw) To: Pip Cet Cc: Stefan Kangas, Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel On Sat, Dec 14, 2024 at 8:07 PM Pip Cet <pipcet@protonmail.com> wrote: > Joao, I think it would be a good idea to keep the modified script in > admin/ or somewhere for future reference, but I don't know whether you > consider it an Emacs contribution (and, thus, covered by your copyright > assignment). Sure, use it. But there's a catch that I'm fairly sure I didn't write part of that code. Someone on the emacs-devel list helped me and wrote the first versions of it (following my requirements), and I'm very sorry but I can't remember who it was. Github handle is 'bhankas. Good luck! ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions. 2024-12-12 6:36 ` Eli Zaretskii @ 2024-12-12 10:42 ` Óscar Fuentes 2024-12-12 10:50 ` Andrea Corallo 1 sibling, 1 reply; 25+ messages in thread From: Óscar Fuentes @ 2024-12-12 10:42 UTC (permalink / raw) To: emacs-devel; +Cc: Pip Cet Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > I looked at the "new" code generated for our EQ macro, and decided that > a fix was in order. I'm therefore sending a first proposal to explain > what I think should be done, and some numbers. > > This patch: > * moves the "slow path" of EQ into a NO_INLINE function > * exits early if the arguments to EQ are actually BASE_EQ > * returns quickly (after a single memory access which cannot be avoided > until we fix our tagging scheme to distinguish exotic objects from > ordinary ones) when symbols_with_pos_enabled isn't true. > > The effect on the code size of the stripped emacs binary is small, but > significant: 8906336 bytes instead of 8955488 bytes on this machine. > (The effect on the code size of the emacs binary with debugging > information is much larger, reducing it from 32182000 bytes to 31125832 > bytes on this system.) There is no effect on the size of the .pdmp > file, which is expected. > > What's missing here is a benchmark, but unless there's a really nasty > surprise when that happens, I'm quite confident that we can improve the > code here. I've seen too many cases where *removing* instructions (mind you, literally removing, not changing!) made the code significantly slower. Modern CPUs are insanely complex and combined with compilers make intuition-based predictions even more futile. But reading your message makes me wonder if EQ and some other "simple" fundamental functions are not lowered by nativecomp? If not, maybe that's a significant opportunity for improvement. As for your patch, one thing that would be easy to do and might save quite a lot of head scratching is to count the fraction of the calls to EQ that benefit from the fast path on a "representative" Emacs run. Then you would have hard data to decide if fighting the compiler/CPU on that case is a worthy cause. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 10:42 ` Improving EQ Óscar Fuentes @ 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Andrea Corallo @ 2024-12-12 10:50 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel, Pip Cet Óscar Fuentes <ofv@wanadoo.es> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> I looked at the "new" code generated for our EQ macro, and decided that >> a fix was in order. I'm therefore sending a first proposal to explain >> what I think should be done, and some numbers. >> >> This patch: >> * moves the "slow path" of EQ into a NO_INLINE function >> * exits early if the arguments to EQ are actually BASE_EQ >> * returns quickly (after a single memory access which cannot be avoided >> until we fix our tagging scheme to distinguish exotic objects from >> ordinary ones) when symbols_with_pos_enabled isn't true. >> >> The effect on the code size of the stripped emacs binary is small, but >> significant: 8906336 bytes instead of 8955488 bytes on this machine. >> (The effect on the code size of the emacs binary with debugging >> information is much larger, reducing it from 32182000 bytes to 31125832 >> bytes on this system.) There is no effect on the size of the .pdmp >> file, which is expected. >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > I've seen too many cases where *removing* instructions (mind you, > literally removing, not changing!) made the code significantly slower. > > Modern CPUs are insanely complex and combined with compilers make > intuition-based predictions even more futile. That's why the patch needs to be benchmarked anyway. > But reading your message makes me wonder if EQ and some other "simple" > fundamental functions are not lowered by nativecomp? If not, maybe > that's a significant opportunity for improvement. Nativecomp only compiles eq for Lisp code, the one discussed here is the eq used in C (and bytecode). BTW ATM nativecomp generates code with the same layout of the eq we had in C till my last change of few weeks ago. When eq will be stable in C I guess I'll replicate the layout for generated code for Lisp as well. Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 10:50 ` Andrea Corallo @ 2024-12-12 11:21 ` Óscar Fuentes 2024-12-13 12:24 ` Pip Cet via Emacs development discussions. 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. 2024-12-12 18:10 ` John ff 2 siblings, 1 reply; 25+ messages in thread From: Óscar Fuentes @ 2024-12-12 11:21 UTC (permalink / raw) To: emacs-devel Andrea Corallo <acorallo@gnu.org> writes: >> But reading your message makes me wonder if EQ and some other "simple" >> fundamental functions are not lowered by nativecomp? If not, maybe >> that's a significant opportunity for improvement. > > Nativecomp only compiles eq for Lisp code, the one discussed here is the > eq used in C (and bytecode). Ok, thanks. Of course this change also affects Emacs running with nativecomp, as many calls to EQ are made by C functions not lowered by nativecomp. My guess is that nativecomp's performance would benefit quite a bit from the general approach of this patch, as every point where nativecomp calls C is a pessimization spot, but that's another topic. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 11:21 ` Óscar Fuentes @ 2024-12-13 12:24 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-13 12:24 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel Óscar Fuentes <ofv@wanadoo.es> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >>> But reading your message makes me wonder if EQ and some other "simple" >>> fundamental functions are not lowered by nativecomp? If not, maybe >>> that's a significant opportunity for improvement. >> >> Nativecomp only compiles eq for Lisp code, the one discussed here is the >> eq used in C (and bytecode). Does nativecomp actually call emit_EQ for anything but lowering ELC jump tables into a sequence of conditional branches? I don't see any code to do so, and Emacs builds fine without emit_EQ if byte-compile-cond-use-jump-table is disabled. > Of course this change also affects Emacs running with nativecomp, as > many calls to EQ are made by C functions not lowered by nativecomp. My impression is that nativecomp usually ends up calling Feq for eq-based conditions. My point is that emit_EQ is used very rarely, when emitting a switch statement (and switch statements should usually use maybe_remove_pos_from_symbol + BASE_EQ rather than EQ). So I won't bother doing anything with emit_EQ. Pip ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes @ 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. 2024-12-12 18:10 ` John ff 2 siblings, 0 replies; 25+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:05 UTC (permalink / raw) To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel "Andrea Corallo" <acorallo@gnu.org> writes: > Óscar Fuentes <ofv@wanadoo.es> writes: >> I've seen too many cases where *removing* instructions (mind you, >> literally removing, not changing!) made the code significantly slower. Yes, there are many ways in which that can happen. Removing prefetches or branch hints is the most obvious example, but I don't claim to know all the ways, and ultimately if an expected performance improvement does not materialize we might have to decide this one on code size reasons alone (of course, if performance is drastically worse, we shouldn't apply the patch). >> Modern CPUs are insanely complex and combined with compilers make >> intuition-based predictions even more futile. The compiler isn't the issue here, since I checked the assembly code that was generated. Totally agree about CPUs. For example, moving code out of line will change many conditional branch locations to a single one (the one in the out-of-line function), which may help or hurt branch prediction, and that's just one of many ways in which inline functions often lose. So we should also benchmark whether this might be one of those cases, in which case we'd want to move all of EQ to a non-inlined function... > That's why the patch needs to be benchmarked anyway. Absolute agreement there. I tried some initial benchmarks and it's lost in the noise, but that was while running on battery on a laptop, and I need to test on a machine with a proper fixed clock rate. >> But reading your message makes me wonder if EQ and some other "simple" >> fundamental functions are not lowered by nativecomp? If not, maybe >> that's a significant opportunity for improvement. > > Nativecomp only compiles eq for Lisp code, the one discussed here is the > eq used in C (and bytecode). > > BTW ATM nativecomp generates code with the same layout of the eq we had > in C till my last change of few weeks ago. When eq will be stable in C > I guess I'll replicate the layout for generated code for Lisp as well. Thanks, that would be great. Yes, it makes most sense to test with and without nativecomp, expecting improvement to be more significant in the latter case (but as EQ is used by C code used by native-compiled Lisp, I expect a small improvement there, too). Pip ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Improving EQ 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. @ 2024-12-12 18:10 ` John ff 2 siblings, 0 replies; 25+ messages in thread From: John ff @ 2024-12-12 18:10 UTC (permalink / raw) To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel, Pip Cet [-- Attachment #1: Type: text/plain, Size: 2352 bytes --] -------- Original Message -------- From: Andrea Corallo <acorallo@gnu.org> Sent: Thu Dec 12 10:50:04 GMT 2024 To: "Óscar Fuentes" <ofv@wanadoo.es> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com> Subject: Re: Improving EQ Óscar Fuentes <ofv@wanadoo.es> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> I looked at the "new" code generated for our EQ macro, and decided that >> a fix was in order. I'm therefore sending a first proposal to explain >> what I think should be done, and some numbers. >> >> This patch: >> * moves the "slow path" of EQ into a NO_INLINE function >> * exits early if the arguments to EQ are actually BASE_EQ >> * returns quickly (after a single memory access which cannot be avoided >> until we fix our tagging scheme to distinguish exotic objects from >> ordinary ones) when symbols_with_pos_enabled isn't true. >> >> The effect on the code size of the stripped emacs binary is small, but >> significant: 8906336 bytes instead of 8955488 bytes on this machine. >> (The effect on the code size of the emacs binary with debugging >> information is much larger, reducing it from 32182000 bytes to 31125832 >> bytes on this system.) There is no effect on the size of the .pdmp >> file, which is expected. >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > I've seen too many cases where *removing* instructions (mind you, > literally removing, not changing!) made the code significantly slower. > > Modern CPUs are insanely complex and combined with compilers make > intuition-based predictions even more futile. That's why the patch needs to be benchmarked anyway. > But reading your message makes me wonder if EQ and some other "simple" > fundamental functions are not lowered by nativecomp? If not, maybe > that's a significant opportunity for improvement. Nativecomp only compiles eq for Lisp code, the one discussed here is the eq used in C (and bytecode). BTW ATM nativecomp generates code with the same layout of the eq we had in C till my last change of few weeks ago. When eq will be stable in C I guess I'll replicate the layout for generated code for Lisp as well. Andrea [-- Attachment #2: Type: text/html, Size: 3135 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-12-14 20:20 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions. 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 8:23 ` Andrea Corallo 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii 2024-12-12 9:35 ` Visuwesh 2024-12-12 10:40 ` Andrea Corallo 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. 2024-12-12 19:09 ` Eli Zaretskii 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas 2024-12-12 10:59 ` Andrea Corallo 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. 2024-12-13 0:49 ` Stefan Kangas 2024-12-13 7:37 ` Andrea Corallo 2024-12-14 12:00 ` Stefan Kangas 2024-12-14 14:06 ` Stefan Monnier 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 2024-12-14 11:58 ` Stefan Kangas [not found] ` <875xnmf2qp.fsf@protonmail.com> 2024-12-14 20:20 ` João Távora 2024-12-12 10:42 ` Improving EQ Óscar Fuentes 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes 2024-12-13 12:24 ` Pip Cet via Emacs development discussions. 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. 2024-12-12 18:10 ` John ff
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).