* Improving EQ @ 2024-12-11 22:37 Pip Cet via Emacs development discussions. 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 10:42 ` Improving EQ Óscar Fuentes 0 siblings, 2 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-11 22:37 UTC (permalink / raw) To: emacs-devel I looked at the "new" code generated for our EQ macro, and decided that a fix was in order. I'm therefore sending a first proposal to explain what I think should be done, and some numbers. This patch: * moves the "slow path" of EQ into a NO_INLINE function * exits early if the arguments to EQ are actually BASE_EQ * returns quickly (after a single memory access which cannot be avoided until we fix our tagging scheme to distinguish exotic objects from ordinary ones) when symbols_with_pos_enabled isn't true. The effect on the code size of the stripped emacs binary is small, but significant: 8906336 bytes instead of 8955488 bytes on this machine. (The effect on the code size of the emacs binary with debugging information is much larger, reducing it from 32182000 bytes to 31125832 bytes on this system.) There is no effect on the size of the .pdmp file, which is expected. What's missing here is a benchmark, but unless there's a really nasty surprise when that happens, I'm quite confident that we can improve the code here. The proposed code doesn't use __builtin_expect anymore. I've deliberately written slow_eq so it returns the same value as EQ, even if the slow code path is disabled. Pip commit 2c807f7320bcb9654e0f148d64c92053b1a47b42 (HEAD -> faster-eq) Author: Pip Cet <pipcet@protonmail.com> Date: Wed Dec 11 22:31:07 2024 +0000 Change EQ to move slow code path into a separate function * src/data.c (slow_eq): New function. * src/lisp.h (EQ): Call it. diff --git a/src/data.c b/src/data.c index 66cf34c1e60..5ee383d2f48 100644 --- a/src/data.c +++ b/src/data.c @@ -162,6 +162,15 @@ circular_list (Lisp_Object list) \f /* Data type predicates. */ +/* NO_INLINE to avoid excessive code growth when LTO is in use. */ +NO_INLINE bool slow_eq (Lisp_Object x, Lisp_Object y) +{ + return BASE_EQ ((symbols_with_pos_enabled && SYMBOL_WITH_POS_P (x) ? + XSYMBOL_WITH_POS_SYM (x) : x), + (symbols_with_pos_enabled && SYMBOL_WITH_POS_P (y) ? + XSYMBOL_WITH_POS_SYM (y) : y)); +} + DEFUN ("eq", Feq, Seq, 2, 2, 0, doc: /* Return t if the two args are the same Lisp object. */ attributes: const) diff --git a/src/lisp.h b/src/lisp.h index 832a1755c04..64d4835a499 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -618,6 +618,7 @@ #define ENUM_BF(TYPE) enum TYPE extern Lisp_Object default_value (Lisp_Object symbol); extern void defalias (Lisp_Object symbol, Lisp_Object definition); extern char *fixnum_to_string (EMACS_INT number, char *buffer, char *end); +extern bool slow_eq (Lisp_Object x, Lisp_Object y); /* Defined in emacs.c. */ @@ -1353,10 +1354,12 @@ make_fixed_natnum (EMACS_INT n) INLINE bool EQ (Lisp_Object x, Lisp_Object y) { - return BASE_EQ ((__builtin_expect (symbols_with_pos_enabled, false) - && SYMBOL_WITH_POS_P (x) ? XSYMBOL_WITH_POS_SYM (x) : x), - (__builtin_expect (symbols_with_pos_enabled, false) - && SYMBOL_WITH_POS_P (y) ? XSYMBOL_WITH_POS_SYM (y) : y)); + if (BASE_EQ (x, y)) + return true; + else if (!symbols_with_pos_enabled) + return false; + else + return slow_eq (x, y); } INLINE intmax_t ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions. @ 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 8:23 ` Andrea Corallo 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 10:42 ` Improving EQ Óscar Fuentes 1 sibling, 2 replies; 58+ messages in thread From: Eli Zaretskii @ 2024-12-12 6:36 UTC (permalink / raw) To: Pip Cet, Mattias Engdegård, Paul Eggert; +Cc: emacs-devel > Date: Wed, 11 Dec 2024 22:37:04 +0000 > From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > > What's missing here is a benchmark, but unless there's a really nasty > surprise when that happens, I'm quite confident that we can improve the > code here. The usual easy benchmark is to byte-compile all the *.el files in the source tree. That is, remove all the *.elc files, then say "make" and time that. There was also some Emacs benchmark suite that someone posted, but I cannot find it now, maybe someone else will. Adding Mattias and Paul to this discussion. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 6:36 ` Eli Zaretskii @ 2024-12-12 8:23 ` Andrea Corallo 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 58+ messages in thread From: Andrea Corallo @ 2024-12-12 8:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, Mattias Engdegård, Paul Eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Wed, 11 Dec 2024 22:37:04 +0000 >> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > The usual easy benchmark is to byte-compile all the *.el files in the > source tree. That is, remove all the *.elc files, then say "make" and > time that. Agree, considering that tests the non zero 'symbols_with_pos_enabled' case. > There was also some Emacs benchmark suite that someone posted, but I > cannot find it now, maybe someone else will. <https://elpa.gnu.org/packages/elisp-benchmarks.html> ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 8:23 ` Andrea Corallo @ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii ` (3 more replies) 1 sibling, 4 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 8:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Mattias Engdegård, Paul Eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Wed, 11 Dec 2024 22:37:04 +0000 >> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > The usual easy benchmark is to byte-compile all the *.el files in the > source tree. That is, remove all the *.elc files, then say "make" and > time that. Considering the point of the optimization was to make compilation (when symbols_with_pos_enabled is true) slower, but speed up non-compilation use cases, I think that may be the opposite of what we want :-) Furthermore, the master branch doesn't currently build after deleting all the *.elc files, because recompilation exceeds max-lisp-eval-depth in that scenario (together with the known purespace issue, this pretty much means "make bootstrap" is the only way I can rebuild an emacs tree right now. It'd be great if Someone could look into this, but I've failed to understand the native-compilation code (and been told off for trying to) too often for that Someone to be me. Plus, of course, I fully understand that native compilation currently has wrong code generation bugs which obviously have to take priority over build issues...) > There was also some Emacs benchmark suite that someone posted, but I > cannot find it now, maybe someone else will. https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if we could agree on a benchmark, and even better if there were a way to reliably run it from emacs -Q :-) In fact, I would suggest to move a reduced benchmark suite to the emacs repo itself, and run it using "make benchmark". Also, just to let everyone know, I'm planning to make the "exotic" property (this object must or can use the slow_eq path) part (probably the LSB) of the tag rather than accessing it via a global variable and the PVEC type. This should reduce code size further, should speed up things, and has some other advantages which I'll go into when I have working code. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. @ 2024-12-12 9:18 ` Eli Zaretskii 2024-12-12 9:35 ` Visuwesh ` (2 subsequent siblings) 3 siblings, 0 replies; 58+ messages in thread From: Eli Zaretskii @ 2024-12-12 9:18 UTC (permalink / raw) To: Pip Cet; +Cc: mattiase, eggert, emacs-devel > Date: Thu, 12 Dec 2024 08:36:50 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > The usual easy benchmark is to byte-compile all the *.el files in the > > source tree. That is, remove all the *.elc files, then say "make" and > > time that. > > Considering the point of the optimization was to make compilation (when > symbols_with_pos_enabled is true) slower, but speed up non-compilation > use cases, I think that may be the opposite of what we want :-) That's fine, because knowing where this slows us down and by how much is also important. > Furthermore, the master branch doesn't currently build after deleting > all the *.elc files, because recompilation exceeds max-lisp-eval-depth > in that scenario (together with the known purespace issue, this pretty > much means "make bootstrap" is the only way I can rebuild an emacs tree > right now. It'd be great if Someone could look into this, but I've > failed to understand the native-compilation code (and been told off for > trying to) too often for that Someone to be me. Plus, of course, I fully > understand that native compilation currently has wrong code generation > bugs which obviously have to take priority over build issues...) If this is with native-compilation, how about trying without? Also, enlarging max-lisp-eval-depth (assuming you don't somehow hit infinite recursion) locally should be easy: just add that to the relevant Makefiles. > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) Our benchmark facilities are very rudimentary, so agreement is not an issue: we just use whatever is available. > In fact, I would suggest to move a reduced benchmark suite to the emacs > repo itself, and run it using "make benchmark". Working on a better benchmark is very useful, but maybe we should try solving one problem at a time? > Also, just to let everyone know, I'm planning to make the "exotic" > property (this object must or can use the slow_eq path) part (probably > the LSB) of the tag rather than accessing it via a global variable and > the PVEC type. This should reduce code size further, should speed up > things, and has some other advantages which I'll go into when I have > working code. Whenever you change something in the tags, please remember to update .gdbinit, otherwise we lose debugging support. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii @ 2024-12-12 9:35 ` Visuwesh 2024-12-12 10:40 ` Andrea Corallo 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas 3 siblings, 0 replies; 58+ messages in thread From: Visuwesh @ 2024-12-12 9:35 UTC (permalink / raw) To: Pip Cet via Emacs development discussions. Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert [வியாழன் டிசம்பர் 12, 2024] Pip Cet via "Emacs development discussions." wrote: >> There was also some Emacs benchmark suite that someone posted, but I >> cannot find it now, maybe someone else will. > > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) Will the command package-isolate help in this scenario? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii 2024-12-12 9:35 ` Visuwesh @ 2024-12-12 10:40 ` Andrea Corallo 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas 3 siblings, 1 reply; 58+ messages in thread From: Andrea Corallo @ 2024-12-12 10:40 UTC (permalink / raw) To: Pip Cet via Emacs development discussions. Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > "Eli Zaretskii" <eliz@gnu.org> writes: > >>> Date: Wed, 11 Dec 2024 22:37:04 +0000 >>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >>> >>> What's missing here is a benchmark, but unless there's a really nasty >>> surprise when that happens, I'm quite confident that we can improve the >>> code here. >> >> The usual easy benchmark is to byte-compile all the *.el files in the >> source tree. That is, remove all the *.elc files, then say "make" and >> time that. > > Considering the point of the optimization was to make compilation (when > symbols_with_pos_enabled is true) slower, but speed up non-compilation > use cases, I think that may be the opposite of what we want :-) Glad you finally agree on the goal of the optimization. > Furthermore, the master branch doesn't currently build after deleting > all the *.elc files, because recompilation exceeds max-lisp-eval-depth > in that scenario (together with the known purespace issue, this pretty > much means "make bootstrap" is the only way I can rebuild an emacs tree > right now. It'd be great if Someone could look into this, but I've > failed to understand the native-compilation code (and been told off for > trying to) too often for that Someone to be me. Plus, of course, I fully > understand that native compilation currently has wrong code generation > bugs which obviously have to take priority over build issues...) > >> There was also some Emacs benchmark suite that someone posted, but I >> cannot find it now, maybe someone else will. > > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) What is not reliable in the elisp-benchmarks invocation suggested in the instructions in it? > In fact, I would suggest to move a reduced benchmark suite to the emacs > repo itself, and run it using "make benchmark". That would be nice. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 10:40 ` Andrea Corallo @ 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. 2024-12-12 19:09 ` Eli Zaretskii 0 siblings, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:46 UTC (permalink / raw) To: Andrea Corallo Cc: Pip Cet via "Emacs development discussions.", Eli Zaretskii, Mattias Engdegård, Paul Eggert "Andrea Corallo" <acorallo@gnu.org> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >>>> Date: Wed, 11 Dec 2024 22:37:04 +0000 >>>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >>>> >>>> What's missing here is a benchmark, but unless there's a really nasty >>>> surprise when that happens, I'm quite confident that we can improve the >>>> code here. >>> >>> The usual easy benchmark is to byte-compile all the *.el files in the >>> source tree. That is, remove all the *.elc files, then say "make" and >>> time that. >> >> Considering the point of the optimization was to make compilation (when >> symbols_with_pos_enabled is true) slower, but speed up non-compilation >> use cases, I think that may be the opposite of what we want :-) > > Glad you finally agree on the goal of the optimization. I find that statement to be offensive, because you know it to be factually incorrect; but even if it weren't, gloating like that is extremely poor form for a maintainer. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. @ 2024-12-12 19:09 ` Eli Zaretskii 0 siblings, 0 replies; 58+ messages in thread From: Eli Zaretskii @ 2024-12-12 19:09 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, emacs-devel, mattiase, eggert > Date: Thu, 12 Dec 2024 17:46:55 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu> > > "Andrea Corallo" <acorallo@gnu.org> writes: > > >> Considering the point of the optimization was to make compilation (when > >> symbols_with_pos_enabled is true) slower, but speed up non-compilation > >> use cases, I think that may be the opposite of what we want :-) > > > > Glad you finally agree on the goal of the optimization. > > I find that statement to be offensive, because you know it to be > factually incorrect; but even if it weren't, gloating like that is > extremely poor form for a maintainer. I'm quite sure Andrea meant it as tongue-in-cheek, nowhere near gloating. Please keep in mind that for Andrea and others here, English is not their first language, and their choice of words could reflect that. ^ permalink raw reply [flat|nested] 58+ messages in thread
* New "make benchmark" target 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. ` (2 preceding siblings ...) 2024-12-12 10:40 ` Andrea Corallo @ 2024-12-12 10:53 ` Stefan Kangas 2024-12-12 10:59 ` Andrea Corallo 3 siblings, 1 reply; 58+ messages in thread From: Stefan Kangas @ 2024-12-12 10:53 UTC (permalink / raw) To: Pip Cet, Eli Zaretskii Cc: Mattias Engdegård, Paul Eggert, emacs-devel, Andrea Corallo Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if > we could agree on a benchmark, and even better if there were a way to > reliably run it from emacs -Q :-) > > In fact, I would suggest to move a reduced benchmark suite to the emacs > repo itself, and run it using "make benchmark". SGTM, but why a reduced suite and not just the whole thing? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas @ 2024-12-12 10:59 ` Andrea Corallo 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 58+ messages in thread From: Andrea Corallo @ 2024-12-12 10:59 UTC (permalink / raw) To: Stefan Kangas Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if >> we could agree on a benchmark, and even better if there were a way to >> reliably run it from emacs -Q :-) >> >> In fact, I would suggest to move a reduced benchmark suite to the emacs >> repo itself, and run it using "make benchmark". > > SGTM, but why a reduced suite and not just the whole thing? My fear is that if we start going into the rabbit hole of which benchmark of elisp-benchmarks should or should not be included, we will never agree and as a consequence succeed. So I guess I'd favor as well including all elisp-benchmarks. Andrea ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-12 10:59 ` Andrea Corallo @ 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. 2024-12-13 0:49 ` Stefan Kangas 0 siblings, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 16:53 UTC (permalink / raw) To: Andrea Corallo Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel "Andrea Corallo" <acorallo@gnu.org> writes: > Stefan Kangas <stefankangas@gmail.com> writes: > >> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> >> writes: >> >>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if >>> we could agree on a benchmark, and even better if there were a way to >>> reliably run it from emacs -Q :-) >>> >>> In fact, I would suggest to move a reduced benchmark suite to the emacs >>> repo itself, and run it using "make benchmark". >> >> SGTM, but why a reduced suite and not just the whole thing? > > My fear is that if we start going into the rabbit hole of which > benchmark of elisp-benchmarks should or should not be included, we will > never agree and as a consequence succeed. So I guess I'd favor as well > including all elisp-benchmarks. I agree a full benchmark suite would be even better, I don't recall why I typed "reduced" there. So let's do that? Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. @ 2024-12-13 0:49 ` Stefan Kangas 2024-12-13 7:37 ` Andrea Corallo 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 0 siblings, 2 replies; 58+ messages in thread From: Stefan Kangas @ 2024-12-13 0:49 UTC (permalink / raw) To: Pip Cet, Andrea Corallo Cc: Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > I agree a full benchmark suite would be even better, I don't recall why > I typed "reduced" there. So let's do that? Please go ahead, thanks. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-13 0:49 ` Stefan Kangas @ 2024-12-13 7:37 ` Andrea Corallo 2024-12-14 12:00 ` Stefan Kangas 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 58+ messages in thread From: Andrea Corallo @ 2024-12-13 7:37 UTC (permalink / raw) To: Stefan Kangas Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Stefan Kangas <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> I agree a full benchmark suite would be even better, I don't recall why >> I typed "reduced" there. So let's do that? > > Please go ahead, thanks. Asking as elisp-benchmark author/maintainer, the way to move an external package in core is to copy the files and keep them manually in sync or there are other ways? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-13 7:37 ` Andrea Corallo @ 2024-12-14 12:00 ` Stefan Kangas 2024-12-14 14:06 ` Stefan Monnier 0 siblings, 1 reply; 58+ messages in thread From: Stefan Kangas @ 2024-12-14 12:00 UTC (permalink / raw) To: Andrea Corallo Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, Stefan Monnier Andrea Corallo <acorallo@gnu.org> writes: > Asking as elisp-benchmark author/maintainer, the way to move an external > package in core is to copy the files and keep them manually in sync or > there are other ways? We can make it into a :core package, which means that you copy the file to emacs.git, and when you update the "Version" header on Emacs master, the GNU ELPA scripts make a release based on that commit. Stefan Monnier (in CC) will know if there are any other adjustments that are needed, but one thing that needs doing is a change to the GNU ELPA `elpa-packages` file, something like this: - (elisp-benchmarks :url nil) + (elisp-benchmarks :core ("lisp/emacs-lisp/elisp-benchmarks.el")) ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-14 12:00 ` Stefan Kangas @ 2024-12-14 14:06 ` Stefan Monnier 0 siblings, 0 replies; 58+ messages in thread From: Stefan Monnier @ 2024-12-14 14:06 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel >> Asking as elisp-benchmark author/maintainer, the way to move an external >> package in core is to copy the files and keep them manually in sync or >> there are other ways? Well, there's the all too famous "bundled ELPA packages" feature we never finished, of course (see `git branch -a | grep elpa` for various approaches Philip Lord proposed and that we never integrated 🙁). [ Stepping down before installing such a feature is my biggest regret w.r.t my time as Emacs maintainer. ] > - (elisp-benchmarks :url nil) > + (elisp-benchmarks :core ("lisp/emacs-lisp/elisp-benchmarks.el")) I suspect the above would need to be completed with more files/dirs in the list. Stefan ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-13 0:49 ` Stefan Kangas 2024-12-13 7:37 ` Andrea Corallo @ 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 2024-12-14 11:58 ` Stefan Kangas 1 sibling, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-14 11:34 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel "Stefan Kangas" <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> I agree a full benchmark suite would be even better, I don't recall why >> I typed "reduced" there. So let's do that? > > Please go ahead, thanks. Here's my proposal: Expand ERT to handle the benchmark case. Copy ALL benchmarks from elisp-benchmarks, but use the ERT framework instead of elisp-benchmarks.el. Keep things minimal for now, try it out, and add complexity only if we get the impression this would make a useful addition; otherwise, revert the changes and go back to using elisp-benchmarks.el. This is what would work: 1. add a :benchmark tag to ert tests 2. create a new directory test/benchmark/ for benchmarks 3. modify test/Makefile not to run benchmark tests by default 4. add make targets to run the benchmark tests only I think the mechanism used by elisp-benchmarks.el to select tests is very ad-hoc and less powerful than the ert tagging mechanism. It also doesn't work for me: executing (progn (elisp-benchmarks-run "" t 1) (elisp-benchmarks-run "bubble" t 1)) means all tests are run twice, but I intended to run all tests once, as a warm up, then run only the bubble test again. However, I suspect this is merely a bug which can easily be fixed (maybe it's intentional, though?). I'm also seeing problems with a "Window size not as stipulated by the benchmark" error message, but I'll have to investigate that one... The mathematical part of elisp-benchmarks.el is questionable: it's built around the idea of using an arithmetic average of several test runs (which is equivalent to a single test run with more iterations); I believe a median/percentile-based approach to selecting a "typical" run would yield more useful numbers. So I'm not proposing to reuse the avg-based code. I tried to resist the temptation of making ert.el overly general; for example, instead of defining a new defstruct to determine HOW tests are run, I merely added a benchmark flag. Maybe we can revisit this if the approach is adopted and the need for more detailed benchmark specifications (inhibit GC? warmup? iteration counts? interact with "perf"?) becomes apparent. However, I did fail and give in to the temptation to allow an inhibit-gc mode specifier, which should probably be removed again... The main problems I see are that "make benchmark" starts emacs instances for all files in test/, which takes a lot of time (but that's a general ERT problem that should be fixed for pass-or-fail testing, too). A minor problem is how to copy the elisp-benchmark tests and keep them in sync. This would very much depend on how much work Andrea is willing to do. Finally, elisp-benchmarks has a very useful feature, somewhat hidden, that this code lacks: while calculating the arithmetic average of several benchmark runs isn't useful, calculating the standard deviation from that average is, because it gives us an indication of how scattered the results are; scattered test results (i.e. high numbers reported in the "tot avg err" column) are a sufficient, but not a necessary, condition for knowing when to discard the test results because something unexpected happened (most likely system load issues or CPU clock games). Benchmarking is, of course, very hard. I understand Paul Eggert is using an ancient machine for benchmarking because it avoids many of the issues that have arisen in the past decade. With a modern machine, we have a heterogeneous set of CPU cores (energy/performance), each of which can be configured in different ways (energy-performance preference for each core) in addition to running at a variable and unpredictable clock speed (cpufreq/boost); CPU caches are now large enough to persist across benchmark runs, and the system memory clock is also variable. This is a very rough start which would allow us to detect many, but not all, unexpected catastrophic performance reductions due to code changes. Finally, if Someone is willing to work on this, we should look into finding a set of benchmarks representative of "typical" Emacs usage so we can use PGO when building Emacs. While I'd prefer doing neither of the two, playing with PGO is a much more promising and maintainable approach than adding __builtin_expect noise to our C source code. Pip From 4217a5b8f990760775709392b24e0205041cfed3 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Sat, 14 Dec 2024 10:45:42 +0000 Subject: [PATCH 1/3] Add a benchmark from elisp-benchmarks DO NOT MERGE FIXME BEFORE MERGING: Should we add a link to https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\ f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\ hb=refs/heads/externals/elisp-benchmarks here? Losing git history because we copy a file from elpa to emacs seems suboptimal... * test/benchmark/bubble.el: New file. --- test/benchmark/bubble.el | 52 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 test/benchmark/bubble.el diff --git a/test/benchmark/bubble.el b/test/benchmark/bubble.el new file mode 100644 index 00000000000..0c38cdbce39 --- /dev/null +++ b/test/benchmark/bubble.el @@ -0,0 +1,52 @@ +;; -*- lexical-binding: t; -*- + +;; Copyright (C) 2019 Free Software Foundation, Inc. + +;; This file is part of GNU Emacs. + +;; GNU Emacs is free software: you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation, either version 3 of the License, or +;; (at your option) any later version. + +;; GNU Emacs is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. + +;;; Commentary: + +;; From: +;; https://www.emacswiki.org/emacs/EmacsLispBenchmark + +(require 'ert) + +(require 'cl-lib) + +(defvar elb-bubble-len 1000) +(defvar elb-bubble-list (mapcar #'random (make-list elb-bubble-len + most-positive-fixnum))) + +(defun elb-bubble (list) + (let ((i (length list))) + (while (> i 1) + (let ((b list)) + (while (cdr b) + (when (< (cadr b) (car b)) + (setcar b (prog1 (cadr b) + (setcdr b (cons (car b) (cddr b)))))) + (setq b (cdr b)))) + (setq i (1- i))) + list)) + +(defun elb-bubble-entry () + (cl-loop repeat 100 + for l = (copy-sequence elb-bubble-list) + do (elb-bubble l))) + +(ert-deftest benchmark-bubble () + :tags '(:benchmark) + (elb-bubble-entry)) -- 2.47.0 From df31e19452dff0fe804af2fd3c73f4cee84b6d16 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Sat, 14 Dec 2024 10:56:19 +0000 Subject: [PATCH 2/3] Expand the ERT framework to allow for benchmarks * lisp/emacs-lisp/ert.el (ert-test-result, ert--test-execution-info): Expand structs to include "benchmark" field. (ert--run-benchmark-test-internal): New function. (ert-run-test, ert-run-or-rerun-test): (ert-run-tests-batch-and-exit): (ert-run-tests): Add benchmark argument. (ert-run-tests-batch): Include GC information when running benchmarks. (ert-summarize-tests-batch-and-exit): Handle 1.0e+INF. --- lisp/emacs-lisp/ert.el | 91 +++++++++++++++++++++++++++++++++--------- 1 file changed, 73 insertions(+), 18 deletions(-) diff --git a/lisp/emacs-lisp/ert.el b/lisp/emacs-lisp/ert.el index 97aa233f6e2..76365ed8152 100644 --- a/lisp/emacs-lisp/ert.el +++ b/lisp/emacs-lisp/ert.el @@ -698,6 +698,7 @@ ert-test-result (messages nil) (should-forms nil) (duration 0) + (benchmark nil) ) (cl-defstruct (ert-test-passed (:include ert-test-result))) (cl-defstruct (ert-test-result-with-condition (:include ert-test-result)) @@ -723,7 +724,8 @@ ert--test-execution-info ;; execution of the current test. We store it to avoid being ;; affected by any new bindings the test itself may establish. (I ;; don't remember whether this feature is important.) - ert-debug-on-error) + ert-debug-on-error + benchmark) (defun ert--run-test-debugger (info condition debugfun) "Error handler used during the test run. @@ -801,6 +803,39 @@ ert--run-test-internal (make-ert-test-passed)) nil) +(defun ert--run-benchmark-test-internal (test-execution-info benchmark) + (setf (ert--test-execution-info-ert-debug-on-error test-execution-info) + ert-debug-on-error) + (catch 'ert--pass + ;; For now, each test gets its own temp buffer and its own + ;; window excursion, just to be safe. If this turns out to be + ;; too expensive, we can remove it. + (with-temp-buffer + (save-window-excursion + (let ((lexical-binding t) ;;FIXME: Why? + (ert--infos '()) + time) + (letrec ((debugfun (lambda (err) + (ert--run-test-debugger test-execution-info + err debugfun)))) + (handler-bind (((error quit) debugfun)) + (garbage-collect) + (let ((gc-cons-threshold (if (eq benchmark 'inhibit-gc) + most-positive-fixnum + gc-cons-threshold))) + (setq time (benchmark-run nil + (funcall (ert-test-body (ert--test-execution-info-test + test-execution-info)))))) + (and (eq benchmark 'inhibit-gc) + (not (= (cadr time) 0)) + (warn "failed to inhibit gc; time %S" time)) + (setf (ert--test-execution-info-benchmark test-execution-info) + time)))))) + (ert-pass)) + (setf (ert--test-execution-info-result test-execution-info) + (make-ert-test-passed)) + nil) + (defun ert--force-message-log-buffer-truncation () "Immediately truncate *Messages* buffer according to `message-log-max'. @@ -832,7 +867,7 @@ ert--running-tests The elements are of type `ert-test'.") -(defun ert-run-test (ert-test) +(defun ert-run-test (ert-test &optional benchmark) "Run ERT-TEST. Returns the result and stores it in ERT-TEST's `most-recent-result' slot." @@ -855,8 +890,12 @@ ert-run-test (push form-description should-form-accu))) (message-log-max t) (ert--running-tests (cons ert-test ert--running-tests))) - (ert--run-test-internal info)) + (if benchmark + (ert--run-benchmark-test-internal info benchmark) + (ert--run-test-internal info))) (let ((result (ert--test-execution-info-result info))) + (setf (ert-test-result-benchmark result) + (ert--test-execution-info-benchmark info)) (setf (ert-test-result-messages result) (with-current-buffer (messages-buffer) (buffer-substring begin-marker (point-max)))) @@ -1206,7 +1245,7 @@ ert--make-stats :test-start-times (make-vector (length tests) nil) :test-end-times (make-vector (length tests) nil)))) -(defun ert-run-or-rerun-test (stats test listener) +(defun ert-run-or-rerun-test (stats test listener &optional benchmark) ;; checkdoc-order: nil "Run the single test TEST and record the result using STATS and LISTENER." (let ((ert--current-run-stats stats) @@ -1221,19 +1260,26 @@ ert-run-or-rerun-test (setf (ert-test-most-recent-result test) nil) (setf (aref (ert--stats-test-start-times stats) pos) (current-time)) (unwind-protect - (ert-run-test test) + (ert-run-test test benchmark) (setf (aref (ert--stats-test-end-times stats) pos) (current-time)) (let ((result (ert-test-most-recent-result test))) - (setf (ert-test-result-duration result) - (float-time - (time-subtract - (aref (ert--stats-test-end-times stats) pos) - (aref (ert--stats-test-start-times stats) pos)))) + (cond ((ert-test-result-benchmark result) + (setf (ert-test-result-duration result) + (if (memq benchmark '(no-gc inhibit-gc)) + (- (car (ert-test-result-benchmark result)) + (caddr (ert-test-result-benchmark result))) + (car (ert-test-result-benchmark result))))) + (t + (setf (ert-test-result-duration result) + (float-time + (time-subtract + (aref (ert--stats-test-end-times stats) pos) + (aref (ert--stats-test-start-times stats) pos)))))) (ert--stats-set-test-and-result stats pos test result) (funcall listener 'test-ended stats test result)) (setf (ert--stats-current-test stats) nil)))) -(defun ert-run-tests (selector listener &optional interactively) +(defun ert-run-tests (selector listener &optional interactively benchmark) "Run the tests specified by SELECTOR, sending progress updates to LISTENER." (let* ((tests (ert-select-tests selector t)) (stats (ert--make-stats tests selector))) @@ -1245,7 +1291,7 @@ ert-run-tests (force-mode-line-update) (unwind-protect (cl-loop for test in tests do - (ert-run-or-rerun-test stats test listener) + (ert-run-or-rerun-test stats test listener benchmark) (when (and interactively (ert-test-quit-p (ert-test-most-recent-result test)) @@ -1367,7 +1413,7 @@ ert-batch-backtrace-right-margin "The maximum line length for printing backtraces in `ert-run-tests-batch'.") ;;;###autoload -(defun ert-run-tests-batch (&optional selector) +(defun ert-run-tests-batch (&optional selector benchmark) "Run the tests specified by SELECTOR, printing results to the terminal. SELECTOR selects which tests to run as described in `ert-select-tests' when @@ -1493,7 +1539,7 @@ ert-run-tests-batch (let* ((max (prin1-to-string (length (ert--stats-tests stats)))) (format-string (concat "%9s %" (prin1-to-string (length max)) - "s/" max " %S (%f sec)%s"))) + "s/" max " %S (%f sec%s)%s"))) (message format-string (ert-string-for-test-result result (ert-test-result-expected-p @@ -1501,13 +1547,19 @@ ert-run-tests-batch (1+ (ert--stats-test-pos stats test)) (ert-test-name test) (ert-test-result-duration result) + (if (ert-test-result-benchmark result) + (format ", %f sec in GC, %d GCs" + (caddr (ert-test-result-benchmark result)) + (cadr (ert-test-result-benchmark result))) + "") (if (ert-test-result-expected-p test result) "" (concat " " (ert-test-location test)))))))))) - nil)) + nil + benchmark)) ;;;###autoload -(defun ert-run-tests-batch-and-exit (&optional selector) +(defun ert-run-tests-batch-and-exit (&optional selector benchmark) "Like `ert-run-tests-batch', but exits Emacs when done. The exit status will be 0 if all test results were as expected, 1 @@ -1525,7 +1577,7 @@ ert-run-tests-batch-and-exit (setq attempt-stack-overflow-recovery nil attempt-orderly-shutdown-on-fatal-signal nil) (unwind-protect - (let ((stats (ert-run-tests-batch selector))) + (let ((stats (ert-run-tests-batch selector benchmark))) (when eln-dir (ignore-errors (delete-directory eln-dir t))) @@ -1726,7 +1778,10 @@ ert-summarize-tests-batch-and-exit If HIGH is a natural number, the HIGH long lasting tests are summarized." (or noninteractive (user-error "This function is only for use in batch mode")) - (or (natnump high) (setq high 0)) + (cond + ;; FIXME: ntake doesn't allow an infinity argument + ((eql high 1.0e+INF) (setq high most-positive-fixnum)) + ((not (natnump high)) (setq high 0))) ;; Better crash loudly than attempting to recover from undefined ;; behavior. (setq attempt-stack-overflow-recovery nil -- 2.47.0 From 8cd4053967a0aa6521039ba887c911daa13b0cf0 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet@protonmail.com> Date: Sat, 14 Dec 2024 10:59:38 +0000 Subject: [PATCH 3/3] Add "make benchmark" rule * Makefile.in (benchmark): New recipe. * test/Makefile.in (SELECTOR_BENCHMARK): New selector. (SELECTOR_ALL, SELECTOR_EXPENSIVE, SELECTOR_DEFAULT): Modify selectors not to include benchmark tests. (check-benchmark): New recipe. --- Makefile.in | 7 +++++++ test/Makefile.in | 25 +++++++++++++++++-------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/Makefile.in b/Makefile.in index 30a762ed03b..13a55452da2 100644 --- a/Makefile.in +++ b/Makefile.in @@ -69,6 +69,9 @@ # check-expensive includes additional tests that can be slow. # check-all runs all tests, including ones that can be slow, or # fail unpredictably +# +# make benchmark +# Run the Emacs benchmark suite. SHELL = @SHELL@ @@ -1138,6 +1141,10 @@ $(CHECK_TARGETS): test/%: $(MAKE) -C test $* +BENCHMARK_TARGETS = benchmark +.PHONY: $(BENCHMARK_TARGETS) +$(BENCHMARK_TARGETS): all + $(MAKE) SUMMARIZE_TESTS="1.0e+INF" BENCHMARKP="t" -C test check-benchmark dist: cd ${srcdir}; ./make-dist diff --git a/test/Makefile.in b/test/Makefile.in index 7a3178546a1..18a478b3e6c 100644 --- a/test/Makefile.in +++ b/test/Makefile.in @@ -78,9 +78,9 @@ TEST_TIMEOUT = TEST_INTERACTIVE ?= no ifeq ($(TEST_INTERACTIVE),yes) -TEST_RUN_ERT = --eval '(ert (quote ${SELECTOR_ACTUAL}))' +TEST_RUN_ERT = --eval '(ert (quote ${SELECTOR_ACTUAL}) ${BENCHMARKP})' else -TEST_RUN_ERT = --batch --eval '(ert-run-tests-batch-and-exit (quote ${SELECTOR_ACTUAL}))' ${WRITE_LOG} +TEST_RUN_ERT = --batch --eval '(ert-run-tests-batch-and-exit (quote ${SELECTOR_ACTUAL}) ${BENCHMARKP})' ${WRITE_LOG} endif # Whether to run tests from .el files in preference to .elc, we do @@ -136,13 +136,15 @@ TEST_NATIVE_COMP = TEST_NATIVE_COMP = no endif ifeq ($(TEST_NATIVE_COMP),yes) -SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable))) -SELECTOR_EXPENSIVE = (not (tag :unstable)) -SELECTOR_ALL = t +SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :benchmark))) +SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :benchmark))) +SELECTOR_ALL = (not (tag :benchmark)) +SELECTOR_BENCHMARK = (tag :benchmark) else -SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :nativecomp))) -SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :nativecomp))) -SELECTOR_ALL = (not (tag :nativecomp)) +SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :nativecomp) (tag :benchmark))) +SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :nativecomp) (tag :benchmark))) +SELECTOR_ALL = (not (or (tag :nativecomp) (tag :benchmark))) +SELECTOR_BENCHMARK = (and (tag :benchmark) (not (tag :nativecomp))) endif ifdef SELECTOR SELECTOR_ACTUAL=$(SELECTOR) @@ -154,6 +156,8 @@ SELECTOR_ACTUAL= SELECTOR_ACTUAL=$(SELECTOR_DEFAULT) else ifeq ($(MAKECMDGOALS),check-maybe) SELECTOR_ACTUAL=$(SELECTOR_DEFAULT) +else ifeq ($(MAKECMDGOALS),check-benchmark) +SELECTOR_ACTUAL=$(SELECTOR_BENCHMARK) else SELECTOR_ACTUAL=$(SELECTOR_EXPENSIVE) endif @@ -323,6 +327,11 @@ .PHONY: check-all: mostlyclean check-no-automated-subdir @${MAKE} check-doit SELECTOR="${SELECTOR_ALL}" +## Run all benchmark tests, regardless of tag. +.PHONY: check-benchmark +check-benchmark: mostlyclean check-no-automated-subdir + @${MAKE} check-doit SELECTOR="${SELECTOR_BENCHMARK}" + ## Re-run all tests which are outdated. A test is outdated if its ## logfile is out-of-date with either the test file, or the source ## files that the tests depend on. See test_template. -- 2.47.0 ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. @ 2024-12-14 11:58 ` Stefan Kangas 2024-12-14 20:07 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 58+ messages in thread From: Stefan Kangas @ 2024-12-14 11:58 UTC (permalink / raw) To: Pip Cet Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > FIXME BEFORE MERGING: Should we add a link to > https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\ > f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\ > hb=refs/heads/externals/elisp-benchmarks > here? Losing git history because we copy a file from elpa to emacs > seems suboptimal... Instead of copying the file, it might be preferable to import the entire git history into emacs.git, like we did for use-package and eglot. Then the old branch on GNU ELPA can be dropped, as we won't lose any history. João has some scripts that he used for eglot, and I adapted them for use-package. Note that he also had some copyright assignment issues to take care of, so it could probably be simplified. Please take a look here: https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770 ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-14 11:58 ` Stefan Kangas @ 2024-12-14 20:07 ` Pip Cet via Emacs development discussions. 2024-12-14 20:20 ` João Távora ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-14 20:07 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, João Távora "Stefan Kangas" <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: > >> FIXME BEFORE MERGING: Should we add a link to >> https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\ >> f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\ >> hb=refs/heads/externals/elisp-benchmarks >> here? Losing git history because we copy a file from elpa to emacs >> seems suboptimal... > > Instead of copying the file, it might be preferable to import the entire > git history into emacs.git, like we did for use-package and eglot. Then > the old branch on GNU ELPA can be dropped, as we won't lose any history. Just to be clear, dropping the branch in GNU ELPA wouldn't mean that the package would no longer be available, just that it would build signed packages from the main Emacs repo? > João has some scripts that he used for eglot, and I adapted them for > use-package. Note that he also had some copyright assignment issues to > take care of, so it could probably be simplified. > > Please take a look here: > https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770 Thanks for the pointer! I tried getting that to work, and finally succeeded in creating a (local) merged brach, but then I noticed that the commit messages will need editing to conform to the ChangeLog style. We also need to decide on the directory structure; right now, I've created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer lisp/benchmarks (which would make it easier to exclude the benchmark files from compilation), but I don't have a strong preference and others should make that decision. (I haven't included the lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too). There are some byte compiler warnings, which I guess we should fix. I'm not sure how to proceed here. Since there aren't that many commits, I can offer to change the commit messages myself, but I fully understand if someone else (Andrea or another volunteer) wants to do it. I can push the merged branch to a scratch branch of the emacs repo, or you can access the merged branch (without any edits to the commit messages) at https://codeberg.org/pipcet/emacs/src/branch/merge-elisp-benchmarks Joao, I think it would be a good idea to keep the modified script in admin/ or somewhere for future reference, but I don't know whether you consider it an Emacs contribution (and, thus, covered by your copyright assignment). [Political/Software Freedom note: I know that some people on the list don't care, but others do: at least occasionally, GitHub won't let you see a "gist" unless you use a GitHub account. This surprised me (and isn't reproducible now, so it's possible this was a temporary misconfiguration on GitHub's side, but who knows?) and it's another reason not to use GitHub for code sharing.] Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-14 20:07 ` Pip Cet via Emacs development discussions. @ 2024-12-14 20:20 ` João Távora 2024-12-15 0:57 ` Stefan Kangas 2024-12-15 0:58 ` Stefan Kangas 2 siblings, 0 replies; 58+ messages in thread From: João Távora @ 2024-12-14 20:20 UTC (permalink / raw) To: Pip Cet Cc: Stefan Kangas, Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel On Sat, Dec 14, 2024 at 8:07 PM Pip Cet <pipcet@protonmail.com> wrote: > Joao, I think it would be a good idea to keep the modified script in > admin/ or somewhere for future reference, but I don't know whether you > consider it an Emacs contribution (and, thus, covered by your copyright > assignment). Sure, use it. But there's a catch that I'm fairly sure I didn't write part of that code. Someone on the emacs-devel list helped me and wrote the first versions of it (following my requirements), and I'm very sorry but I can't remember who it was. Github handle is 'bhankas. Good luck! ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-14 20:07 ` Pip Cet via Emacs development discussions. 2024-12-14 20:20 ` João Távora @ 2024-12-15 0:57 ` Stefan Kangas 2024-12-22 16:04 ` Pip Cet via Emacs development discussions. 2024-12-15 0:58 ` Stefan Kangas 2 siblings, 1 reply; 58+ messages in thread From: Stefan Kangas @ 2024-12-15 0:57 UTC (permalink / raw) To: Pip Cet Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, João Távora Pip Cet <pipcet@protonmail.com> writes: > Just to be clear, dropping the branch in GNU ELPA wouldn't mean that the > package would no longer be available, just that it would build signed > packages from the main Emacs repo? Yes. >> João has some scripts that he used for eglot, and I adapted them for >> use-package. Note that he also had some copyright assignment issues to >> take care of, so it could probably be simplified. >> >> Please take a look here: >> https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770 > > Thanks for the pointer! I tried getting that to work, and finally > succeeded in creating a (local) merged brach, but then I noticed that > the commit messages will need editing to conform to the ChangeLog style. I guess there's no clear drawback creating the ChangeLog entries for each commit, but it's not required. Only the final merge commit needs a ChangeLog entry. I guess that entry will look something like this: * lisp/emacs-lisp/elisp-benchmarks.el: * lisp/emacs-lisp/benchmarks/bubble.el: * lisp/emacs-lisp/benchmarks/pidigits.el: New files. (Incidentally, this is the same ChangeLog entry we would use if we just copied the files without preserving history.) > We also need to decide on the directory structure; right now, I've > created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer > lisp/benchmarks (which would make it easier to exclude the benchmark > files from compilation), but I don't have a strong preference and others > should make that decision. (I haven't included the > lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks > in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too). I don't have a strong opinion here, but maybe this stuff belongs under test/ even? > I'm not sure how to proceed here. Since there aren't that many commits, > I can offer to change the commit messages myself, but I fully understand > if someone else (Andrea or another volunteer) wants to do it. FWIW, I'd just go ahead without waiting for someone else. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-15 0:57 ` Stefan Kangas @ 2024-12-22 16:04 ` Pip Cet via Emacs development discussions. 2024-12-29 10:47 ` Andrea Corallo 0 siblings, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-22 16:04 UTC (permalink / raw) To: Stefan Kangas Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, João Távora "Stefan Kangas" <stefankangas@gmail.com> writes: > Pip Cet <pipcet@protonmail.com> writes: >> We also need to decide on the directory structure; right now, I've >> created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer >> lisp/benchmarks (which would make it easier to exclude the benchmark >> files from compilation), but I don't have a strong preference and others >> should make that decision. (I haven't included the >> lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks >> in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too). > > I don't have a strong opinion here, but maybe this stuff belongs under > test/ even? I'm still working on this, but it turns out it's harder than I thought to turn the .el files for the benchmarks into something that's usable both with ERT and with the existing elisp-benchmarks.el infrastructure. For example, there's the use of elb-bench-directory to locate resource files; ERT has its own function for that, but it turns out one of the resources one benchmark uses is the source file for another benchmark. Usually I'd just use letf around the benchmark call, but that may affect performance too much for the benchmarks to be comparable between the ERT and elisp-benchmarks invocations. I just don't know whether I'd feel comfortable invoking the benchmarks in such different ways and presenting the results in a way that would make people compare them. The rest of the issues are trivial: whitespace issues, two different files calling Fprovide with the same feature, elb-scroll.el merged into elb-smie.el rather than maintaining them as two separate files. These are very definitely not deficiencies in the current elisp-benchmarks package, just different conventions. However, that amounts to significant changes to the benchmark .el files overall; rather than copies of the elisp-benchmarks files, we now have modified versions and would have to port any changes between the two different sets of files. Ultimately, my current benchmark branch doesn't do what I set out to do, which is to share the elisp-benchmarks suite between an unmodified elisp-benchmarks and the new ERT framework, yielding comparable results. Getting it to work isn't the main problem, comparability of results is. So it is with some trepidation that I suggest that the best remaining option may be to fork or "freeze"/archive elisp-benchmarks and move development of benchmarks for current Emacs builds entirely to the ERT framework. Forking causes a lot of extra synchronization work. Archiving the package means we will never add new benchmarks for pre-make-benchmark Emacs builds. I'm convinced a "make benchmark" target is worth it. I also think that we should use the ERT framework, because benchmarks and pass-or-fail tests are quite similar. Maybe I'm missing an obvious solution here? Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-22 16:04 ` Pip Cet via Emacs development discussions. @ 2024-12-29 10:47 ` Andrea Corallo 2024-12-30 11:45 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 58+ messages in thread From: Andrea Corallo @ 2024-12-29 10:47 UTC (permalink / raw) To: Pip Cet Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, João Távora Pip Cet <pipcet@protonmail.com> writes: > "Stefan Kangas" <stefankangas@gmail.com> writes: > >> Pip Cet <pipcet@protonmail.com> writes: > >>> We also need to decide on the directory structure; right now, I've >>> created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer >>> lisp/benchmarks (which would make it easier to exclude the benchmark >>> files from compilation), but I don't have a strong preference and others >>> should make that decision. (I haven't included the >>> lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks >>> in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too). >> >> I don't have a strong opinion here, but maybe this stuff belongs under >> test/ even? > > I'm still working on this, but it turns out it's harder than I thought > to turn the .el files for the benchmarks into something that's usable > both with ERT and with the existing elisp-benchmarks.el infrastructure. > > For example, there's the use of elb-bench-directory to locate resource > files; ERT has its own function for that, but it turns out one of the > resources one benchmark uses is the source file for another benchmark. > Usually I'd just use letf around the benchmark call, but that may affect > performance too much for the benchmarks to be comparable between the ERT > and elisp-benchmarks invocations. > > I just don't know whether I'd feel comfortable invoking the benchmarks > in such different ways and presenting the results in a way that would > make people compare them. > > The rest of the issues are trivial: whitespace issues, two different > files calling Fprovide with the same feature, elb-scroll.el merged into > elb-smie.el rather than maintaining them as two separate files. These > are very definitely not deficiencies in the current elisp-benchmarks > package, just different conventions. However, that amounts to > significant changes to the benchmark .el files overall; rather than > copies of the elisp-benchmarks files, we now have modified versions and > would have to port any changes between the two different sets of files. > > Ultimately, my current benchmark branch doesn't do what I set out to do, > which is to share the elisp-benchmarks suite between an unmodified > elisp-benchmarks and the new ERT framework, yielding comparable results. > Getting it to work isn't the main problem, comparability of results is. > > So it is with some trepidation that I suggest that the best remaining > option may be to fork or "freeze"/archive elisp-benchmarks and move > development of benchmarks for current Emacs builds entirely to the ERT > framework. Forking causes a lot of extra synchronization work. > Archiving the package means we will never add new benchmarks for > pre-make-benchmark Emacs builds. > > I'm convinced a "make benchmark" target is worth it. I also think that > we should use the ERT framework, because benchmarks and pass-or-fail > tests are quite similar. > > Maybe I'm missing an obvious solution here? I'd personally drop the requirement of using ERT as a framework for benchmarks, I'd just move elisp-benchmarks code in emacs core and add the target. My 2 cents. Andrea ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-29 10:47 ` Andrea Corallo @ 2024-12-30 11:45 ` Pip Cet via Emacs development discussions. 2024-12-30 14:15 ` Eli Zaretskii 0 siblings, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-30 11:45 UTC (permalink / raw) To: Andrea Corallo Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, João Távora "Andrea Corallo" <acorallo@gnu.org> writes: > Pip Cet <pipcet@protonmail.com> writes: >> I'm convinced a "make benchmark" target is worth it. I also think that >> we should use the ERT framework, because benchmarks and pass-or-fail >> tests are quite similar. >> >> Maybe I'm missing an obvious solution here? > > I'd personally drop the requirement of using ERT as a framework for > benchmarks, I'd just move elisp-benchmarks code in emacs core and add > the target. Well, as is obvious from the quoted paragraph, I disagree. I've stated why in the thread; if someone wants a summary, I can provide one, but I won't do so now because it might sound too much like an attack on elisp-benchmarks or its author. It seems likely that the consequence of my suggestion to add a make target to do something useful is that the make target is permanently reserved for something much less useful, blocking the way for future developments. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 11:45 ` Pip Cet via Emacs development discussions. @ 2024-12-30 14:15 ` Eli Zaretskii 2024-12-30 15:00 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2024-12-30 14:15 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora > Date: Mon, 30 Dec 2024 11:45:36 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora <joaotavora@gmail.com> > > "Andrea Corallo" <acorallo@gnu.org> writes: > > > I'd personally drop the requirement of using ERT as a framework for > > benchmarks, I'd just move elisp-benchmarks code in emacs core and add > > the target. > > Well, as is obvious from the quoted paragraph, I disagree. I've stated > why in the thread; if someone wants a summary, I can provide one Can you point to the message where you explained your rationale for using ERT for this? I've scanned the discussion, but couldn't find such a message. Thanks. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 14:15 ` Eli Zaretskii @ 2024-12-30 15:00 ` Pip Cet via Emacs development discussions. 2024-12-30 15:21 ` Eli Zaretskii 0 siblings, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-30 15:00 UTC (permalink / raw) To: Eli Zaretskii Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Mon, 30 Dec 2024 11:45:36 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii >> <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert >> <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora >> <joaotavora@gmail.com> >> >> "Andrea Corallo" <acorallo@gnu.org> writes: >> >> > I'd personally drop the requirement of using ERT as a framework for >> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add >> > the target. >> >> Well, as is obvious from the quoted paragraph, I disagree. I've stated >> why in the thread; if someone wants a summary, I can provide one > > Can you point to the message where you explained your rationale for > using ERT for this? I've scanned the discussion, but couldn't find > such a message. The best I can find is this: https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html I can try to provide a more detailed/structured rationale if that's helpful. (Is it, though? Reusing someone's code in a way which reduces their user base and might cause them more work isn't something we should do lightly.) Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 15:00 ` Pip Cet via Emacs development discussions. @ 2024-12-30 15:21 ` Eli Zaretskii 2024-12-30 15:49 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2024-12-30 15:21 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora > Date: Mon, 30 Dec 2024 15:00:29 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com > > "Eli Zaretskii" <eliz@gnu.org> writes: > > >> Date: Mon, 30 Dec 2024 11:45:36 +0000 > >> From: Pip Cet <pipcet@protonmail.com> > >> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii > >> <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert > >> <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora > >> <joaotavora@gmail.com> > >> > >> "Andrea Corallo" <acorallo@gnu.org> writes: > >> > >> > I'd personally drop the requirement of using ERT as a framework for > >> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add > >> > the target. > >> > >> Well, as is obvious from the quoted paragraph, I disagree. I've stated > >> why in the thread; if someone wants a summary, I can provide one > > > > Can you point to the message where you explained your rationale for > > using ERT for this? I've scanned the discussion, but couldn't find > > such a message. > > The best I can find is this: > > https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html Thanks, but AFAICT this just says that you intended to use/extend ERT to run this benchmark suite, but doesn't explain why you think using ERT would be an advantage worthy of keeping. > I can try to provide a more detailed/structured rationale if that's > helpful. (Is it, though? Reusing someone's code in a way which reduces > their user base and might cause them more work isn't something we should > do lightly.) I'm not sure I follow. Andrea suggests to move elisp-benchmarks into the repository, and add a target to the test/ Makefile to run it. AFAIU he suggested that because it should be less work, not more. Why do you think it is wrong to do the (AFAIU) simple change that Andrea proposed? The reason I'm asking is because I think we want this suite to be part of our test, but don't necessarily want the addition of the benchmarks to the test suite be a large job that complicates the benchmarks and the test suite alike. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 15:21 ` Eli Zaretskii @ 2024-12-30 15:49 ` Pip Cet via Emacs development discussions. 2024-12-30 15:53 ` João Távora 2024-12-30 16:40 ` Eli Zaretskii 0 siblings, 2 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-30 15:49 UTC (permalink / raw) To: Eli Zaretskii Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Mon, 30 Dec 2024 15:00:29 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> >> Date: Mon, 30 Dec 2024 11:45:36 +0000 >> >> From: Pip Cet <pipcet@protonmail.com> >> >> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii >> >> <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert >> >> <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora >> >> <joaotavora@gmail.com> >> >> >> >> "Andrea Corallo" <acorallo@gnu.org> writes: >> >> >> >> > I'd personally drop the requirement of using ERT as a framework for >> >> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add >> >> > the target. >> >> >> >> Well, as is obvious from the quoted paragraph, I disagree. I've stated >> >> why in the thread; if someone wants a summary, I can provide one >> > >> > Can you point to the message where you explained your rationale for >> > using ERT for this? I've scanned the discussion, but couldn't find >> > such a message. >> >> The best I can find is this: >> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html > > Thanks, but AFAICT this just says that you intended to use/extend ERT > to run this benchmark suite, but doesn't explain why you think using > ERT would be an advantage worthy of keeping. I think some advantages are stated in that email: the ERT tagging mechanism is more general, works, and can be extended (I describe one such extension). All that isn't currently true for elisp-benchmarks. The other big difference is resource management, which elisp-benchmarks does via a global variable, reusing one test as data for another. ERT has a somewhat better mechanism. >> I can try to provide a more detailed/structured rationale if that's >> helpful. (Is it, though? Reusing someone's code in a way which reduces >> their user base and might cause them more work isn't something we should >> do lightly.) > > I'm not sure I follow. Andrea suggests to move elisp-benchmarks into > the repository, and add a target to the test/ Makefile to run it. > AFAIU he suggested that because it should be less work, not more. I was saying that my proposal would cause Andrea more work, because we'd reuse his code in a way which reduces (splits) the elisp-benchmarks user base. It would also cause me work, which I've done, but that shouldn't really count against it :-) > Why do you think it is wrong to do the (AFAIU) simple change that > Andrea proposed? Because it's a de facto commitment to not doing it in ERT. Having two parallel benchmark suites isn't something I think would happen (and which one would we use for our make target?) > The reason I'm asking is because I think we want this suite to be part > of our test, but don't necessarily want the addition of the benchmarks > to the test suite be a large job that complicates the benchmarks and > the test suite alike. It's not a large job if we make it a clean split. However, it would be work for those preferring elisp-benchmarks conventions. It might be relevant that elisp-benchmarks hasn't seen very active development lately. I think switching to ERT might help there, if only because of the mailing list traffic. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 15:49 ` Pip Cet via Emacs development discussions. @ 2024-12-30 15:53 ` João Távora 2024-12-30 16:40 ` Eli Zaretskii 1 sibling, 0 replies; 58+ messages in thread From: João Távora @ 2024-12-30 15:53 UTC (permalink / raw) To: Pip Cet Cc: Eli Zaretskii, Andrea Corallo, Stefan Kangas, Mattias Engdegård, Paul Eggert, emacs-devel [-- Attachment #1: Type: text/plain, Size: 497 bytes --] On Mon, Dec 30, 2024, 15:49 Pip Cet <pipcet@protonmail.com> wrote: > Why do you think it is wrong to do the (AFAIU) simple change that > > Andrea proposed? > > Because it's a de facto commitment to not doing it in ERT. Having two > parallel benchmark suites isn't something I think would happen > I don't subscribe to this mailing list anymore, but since I'm being CC'd in these emails (why?) I might as well +1 this particular point of Pip's. I wish you a happy new year. João [-- Attachment #2: Type: text/html, Size: 983 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 15:49 ` Pip Cet via Emacs development discussions. 2024-12-30 15:53 ` João Távora @ 2024-12-30 16:40 ` Eli Zaretskii 2024-12-30 17:25 ` Pip Cet via Emacs development discussions. 1 sibling, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2024-12-30 16:40 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel > Date: Mon, 30 Dec 2024 15:49:30 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com > > >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html > > > > Thanks, but AFAICT this just says that you intended to use/extend ERT > > to run this benchmark suite, but doesn't explain why you think using > > ERT would be an advantage worthy of keeping. > > I think some advantages are stated in that email: the ERT tagging > mechanism is more general, works, and can be extended (I describe one > such extension). All that isn't currently true for elisp-benchmarks. Unlike the rest of the test suite, where we need a way to be able to run individual tests, a benchmark suite is much more likely to run as a whole, because benchmarking a single kind of jobs in Emacs is much less useful than producing a benchmark of a representative sample of jobs. So I'm not sure this particular aspect is such a serious problem, certainly if it makes the job of adding the suite much harder. > The other big difference is resource management, which elisp-benchmarks > does via a global variable, reusing one test as data for another. ERT > has a somewhat better mechanism. This is just an argument for cleaner, more elegant code, right? If so, I think we can live with the issue. Having a ready-t-run benchmark suite is so much more important that I'm prepared to make compromises. > > Why do you think it is wrong to do the (AFAIU) simple change that > > Andrea proposed? > > Because it's a de facto commitment to not doing it in ERT. Probably, but that in itself is not a catastrophe, surely? > It might be relevant that elisp-benchmarks hasn't seen very active > development lately. I think switching to ERT might help there, if only > because of the mailing list traffic. My vote is to get the job done the fastest way possible. Tests are not Emacs itself, they are means towards a certain end, and so can use less clean code and designs, at least IMO. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 16:40 ` Eli Zaretskii @ 2024-12-30 17:25 ` Pip Cet via Emacs development discussions. 2024-12-30 18:16 ` Eli Zaretskii 2024-12-30 18:26 ` Andrea Corallo 0 siblings, 2 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-30 17:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and include it now? That would preserve the Git history and importantly (to me) reserve the name for now. >> Date: Mon, 30 Dec 2024 15:49:30 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com >> >> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html >> > >> > Thanks, but AFAICT this just says that you intended to use/extend ERT >> > to run this benchmark suite, but doesn't explain why you think using >> > ERT would be an advantage worthy of keeping. >> >> I think some advantages are stated in that email: the ERT tagging >> mechanism is more general, works, and can be extended (I describe one >> such extension). All that isn't currently true for elisp-benchmarks. > > Unlike the rest of the test suite, where we need a way to be able to > run individual tests, a benchmark suite is much more likely to run as > a whole, because benchmarking a single kind of jobs in Emacs is much > less useful than producing a benchmark of a representative sample of > jobs. So I'm not sure this particular aspect is such a serious Not my experience. Running the entire suite is much more likely not to produce usable data due to such issues as CPU thermal management (for example: the first few tests are run at full clock speed and heat up the system so much that thermal throttling is activated; the next few tests are run at a reduced rate while the fan is running; eventually we run out of amperes that we're allowed to drain the battery by and reduce clock speed even further; this results in reduced temperature, so the fan speed is reduced, which means we will eventually decide to try a higher clock speed again, which will work for a while only before repeating the cycle. The whole thing will appear regular enough we won't notice the data is bad, but it will be, until we rerun the test on the same system in a different room and get wildly different results). A single-second test run in a loop produces the occasional mid-stream result which is actually useful (and promptly lost to the averaging mechanism of elisp-benchmarks). Benchmarking is hard, and I wouldn't have provided this very verbose example if I hadn't seen "paradoxical" results that can only be explained by such mechanisms. We need to move away from average run times either way, and that requires code changes. And I don't usually run ERT tests individually, while I'm trying to get in the habit of running the (non-expensive) test suite before I push. > problem, certainly if it makes the job of adding the suite much > harder. I don't think time-to-Emacs is very different for the two approaches. The difference is the post-merge work. >> The other big difference is resource management, which elisp-benchmarks >> does via a global variable, reusing one test as data for another. ERT >> has a somewhat better mechanism. > > This is just an argument for cleaner, more elegant code, right? If > so, I think we can live with the issue. Having a ready-t-run > benchmark suite is so much more important that I'm prepared to make > compromises. > >> > Why do you think it is wrong to do the (AFAIU) simple change that >> > Andrea proposed? >> >> Because it's a de facto commitment to not doing it in ERT. > > Probably, but that in itself is not a catastrophe, surely? Hmm. I disagree, but I think the real catastrophe would be wasting the make target name, not the inclusion of another directory. My suggestion is a compromise: add "make elisp-benchmarks" now, using Andrea's code, then consider more complicated ERT-based approaches without being in any hurry to do so. But, also, let's agree that the ERT-based approaches are "allowed" to reuse the elisp-benchmarks code without providing comparable results or a porting mechanism, and keep the "make benchmark" name reserved for a while. My prediction is that it will turn out "make elisp-benchmarks" doesn't usually provide very useful results, and expansion of the test framework to produce useful results is easier reusing the ERT framework. >> It might be relevant that elisp-benchmarks hasn't seen very active >> development lately. I think switching to ERT might help there, if only >> because of the mailing list traffic. > > My vote is to get the job done the fastest way possible. Tests are Well, the ERT patch is ready "right now" (meaning it needs rebasing). The elisp-benchmarks code would require, at least, whitespace fixes ;-) > not Emacs itself, they are means towards a certain end, and so can use > less clean code and designs, at least IMO. To be perfectly honest, I was worried about the commit history because in the Good Old CVS Days, file renames were more of a problem than they are with git. With git, I would prefer one "this is elisp-benchmarks" commit even if the subsequent history modifies and moves those files. So no reason not to do that now? Thanks for your patience. I can prepare a git branch doing that. As for the git history rewriting magic code, I've heard from Joao and the other developer involved; he doesn't have a copyright assignment, but also thinks this is small enough to be paperwork-exempt, so we can (probably) include a script in admin/ to migrate elpa packages to core. My preference would be a top-level directory called "elisp-benchmarks", but ultimately that's a minor question, so just let me know the preferred destination. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 17:25 ` Pip Cet via Emacs development discussions. @ 2024-12-30 18:16 ` Eli Zaretskii 2024-12-31 4:00 ` Pip Cet via Emacs development discussions. 2024-12-30 18:26 ` Andrea Corallo 1 sibling, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2024-12-30 18:16 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel > Date: Mon, 30 Dec 2024 17:25:44 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and > include it now? That would preserve the Git history and importantly (to > me) reserve the name for now. Fine by me. > And I don't usually run ERT tests individually, while I'm trying to get > in the habit of running the (non-expensive) test suite before I push. I do it all the time, when I install some change and want to make sure the related tests still pass. > My suggestion is a compromise: add "make elisp-benchmarks" now, using > Andrea's code, then consider more complicated ERT-based approaches > without being in any hurry to do so. But, also, let's agree that the > ERT-based approaches are "allowed" to reuse the elisp-benchmarks code > without providing comparable results or a porting mechanism, and keep > the "make benchmark" name reserved for a while. Of course ERT-based approaches are allowed. I only chimed into this discussion because that approach bumped into some difficulties. > My prediction is that it will turn out "make elisp-benchmarks" doesn't > usually provide very useful results, and expansion of the test framework > to produce useful results is easier reusing the ERT framework. We'll see, and if you are right, we will work on improving the benchmarks. > My preference would be a top-level directory called "elisp-benchmarks", > but ultimately that's a minor question, so just let me know the > preferred destination. I thought we wanted it under test/ ? But I'm also okay with having a directory that is sibling to test/ if there are no objections from Andrea and others. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 18:16 ` Eli Zaretskii @ 2024-12-31 4:00 ` Pip Cet via Emacs development discussions. 2024-12-31 5:26 ` Stefan Kangas 2024-12-31 12:53 ` Eli Zaretskii 0 siblings, 2 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-31 4:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Mon, 30 Dec 2024 17:25:44 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and >> include it now? That would preserve the Git history and importantly (to >> me) reserve the name for now. > > Fine by me. Pushed to scratch/elisp-benchmarks. Sorry for all the noise; we should really find a way to reduce emacs-diffs notifications when many commits are made at once. This branch: 1. contains whitespace errors, because it contains the files precisely as they are in the elisp-benchmarks repo 2. contains the admin/elpa2emacs script, based on the assessment by the one author who doesn't have a copyright assignment that less than 15 lines of his original code remain. I made it the last commit in the series so it's easy to drop that specific commit and merge the rest. 3. creates a top-level elisp-benchmarks directory >> And I don't usually run ERT tests individually, while I'm trying to get >> in the habit of running the (non-expensive) test suite before I push. > > I do it all the time, when I install some change and want to make sure > the related tests still pass. Interesting: I find it very hard to localize my changes to specific tests, usually. >> My preference would be a top-level directory called "elisp-benchmarks", >> but ultimately that's a minor question, so just let me know the >> preferred destination. > > I thought we wanted it under test/ ? > > But I'm also okay with having a directory that is sibling to test/ if > there are no objections from Andrea and others. Obviously, it's not too late for such objections. That's why I wanted to include the script: redoing the merge that way is much easier than applying git filter-repo to a "live" Emacs repo (which is destroyed in the process). Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 4:00 ` Pip Cet via Emacs development discussions. @ 2024-12-31 5:26 ` Stefan Kangas 2024-12-31 13:05 ` Eli Zaretskii 2024-12-31 12:53 ` Eli Zaretskii 1 sibling, 1 reply; 58+ messages in thread From: Stefan Kangas @ 2024-12-31 5:26 UTC (permalink / raw) To: Pip Cet, Eli Zaretskii; +Cc: acorallo, mattiase, eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > Pushed to scratch/elisp-benchmarks. Sorry for all the noise; we should > really find a way to reduce emacs-diffs notifications when many commits > are made at once. Thanks. AFAIU, the purpose of the emacs-diffs list is to detail the full history of all commits. We have enough disk and bandwidth these days, and threads are easy to skip, so I wouldn't worry about it. >>> My preference would be a top-level directory called "elisp-benchmarks", >>> but ultimately that's a minor question, so just let me know the >>> preferred destination. >> >> I thought we wanted it under test/ ? >> >> But I'm also okay with having a directory that is sibling to test/ if >> there are no objections from Andrea and others. > > Obviously, it's not too late for such objections. That's why I wanted > to include the script: redoing the merge that way is much easier than > applying git filter-repo to a "live" Emacs repo (which is destroyed in > the process). My two cents: I'd put this in a directory named "benchmarks" instead. The "elisp-" part is relevant only because that was the name of the old GNU ELPA package, but seems redundant as part of a directory name in emacs.git. We also don't have "elisp-src" and "elisp-test", for example. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 5:26 ` Stefan Kangas @ 2024-12-31 13:05 ` Eli Zaretskii 2024-12-31 14:14 ` Pip Cet via Emacs development discussions. 0 siblings, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2024-12-31 13:05 UTC (permalink / raw) To: Stefan Kangas; +Cc: pipcet, acorallo, mattiase, eggert, emacs-devel > From: Stefan Kangas <stefankangas@gmail.com> > Date: Mon, 30 Dec 2024 23:26:05 -0600 > Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, > emacs-devel@gnu.org > > I'd put this in a directory named "benchmarks" instead. I think this name is indeed better. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 13:05 ` Eli Zaretskii @ 2024-12-31 14:14 ` Pip Cet via Emacs development discussions. 2024-12-31 14:22 ` Eli Zaretskii 0 siblings, 1 reply; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-31 14:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Kangas, acorallo, mattiase, eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> From: Stefan Kangas <stefankangas@gmail.com> >> Date: Mon, 30 Dec 2024 23:26:05 -0600 >> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, >> emacs-devel@gnu.org >> >> I'd put this in a directory named "benchmarks" instead. > > I think this name is indeed better. Is this about the directory or the make target? Absolutely no objections on the directory name, as it won't conflict with putting new ERT-based benchmarks in test/. As for the make target, we should decide whether "make benchmark" means "run all benchmarks no matter which framework" or "run the elisp-benchmarks specifically". I don't think benchmark: elisp-benchmarks check-benchmark is a bad rule at all. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 14:14 ` Pip Cet via Emacs development discussions. @ 2024-12-31 14:22 ` Eli Zaretskii 0 siblings, 0 replies; 58+ messages in thread From: Eli Zaretskii @ 2024-12-31 14:22 UTC (permalink / raw) To: Pip Cet; +Cc: stefankangas, acorallo, mattiase, eggert, emacs-devel > Date: Tue, 31 Dec 2024 14:14:42 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Stefan Kangas <stefankangas@gmail.com>, acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > >> From: Stefan Kangas <stefankangas@gmail.com> > >> Date: Mon, 30 Dec 2024 23:26:05 -0600 > >> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, > >> emacs-devel@gnu.org > >> > >> I'd put this in a directory named "benchmarks" instead. > > > > I think this name is indeed better. > > Is this about the directory or the make target? The former. > As for the make target, we should decide whether "make benchmark" means > "run all benchmarks no matter which framework" or "run the > elisp-benchmarks specifically". I don't think "make elisp-benchmarks" for the target is OK, IMO. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 4:00 ` Pip Cet via Emacs development discussions. 2024-12-31 5:26 ` Stefan Kangas @ 2024-12-31 12:53 ` Eli Zaretskii 2024-12-31 14:34 ` Andrea Corallo 1 sibling, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2024-12-31 12:53 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel > Date: Tue, 31 Dec 2024 04:00:05 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > >> Date: Mon, 30 Dec 2024 17:25:44 +0000 > >> From: Pip Cet <pipcet@protonmail.com> > >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > >> > >> "Eli Zaretskii" <eliz@gnu.org> writes: > >> > >> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and > >> include it now? That would preserve the Git history and importantly (to > >> me) reserve the name for now. > > > > Fine by me. > > Pushed to scratch/elisp-benchmarks. Thanks. > Sorry for all the noise; we should really find a way to reduce > emacs-diffs notifications when many commits are made at once. I see no reason: people who subscribe to that list should be prepared for floods from time to time. > This branch: > > 1. contains whitespace errors, because it contains the files precisely > as they are in the elisp-benchmarks repo We should probably clean that up, unless Andrea says that those whitespace are somehow needed. > >> And I don't usually run ERT tests individually, while I'm trying to get > >> in the habit of running the (non-expensive) test suite before I push. > > > > I do it all the time, when I install some change and want to make sure > > the related tests still pass. > > Interesting: I find it very hard to localize my changes to specific > tests, usually. It is not a 100% reliable technique, but in 99% of the cases, running the tests corresponding to the Lisp file(s) modified by a changeset, followed by additional tests where Grep finds the functions in which the changes were made, provide good coverage (if we have tests for those features at all). ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 12:53 ` Eli Zaretskii @ 2024-12-31 14:34 ` Andrea Corallo 0 siblings, 0 replies; 58+ messages in thread From: Andrea Corallo @ 2024-12-31 14:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Pip Cet, stefankangas, mattiase, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Tue, 31 Dec 2024 04:00:05 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> >> Date: Mon, 30 Dec 2024 17:25:44 +0000 >> >> From: Pip Cet <pipcet@protonmail.com> >> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> >> >> >> "Eli Zaretskii" <eliz@gnu.org> writes: >> >> >> >> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and >> >> include it now? That would preserve the Git history and importantly (to >> >> me) reserve the name for now. >> > >> > Fine by me. >> >> Pushed to scratch/elisp-benchmarks. > > Thanks. > >> Sorry for all the noise; we should really find a way to reduce >> emacs-diffs notifications when many commits are made at once. > > I see no reason: people who subscribe to that list should be prepared > for floods from time to time. > >> This branch: >> >> 1. contains whitespace errors, because it contains the files precisely >> as they are in the elisp-benchmarks repo > > We should probably clean that up, unless Andrea says that those > whitespace are somehow needed. Yep, I guess the best for simplicity is to push changes to elisp-benchmarks and later to merge them into emacs-core. AFAIR it's also the way other packages in a similar situation operates. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 17:25 ` Pip Cet via Emacs development discussions. 2024-12-30 18:16 ` Eli Zaretskii @ 2024-12-30 18:26 ` Andrea Corallo 2024-12-30 18:58 ` Stefan Kangas 2024-12-30 21:34 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 58+ messages in thread From: Andrea Corallo @ 2024-12-30 18:26 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > "Eli Zaretskii" <eliz@gnu.org> writes: > > Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and > include it now? That would preserve the Git history and importantly (to > me) reserve the name for now. > >>> Date: Mon, 30 Dec 2024 15:49:30 +0000 >>> From: Pip Cet <pipcet@protonmail.com> >>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com >>> >>> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html >>> > >>> > Thanks, but AFAICT this just says that you intended to use/extend ERT >>> > to run this benchmark suite, but doesn't explain why you think using >>> > ERT would be an advantage worthy of keeping. >>> >>> I think some advantages are stated in that email: the ERT tagging >>> mechanism is more general, works, and can be extended (I describe one >>> such extension). All that isn't currently true for elisp-benchmarks. >> >> Unlike the rest of the test suite, where we need a way to be able to >> run individual tests, a benchmark suite is much more likely to run as >> a whole, because benchmarking a single kind of jobs in Emacs is much >> less useful than producing a benchmark of a representative sample of >> jobs. So I'm not sure this particular aspect is such a serious > > Not my experience. Running the entire suite is much more likely not to > produce usable data due to such issues as CPU thermal management (for > example: the first few tests are run at full clock speed and heat up the > system so much that thermal throttling is activated; the next few tests > are run at a reduced rate while the fan is running; eventually we run > out of amperes that we're allowed to drain the battery by and reduce > clock speed even further; this results in reduced temperature, so the > fan speed is reduced, which means we will eventually decide to try a > higher clock speed again, which will work for a while only before > repeating the cycle. The whole thing will appear regular enough we > won't notice the data is bad, but it will be, until we rerun the test on > the same system in a different room and get wildly different results). > A single-second test run in a loop produces the occasional mid-stream > result which is actually useful (and promptly lost to the averaging > mechanism of elisp-benchmarks). Yes, elisp-benchmark is running all the selected benchmarks at each iteration, so that a single one cannot take advantaged of the initial cool CPU state. If unstable throttling on a specific system is a problem this will show up as computed error for that test. If a system is throttling the right (and only) thing to do is to measure it, this is in my experience what benchmarks do. That said tipically Eli is right, the typical use of a benchmark suite is to run it as a whole and look at the total results, this indeed accounts for avg throttling as well. > Benchmarking is hard, and I wouldn't have provided this very verbose > example if I hadn't seen "paradoxical" results that can only be > explained by such mechanisms. We need to move away from average run > times either way, and that requires code changes. I'm not sure I understand what you mean, if we prefer something like geo-mean in elisp-beanhcmarks we can change for that, should be easy. I'm open to patches to elisp-benchmarks (and to its hypothetical copy in emacs-core). My opinion that something can potentially be improved in it (why not), but I personally ATM don't understand the need for ERT. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 18:26 ` Andrea Corallo @ 2024-12-30 18:58 ` Stefan Kangas 2024-12-30 21:34 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 58+ messages in thread From: Stefan Kangas @ 2024-12-30 18:58 UTC (permalink / raw) To: Andrea Corallo, Pip Cet; +Cc: Eli Zaretskii, mattiase, eggert, emacs-devel Andrea Corallo <acorallo@gnu.org> writes: > Yes, elisp-benchmark is running all the selected benchmarks at each > iteration, so that a single one cannot take advantaged of the initial > cool CPU state. If unstable throttling on a specific system is a > problem this will show up as computed error for that test. If a system > is throttling the right (and only) thing to do is to measure it, this is > in my experience what benchmarks do. > > That said tipically Eli is right, the typical use of a benchmark suite > is to run it as a whole and look at the total results, this indeed > accounts for avg throttling as well. May I propose writing down some general usage notes along the lines of the above somewhere? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 18:26 ` Andrea Corallo 2024-12-30 18:58 ` Stefan Kangas @ 2024-12-30 21:34 ` Pip Cet via Emacs development discussions. 2024-12-31 9:55 ` Andrea Corallo 2024-12-31 12:43 ` Eli Zaretskii 1 sibling, 2 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-30 21:34 UTC (permalink / raw) To: Andrea Corallo; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel "Andrea Corallo" <acorallo@gnu.org> writes: >> Benchmarking is hard, and I wouldn't have provided this very verbose >> example if I hadn't seen "paradoxical" results that can only be >> explained by such mechanisms. We need to move away from average run >> times either way, and that requires code changes. > > I'm not sure I understand what you mean, if we prefer something like > geo-mean in elisp-beanhcmarks we can change for that, should be easy. In such situations (machines that don't allow reasonable benchmarks; this has become the standard situation for me) I've usually found it necessary to store a bucket histogram (or full history) across many benchmark runs; this clearly allows you to see the different throttling levels as separate peaks. If we must use a single number, we want the fastest actual run; so, in practice, discard a few percentiles to account for possible rare errors. > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in > emacs-core). My opinion that something can potentially be improved in What's the best way to report the need for such improvements? I'm currently aware of four "bugs" we should definitely fix; one of them, ideally, before merging. > it (why not), but I personally ATM don't understand the need for ERT. Let's focus on the basics right now: people know how to write ERT tests. We have hundreds of them. Some of them could be benchmarks, and we want to make that as easy as possible. ERT provides a way to do that, in the same file if we want to: just add a tag. It provides a way to locate and properly identify resources (five "bugs": reusing test A as input for test B means we don't have separation of tests in elisp-benchmarks, and that's something we should strive for). It also allows a third class of tests: stress tests which we want to execute more often than once per test run, which identify occasional failures in code that needs to be executed very often to establish stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect result once every 8 million runs). IIRC, right now ERT uses ad-hoc loops for such tests, but it'd be nicer to expose the repetition count in the framework (I'm not going to run the non-expensive testsuite on FreeDOS if that means waiting for a million iterations on an emulated machine). (I also think we should introduce an ert-how structure that describes how a test is to be run: do we want to inhibit GC or allow it? Run some warm-up test runs or not? What's the expected time, and when should we time out? We can't run the complete matrix for all tests, so we need some hints in the test, and the lack of a test declaration in elisp-benchmarks hurts us there). Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 21:34 ` Pip Cet via Emacs development discussions. @ 2024-12-31 9:55 ` Andrea Corallo 2024-12-31 12:43 ` Eli Zaretskii 1 sibling, 0 replies; 58+ messages in thread From: Andrea Corallo @ 2024-12-31 9:55 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > "Andrea Corallo" <acorallo@gnu.org> writes: >>> Benchmarking is hard, and I wouldn't have provided this very verbose >>> example if I hadn't seen "paradoxical" results that can only be >>> explained by such mechanisms. We need to move away from average run >>> times either way, and that requires code changes. >> >> I'm not sure I understand what you mean, if we prefer something like >> geo-mean in elisp-beanhcmarks we can change for that, should be easy. > > In such situations (machines that don't allow reasonable benchmarks; > this has become the standard situation for me) I've usually found it > necessary to store a bucket histogram (or full history) across many > benchmark runs; this clearly allows you to see the different throttling > levels as separate peaks. If we must use a single number, we want the > fastest actual run This is not how, in my professional experience at least, benchmarks are made/used. If the CPU is throttoling during the execution of a test this has to be measured and reported in the final score as it reflects how the system behaves. Considering only "best scores" is artificial, I see no reason for further complications in this area. >> I'm open to patches to elisp-benchmarks (and to its hypothetical copy in >> emacs-core). My opinion that something can potentially be improved in > > What's the best way to report the need for such improvements? I'm > currently aware of four "bugs" we should definitely fix; one of them, > ideally, before merging. It's an ELPA package so AFAIK the process is the same than for emacs-core. >> it (why not), but I personally ATM don't understand the need for ERT. > > Let's focus on the basics right now: people know how to write ERT tests. > We have hundreds of them. Some of them could be benchmarks, and we want > to make that as easy as possible. Which ones? > ERT provides a way to do that, in the same file if we want to: just add > a tag. > > It provides a way to locate and properly identify resources (five > "bugs": reusing test A as input for test B means we don't have > separation of tests in elisp-benchmarks, and that's something we should > strive for). That (if it's the case) sounds like a very simple fix. > It also allows a third class of tests: stress tests which we want to > execute more often than once per test run, which identify occasional > failures in code that needs to be executed very often to establish > stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect > result once every 8 million runs). IIRC, right now ERT uses ad-hoc > loops for such tests, but it'd be nicer to expose the repetition count > in the framework (I'm not going to run the non-expensive testsuite on > FreeDOS if that means waiting for a million iterations on an emulated > machine). > > (I also think we should introduce an ert-how structure that describes how > a test is to be run: do we want to inhibit GC or allow it? We definitely don't want to inhibit GC while running benchmarks. Why should we? > Run some > warm-up test runs or not? Of course we should, measuring a fresh state is not realistic, elisp-benchmarks is running an iterations of all tests as warm-up, I think this is good enough. > What's the expected time, and when should we > time out? Bechmark tests are not testsuite tests, they are not supposed to hang nor have long execution time, but anyway we can easily introduce a time-out which all benchmarks has to stay in if we want to be on the safe side. > We can't run the complete matrix for all tests, so we need > some hints in the test, and the lack of a test declaration in > elisp-benchmarks hurts us there). As Eli mentioned, I don't think the goal is to be able to select/run complex matrices of tests here, I believe the typical use cases are two: 1- A user is running all the suite to get the final score (typical use). 2- A developer is running a single benchmark (probably to profile or micro optimize it). ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-30 21:34 ` Pip Cet via Emacs development discussions. 2024-12-31 9:55 ` Andrea Corallo @ 2024-12-31 12:43 ` Eli Zaretskii 2024-12-31 14:01 ` Pip Cet via Emacs development discussions. 2025-01-04 16:34 ` Pip Cet via Emacs development discussions. 1 sibling, 2 replies; 58+ messages in thread From: Eli Zaretskii @ 2024-12-31 12:43 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel > Date: Mon, 30 Dec 2024 21:34:55 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > > > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in > > emacs-core). My opinion that something can potentially be improved in > > What's the best way to report the need for such improvements? Since you've pushed that to a branch, I suggest to submit bug reports about these issues, using "[scratch/elisp-benchmarks]" in the Subject of the bug. > > it (why not), but I personally ATM don't understand the need for ERT. > > Let's focus on the basics right now: people know how to write ERT tests. > We have hundreds of them. Some of them could be benchmarks, and we want > to make that as easy as possible. We can later add more benchmarks using ERT. There's no contradiction. > It also allows a third class of tests: stress tests which we want to > execute more often than once per test run, which identify occasional > failures in code that needs to be executed very often to establish > stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect > result once every 8 million runs). IIRC, right now ERT uses ad-hoc > loops for such tests, but it'd be nicer to expose the repetition count > in the framework (I'm not going to run the non-expensive testsuite on > FreeDOS if that means waiting for a million iterations on an emulated > machine). > > (I also think we should introduce an ert-how structure that describes how > a test is to be run: do we want to inhibit GC or allow it? Run some > warm-up test runs or not? What's the expected time, and when should we > time out? We can't run the complete matrix for all tests, so we need > some hints in the test, and the lack of a test declaration in > elisp-benchmarks hurts us there). These seem to be long-term goals of improving the benchmark suite. They are fine by me, but I don't see why they should preclude installing the benchmarks we have without first converting them to ERT. We can do that later, if we decide it's worth the effort. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 12:43 ` Eli Zaretskii @ 2024-12-31 14:01 ` Pip Cet via Emacs development discussions. 2025-01-04 16:34 ` Pip Cet via Emacs development discussions. 1 sibling, 0 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-31 14:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Mon, 30 Dec 2024 21:34:55 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> >> > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in >> > emacs-core). My opinion that something can potentially be improved in >> >> What's the best way to report the need for such improvements? > > Since you've pushed that to a branch, I suggest to submit bug reports > about these issues, using "[scratch/elisp-benchmarks]" in the Subject > of the bug. Okay. >> > it (why not), but I personally ATM don't understand the need for ERT. >> >> Let's focus on the basics right now: people know how to write ERT tests. >> We have hundreds of them. Some of them could be benchmarks, and we want >> to make that as easy as possible. > > We can later add more benchmarks using ERT. There's no contradiction. I agree. There's definitely no "right now" need for ERT, I was explaining why it's the change I'll be investigating. >> It also allows a third class of tests: stress tests which we want to >> execute more often than once per test run, which identify occasional >> failures in code that needs to be executed very often to establish >> stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect >> result once every 8 million runs). IIRC, right now ERT uses ad-hoc >> loops for such tests, but it'd be nicer to expose the repetition count >> in the framework (I'm not going to run the non-expensive testsuite on >> FreeDOS if that means waiting for a million iterations on an emulated >> machine). >> >> (I also think we should introduce an ert-how structure that describes how >> a test is to be run: do we want to inhibit GC or allow it? Run some >> warm-up test runs or not? What's the expected time, and when should we >> time out? We can't run the complete matrix for all tests, so we need >> some hints in the test, and the lack of a test declaration in >> elisp-benchmarks hurts us there). > > These seem to be long-term goals of improving the benchmark suite. > They are fine by me, but I don't see why they should preclude > installing the benchmarks we have without first converting them to > ERT. We can do that later, if we decide it's worth the effort. We seem to agree here: my intention, too, is to merge the elisp-benchmarks branch ASAP. Let's establish which changes are required on that branch, then do a synchronized rebase-merge to preserve history? Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-31 12:43 ` Eli Zaretskii 2024-12-31 14:01 ` Pip Cet via Emacs development discussions. @ 2025-01-04 16:34 ` Pip Cet via Emacs development discussions. 2025-01-04 18:33 ` Eli Zaretskii 2025-01-06 11:23 ` Andrea Corallo 1 sibling, 2 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2025-01-04 16:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> Date: Mon, 30 Dec 2024 21:34:55 +0000 >> From: Pip Cet <pipcet@protonmail.com> >> Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> >> > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in >> > emacs-core). My opinion that something can potentially be improved in >> >> What's the best way to report the need for such improvements? > > Since you've pushed that to a branch, I suggest to submit bug reports > about these issues, using "[scratch/elisp-benchmarks]" in the Subject > of the bug. I've studied the issues a bit more. This is a bit long, but in summary, I think improving elisp-benchmarks.el (the specific file, not the entire package, which I still intend to reuse) would take more time than starting from ERT, so I'll look into the latter a bit more (maybe I'll run into unforeseen difficulties and change my mind). >> > it (why not), but I personally ATM don't understand the need for ERT. >> >> Let's focus on the basics right now: people know how to write ERT tests. >> We have hundreds of them. Some of them could be benchmarks, and we want >> to make that as easy as possible. > > We can later add more benchmarks using ERT. There's no contradiction. Before describing the issues I found, let me agree with this. If only for the sake of having a better git history, we should merge the elisp-benchmarks branch ASAP after changing the directory name as discussed. I'll force-push a fixed branch after filing the reports; I still think doing a synchronized rebase-and-merge would be worth it since it would result in a cleaner git history than a merge-with-conflicts of a branch based on a previous commit on the master branch. >> It also allows a third class of tests: stress tests which we want to >> execute more often than once per test run, which identify occasional >> failures in code that needs to be executed very often to establish >> stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect >> result once every 8 million runs). IIRC, right now ERT uses ad-hoc >> loops for such tests, but it'd be nicer to expose the repetition count >> in the framework (I'm not going to run the non-expensive testsuite on >> FreeDOS if that means waiting for a million iterations on an emulated >> machine). >> >> (I also think we should introduce an ert-how structure that describes how >> a test is to be run: do we want to inhibit GC or allow it? Run some >> warm-up test runs or not? What's the expected time, and when should we >> time out? We can't run the complete matrix for all tests, so we need >> some hints in the test, and the lack of a test declaration in >> elisp-benchmarks hurts us there). > > These seem to be long-term goals of improving the benchmark suite. > They are fine by me, but I don't see why they should preclude > installing the benchmarks we have without first converting them to > ERT. We can do that later, if we decide it's worth the effort. I agree again. Please read this message as an explanation for why I, personally, think that it is worth the effort. It's not meant as an attack, and it doesn't contradict what you said above in any way. I'm reporting a small number of elisp-benchmarks "bugs" (I think the term is likely to be contentious; I use it because that's what the mailing list is called). All of them are fixable. Most of them are easily fixable by moving to ERT. In my opinion, fixing them in elisp-bechmarks.el is not, as far as Emacs development is concerned, necessary or helpful: we should spend our time improving ERT rather than discussing which parts of it need to be reimplemented (the answer, of course, is that all parts of ERT are needed and none need to be reimplemented: let's just use it). I'm not saying elisp-benchmarks.el is bad software: if the goal is to produce a new benchmarking framework, without using existing code, for use in Emacs, it's a good early start. Continuing the effort would be a significant time investment, and the remaining time to reaching the goal of a generally useful benchmarking framework is greater than what we need to do if we start with ERT (and reuse the benchmarks, of course; the issues are overwhelmingly in elisp-benchmarks.el). elisp-benchmarks.el has not defined the circumstances in which it is meant to be used: many of the issues can be avoided by running elisp-benchmarks in a clean session which is terminated immediately after running the benchmark. However, if this limitation is meant to be permanent, it is inappropriate to declare elisp-benchmarks-run interactive, and it would mean elisp-benchmarks-run should enforce that it is run just once per session (either by terminating it or by setting a flag). In reporting the issues, I worked under the assumption that elisp-benchmarks can usefully be run in existing Emacs sessions, interactive or not, as well as new ones, interactive or not. If this is considered out of scope for elisp-benchmarks, this would limit its usefulness massively, and we would still need to declare this limitation precisely. Unfortunately, I decided to stop at some point: the issues, IMHO, need to be addressed before we can consider the question of whether the numbers produced by elisp-benchmarks are useful enough. In particular, as you (Andrea) correctly pointed out, it is sometimes appropriate to use an average run time (or, non-equivalently, an average speed) for reporting test results; the assumptions needed for this are very significant and need to be spelled out explicitly. The vast majority of "make benchmark" uses which I think should happen cannot meet these stringent requirements. To put things simply, it is better to discard outliers (test runs which take significantly longer than the rest). Averaging doesn't do that: it simply ruins your entire test run if there is a significant outlier. IOW, running the benchmarks with a large repetition count is very likely to result in useful data being discarded, and a useless result. elisp-benchmarks.el makes an attempt to detect outliers by reporting the (modified) standard deviation of test times. This is, again, okay for some use cases, but for others, not so much. In particular, while a large standard deviation is a sufficient criterion for discarding a test, a large repetition count can produce a small standard deviation while reporting an unreliable average. IMHO, reporting the minimum and maximum run time would be more useful than the current result (the minimum time for a successful benchmark run is a very useful number. If there was no system malfunction and the repetition count was large enough, I still thisk this is almost always the number we want). It would mean increasing the repetition count would improve the data, while in the current implementation, it mostly increases the risk of reporting unreliable data. My conclusion is that elisp-benchmarks.el (again, the benchmarks are fine) isn't the right way forward. I'm happy to change the scratch/elisp-benchmarks branch in the ways we've discussed, and it should be merged, but if someone decides to incrementally solve some of the issues, that, while not very harmful, would be an inefficient use of resources. Benchmarks need a test framework. The options are reimplementing ERT or using it. I prefer the second approach and will investigate it further. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2025-01-04 16:34 ` Pip Cet via Emacs development discussions. @ 2025-01-04 18:33 ` Eli Zaretskii 2025-01-05 10:18 ` Pip Cet via Emacs development discussions. 2025-01-06 11:23 ` Andrea Corallo 1 sibling, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2025-01-04 18:33 UTC (permalink / raw) To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel > Date: Sat, 04 Jan 2025 16:34:24 +0000 > From: Pip Cet <pipcet@protonmail.com> > Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > > "Eli Zaretskii" <eliz@gnu.org> writes: > > > Since you've pushed that to a branch, I suggest to submit bug reports > > about these issues, using "[scratch/elisp-benchmarks]" in the Subject > > of the bug. > > I've studied the issues a bit more. This is a bit long, but in summary, > I think improving elisp-benchmarks.el (the specific file, not the entire > package, which I still intend to reuse) would take more time than > starting from ERT You mean, starting from scratch?? How can this be less work than fixing whatever bugs you found in the benchmarks (assuming that we want to fix all of them)? > I'm reporting a small number of elisp-benchmarks "bugs" (I think the > term is likely to be contentious; I use it because that's what the > mailing list is called). All of them are fixable. Most of them are > easily fixable by moving to ERT. You mean, if we move to ERT, which by itself is a significant job, then some or most of these bugs could be fixed as a side effect or with much less work, is that it? That could be so, but then the move to ERT itself is not a small job, so we need to take that into consideration when deciding whether to move to ERT right now. > My conclusion is that elisp-benchmarks.el (again, the benchmarks are > fine) isn't the right way forward. Well, you though that to begin with, so forgive me if I say that I'd like a second independent opinion in this case. > I'm happy to change the scratch/elisp-benchmarks branch in the ways > we've discussed, and it should be merged, but if someone decides to > incrementally solve some of the issues, that, while not very harmful, > would be an inefficient use of resources. Let's hear what Andrea thinks about the issues you reported (let's please discuss them on the bug tracker, not here), and let's take it from there. > Benchmarks need a test framework. I don't necessarily agree. A benchmark doesn't have to have a "correct" or "expected" result, so a test framework is not necessarily justified or needed. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2025-01-04 18:33 ` Eli Zaretskii @ 2025-01-05 10:18 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2025-01-05 10:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel "Eli Zaretskii" <eliz@gnu.org> writes: >> I'm happy to change the scratch/elisp-benchmarks branch in the ways >> we've discussed, and it should be merged, but if someone decides to >> incrementally solve some of the issues, that, while not very harmful, >> would be an inefficient use of resources. > > Let's hear what Andrea thinks about the issues you reported (let's > please discuss them on the bug tracker, not here), and let's take it > from there. Sure, just let me know when we can merge the branch (assuming you meant to say that the merge, too, should wait for Andrea's thoughts) or which other changes are required! Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2025-01-04 16:34 ` Pip Cet via Emacs development discussions. 2025-01-04 18:33 ` Eli Zaretskii @ 2025-01-06 11:23 ` Andrea Corallo 2025-01-06 14:46 ` Eli Zaretskii 1 sibling, 1 reply; 58+ messages in thread From: Andrea Corallo @ 2025-01-06 11:23 UTC (permalink / raw) To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel Pip Cet <pipcet@protonmail.com> writes: > In particular, as you (Andrea) correctly pointed out, it is sometimes > appropriate to use an average run time (or, non-equivalently, an average > speed) for reporting test results; the assumptions needed for this are > very significant and need to be spelled out explicitly. The vast > majority of "make benchmark" uses which I think should happen cannot > meet these stringent requirements. > > To put things simply, it is better to discard outliers (test runs which > take significantly longer than the rest). Averaging doesn't do that: it > simply ruins your entire test run if there is a significant outlier. > IOW, running the benchmarks with a large repetition count is very likely > to result in useful data being discarded, and a useless result. As mentioned, I disagree with having some logic put in place to arbitrarily decide which value is worth to be considered and which value should be discarded. If a system is producing noisy measures this has to be reported as error of the measure. Those numbers are there for some real reason and have to be accounted. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2025-01-06 11:23 ` Andrea Corallo @ 2025-01-06 14:46 ` Eli Zaretskii 2025-01-06 18:41 ` Andrea Corallo 0 siblings, 1 reply; 58+ messages in thread From: Eli Zaretskii @ 2025-01-06 14:46 UTC (permalink / raw) To: Andrea Corallo; +Cc: pipcet, stefankangas, mattiase, eggert, emacs-devel > From: Andrea Corallo <acorallo@gnu.org> > Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, > mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Mon, 06 Jan 2025 06:23:22 -0500 > > Pip Cet <pipcet@protonmail.com> writes: > > > In particular, as you (Andrea) correctly pointed out, it is sometimes > > appropriate to use an average run time (or, non-equivalently, an average > > speed) for reporting test results; the assumptions needed for this are > > very significant and need to be spelled out explicitly. The vast > > majority of "make benchmark" uses which I think should happen cannot > > meet these stringent requirements. > > > > To put things simply, it is better to discard outliers (test runs which > > take significantly longer than the rest). Averaging doesn't do that: it > > simply ruins your entire test run if there is a significant outlier. > > IOW, running the benchmarks with a large repetition count is very likely > > to result in useful data being discarded, and a useless result. > > As mentioned, I disagree with having some logic put in place to > arbitrarily decide which value is worth to be considered and which value > should be discarded. If a system is producing noisy measures this has > to be reported as error of the measure. Those numbers are there for > some real reason and have to be accounted. Without too deep understanding of the underlying issue: IME, if some sample can include outliers, it is always better to use robust estimators, rather than attempt to detect and discard outliers. That's because detection of outliers can decide that a valid measurement is an outlier, and then the estimation becomes biased. In practical terms, for estimating the mean, I can suggest to use the sample median instead of the sample average. The median is very robust to outliers, and only slightly less efficient (i.e., converges a bit slower) than the sample average. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2025-01-06 14:46 ` Eli Zaretskii @ 2025-01-06 18:41 ` Andrea Corallo 0 siblings, 0 replies; 58+ messages in thread From: Andrea Corallo @ 2025-01-06 18:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pipcet, stefankangas, mattiase, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Andrea Corallo <acorallo@gnu.org> >> Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, >> mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> Date: Mon, 06 Jan 2025 06:23:22 -0500 >> >> Pip Cet <pipcet@protonmail.com> writes: >> >> > In particular, as you (Andrea) correctly pointed out, it is sometimes >> > appropriate to use an average run time (or, non-equivalently, an average >> > speed) for reporting test results; the assumptions needed for this are >> > very significant and need to be spelled out explicitly. The vast >> > majority of "make benchmark" uses which I think should happen cannot >> > meet these stringent requirements. >> > >> > To put things simply, it is better to discard outliers (test runs which >> > take significantly longer than the rest). Averaging doesn't do that: it >> > simply ruins your entire test run if there is a significant outlier. >> > IOW, running the benchmarks with a large repetition count is very likely >> > to result in useful data being discarded, and a useless result. >> >> As mentioned, I disagree with having some logic put in place to >> arbitrarily decide which value is worth to be considered and which value >> should be discarded. If a system is producing noisy measures this has >> to be reported as error of the measure. Those numbers are there for >> some real reason and have to be accounted. > > Without too deep understanding of the underlying issue: IME, if some > sample can include outliers, it is always better to use robust > estimators, rather than attempt to detect and discard outliers. > That's because detection of outliers can decide that a valid > measurement is an outlier, and then the estimation becomes biased. 100% agreed > In practical terms, for estimating the mean, I can suggest to use the > sample median instead of the sample average. The median is very > robust to outliers, and only slightly less efficient (i.e., converges > a bit slower) than the sample average. For my experience benchmarks typically use geo-mean, there's quite some info around on why is that, ex [1]. The use of arithmetic mean in elisp-benchmarks is an error of youth (I'm responsible of) which I think should be fixed. Andrea [1] <https://dl.acm.org/doi/pdf/10.1145/5666.5673> ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: New "make benchmark" target 2024-12-14 20:07 ` Pip Cet via Emacs development discussions. 2024-12-14 20:20 ` João Távora 2024-12-15 0:57 ` Stefan Kangas @ 2024-12-15 0:58 ` Stefan Kangas 2 siblings, 0 replies; 58+ messages in thread From: Stefan Kangas @ 2024-12-15 0:58 UTC (permalink / raw) To: Pip Cet Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel, João Távora Pip Cet <pipcet@protonmail.com> writes: > I can push the merged branch to a scratch branch of the emacs repo, or > you can access the merged branch (without any edits to the commit > messages) at > https://codeberg.org/pipcet/emacs/src/branch/merge-elisp-benchmarks (I missed replying to this part.) Pushing this to a scratch branch is suitable here, indeed. Thanks for working on this. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions. 2024-12-12 6:36 ` Eli Zaretskii @ 2024-12-12 10:42 ` Óscar Fuentes 2024-12-12 10:50 ` Andrea Corallo 1 sibling, 1 reply; 58+ messages in thread From: Óscar Fuentes @ 2024-12-12 10:42 UTC (permalink / raw) To: emacs-devel; +Cc: Pip Cet Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> writes: > I looked at the "new" code generated for our EQ macro, and decided that > a fix was in order. I'm therefore sending a first proposal to explain > what I think should be done, and some numbers. > > This patch: > * moves the "slow path" of EQ into a NO_INLINE function > * exits early if the arguments to EQ are actually BASE_EQ > * returns quickly (after a single memory access which cannot be avoided > until we fix our tagging scheme to distinguish exotic objects from > ordinary ones) when symbols_with_pos_enabled isn't true. > > The effect on the code size of the stripped emacs binary is small, but > significant: 8906336 bytes instead of 8955488 bytes on this machine. > (The effect on the code size of the emacs binary with debugging > information is much larger, reducing it from 32182000 bytes to 31125832 > bytes on this system.) There is no effect on the size of the .pdmp > file, which is expected. > > What's missing here is a benchmark, but unless there's a really nasty > surprise when that happens, I'm quite confident that we can improve the > code here. I've seen too many cases where *removing* instructions (mind you, literally removing, not changing!) made the code significantly slower. Modern CPUs are insanely complex and combined with compilers make intuition-based predictions even more futile. But reading your message makes me wonder if EQ and some other "simple" fundamental functions are not lowered by nativecomp? If not, maybe that's a significant opportunity for improvement. As for your patch, one thing that would be easy to do and might save quite a lot of head scratching is to count the fraction of the calls to EQ that benefit from the fast path on a "representative" Emacs run. Then you would have hard data to decide if fighting the compiler/CPU on that case is a worthy cause. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 10:42 ` Improving EQ Óscar Fuentes @ 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Andrea Corallo @ 2024-12-12 10:50 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel, Pip Cet Óscar Fuentes <ofv@wanadoo.es> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> I looked at the "new" code generated for our EQ macro, and decided that >> a fix was in order. I'm therefore sending a first proposal to explain >> what I think should be done, and some numbers. >> >> This patch: >> * moves the "slow path" of EQ into a NO_INLINE function >> * exits early if the arguments to EQ are actually BASE_EQ >> * returns quickly (after a single memory access which cannot be avoided >> until we fix our tagging scheme to distinguish exotic objects from >> ordinary ones) when symbols_with_pos_enabled isn't true. >> >> The effect on the code size of the stripped emacs binary is small, but >> significant: 8906336 bytes instead of 8955488 bytes on this machine. >> (The effect on the code size of the emacs binary with debugging >> information is much larger, reducing it from 32182000 bytes to 31125832 >> bytes on this system.) There is no effect on the size of the .pdmp >> file, which is expected. >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > I've seen too many cases where *removing* instructions (mind you, > literally removing, not changing!) made the code significantly slower. > > Modern CPUs are insanely complex and combined with compilers make > intuition-based predictions even more futile. That's why the patch needs to be benchmarked anyway. > But reading your message makes me wonder if EQ and some other "simple" > fundamental functions are not lowered by nativecomp? If not, maybe > that's a significant opportunity for improvement. Nativecomp only compiles eq for Lisp code, the one discussed here is the eq used in C (and bytecode). BTW ATM nativecomp generates code with the same layout of the eq we had in C till my last change of few weeks ago. When eq will be stable in C I guess I'll replicate the layout for generated code for Lisp as well. Andrea ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 10:50 ` Andrea Corallo @ 2024-12-12 11:21 ` Óscar Fuentes 2024-12-13 12:24 ` Pip Cet via Emacs development discussions. 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. 2024-12-12 18:10 ` John ff 2 siblings, 1 reply; 58+ messages in thread From: Óscar Fuentes @ 2024-12-12 11:21 UTC (permalink / raw) To: emacs-devel Andrea Corallo <acorallo@gnu.org> writes: >> But reading your message makes me wonder if EQ and some other "simple" >> fundamental functions are not lowered by nativecomp? If not, maybe >> that's a significant opportunity for improvement. > > Nativecomp only compiles eq for Lisp code, the one discussed here is the > eq used in C (and bytecode). Ok, thanks. Of course this change also affects Emacs running with nativecomp, as many calls to EQ are made by C functions not lowered by nativecomp. My guess is that nativecomp's performance would benefit quite a bit from the general approach of this patch, as every point where nativecomp calls C is a pessimization spot, but that's another topic. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 11:21 ` Óscar Fuentes @ 2024-12-13 12:24 ` Pip Cet via Emacs development discussions. 0 siblings, 0 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-13 12:24 UTC (permalink / raw) To: Óscar Fuentes; +Cc: emacs-devel Óscar Fuentes <ofv@wanadoo.es> writes: > Andrea Corallo <acorallo@gnu.org> writes: > >>> But reading your message makes me wonder if EQ and some other "simple" >>> fundamental functions are not lowered by nativecomp? If not, maybe >>> that's a significant opportunity for improvement. >> >> Nativecomp only compiles eq for Lisp code, the one discussed here is the >> eq used in C (and bytecode). Does nativecomp actually call emit_EQ for anything but lowering ELC jump tables into a sequence of conditional branches? I don't see any code to do so, and Emacs builds fine without emit_EQ if byte-compile-cond-use-jump-table is disabled. > Of course this change also affects Emacs running with nativecomp, as > many calls to EQ are made by C functions not lowered by nativecomp. My impression is that nativecomp usually ends up calling Feq for eq-based conditions. My point is that emit_EQ is used very rarely, when emitting a switch statement (and switch statements should usually use maybe_remove_pos_from_symbol + BASE_EQ rather than EQ). So I won't bother doing anything with emit_EQ. Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes @ 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. 2024-12-12 18:10 ` John ff 2 siblings, 0 replies; 58+ messages in thread From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:05 UTC (permalink / raw) To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel "Andrea Corallo" <acorallo@gnu.org> writes: > Óscar Fuentes <ofv@wanadoo.es> writes: >> I've seen too many cases where *removing* instructions (mind you, >> literally removing, not changing!) made the code significantly slower. Yes, there are many ways in which that can happen. Removing prefetches or branch hints is the most obvious example, but I don't claim to know all the ways, and ultimately if an expected performance improvement does not materialize we might have to decide this one on code size reasons alone (of course, if performance is drastically worse, we shouldn't apply the patch). >> Modern CPUs are insanely complex and combined with compilers make >> intuition-based predictions even more futile. The compiler isn't the issue here, since I checked the assembly code that was generated. Totally agree about CPUs. For example, moving code out of line will change many conditional branch locations to a single one (the one in the out-of-line function), which may help or hurt branch prediction, and that's just one of many ways in which inline functions often lose. So we should also benchmark whether this might be one of those cases, in which case we'd want to move all of EQ to a non-inlined function... > That's why the patch needs to be benchmarked anyway. Absolute agreement there. I tried some initial benchmarks and it's lost in the noise, but that was while running on battery on a laptop, and I need to test on a machine with a proper fixed clock rate. >> But reading your message makes me wonder if EQ and some other "simple" >> fundamental functions are not lowered by nativecomp? If not, maybe >> that's a significant opportunity for improvement. > > Nativecomp only compiles eq for Lisp code, the one discussed here is the > eq used in C (and bytecode). > > BTW ATM nativecomp generates code with the same layout of the eq we had > in C till my last change of few weeks ago. When eq will be stable in C > I guess I'll replicate the layout for generated code for Lisp as well. Thanks, that would be great. Yes, it makes most sense to test with and without nativecomp, expecting improvement to be more significant in the latter case (but as EQ is used by C code used by native-compiled Lisp, I expect a small improvement there, too). Pip ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: Improving EQ 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. @ 2024-12-12 18:10 ` John ff 2 siblings, 0 replies; 58+ messages in thread From: John ff @ 2024-12-12 18:10 UTC (permalink / raw) To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel, Pip Cet [-- Attachment #1: Type: text/plain, Size: 2352 bytes --] -------- Original Message -------- From: Andrea Corallo <acorallo@gnu.org> Sent: Thu Dec 12 10:50:04 GMT 2024 To: "Óscar Fuentes" <ofv@wanadoo.es> Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com> Subject: Re: Improving EQ Óscar Fuentes <ofv@wanadoo.es> writes: > Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org> > writes: > >> I looked at the "new" code generated for our EQ macro, and decided that >> a fix was in order. I'm therefore sending a first proposal to explain >> what I think should be done, and some numbers. >> >> This patch: >> * moves the "slow path" of EQ into a NO_INLINE function >> * exits early if the arguments to EQ are actually BASE_EQ >> * returns quickly (after a single memory access which cannot be avoided >> until we fix our tagging scheme to distinguish exotic objects from >> ordinary ones) when symbols_with_pos_enabled isn't true. >> >> The effect on the code size of the stripped emacs binary is small, but >> significant: 8906336 bytes instead of 8955488 bytes on this machine. >> (The effect on the code size of the emacs binary with debugging >> information is much larger, reducing it from 32182000 bytes to 31125832 >> bytes on this system.) There is no effect on the size of the .pdmp >> file, which is expected. >> >> What's missing here is a benchmark, but unless there's a really nasty >> surprise when that happens, I'm quite confident that we can improve the >> code here. > > I've seen too many cases where *removing* instructions (mind you, > literally removing, not changing!) made the code significantly slower. > > Modern CPUs are insanely complex and combined with compilers make > intuition-based predictions even more futile. That's why the patch needs to be benchmarked anyway. > But reading your message makes me wonder if EQ and some other "simple" > fundamental functions are not lowered by nativecomp? If not, maybe > that's a significant opportunity for improvement. Nativecomp only compiles eq for Lisp code, the one discussed here is the eq used in C (and bytecode). BTW ATM nativecomp generates code with the same layout of the eq we had in C till my last change of few weeks ago. When eq will be stable in C I guess I'll replicate the layout for generated code for Lisp as well. Andrea [-- Attachment #2: Type: text/html, Size: 3135 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2025-01-06 18:41 UTC | newest] Thread overview: 58+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions. 2024-12-12 6:36 ` Eli Zaretskii 2024-12-12 8:23 ` Andrea Corallo 2024-12-12 8:36 ` Pip Cet via Emacs development discussions. 2024-12-12 9:18 ` Eli Zaretskii 2024-12-12 9:35 ` Visuwesh 2024-12-12 10:40 ` Andrea Corallo 2024-12-12 17:46 ` Pip Cet via Emacs development discussions. 2024-12-12 19:09 ` Eli Zaretskii 2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas 2024-12-12 10:59 ` Andrea Corallo 2024-12-12 16:53 ` Pip Cet via Emacs development discussions. 2024-12-13 0:49 ` Stefan Kangas 2024-12-13 7:37 ` Andrea Corallo 2024-12-14 12:00 ` Stefan Kangas 2024-12-14 14:06 ` Stefan Monnier 2024-12-14 11:34 ` Pip Cet via Emacs development discussions. 2024-12-14 11:58 ` Stefan Kangas 2024-12-14 20:07 ` Pip Cet via Emacs development discussions. 2024-12-14 20:20 ` João Távora 2024-12-15 0:57 ` Stefan Kangas 2024-12-22 16:04 ` Pip Cet via Emacs development discussions. 2024-12-29 10:47 ` Andrea Corallo 2024-12-30 11:45 ` Pip Cet via Emacs development discussions. 2024-12-30 14:15 ` Eli Zaretskii 2024-12-30 15:00 ` Pip Cet via Emacs development discussions. 2024-12-30 15:21 ` Eli Zaretskii 2024-12-30 15:49 ` Pip Cet via Emacs development discussions. 2024-12-30 15:53 ` João Távora 2024-12-30 16:40 ` Eli Zaretskii 2024-12-30 17:25 ` Pip Cet via Emacs development discussions. 2024-12-30 18:16 ` Eli Zaretskii 2024-12-31 4:00 ` Pip Cet via Emacs development discussions. 2024-12-31 5:26 ` Stefan Kangas 2024-12-31 13:05 ` Eli Zaretskii 2024-12-31 14:14 ` Pip Cet via Emacs development discussions. 2024-12-31 14:22 ` Eli Zaretskii 2024-12-31 12:53 ` Eli Zaretskii 2024-12-31 14:34 ` Andrea Corallo 2024-12-30 18:26 ` Andrea Corallo 2024-12-30 18:58 ` Stefan Kangas 2024-12-30 21:34 ` Pip Cet via Emacs development discussions. 2024-12-31 9:55 ` Andrea Corallo 2024-12-31 12:43 ` Eli Zaretskii 2024-12-31 14:01 ` Pip Cet via Emacs development discussions. 2025-01-04 16:34 ` Pip Cet via Emacs development discussions. 2025-01-04 18:33 ` Eli Zaretskii 2025-01-05 10:18 ` Pip Cet via Emacs development discussions. 2025-01-06 11:23 ` Andrea Corallo 2025-01-06 14:46 ` Eli Zaretskii 2025-01-06 18:41 ` Andrea Corallo 2024-12-15 0:58 ` Stefan Kangas 2024-12-12 10:42 ` Improving EQ Óscar Fuentes 2024-12-12 10:50 ` Andrea Corallo 2024-12-12 11:21 ` Óscar Fuentes 2024-12-13 12:24 ` Pip Cet via Emacs development discussions. 2024-12-12 17:05 ` Pip Cet via Emacs development discussions. 2024-12-12 18:10 ` John ff
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).