unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Improving EQ
@ 2024-12-11 22:37 Pip Cet via Emacs development discussions.
  2024-12-12  6:36 ` Eli Zaretskii
  2024-12-12 10:42 ` Improving EQ Óscar Fuentes
  0 siblings, 2 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-11 22:37 UTC (permalink / raw)
  To: emacs-devel

I looked at the "new" code generated for our EQ macro, and decided that
a fix was in order.  I'm therefore sending a first proposal to explain
what I think should be done, and some numbers.

This patch:
* moves the "slow path" of EQ into a NO_INLINE function
* exits early if the arguments to EQ are actually BASE_EQ
* returns quickly (after a single memory access which cannot be avoided
until we fix our tagging scheme to distinguish exotic objects from
ordinary ones) when symbols_with_pos_enabled isn't true.

The effect on the code size of the stripped emacs binary is small, but
significant: 8906336 bytes instead of 8955488 bytes on this machine.
(The effect on the code size of the emacs binary with debugging
information is much larger, reducing it from 32182000 bytes to 31125832
bytes on this system.)  There is no effect on the size of the .pdmp
file, which is expected.

What's missing here is a benchmark, but unless there's a really nasty
surprise when that happens, I'm quite confident that we can improve the
code here.

The proposed code doesn't use __builtin_expect anymore.

I've deliberately written slow_eq so it returns the same value as EQ,
even if the slow code path is disabled.

Pip

commit 2c807f7320bcb9654e0f148d64c92053b1a47b42 (HEAD -> faster-eq)
Author: Pip Cet <pipcet@protonmail.com>
Date:   Wed Dec 11 22:31:07 2024 +0000

    Change EQ to move slow code path into a separate function
    
    * src/data.c (slow_eq): New function.
    * src/lisp.h (EQ): Call it.

diff --git a/src/data.c b/src/data.c
index 66cf34c1e60..5ee383d2f48 100644
--- a/src/data.c
+++ b/src/data.c
@@ -162,6 +162,15 @@ circular_list (Lisp_Object list)
 \f
 /* Data type predicates.  */
 
+/* NO_INLINE to avoid excessive code growth when LTO is in use.  */
+NO_INLINE bool slow_eq (Lisp_Object x, Lisp_Object y)
+{
+  return BASE_EQ ((symbols_with_pos_enabled && SYMBOL_WITH_POS_P (x) ?
+		   XSYMBOL_WITH_POS_SYM (x) : x),
+		  (symbols_with_pos_enabled && SYMBOL_WITH_POS_P (y) ?
+		   XSYMBOL_WITH_POS_SYM (y) : y));
+}
+
 DEFUN ("eq", Feq, Seq, 2, 2, 0,
        doc: /* Return t if the two args are the same Lisp object.  */
        attributes: const)
diff --git a/src/lisp.h b/src/lisp.h
index 832a1755c04..64d4835a499 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -618,6 +618,7 @@ #define ENUM_BF(TYPE) enum TYPE
 extern Lisp_Object default_value (Lisp_Object symbol);
 extern void defalias (Lisp_Object symbol, Lisp_Object definition);
 extern char *fixnum_to_string (EMACS_INT number, char *buffer, char *end);
+extern bool slow_eq (Lisp_Object x, Lisp_Object y);
 
 
 /* Defined in emacs.c.  */
@@ -1353,10 +1354,12 @@ make_fixed_natnum (EMACS_INT n)
 INLINE bool
 EQ (Lisp_Object x, Lisp_Object y)
 {
-  return BASE_EQ ((__builtin_expect (symbols_with_pos_enabled, false)
-		   && SYMBOL_WITH_POS_P (x) ? XSYMBOL_WITH_POS_SYM (x) : x),
-		  (__builtin_expect (symbols_with_pos_enabled, false)
-		   && SYMBOL_WITH_POS_P (y) ? XSYMBOL_WITH_POS_SYM (y) : y));
+  if (BASE_EQ (x, y))
+    return true;
+  else if (!symbols_with_pos_enabled)
+    return false;
+  else
+    return slow_eq (x, y);
 }
 
 INLINE intmax_t




^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
@ 2024-12-12  6:36 ` Eli Zaretskii
  2024-12-12  8:23   ` Andrea Corallo
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12 10:42 ` Improving EQ Óscar Fuentes
  1 sibling, 2 replies; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-12  6:36 UTC (permalink / raw)
  To: Pip Cet, Mattias Engdegård, Paul Eggert; +Cc: emacs-devel

> Date: Wed, 11 Dec 2024 22:37:04 +0000
> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> 
> What's missing here is a benchmark, but unless there's a really nasty
> surprise when that happens, I'm quite confident that we can improve the
> code here.

The usual easy benchmark is to byte-compile all the *.el files in the
source tree.  That is, remove all the *.elc files, then say "make" and
time that.

There was also some Emacs benchmark suite that someone posted, but I
cannot find it now, maybe someone else will.

Adding Mattias and Paul to this discussion.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12  6:36 ` Eli Zaretskii
@ 2024-12-12  8:23   ` Andrea Corallo
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 64+ messages in thread
From: Andrea Corallo @ 2024-12-12  8:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Pip Cet, Mattias Engdegård, Paul Eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>> 
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> The usual easy benchmark is to byte-compile all the *.el files in the
> source tree.  That is, remove all the *.elc files, then say "make" and
> time that.

Agree, considering that tests the non zero 'symbols_with_pos_enabled'
case.

> There was also some Emacs benchmark suite that someone posted, but I
> cannot find it now, maybe someone else will.

<https://elpa.gnu.org/packages/elisp-benchmarks.html>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12  6:36 ` Eli Zaretskii
  2024-12-12  8:23   ` Andrea Corallo
@ 2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12  9:18     ` Eli Zaretskii
                       ` (3 more replies)
  1 sibling, 4 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12  8:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, Paul Eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> The usual easy benchmark is to byte-compile all the *.el files in the
> source tree.  That is, remove all the *.elc files, then say "make" and
> time that.

Considering the point of the optimization was to make compilation (when
symbols_with_pos_enabled is true) slower, but speed up non-compilation
use cases, I think that may be the opposite of what we want :-)

Furthermore, the master branch doesn't currently build after deleting
all the *.elc files, because recompilation exceeds max-lisp-eval-depth
in that scenario (together with the known purespace issue, this pretty
much means "make bootstrap" is the only way I can rebuild an emacs tree
right now. It'd be great if Someone could look into this, but I've
failed to understand the native-compilation code (and been told off for
trying to) too often for that Someone to be me. Plus, of course, I fully
understand that native compilation currently has wrong code generation
bugs which obviously have to take priority over build issues...)

> There was also some Emacs benchmark suite that someone posted, but I
> cannot find it now, maybe someone else will.

https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
we could agree on a benchmark, and even better if there were a way to
reliably run it from emacs -Q :-)

In fact, I would suggest to move a reduced benchmark suite to the emacs
repo itself, and run it using "make benchmark".

Also, just to let everyone know, I'm planning to make the "exotic"
property (this object must or can use the slow_eq path) part (probably
the LSB) of the tag rather than accessing it via a global variable and
the PVEC type. This should reduce code size further, should speed up
things, and has some other advantages which I'll go into when I have
working code.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
@ 2024-12-12  9:18     ` Eli Zaretskii
  2024-12-12  9:35     ` Visuwesh
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-12  9:18 UTC (permalink / raw)
  To: Pip Cet; +Cc: mattiase, eggert, emacs-devel

> Date: Thu, 12 Dec 2024 08:36:50 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> > The usual easy benchmark is to byte-compile all the *.el files in the
> > source tree.  That is, remove all the *.elc files, then say "make" and
> > time that.
> 
> Considering the point of the optimization was to make compilation (when
> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> use cases, I think that may be the opposite of what we want :-)

That's fine, because knowing where this slows us down and by how much
is also important.

> Furthermore, the master branch doesn't currently build after deleting
> all the *.elc files, because recompilation exceeds max-lisp-eval-depth
> in that scenario (together with the known purespace issue, this pretty
> much means "make bootstrap" is the only way I can rebuild an emacs tree
> right now. It'd be great if Someone could look into this, but I've
> failed to understand the native-compilation code (and been told off for
> trying to) too often for that Someone to be me. Plus, of course, I fully
> understand that native compilation currently has wrong code generation
> bugs which obviously have to take priority over build issues...)

If this is with native-compilation, how about trying without?

Also, enlarging max-lisp-eval-depth (assuming you don't somehow hit
infinite recursion) locally should be easy: just add that to the
relevant Makefiles.

> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)

Our benchmark facilities are very rudimentary, so agreement is not an
issue: we just use whatever is available.

> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".

Working on a better benchmark is very useful, but maybe we should try
solving one problem at a time?

> Also, just to let everyone know, I'm planning to make the "exotic"
> property (this object must or can use the slow_eq path) part (probably
> the LSB) of the tag rather than accessing it via a global variable and
> the PVEC type. This should reduce code size further, should speed up
> things, and has some other advantages which I'll go into when I have
> working code.

Whenever you change something in the tags, please remember to update
.gdbinit, otherwise we lose debugging support.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12  9:18     ` Eli Zaretskii
@ 2024-12-12  9:35     ` Visuwesh
  2024-12-12 10:40     ` Andrea Corallo
  2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
  3 siblings, 0 replies; 64+ messages in thread
From: Visuwesh @ 2024-12-12  9:35 UTC (permalink / raw)
  To: Pip Cet via Emacs development discussions.
  Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert

[வியாழன் டிசம்பர் 12, 2024] Pip Cet via "Emacs development discussions." wrote:

>> There was also some Emacs benchmark suite that someone posted, but I
>> cannot find it now, maybe someone else will.
>
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)

Will the command package-isolate help in this scenario?



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12  9:18     ` Eli Zaretskii
  2024-12-12  9:35     ` Visuwesh
@ 2024-12-12 10:40     ` Andrea Corallo
  2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
  2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
  3 siblings, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:40 UTC (permalink / raw)
  To: Pip Cet via Emacs development discussions.
  Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert

Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:

> "Eli Zaretskii" <eliz@gnu.org> writes:
>
>>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>>
>>> What's missing here is a benchmark, but unless there's a really nasty
>>> surprise when that happens, I'm quite confident that we can improve the
>>> code here.
>>
>> The usual easy benchmark is to byte-compile all the *.el files in the
>> source tree.  That is, remove all the *.elc files, then say "make" and
>> time that.
>
> Considering the point of the optimization was to make compilation (when
> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> use cases, I think that may be the opposite of what we want :-)

Glad you finally agree on the goal of the optimization.

> Furthermore, the master branch doesn't currently build after deleting
> all the *.elc files, because recompilation exceeds max-lisp-eval-depth
> in that scenario (together with the known purespace issue, this pretty
> much means "make bootstrap" is the only way I can rebuild an emacs tree
> right now. It'd be great if Someone could look into this, but I've
> failed to understand the native-compilation code (and been told off for
> trying to) too often for that Someone to be me. Plus, of course, I fully
> understand that native compilation currently has wrong code generation
> bugs which obviously have to take priority over build issues...)
>
>> There was also some Emacs benchmark suite that someone posted, but I
>> cannot find it now, maybe someone else will.
>
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)

What is not reliable in the elisp-benchmarks invocation suggested in the
instructions in it?

> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".

That would be nice.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
  2024-12-12  6:36 ` Eli Zaretskii
@ 2024-12-12 10:42 ` Óscar Fuentes
  2024-12-12 10:50   ` Andrea Corallo
  1 sibling, 1 reply; 64+ messages in thread
From: Óscar Fuentes @ 2024-12-12 10:42 UTC (permalink / raw)
  To: emacs-devel; +Cc: Pip Cet

Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:

> I looked at the "new" code generated for our EQ macro, and decided that
> a fix was in order.  I'm therefore sending a first proposal to explain
> what I think should be done, and some numbers.
>
> This patch:
> * moves the "slow path" of EQ into a NO_INLINE function
> * exits early if the arguments to EQ are actually BASE_EQ
> * returns quickly (after a single memory access which cannot be avoided
> until we fix our tagging scheme to distinguish exotic objects from
> ordinary ones) when symbols_with_pos_enabled isn't true.
>
> The effect on the code size of the stripped emacs binary is small, but
> significant: 8906336 bytes instead of 8955488 bytes on this machine.
> (The effect on the code size of the emacs binary with debugging
> information is much larger, reducing it from 32182000 bytes to 31125832
> bytes on this system.)  There is no effect on the size of the .pdmp
> file, which is expected.
>
> What's missing here is a benchmark, but unless there's a really nasty
> surprise when that happens, I'm quite confident that we can improve the
> code here.

I've seen too many cases where *removing* instructions (mind you,
literally removing, not changing!) made the code significantly slower.

Modern CPUs are insanely complex and combined with compilers make
intuition-based predictions even more futile.

But reading your message makes me wonder if EQ and some other "simple"
fundamental functions are not lowered by nativecomp? If not, maybe
that's a significant opportunity for improvement.

As for your patch, one thing that would be easy to do and might save
quite a lot of head scratching is to count the fraction of the calls to
EQ that benefit from the fast path on a "representative" Emacs run. Then
you would have hard data to decide if fighting the compiler/CPU on that
case is a worthy cause.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 10:42 ` Improving EQ Óscar Fuentes
@ 2024-12-12 10:50   ` Andrea Corallo
  2024-12-12 11:21     ` Óscar Fuentes
                       ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:50 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel, Pip Cet

Óscar Fuentes <ofv@wanadoo.es> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> I looked at the "new" code generated for our EQ macro, and decided that
>> a fix was in order.  I'm therefore sending a first proposal to explain
>> what I think should be done, and some numbers.
>>
>> This patch:
>> * moves the "slow path" of EQ into a NO_INLINE function
>> * exits early if the arguments to EQ are actually BASE_EQ
>> * returns quickly (after a single memory access which cannot be avoided
>> until we fix our tagging scheme to distinguish exotic objects from
>> ordinary ones) when symbols_with_pos_enabled isn't true.
>>
>> The effect on the code size of the stripped emacs binary is small, but
>> significant: 8906336 bytes instead of 8955488 bytes on this machine.
>> (The effect on the code size of the emacs binary with debugging
>> information is much larger, reducing it from 32182000 bytes to 31125832
>> bytes on this system.)  There is no effect on the size of the .pdmp
>> file, which is expected.
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> I've seen too many cases where *removing* instructions (mind you,
> literally removing, not changing!) made the code significantly slower.
>
> Modern CPUs are insanely complex and combined with compilers make
> intuition-based predictions even more futile.

That's why the patch needs to be benchmarked anyway.

> But reading your message makes me wonder if EQ and some other "simple"
> fundamental functions are not lowered by nativecomp? If not, maybe
> that's a significant opportunity for improvement.

Nativecomp only compiles eq for Lisp code, the one discussed here is the
eq used in C (and bytecode).

BTW ATM nativecomp generates code with the same layout of the eq we had
in C till my last change of few weeks ago.  When eq will be stable in C
I guess I'll replicate the layout for generated code for Lisp as well.

  Andrea



^ permalink raw reply	[flat|nested] 64+ messages in thread

* New "make benchmark" target
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
                       ` (2 preceding siblings ...)
  2024-12-12 10:40     ` Andrea Corallo
@ 2024-12-12 10:53     ` Stefan Kangas
  2024-12-12 10:59       ` Andrea Corallo
  3 siblings, 1 reply; 64+ messages in thread
From: Stefan Kangas @ 2024-12-12 10:53 UTC (permalink / raw)
  To: Pip Cet, Eli Zaretskii
  Cc: Mattias Engdegård, Paul Eggert, emacs-devel, Andrea Corallo

Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:

> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)
>
> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".

SGTM, but why a reduced suite and not just the whole thing?



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
@ 2024-12-12 10:59       ` Andrea Corallo
  2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:59 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
>> we could agree on a benchmark, and even better if there were a way to
>> reliably run it from emacs -Q :-)
>>
>> In fact, I would suggest to move a reduced benchmark suite to the emacs
>> repo itself, and run it using "make benchmark".
>
> SGTM, but why a reduced suite and not just the whole thing?

My fear is that if we start going into the rabbit hole of which
benchmark of elisp-benchmarks should or should not be included, we will
never agree and as a consequence succeed.  So I guess I'd favor as well
including all elisp-benchmarks.

  Andrea



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 10:50   ` Andrea Corallo
@ 2024-12-12 11:21     ` Óscar Fuentes
  2024-12-13 12:24       ` Pip Cet via Emacs development discussions.
  2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
  2024-12-12 18:10     ` John ff
  2 siblings, 1 reply; 64+ messages in thread
From: Óscar Fuentes @ 2024-12-12 11:21 UTC (permalink / raw)
  To: emacs-devel

Andrea Corallo <acorallo@gnu.org> writes:

>> But reading your message makes me wonder if EQ and some other "simple"
>> fundamental functions are not lowered by nativecomp? If not, maybe
>> that's a significant opportunity for improvement.
>
> Nativecomp only compiles eq for Lisp code, the one discussed here is the
> eq used in C (and bytecode).

Ok, thanks.

Of course this change also affects Emacs running with nativecomp, as
many calls to EQ are made by C functions not lowered by nativecomp.

My guess is that nativecomp's performance would benefit quite a bit from
the general approach of this patch, as every point where nativecomp
calls C is a pessimization spot, but that's another topic.




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-12 10:59       ` Andrea Corallo
@ 2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
  2024-12-13  0:49           ` Stefan Kangas
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 16:53 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:

> Stefan Kangas <stefankangas@gmail.com> writes:
>
>> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>> writes:
>>
>>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
>>> we could agree on a benchmark, and even better if there were a way to
>>> reliably run it from emacs -Q :-)
>>>
>>> In fact, I would suggest to move a reduced benchmark suite to the emacs
>>> repo itself, and run it using "make benchmark".
>>
>> SGTM, but why a reduced suite and not just the whole thing?
>
> My fear is that if we start going into the rabbit hole of which
> benchmark of elisp-benchmarks should or should not be included, we will
> never agree and as a consequence succeed.  So I guess I'd favor as well
> including all elisp-benchmarks.

I agree a full benchmark suite would be even better, I don't recall why
I typed "reduced" there. So let's do that?

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 10:50   ` Andrea Corallo
  2024-12-12 11:21     ` Óscar Fuentes
@ 2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
  2024-12-12 18:10     ` John ff
  2 siblings, 0 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:05 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:
> Óscar Fuentes <ofv@wanadoo.es> writes:
>> I've seen too many cases where *removing* instructions (mind you,
>> literally removing, not changing!) made the code significantly slower.

Yes, there are many ways in which that can happen. Removing prefetches
or branch hints is the most obvious example, but I don't claim to know
all the ways, and ultimately if an expected performance improvement does
not materialize we might have to decide this one on code size reasons
alone (of course, if performance is drastically worse, we shouldn't
apply the patch).

>> Modern CPUs are insanely complex and combined with compilers make
>> intuition-based predictions even more futile.

The compiler isn't the issue here, since I checked the assembly code
that was generated. Totally agree about CPUs. For example, moving code
out of line will change many conditional branch locations to a single
one (the one in the out-of-line function), which may help or hurt branch
prediction, and that's just one of many ways in which inline functions
often lose.  So we should also benchmark whether this might be one of
those cases, in which case we'd want to move all of EQ to a non-inlined
function...

> That's why the patch needs to be benchmarked anyway.

Absolute agreement there. I tried some initial benchmarks and it's lost
in the noise, but that was while running on battery on a laptop, and I
need to test on a machine with a proper fixed clock rate.

>> But reading your message makes me wonder if EQ and some other "simple"
>> fundamental functions are not lowered by nativecomp? If not, maybe
>> that's a significant opportunity for improvement.
>
> Nativecomp only compiles eq for Lisp code, the one discussed here is the
> eq used in C (and bytecode).
>
> BTW ATM nativecomp generates code with the same layout of the eq we had
> in C till my last change of few weeks ago.  When eq will be stable in C
> I guess I'll replicate the layout for generated code for Lisp as well.

Thanks, that would be great. Yes, it makes most sense to test with and
without nativecomp, expecting improvement to be more significant in the
latter case (but as EQ is used by C code used by native-compiled Lisp, I
expect a small improvement there, too).

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 10:40     ` Andrea Corallo
@ 2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
  2024-12-12 19:09         ` Eli Zaretskii
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:46 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Pip Cet via "Emacs development discussions.",
	Eli Zaretskii, Mattias Engdegård, Paul Eggert

"Andrea Corallo" <acorallo@gnu.org> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>>
>>>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>>>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>>>
>>>> What's missing here is a benchmark, but unless there's a really nasty
>>>> surprise when that happens, I'm quite confident that we can improve the
>>>> code here.
>>>
>>> The usual easy benchmark is to byte-compile all the *.el files in the
>>> source tree.  That is, remove all the *.elc files, then say "make" and
>>> time that.
>>
>> Considering the point of the optimization was to make compilation (when
>> symbols_with_pos_enabled is true) slower, but speed up non-compilation
>> use cases, I think that may be the opposite of what we want :-)
>
> Glad you finally agree on the goal of the optimization.

I find that statement to be offensive, because you know it to be
factually incorrect; but even if it weren't, gloating like that is
extremely poor form for a maintainer.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 10:50   ` Andrea Corallo
  2024-12-12 11:21     ` Óscar Fuentes
  2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
@ 2024-12-12 18:10     ` John ff
  2 siblings, 0 replies; 64+ messages in thread
From: John ff @ 2024-12-12 18:10 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel, Pip Cet

[-- Attachment #1: Type: text/plain, Size: 2352 bytes --]




-------- Original Message --------
From: Andrea Corallo <acorallo@gnu.org>
Sent: Thu Dec 12 10:50:04 GMT 2024
To: "Óscar Fuentes" <ofv@wanadoo.es>
Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>
Subject: Re: Improving EQ

Óscar Fuentes <ofv@wanadoo.es> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> I looked at the "new" code generated for our EQ macro, and decided that
>> a fix was in order.  I'm therefore sending a first proposal to explain
>> what I think should be done, and some numbers.
>>
>> This patch:
>> * moves the "slow path" of EQ into a NO_INLINE function
>> * exits early if the arguments to EQ are actually BASE_EQ
>> * returns quickly (after a single memory access which cannot be avoided
>> until we fix our tagging scheme to distinguish exotic objects from
>> ordinary ones) when symbols_with_pos_enabled isn't true.
>>
>> The effect on the code size of the stripped emacs binary is small, but
>> significant: 8906336 bytes instead of 8955488 bytes on this machine.
>> (The effect on the code size of the emacs binary with debugging
>> information is much larger, reducing it from 32182000 bytes to 31125832
>> bytes on this system.)  There is no effect on the size of the .pdmp
>> file, which is expected.
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> I've seen too many cases where *removing* instructions (mind you,
> literally removing, not changing!) made the code significantly slower.
>
> Modern CPUs are insanely complex and combined with compilers make
> intuition-based predictions even more futile.

That's why the patch needs to be benchmarked anyway.

> But reading your message makes me wonder if EQ and some other "simple"
> fundamental functions are not lowered by nativecomp? If not, maybe
> that's a significant opportunity for improvement.

Nativecomp only compiles eq for Lisp code, the one discussed here is the
eq used in C (and bytecode).

BTW ATM nativecomp generates code with the same layout of the eq we had
in C till my last change of few weeks ago.  When eq will be stable in C
I guess I'll replicate the layout for generated code for Lisp as well.

  Andrea


[-- Attachment #2: Type: text/html, Size: 3135 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
@ 2024-12-12 19:09         ` Eli Zaretskii
  0 siblings, 0 replies; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-12 19:09 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, emacs-devel, mattiase, eggert

> Date: Thu, 12 Dec 2024 17:46:55 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>
> 
> "Andrea Corallo" <acorallo@gnu.org> writes:
> 
> >> Considering the point of the optimization was to make compilation (when
> >> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> >> use cases, I think that may be the opposite of what we want :-)
> >
> > Glad you finally agree on the goal of the optimization.
> 
> I find that statement to be offensive, because you know it to be
> factually incorrect; but even if it weren't, gloating like that is
> extremely poor form for a maintainer.

I'm quite sure Andrea meant it as tongue-in-cheek, nowhere near
gloating.  Please keep in mind that for Andrea and others here,
English is not their first language, and their choice of words could
reflect that.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
@ 2024-12-13  0:49           ` Stefan Kangas
  2024-12-13  7:37             ` Andrea Corallo
  2024-12-14 11:34             ` Pip Cet via Emacs development discussions.
  0 siblings, 2 replies; 64+ messages in thread
From: Stefan Kangas @ 2024-12-13  0:49 UTC (permalink / raw)
  To: Pip Cet, Andrea Corallo
  Cc: Eli Zaretskii, Mattias Engdegård, Paul Eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> I agree a full benchmark suite would be even better, I don't recall why
> I typed "reduced" there. So let's do that?

Please go ahead, thanks.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-13  0:49           ` Stefan Kangas
@ 2024-12-13  7:37             ` Andrea Corallo
  2024-12-14 12:00               ` Stefan Kangas
  2024-12-14 11:34             ` Pip Cet via Emacs development discussions.
  1 sibling, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2024-12-13  7:37 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> I agree a full benchmark suite would be even better, I don't recall why
>> I typed "reduced" there. So let's do that?
>
> Please go ahead, thanks.

Asking as elisp-benchmark author/maintainer, the way to move an external
package in core is to copy the files and keep them manually in sync or
there are other ways?



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Improving EQ
  2024-12-12 11:21     ` Óscar Fuentes
@ 2024-12-13 12:24       ` Pip Cet via Emacs development discussions.
  0 siblings, 0 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-13 12:24 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

Óscar Fuentes <ofv@wanadoo.es> writes:

> Andrea Corallo <acorallo@gnu.org> writes:
>
>>> But reading your message makes me wonder if EQ and some other "simple"
>>> fundamental functions are not lowered by nativecomp? If not, maybe
>>> that's a significant opportunity for improvement.
>>
>> Nativecomp only compiles eq for Lisp code, the one discussed here is the
>> eq used in C (and bytecode).

Does nativecomp actually call emit_EQ for anything but lowering ELC jump
tables into a sequence of conditional branches?  I don't see any code to
do so, and Emacs builds fine without emit_EQ if
byte-compile-cond-use-jump-table is disabled.

> Of course this change also affects Emacs running with nativecomp, as
> many calls to EQ are made by C functions not lowered by nativecomp.

My impression is that nativecomp usually ends up calling Feq for
eq-based conditions.

My point is that emit_EQ is used very rarely, when emitting a switch
statement (and switch statements should usually use
maybe_remove_pos_from_symbol + BASE_EQ rather than EQ).

So I won't bother doing anything with emit_EQ.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-13  0:49           ` Stefan Kangas
  2024-12-13  7:37             ` Andrea Corallo
@ 2024-12-14 11:34             ` Pip Cet via Emacs development discussions.
  2024-12-14 11:58               ` Stefan Kangas
  1 sibling, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-14 11:34 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel

"Stefan Kangas" <stefankangas@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> I agree a full benchmark suite would be even better, I don't recall why
>> I typed "reduced" there. So let's do that?
>
> Please go ahead, thanks.

Here's my proposal:

Expand ERT to handle the benchmark case. Copy ALL benchmarks from
elisp-benchmarks, but use the ERT framework instead of
elisp-benchmarks.el.  Keep things minimal for now, try it out, and add
complexity only if we get the impression this would make a useful
addition; otherwise, revert the changes and go back to using
elisp-benchmarks.el.

This is what would work:
1. add a :benchmark tag to ert tests
2. create a new directory test/benchmark/ for benchmarks
3. modify test/Makefile not to run benchmark tests by default
4. add make targets to run the benchmark tests only

I think the mechanism used by elisp-benchmarks.el to select tests is
very ad-hoc and less powerful than the ert tagging mechanism. It also
doesn't work for me: executing

(progn (elisp-benchmarks-run "" t 1)
       (elisp-benchmarks-run "bubble" t 1))

means all tests are run twice, but I intended to run all tests once, as
a warm up, then run only the bubble test again.  However, I suspect this
is merely a bug which can easily be fixed (maybe it's intentional,
though?).  I'm also seeing problems with a "Window size not as
stipulated by the benchmark" error message, but I'll have to investigate
that one...

The mathematical part of elisp-benchmarks.el is questionable: it's built
around the idea of using an arithmetic average of several test runs
(which is equivalent to a single test run with more iterations); I
believe a median/percentile-based approach to selecting a "typical" run
would yield more useful numbers.  So I'm not proposing to reuse the
avg-based code.

I tried to resist the temptation of making ert.el overly general; for
example, instead of defining a new defstruct to determine HOW tests are
run, I merely added a benchmark flag. Maybe we can revisit this if the
approach is adopted and the need for more detailed benchmark
specifications (inhibit GC? warmup? iteration counts? interact with
"perf"?) becomes apparent.  However, I did fail and give in to the
temptation to allow an inhibit-gc mode specifier, which should probably
be removed again...

The main problems I see are that "make benchmark" starts emacs instances
for all files in test/, which takes a lot of time (but that's a general
ERT problem that should be fixed for pass-or-fail testing, too).

A minor problem is how to copy the elisp-benchmark tests and keep them
in sync.  This would very much depend on how much work Andrea is willing
to do.

Finally, elisp-benchmarks has a very useful feature, somewhat hidden,
that this code lacks: while calculating the arithmetic average of
several benchmark runs isn't useful, calculating the standard deviation
from that average is, because it gives us an indication of how scattered
the results are; scattered test results (i.e. high numbers reported in
the "tot avg err" column) are a sufficient, but not a necessary,
condition for knowing when to discard the test results because something
unexpected happened (most likely system load issues or CPU clock games).

Benchmarking is, of course, very hard. I understand Paul Eggert is using
an ancient machine for benchmarking because it avoids many of the issues
that have arisen in the past decade. With a modern machine, we have a
heterogeneous set of CPU cores (energy/performance), each of which can
be configured in different ways (energy-performance preference for each
core) in addition to running at a variable and unpredictable clock speed
(cpufreq/boost); CPU caches are now large enough to persist across
benchmark runs, and the system memory clock is also variable.

This is a very rough start which would allow us to detect many, but not
all, unexpected catastrophic performance reductions due to code changes.

Finally, if Someone is willing to work on this, we should look into
finding a set of benchmarks representative of "typical" Emacs usage so
we can use PGO when building Emacs. While I'd prefer doing neither of
the two, playing with PGO is a much more promising and maintainable
approach than adding __builtin_expect noise to our C source code.

Pip

From 4217a5b8f990760775709392b24e0205041cfed3 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@protonmail.com>
Date: Sat, 14 Dec 2024 10:45:42 +0000
Subject: [PATCH 1/3] Add a benchmark from elisp-benchmarks

DO NOT MERGE

FIXME BEFORE MERGING: Should we add a link to
https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\
f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\
hb=refs/heads/externals/elisp-benchmarks
here?  Losing git history because we copy a file from elpa to emacs
seems suboptimal...

* test/benchmark/bubble.el: New file.
---
 test/benchmark/bubble.el | 52 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)
 create mode 100644 test/benchmark/bubble.el

diff --git a/test/benchmark/bubble.el b/test/benchmark/bubble.el
new file mode 100644
index 00000000000..0c38cdbce39
--- /dev/null
+++ b/test/benchmark/bubble.el
@@ -0,0 +1,52 @@
+;; -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2019 Free Software Foundation, Inc.
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; From:
+;; https://www.emacswiki.org/emacs/EmacsLispBenchmark
+
+(require 'ert)
+
+(require 'cl-lib)
+
+(defvar elb-bubble-len 1000)
+(defvar elb-bubble-list (mapcar #'random (make-list elb-bubble-len
+						    most-positive-fixnum)))
+
+(defun elb-bubble (list)
+  (let ((i (length list)))
+    (while (> i 1)
+      (let ((b list))
+        (while (cdr b)
+          (when (< (cadr b) (car b))
+            (setcar b (prog1 (cadr b)
+                        (setcdr b (cons (car b) (cddr b))))))
+          (setq b (cdr b))))
+      (setq i (1- i)))
+    list))
+
+(defun elb-bubble-entry ()
+  (cl-loop repeat 100
+	   for l = (copy-sequence elb-bubble-list)
+	   do (elb-bubble l)))
+
+(ert-deftest benchmark-bubble ()
+  :tags '(:benchmark)
+  (elb-bubble-entry))
-- 
2.47.0

From df31e19452dff0fe804af2fd3c73f4cee84b6d16 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@protonmail.com>
Date: Sat, 14 Dec 2024 10:56:19 +0000
Subject: [PATCH 2/3] Expand the ERT framework to allow for benchmarks

* lisp/emacs-lisp/ert.el (ert-test-result, ert--test-execution-info):
Expand structs to include "benchmark" field.
(ert--run-benchmark-test-internal): New function.
(ert-run-test, ert-run-or-rerun-test):
(ert-run-tests-batch-and-exit):
(ert-run-tests): Add benchmark argument.
(ert-run-tests-batch): Include GC information when running benchmarks.
(ert-summarize-tests-batch-and-exit): Handle 1.0e+INF.
---
 lisp/emacs-lisp/ert.el | 91 +++++++++++++++++++++++++++++++++---------
 1 file changed, 73 insertions(+), 18 deletions(-)

diff --git a/lisp/emacs-lisp/ert.el b/lisp/emacs-lisp/ert.el
index 97aa233f6e2..76365ed8152 100644
--- a/lisp/emacs-lisp/ert.el
+++ b/lisp/emacs-lisp/ert.el
@@ -698,6 +698,7 @@ ert-test-result
   (messages nil)
   (should-forms nil)
   (duration 0)
+  (benchmark nil)
   )
 (cl-defstruct (ert-test-passed (:include ert-test-result)))
 (cl-defstruct (ert-test-result-with-condition (:include ert-test-result))
@@ -723,7 +724,8 @@ ert--test-execution-info
   ;; execution of the current test.  We store it to avoid being
   ;; affected by any new bindings the test itself may establish.  (I
   ;; don't remember whether this feature is important.)
-  ert-debug-on-error)
+  ert-debug-on-error
+  benchmark)
 
 (defun ert--run-test-debugger (info condition debugfun)
   "Error handler used during the test run.
@@ -801,6 +803,39 @@ ert--run-test-internal
         (make-ert-test-passed))
   nil)
 
+(defun ert--run-benchmark-test-internal (test-execution-info benchmark)
+  (setf (ert--test-execution-info-ert-debug-on-error test-execution-info)
+        ert-debug-on-error)
+  (catch 'ert--pass
+    ;; For now, each test gets its own temp buffer and its own
+    ;; window excursion, just to be safe.  If this turns out to be
+    ;; too expensive, we can remove it.
+    (with-temp-buffer
+      (save-window-excursion
+        (let ((lexical-binding t) ;;FIXME: Why?
+              (ert--infos '())
+              time)
+          (letrec ((debugfun (lambda (err)
+                               (ert--run-test-debugger test-execution-info
+                                                       err debugfun))))
+            (handler-bind (((error quit) debugfun))
+              (garbage-collect)
+              (let ((gc-cons-threshold (if (eq benchmark 'inhibit-gc)
+                                           most-positive-fixnum
+                                         gc-cons-threshold)))
+                (setq time (benchmark-run nil
+                             (funcall (ert-test-body (ert--test-execution-info-test
+                                                      test-execution-info))))))
+              (and (eq benchmark 'inhibit-gc)
+                   (not (= (cadr time) 0))
+                   (warn "failed to inhibit gc; time %S" time))
+              (setf (ert--test-execution-info-benchmark test-execution-info)
+                    time))))))
+    (ert-pass))
+  (setf (ert--test-execution-info-result test-execution-info)
+        (make-ert-test-passed))
+  nil)
+
 (defun ert--force-message-log-buffer-truncation ()
   "Immediately truncate *Messages* buffer according to `message-log-max'.
 
@@ -832,7 +867,7 @@ ert--running-tests
 
 The elements are of type `ert-test'.")
 
-(defun ert-run-test (ert-test)
+(defun ert-run-test (ert-test &optional benchmark)
   "Run ERT-TEST.
 
 Returns the result and stores it in ERT-TEST's `most-recent-result' slot."
@@ -855,8 +890,12 @@ ert-run-test
                          (push form-description should-form-accu)))
                       (message-log-max t)
                       (ert--running-tests (cons ert-test ert--running-tests)))
-                  (ert--run-test-internal info))
+                  (if benchmark
+                      (ert--run-benchmark-test-internal info benchmark)
+                    (ert--run-test-internal info)))
               (let ((result (ert--test-execution-info-result info)))
+                (setf (ert-test-result-benchmark result)
+                      (ert--test-execution-info-benchmark info))
                 (setf (ert-test-result-messages result)
                       (with-current-buffer (messages-buffer)
                         (buffer-substring begin-marker (point-max))))
@@ -1206,7 +1245,7 @@ ert--make-stats
                      :test-start-times (make-vector (length tests) nil)
                      :test-end-times (make-vector (length tests) nil))))
 
-(defun ert-run-or-rerun-test (stats test listener)
+(defun ert-run-or-rerun-test (stats test listener &optional benchmark)
   ;; checkdoc-order: nil
   "Run the single test TEST and record the result using STATS and LISTENER."
   (let ((ert--current-run-stats stats)
@@ -1221,19 +1260,26 @@ ert-run-or-rerun-test
     (setf (ert-test-most-recent-result test) nil)
     (setf (aref (ert--stats-test-start-times stats) pos) (current-time))
     (unwind-protect
-        (ert-run-test test)
+        (ert-run-test test benchmark)
       (setf (aref (ert--stats-test-end-times stats) pos) (current-time))
       (let ((result (ert-test-most-recent-result test)))
-        (setf (ert-test-result-duration result)
-              (float-time
-               (time-subtract
-                (aref (ert--stats-test-end-times stats) pos)
-                (aref (ert--stats-test-start-times stats) pos))))
+        (cond ((ert-test-result-benchmark result)
+               (setf (ert-test-result-duration result)
+                     (if (memq benchmark '(no-gc inhibit-gc))
+                         (- (car (ert-test-result-benchmark result))
+                            (caddr (ert-test-result-benchmark result)))
+                       (car (ert-test-result-benchmark result)))))
+              (t
+               (setf (ert-test-result-duration result)
+                (float-time
+                 (time-subtract
+                  (aref (ert--stats-test-end-times stats) pos)
+                  (aref (ert--stats-test-start-times stats) pos))))))
         (ert--stats-set-test-and-result stats pos test result)
         (funcall listener 'test-ended stats test result))
       (setf (ert--stats-current-test stats) nil))))
 
-(defun ert-run-tests (selector listener &optional interactively)
+(defun ert-run-tests (selector listener &optional interactively benchmark)
   "Run the tests specified by SELECTOR, sending progress updates to LISTENER."
   (let* ((tests (ert-select-tests selector t))
          (stats (ert--make-stats tests selector)))
@@ -1245,7 +1291,7 @@ ert-run-tests
             (force-mode-line-update)
             (unwind-protect
 		(cl-loop for test in tests do
-			 (ert-run-or-rerun-test stats test listener)
+			 (ert-run-or-rerun-test stats test listener benchmark)
 			 (when (and interactively
 				    (ert-test-quit-p
 				     (ert-test-most-recent-result test))
@@ -1367,7 +1413,7 @@ ert-batch-backtrace-right-margin
   "The maximum line length for printing backtraces in `ert-run-tests-batch'.")
 
 ;;;###autoload
-(defun ert-run-tests-batch (&optional selector)
+(defun ert-run-tests-batch (&optional selector benchmark)
   "Run the tests specified by SELECTOR, printing results to the terminal.
 
 SELECTOR selects which tests to run as described in `ert-select-tests' when
@@ -1493,7 +1539,7 @@ ert-run-tests-batch
             (let* ((max (prin1-to-string (length (ert--stats-tests stats))))
                    (format-string (concat "%9s  %"
                                           (prin1-to-string (length max))
-                                          "s/" max "  %S (%f sec)%s")))
+                                          "s/" max "  %S (%f sec%s)%s")))
               (message format-string
                        (ert-string-for-test-result result
                                                    (ert-test-result-expected-p
@@ -1501,13 +1547,19 @@ ert-run-tests-batch
                        (1+ (ert--stats-test-pos stats test))
                        (ert-test-name test)
                        (ert-test-result-duration result)
+                       (if (ert-test-result-benchmark result)
+                           (format ", %f sec in GC, %d GCs"
+                                   (caddr (ert-test-result-benchmark result))
+                                   (cadr (ert-test-result-benchmark result)))
+                         "")
                        (if (ert-test-result-expected-p test result)
                            ""
                          (concat " " (ert-test-location test))))))))))
-   nil))
+   nil
+   benchmark))
 
 ;;;###autoload
-(defun ert-run-tests-batch-and-exit (&optional selector)
+(defun ert-run-tests-batch-and-exit (&optional selector benchmark)
   "Like `ert-run-tests-batch', but exits Emacs when done.
 
 The exit status will be 0 if all test results were as expected, 1
@@ -1525,7 +1577,7 @@ ert-run-tests-batch-and-exit
     (setq attempt-stack-overflow-recovery nil
           attempt-orderly-shutdown-on-fatal-signal nil)
     (unwind-protect
-        (let ((stats (ert-run-tests-batch selector)))
+        (let ((stats (ert-run-tests-batch selector benchmark)))
           (when eln-dir
             (ignore-errors
               (delete-directory eln-dir t)))
@@ -1726,7 +1778,10 @@ ert-summarize-tests-batch-and-exit
 If HIGH is a natural number, the HIGH long lasting tests are summarized."
   (or noninteractive
       (user-error "This function is only for use in batch mode"))
-  (or (natnump high) (setq high 0))
+  (cond
+   ;; FIXME: ntake doesn't allow an infinity argument
+   ((eql high 1.0e+INF) (setq high most-positive-fixnum))
+   ((not (natnump high)) (setq high 0)))
   ;; Better crash loudly than attempting to recover from undefined
   ;; behavior.
   (setq attempt-stack-overflow-recovery nil
-- 
2.47.0

From 8cd4053967a0aa6521039ba887c911daa13b0cf0 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@protonmail.com>
Date: Sat, 14 Dec 2024 10:59:38 +0000
Subject: [PATCH 3/3] Add "make benchmark" rule

* Makefile.in (benchmark): New recipe.
* test/Makefile.in (SELECTOR_BENCHMARK): New selector.
(SELECTOR_ALL, SELECTOR_EXPENSIVE, SELECTOR_DEFAULT): Modify selectors
not to include benchmark tests.
(check-benchmark): New recipe.
---
 Makefile.in      |  7 +++++++
 test/Makefile.in | 25 +++++++++++++++++--------
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 30a762ed03b..13a55452da2 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -69,6 +69,9 @@
 #      check-expensive includes additional tests that can be slow.
 #      check-all runs all tests, including ones that can be slow, or
 #        fail unpredictably
+#
+# make benchmark
+#      Run the Emacs benchmark suite.
 
 SHELL = @SHELL@
 
@@ -1138,6 +1141,10 @@ $(CHECK_TARGETS):
 test/%:
 	$(MAKE) -C test $*
 
+BENCHMARK_TARGETS = benchmark
+.PHONY: $(BENCHMARK_TARGETS)
+$(BENCHMARK_TARGETS): all
+	$(MAKE) SUMMARIZE_TESTS="1.0e+INF" BENCHMARKP="t" -C test check-benchmark
 
 dist:
 	cd ${srcdir}; ./make-dist
diff --git a/test/Makefile.in b/test/Makefile.in
index 7a3178546a1..18a478b3e6c 100644
--- a/test/Makefile.in
+++ b/test/Makefile.in
@@ -78,9 +78,9 @@ TEST_TIMEOUT =
 TEST_INTERACTIVE ?= no
 
 ifeq ($(TEST_INTERACTIVE),yes)
-TEST_RUN_ERT = --eval '(ert (quote ${SELECTOR_ACTUAL}))'
+TEST_RUN_ERT = --eval '(ert (quote ${SELECTOR_ACTUAL}) ${BENCHMARKP})'
 else
-TEST_RUN_ERT = --batch --eval '(ert-run-tests-batch-and-exit (quote ${SELECTOR_ACTUAL}))' ${WRITE_LOG}
+TEST_RUN_ERT = --batch --eval '(ert-run-tests-batch-and-exit (quote ${SELECTOR_ACTUAL}) ${BENCHMARKP})' ${WRITE_LOG}
 endif
 
 # Whether to run tests from .el files in preference to .elc, we do
@@ -136,13 +136,15 @@ TEST_NATIVE_COMP =
 TEST_NATIVE_COMP = no
 endif
 ifeq ($(TEST_NATIVE_COMP),yes)
-SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable)))
-SELECTOR_EXPENSIVE = (not (tag :unstable))
-SELECTOR_ALL = t
+SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :benchmark)))
+SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :benchmark)))
+SELECTOR_ALL = (not (tag :benchmark))
+SELECTOR_BENCHMARK = (tag :benchmark)
 else
-SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :nativecomp)))
-SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :nativecomp)))
-SELECTOR_ALL = (not (tag :nativecomp))
+SELECTOR_DEFAULT = (not (or (tag :expensive-test) (tag :unstable) (tag :nativecomp) (tag :benchmark)))
+SELECTOR_EXPENSIVE = (not (or (tag :unstable) (tag :nativecomp) (tag :benchmark)))
+SELECTOR_ALL = (not (or (tag :nativecomp) (tag :benchmark)))
+SELECTOR_BENCHMARK = (and (tag :benchmark) (not (tag :nativecomp)))
 endif
 ifdef SELECTOR
 SELECTOR_ACTUAL=$(SELECTOR)
@@ -154,6 +156,8 @@ SELECTOR_ACTUAL=
 SELECTOR_ACTUAL=$(SELECTOR_DEFAULT)
 else ifeq ($(MAKECMDGOALS),check-maybe)
 SELECTOR_ACTUAL=$(SELECTOR_DEFAULT)
+else ifeq ($(MAKECMDGOALS),check-benchmark)
+SELECTOR_ACTUAL=$(SELECTOR_BENCHMARK)
 else
 SELECTOR_ACTUAL=$(SELECTOR_EXPENSIVE)
 endif
@@ -323,6 +327,11 @@ .PHONY:
 check-all: mostlyclean check-no-automated-subdir
 	@${MAKE} check-doit SELECTOR="${SELECTOR_ALL}"
 
+## Run all benchmark tests, regardless of tag.
+.PHONY: check-benchmark
+check-benchmark: mostlyclean check-no-automated-subdir
+	@${MAKE} check-doit SELECTOR="${SELECTOR_BENCHMARK}"
+
 ## Re-run all tests which are outdated. A test is outdated if its
 ## logfile is out-of-date with either the test file, or the source
 ## files that the tests depend on.  See test_template.
-- 
2.47.0





^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-14 11:34             ` Pip Cet via Emacs development discussions.
@ 2024-12-14 11:58               ` Stefan Kangas
  2024-12-14 20:07                 ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Stefan Kangas @ 2024-12-14 11:58 UTC (permalink / raw)
  To: Pip Cet
  Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> FIXME BEFORE MERGING: Should we add a link to
> https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\
> f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\
> hb=refs/heads/externals/elisp-benchmarks
> here?  Losing git history because we copy a file from elpa to emacs
> seems suboptimal...

Instead of copying the file, it might be preferable to import the entire
git history into emacs.git, like we did for use-package and eglot.  Then
the old branch on GNU ELPA can be dropped, as we won't lose any history.

João has some scripts that he used for eglot, and I adapted them for
use-package.  Note that he also had some copyright assignment issues to
take care of, so it could probably be simplified.

Please take a look here:
https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-13  7:37             ` Andrea Corallo
@ 2024-12-14 12:00               ` Stefan Kangas
  2024-12-14 14:06                 ` Stefan Monnier
  0 siblings, 1 reply; 64+ messages in thread
From: Stefan Kangas @ 2024-12-14 12:00 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel, Stefan Monnier

Andrea Corallo <acorallo@gnu.org> writes:

> Asking as elisp-benchmark author/maintainer, the way to move an external
> package in core is to copy the files and keep them manually in sync or
> there are other ways?

We can make it into a :core package, which means that you copy the file
to emacs.git, and when you update the "Version" header on Emacs master,
the GNU ELPA scripts make a release based on that commit.

Stefan Monnier (in CC) will know if there are any other adjustments that
are needed, but one thing that needs doing is a change to the GNU ELPA
`elpa-packages` file, something like this:

- (elisp-benchmarks	:url nil)
+ (elisp-benchmarks	:core ("lisp/emacs-lisp/elisp-benchmarks.el"))



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-14 12:00               ` Stefan Kangas
@ 2024-12-14 14:06                 ` Stefan Monnier
  0 siblings, 0 replies; 64+ messages in thread
From: Stefan Monnier @ 2024-12-14 14:06 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Andrea Corallo, Pip Cet, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel

>> Asking as elisp-benchmark author/maintainer, the way to move an external
>> package in core is to copy the files and keep them manually in sync or
>> there are other ways?

Well, there's the all too famous "bundled ELPA packages" feature we
never finished, of course (see `git branch -a | grep elpa` for
various approaches Philip Lord proposed and that we never integrated 🙁).
[ Stepping down before installing such a feature is my biggest regret
  w.r.t my time as Emacs maintainer.  ]

> - (elisp-benchmarks	:url nil)
> + (elisp-benchmarks	:core ("lisp/emacs-lisp/elisp-benchmarks.el"))

I suspect the above would need to be completed with more files/dirs in
the list.


        Stefan




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-14 11:58               ` Stefan Kangas
@ 2024-12-14 20:07                 ` Pip Cet via Emacs development discussions.
  2024-12-14 20:20                   ` João Távora
                                     ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-14 20:07 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel, João Távora

"Stefan Kangas" <stefankangas@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> FIXME BEFORE MERGING: Should we add a link to
>> https://git.savannah.gnu.org/gitweb/?p=emacs/elpa.git;a=history;\
>> f=benchmarks/bubble.el;h=d7101b1b99b60a3bd6945909d1f0125215f4ce1c;\
>> hb=refs/heads/externals/elisp-benchmarks
>> here?  Losing git history because we copy a file from elpa to emacs
>> seems suboptimal...
>
> Instead of copying the file, it might be preferable to import the entire
> git history into emacs.git, like we did for use-package and eglot.  Then
> the old branch on GNU ELPA can be dropped, as we won't lose any history.

Just to be clear, dropping the branch in GNU ELPA wouldn't mean that the
package would no longer be available, just that it would build signed
packages from the main Emacs repo?

> João has some scripts that he used for eglot, and I adapted them for
> use-package.  Note that he also had some copyright assignment issues to
> take care of, so it could probably be simplified.
>
> Please take a look here:
> https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770

Thanks for the pointer!  I tried getting that to work, and finally
succeeded in creating a (local) merged brach, but then I noticed that
the commit messages will need editing to conform to the ChangeLog style.

We also need to decide on the directory structure; right now, I've
created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer
lisp/benchmarks (which would make it easier to exclude the benchmark
files from compilation), but I don't have a strong preference and others
should make that decision.  (I haven't included the
lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks
in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too).

There are some byte compiler warnings, which I guess we should fix.

I'm not sure how to proceed here.  Since there aren't that many commits,
I can offer to change the commit messages myself, but I fully understand
if someone else (Andrea or another volunteer) wants to do it.

I can push the merged branch to a scratch branch of the emacs repo, or
you can access the merged branch (without any edits to the commit
messages) at
https://codeberg.org/pipcet/emacs/src/branch/merge-elisp-benchmarks

Joao, I think it would be a good idea to keep the modified script in
admin/ or somewhere for future reference, but I don't know whether you
consider it an Emacs contribution (and, thus, covered by your copyright
assignment).

[Political/Software Freedom note: I know that some people on the list
don't care, but others do: at least occasionally, GitHub won't let you
see a "gist" unless you use a GitHub account.  This surprised me (and
isn't reproducible now, so it's possible this was a temporary
misconfiguration on GitHub's side, but who knows?) and it's another
reason not to use GitHub for code sharing.]

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-14 20:07                 ` Pip Cet via Emacs development discussions.
@ 2024-12-14 20:20                   ` João Távora
  2024-12-15  0:57                   ` Stefan Kangas
  2024-12-15  0:58                   ` Stefan Kangas
  2 siblings, 0 replies; 64+ messages in thread
From: João Távora @ 2024-12-14 20:20 UTC (permalink / raw)
  To: Pip Cet
  Cc: Stefan Kangas, Andrea Corallo, Eli Zaretskii,
	Mattias Engdegård, Paul Eggert, emacs-devel

On Sat, Dec 14, 2024 at 8:07 PM Pip Cet <pipcet@protonmail.com> wrote:

> Joao, I think it would be a good idea to keep the modified script in
> admin/ or somewhere for future reference, but I don't know whether you
> consider it an Emacs contribution (and, thus, covered by your copyright
> assignment).

Sure, use it.  But there's a catch that I'm fairly sure I didn't write part
of that code.  Someone on the emacs-devel list helped me and wrote
the first versions of it (following my requirements), and I'm very sorry
but I can't remember who it was. Github handle is 'bhankas.  Good luck!



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-14 20:07                 ` Pip Cet via Emacs development discussions.
  2024-12-14 20:20                   ` João Távora
@ 2024-12-15  0:57                   ` Stefan Kangas
  2024-12-22 16:04                     ` Pip Cet via Emacs development discussions.
  2024-12-15  0:58                   ` Stefan Kangas
  2 siblings, 1 reply; 64+ messages in thread
From: Stefan Kangas @ 2024-12-15  0:57 UTC (permalink / raw)
  To: Pip Cet
  Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel, João Távora

Pip Cet <pipcet@protonmail.com> writes:

> Just to be clear, dropping the branch in GNU ELPA wouldn't mean that the
> package would no longer be available, just that it would build signed
> packages from the main Emacs repo?

Yes.

>> João has some scripts that he used for eglot, and I adapted them for
>> use-package.  Note that he also had some copyright assignment issues to
>> take care of, so it could probably be simplified.
>>
>> Please take a look here:
>> https://gist.github.com/joaotavora/2ed97f2ec85958986983d5cb78202770
>
> Thanks for the pointer!  I tried getting that to work, and finally
> succeeded in creating a (local) merged brach, but then I noticed that
> the commit messages will need editing to conform to the ChangeLog style.

I guess there's no clear drawback creating the ChangeLog entries for
each commit, but it's not required.  Only the final merge commit needs a
ChangeLog entry.

I guess that entry will look something like this:

    * lisp/emacs-lisp/elisp-benchmarks.el:
    * lisp/emacs-lisp/benchmarks/bubble.el:
    * lisp/emacs-lisp/benchmarks/pidigits.el: New files.

(Incidentally, this is the same ChangeLog entry we would use if we just
copied the files without preserving history.)

> We also need to decide on the directory structure; right now, I've
> created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer
> lisp/benchmarks (which would make it easier to exclude the benchmark
> files from compilation), but I don't have a strong preference and others
> should make that decision.  (I haven't included the
> lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks
> in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too).

I don't have a strong opinion here, but maybe this stuff belongs under
test/ even?

> I'm not sure how to proceed here.  Since there aren't that many commits,
> I can offer to change the commit messages myself, but I fully understand
> if someone else (Andrea or another volunteer) wants to do it.

FWIW, I'd just go ahead without waiting for someone else.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-14 20:07                 ` Pip Cet via Emacs development discussions.
  2024-12-14 20:20                   ` João Távora
  2024-12-15  0:57                   ` Stefan Kangas
@ 2024-12-15  0:58                   ` Stefan Kangas
  2 siblings, 0 replies; 64+ messages in thread
From: Stefan Kangas @ 2024-12-15  0:58 UTC (permalink / raw)
  To: Pip Cet
  Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel, João Távora

Pip Cet <pipcet@protonmail.com> writes:

> I can push the merged branch to a scratch branch of the emacs repo, or
> you can access the merged branch (without any edits to the commit
> messages) at
> https://codeberg.org/pipcet/emacs/src/branch/merge-elisp-benchmarks

(I missed replying to this part.)

Pushing this to a scratch branch is suitable here, indeed.

Thanks for working on this.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-15  0:57                   ` Stefan Kangas
@ 2024-12-22 16:04                     ` Pip Cet via Emacs development discussions.
  2024-12-29 10:47                       ` Andrea Corallo
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-22 16:04 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Andrea Corallo, Eli Zaretskii, Mattias Engdegård,
	Paul Eggert, emacs-devel, João Távora

"Stefan Kangas" <stefankangas@gmail.com> writes:

> Pip Cet <pipcet@protonmail.com> writes:

>> We also need to decide on the directory structure; right now, I've
>> created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer
>> lisp/benchmarks (which would make it easier to exclude the benchmark
>> files from compilation), but I don't have a strong preference and others
>> should make that decision.  (I haven't included the
>> lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks
>> in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too).
>
> I don't have a strong opinion here, but maybe this stuff belongs under
> test/ even?

I'm still working on this, but it turns out it's harder than I thought
to turn the .el files for the benchmarks into something that's usable
both with ERT and with the existing elisp-benchmarks.el infrastructure.

For example, there's the use of elb-bench-directory to locate resource
files; ERT has its own function for that, but it turns out one of the
resources one benchmark uses is the source file for another benchmark.
Usually I'd just use letf around the benchmark call, but that may affect
performance too much for the benchmarks to be comparable between the ERT
and elisp-benchmarks invocations.

I just don't know whether I'd feel comfortable invoking the benchmarks
in such different ways and presenting the results in a way that would
make people compare them.

The rest of the issues are trivial: whitespace issues, two different
files calling Fprovide with the same feature, elb-scroll.el merged into
elb-smie.el rather than maintaining them as two separate files.  These
are very definitely not deficiencies in the current elisp-benchmarks
package, just different conventions.  However, that amounts to
significant changes to the benchmark .el files overall; rather than
copies of the elisp-benchmarks files, we now have modified versions and
would have to port any changes between the two different sets of files.

Ultimately, my current benchmark branch doesn't do what I set out to do,
which is to share the elisp-benchmarks suite between an unmodified
elisp-benchmarks and the new ERT framework, yielding comparable results.
Getting it to work isn't the main problem, comparability of results is.

So it is with some trepidation that I suggest that the best remaining
option may be to fork or "freeze"/archive elisp-benchmarks and move
development of benchmarks for current Emacs builds entirely to the ERT
framework.  Forking causes a lot of extra synchronization work.
Archiving the package means we will never add new benchmarks for
pre-make-benchmark Emacs builds.

I'm convinced a "make benchmark" target is worth it.  I also think that
we should use the ERT framework, because benchmarks and pass-or-fail
tests are quite similar.

Maybe I'm missing an obvious solution here?

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-22 16:04                     ` Pip Cet via Emacs development discussions.
@ 2024-12-29 10:47                       ` Andrea Corallo
  2024-12-30 11:45                         ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2024-12-29 10:47 UTC (permalink / raw)
  To: Pip Cet
  Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel, João Távora

Pip Cet <pipcet@protonmail.com> writes:

> "Stefan Kangas" <stefankangas@gmail.com> writes:
>
>> Pip Cet <pipcet@protonmail.com> writes:
>
>>> We also need to decide on the directory structure; right now, I've
>>> created a lisp/emacs-lisp/benchmarks/ directory; I'd prefer
>>> lisp/benchmarks (which would make it easier to exclude the benchmark
>>> files from compilation), but I don't have a strong preference and others
>>> should make that decision.  (I haven't included the
>>> lisp/emacs-lisp/subdirs.el file, but if we decide to keep the benchmarks
>>> in lisp/emacs-lisp/benchmarks/, we'll need to gitignore that, too).
>>
>> I don't have a strong opinion here, but maybe this stuff belongs under
>> test/ even?
>
> I'm still working on this, but it turns out it's harder than I thought
> to turn the .el files for the benchmarks into something that's usable
> both with ERT and with the existing elisp-benchmarks.el infrastructure.
>
> For example, there's the use of elb-bench-directory to locate resource
> files; ERT has its own function for that, but it turns out one of the
> resources one benchmark uses is the source file for another benchmark.
> Usually I'd just use letf around the benchmark call, but that may affect
> performance too much for the benchmarks to be comparable between the ERT
> and elisp-benchmarks invocations.
>
> I just don't know whether I'd feel comfortable invoking the benchmarks
> in such different ways and presenting the results in a way that would
> make people compare them.
>
> The rest of the issues are trivial: whitespace issues, two different
> files calling Fprovide with the same feature, elb-scroll.el merged into
> elb-smie.el rather than maintaining them as two separate files.  These
> are very definitely not deficiencies in the current elisp-benchmarks
> package, just different conventions.  However, that amounts to
> significant changes to the benchmark .el files overall; rather than
> copies of the elisp-benchmarks files, we now have modified versions and
> would have to port any changes between the two different sets of files.
>
> Ultimately, my current benchmark branch doesn't do what I set out to do,
> which is to share the elisp-benchmarks suite between an unmodified
> elisp-benchmarks and the new ERT framework, yielding comparable results.
> Getting it to work isn't the main problem, comparability of results is.
>
> So it is with some trepidation that I suggest that the best remaining
> option may be to fork or "freeze"/archive elisp-benchmarks and move
> development of benchmarks for current Emacs builds entirely to the ERT
> framework.  Forking causes a lot of extra synchronization work.
> Archiving the package means we will never add new benchmarks for
> pre-make-benchmark Emacs builds.
>
> I'm convinced a "make benchmark" target is worth it.  I also think that
> we should use the ERT framework, because benchmarks and pass-or-fail
> tests are quite similar.
>
> Maybe I'm missing an obvious solution here?

I'd personally drop the requirement of using ERT as a framework for
benchmarks, I'd just move elisp-benchmarks code in emacs core and add
the target.

My 2 cents.

  Andrea



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-29 10:47                       ` Andrea Corallo
@ 2024-12-30 11:45                         ` Pip Cet via Emacs development discussions.
  2024-12-30 14:15                           ` Eli Zaretskii
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 11:45 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel, João Távora

"Andrea Corallo" <acorallo@gnu.org> writes:

> Pip Cet <pipcet@protonmail.com> writes:

>> I'm convinced a "make benchmark" target is worth it.  I also think that
>> we should use the ERT framework, because benchmarks and pass-or-fail
>> tests are quite similar.
>>
>> Maybe I'm missing an obvious solution here?
>
> I'd personally drop the requirement of using ERT as a framework for
> benchmarks, I'd just move elisp-benchmarks code in emacs core and add
> the target.

Well, as is obvious from the quoted paragraph, I disagree.  I've stated
why in the thread; if someone wants a summary, I can provide one, but I
won't do so now because it might sound too much like an attack on
elisp-benchmarks or its author.

It seems likely that the consequence of my suggestion to add a make
target to do something useful is that the make target is permanently
reserved for something much less useful, blocking the way for future
developments.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 11:45                         ` Pip Cet via Emacs development discussions.
@ 2024-12-30 14:15                           ` Eli Zaretskii
  2024-12-30 15:00                             ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-30 14:15 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora

> Date: Mon, 30 Dec 2024 11:45:36 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora <joaotavora@gmail.com>
> 
> "Andrea Corallo" <acorallo@gnu.org> writes:
> 
> > I'd personally drop the requirement of using ERT as a framework for
> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add
> > the target.
> 
> Well, as is obvious from the quoted paragraph, I disagree.  I've stated
> why in the thread; if someone wants a summary, I can provide one

Can you point to the message where you explained your rationale for
using ERT for this?  I've scanned the discussion, but couldn't find
such a message.

Thanks.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 14:15                           ` Eli Zaretskii
@ 2024-12-30 15:00                             ` Pip Cet via Emacs development discussions.
  2024-12-30 15:21                               ` Eli Zaretskii
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 15:00 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Mon, 30 Dec 2024 11:45:36 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii
>> <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert
>> <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora
>> <joaotavora@gmail.com>
>>
>> "Andrea Corallo" <acorallo@gnu.org> writes:
>>
>> > I'd personally drop the requirement of using ERT as a framework for
>> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add
>> > the target.
>>
>> Well, as is obvious from the quoted paragraph, I disagree.  I've stated
>> why in the thread; if someone wants a summary, I can provide one
>
> Can you point to the message where you explained your rationale for
> using ERT for this?  I've scanned the discussion, but couldn't find
> such a message.

The best I can find is this:

https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html

I can try to provide a more detailed/structured rationale if that's
helpful.  (Is it, though?  Reusing someone's code in a way which reduces
their user base and might cause them more work isn't something we should
do lightly.)

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 15:00                             ` Pip Cet via Emacs development discussions.
@ 2024-12-30 15:21                               ` Eli Zaretskii
  2024-12-30 15:49                                 ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-30 15:21 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora

> Date: Mon, 30 Dec 2024 15:00:29 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> Date: Mon, 30 Dec 2024 11:45:36 +0000
> >> From: Pip Cet <pipcet@protonmail.com>
> >> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii
> >> <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert
> >> <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora
> >> <joaotavora@gmail.com>
> >>
> >> "Andrea Corallo" <acorallo@gnu.org> writes:
> >>
> >> > I'd personally drop the requirement of using ERT as a framework for
> >> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add
> >> > the target.
> >>
> >> Well, as is obvious from the quoted paragraph, I disagree.  I've stated
> >> why in the thread; if someone wants a summary, I can provide one
> >
> > Can you point to the message where you explained your rationale for
> > using ERT for this?  I've scanned the discussion, but couldn't find
> > such a message.
> 
> The best I can find is this:
> 
> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html

Thanks, but AFAICT this just says that you intended to use/extend ERT
to run this benchmark suite, but doesn't explain why you think using
ERT would be an advantage worthy of keeping.

> I can try to provide a more detailed/structured rationale if that's
> helpful.  (Is it, though?  Reusing someone's code in a way which reduces
> their user base and might cause them more work isn't something we should
> do lightly.)

I'm not sure I follow.  Andrea suggests to move elisp-benchmarks into
the repository, and add a target to the test/ Makefile to run it.
AFAIU he suggested that because it should be less work, not more.

Why do you think it is wrong to do the (AFAIU) simple change that
Andrea proposed?

The reason I'm asking is because I think we want this suite to be part
of our test, but don't necessarily want the addition of the benchmarks
to the test suite be a large job that complicates the benchmarks and
the test suite alike.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 15:21                               ` Eli Zaretskii
@ 2024-12-30 15:49                                 ` Pip Cet via Emacs development discussions.
  2024-12-30 15:53                                   ` João Távora
  2024-12-30 16:40                                   ` Eli Zaretskii
  0 siblings, 2 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 15:49 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel, joaotavora

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Mon, 30 Dec 2024 15:00:29 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com
>>
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>>
>> >> Date: Mon, 30 Dec 2024 11:45:36 +0000
>> >> From: Pip Cet <pipcet@protonmail.com>
>> >> Cc: Stefan Kangas <stefankangas@gmail.com>, Eli Zaretskii
>> >> <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert
>> >> <eggert@cs.ucla.edu>, emacs-devel@gnu.org, João Távora
>> >> <joaotavora@gmail.com>
>> >>
>> >> "Andrea Corallo" <acorallo@gnu.org> writes:
>> >>
>> >> > I'd personally drop the requirement of using ERT as a framework for
>> >> > benchmarks, I'd just move elisp-benchmarks code in emacs core and add
>> >> > the target.
>> >>
>> >> Well, as is obvious from the quoted paragraph, I disagree.  I've stated
>> >> why in the thread; if someone wants a summary, I can provide one
>> >
>> > Can you point to the message where you explained your rationale for
>> > using ERT for this?  I've scanned the discussion, but couldn't find
>> > such a message.
>>
>> The best I can find is this:
>>
>> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html
>
> Thanks, but AFAICT this just says that you intended to use/extend ERT
> to run this benchmark suite, but doesn't explain why you think using
> ERT would be an advantage worthy of keeping.

I think some advantages are stated in that email: the ERT tagging
mechanism is more general, works, and can be extended (I describe one
such extension).  All that isn't currently true for elisp-benchmarks.

The other big difference is resource management, which elisp-benchmarks
does via a global variable, reusing one test as data for another.  ERT
has a somewhat better mechanism.

>> I can try to provide a more detailed/structured rationale if that's
>> helpful.  (Is it, though?  Reusing someone's code in a way which reduces
>> their user base and might cause them more work isn't something we should
>> do lightly.)
>
> I'm not sure I follow.  Andrea suggests to move elisp-benchmarks into
> the repository, and add a target to the test/ Makefile to run it.
> AFAIU he suggested that because it should be less work, not more.

I was saying that my proposal would cause Andrea more work, because we'd
reuse his code in a way which reduces (splits) the elisp-benchmarks user
base.  It would also cause me work, which I've done, but that shouldn't
really count against it :-)

> Why do you think it is wrong to do the (AFAIU) simple change that
> Andrea proposed?

Because it's a de facto commitment to not doing it in ERT.  Having two
parallel benchmark suites isn't something I think would happen (and
which one would we use for our make target?)

> The reason I'm asking is because I think we want this suite to be part
> of our test, but don't necessarily want the addition of the benchmarks
> to the test suite be a large job that complicates the benchmarks and
> the test suite alike.

It's not a large job if we make it a clean split.  However, it would be
work for those preferring elisp-benchmarks conventions.

It might be relevant that elisp-benchmarks hasn't seen very active
development lately.  I think switching to ERT might help there, if only
because of the mailing list traffic.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 15:49                                 ` Pip Cet via Emacs development discussions.
@ 2024-12-30 15:53                                   ` João Távora
  2024-12-30 16:40                                   ` Eli Zaretskii
  1 sibling, 0 replies; 64+ messages in thread
From: João Távora @ 2024-12-30 15:53 UTC (permalink / raw)
  To: Pip Cet
  Cc: Eli Zaretskii, Andrea Corallo, Stefan Kangas,
	Mattias Engdegård, Paul Eggert, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 497 bytes --]

On Mon, Dec 30, 2024, 15:49 Pip Cet <pipcet@protonmail.com> wrote:

> Why do you think it is wrong to do the (AFAIU) simple change that
> > Andrea proposed?
>
> Because it's a de facto commitment to not doing it in ERT.  Having two
> parallel benchmark suites isn't something I think would happen
>

I don't subscribe to this mailing list anymore, but since I'm being CC'd in
these emails (why?) I might as well +1 this particular point of Pip's. I
wish you a happy new year.

João

[-- Attachment #2: Type: text/html, Size: 983 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 15:49                                 ` Pip Cet via Emacs development discussions.
  2024-12-30 15:53                                   ` João Távora
@ 2024-12-30 16:40                                   ` Eli Zaretskii
  2024-12-30 17:25                                     ` Pip Cet via Emacs development discussions.
  1 sibling, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-30 16:40 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

> Date: Mon, 30 Dec 2024 15:49:30 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com
> 
> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html
> >
> > Thanks, but AFAICT this just says that you intended to use/extend ERT
> > to run this benchmark suite, but doesn't explain why you think using
> > ERT would be an advantage worthy of keeping.
> 
> I think some advantages are stated in that email: the ERT tagging
> mechanism is more general, works, and can be extended (I describe one
> such extension).  All that isn't currently true for elisp-benchmarks.

Unlike the rest of the test suite, where we need a way to be able to
run individual tests, a benchmark suite is much more likely to run as
a whole, because benchmarking a single kind of jobs in Emacs is much
less useful than producing a benchmark of a representative sample of
jobs.  So I'm not sure this particular aspect is such a serious
problem, certainly if it makes the job of adding the suite much
harder.

> The other big difference is resource management, which elisp-benchmarks
> does via a global variable, reusing one test as data for another.  ERT
> has a somewhat better mechanism.

This is just an argument for cleaner, more elegant code, right?  If
so, I think we can live with the issue.  Having a ready-t-run
benchmark suite is so much more important that I'm prepared to make
compromises.

> > Why do you think it is wrong to do the (AFAIU) simple change that
> > Andrea proposed?
> 
> Because it's a de facto commitment to not doing it in ERT.

Probably, but that in itself is not a catastrophe, surely?

> It might be relevant that elisp-benchmarks hasn't seen very active
> development lately.  I think switching to ERT might help there, if only
> because of the mailing list traffic.

My vote is to get the job done the fastest way possible.  Tests are
not Emacs itself, they are means towards a certain end, and so can use
less clean code and designs, at least IMO.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 16:40                                   ` Eli Zaretskii
@ 2024-12-30 17:25                                     ` Pip Cet via Emacs development discussions.
  2024-12-30 18:16                                       ` Eli Zaretskii
  2024-12-30 18:26                                       ` Andrea Corallo
  0 siblings, 2 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 17:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
include it now?  That would preserve the Git history and importantly (to
me) reserve the name for now.

>> Date: Mon, 30 Dec 2024 15:49:30 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com
>>
>> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html
>> >
>> > Thanks, but AFAICT this just says that you intended to use/extend ERT
>> > to run this benchmark suite, but doesn't explain why you think using
>> > ERT would be an advantage worthy of keeping.
>>
>> I think some advantages are stated in that email: the ERT tagging
>> mechanism is more general, works, and can be extended (I describe one
>> such extension).  All that isn't currently true for elisp-benchmarks.
>
> Unlike the rest of the test suite, where we need a way to be able to
> run individual tests, a benchmark suite is much more likely to run as
> a whole, because benchmarking a single kind of jobs in Emacs is much
> less useful than producing a benchmark of a representative sample of
> jobs.  So I'm not sure this particular aspect is such a serious

Not my experience.  Running the entire suite is much more likely not to
produce usable data due to such issues as CPU thermal management (for
example: the first few tests are run at full clock speed and heat up the
system so much that thermal throttling is activated; the next few tests
are run at a reduced rate while the fan is running; eventually we run
out of amperes that we're allowed to drain the battery by and reduce
clock speed even further; this results in reduced temperature, so the
fan speed is reduced, which means we will eventually decide to try a
higher clock speed again, which will work for a while only before
repeating the cycle.  The whole thing will appear regular enough we
won't notice the data is bad, but it will be, until we rerun the test on
the same system in a different room and get wildly different results).
A single-second test run in a loop produces the occasional mid-stream
result which is actually useful (and promptly lost to the averaging
mechanism of elisp-benchmarks).

Benchmarking is hard, and I wouldn't have provided this very verbose
example if I hadn't seen "paradoxical" results that can only be
explained by such mechanisms.  We need to move away from average run
times either way, and that requires code changes.

And I don't usually run ERT tests individually, while I'm trying to get
in the habit of running the (non-expensive) test suite before I push.

> problem, certainly if it makes the job of adding the suite much
> harder.

I don't think time-to-Emacs is very different for the two approaches.
The difference is the post-merge work.

>> The other big difference is resource management, which elisp-benchmarks
>> does via a global variable, reusing one test as data for another.  ERT
>> has a somewhat better mechanism.
>
> This is just an argument for cleaner, more elegant code, right?  If
> so, I think we can live with the issue.  Having a ready-t-run
> benchmark suite is so much more important that I'm prepared to make
> compromises.
>
>> > Why do you think it is wrong to do the (AFAIU) simple change that
>> > Andrea proposed?
>>
>> Because it's a de facto commitment to not doing it in ERT.
>
> Probably, but that in itself is not a catastrophe, surely?

Hmm.  I disagree, but I think the real catastrophe would be wasting the
make target name, not the inclusion of another directory.

My suggestion is a compromise: add "make elisp-benchmarks" now, using
Andrea's code, then consider more complicated ERT-based approaches
without being in any hurry to do so.  But, also, let's agree that the
ERT-based approaches are "allowed" to reuse the elisp-benchmarks code
without providing comparable results or a porting mechanism, and keep
the "make benchmark" name reserved for a while.

My prediction is that it will turn out "make elisp-benchmarks" doesn't
usually provide very useful results, and expansion of the test framework
to produce useful results is easier reusing the ERT framework.

>> It might be relevant that elisp-benchmarks hasn't seen very active
>> development lately.  I think switching to ERT might help there, if only
>> because of the mailing list traffic.
>
> My vote is to get the job done the fastest way possible.  Tests are

Well, the ERT patch is ready "right now" (meaning it needs rebasing).
The elisp-benchmarks code would require, at least, whitespace fixes ;-)

> not Emacs itself, they are means towards a certain end, and so can use
> less clean code and designs, at least IMO.

To be perfectly honest, I was worried about the commit history because
in the Good Old CVS Days, file renames were more of a problem than they
are with git.  With git, I would prefer one "this is elisp-benchmarks"
commit even if the subsequent history modifies and moves those files.
So no reason not to do that now?  Thanks for your patience.

I can prepare a git branch doing that.  As for the git history rewriting
magic code, I've heard from Joao and the other developer involved; he
doesn't have a copyright assignment, but also thinks this is small
enough to be paperwork-exempt, so we can (probably) include a script in
admin/ to migrate elpa packages to core.

My preference would be a top-level directory called "elisp-benchmarks",
but ultimately that's a minor question, so just let me know the
preferred destination.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 17:25                                     ` Pip Cet via Emacs development discussions.
@ 2024-12-30 18:16                                       ` Eli Zaretskii
  2024-12-31  4:00                                         ` Pip Cet via Emacs development discussions.
  2024-12-30 18:26                                       ` Andrea Corallo
  1 sibling, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-30 18:16 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

> Date: Mon, 30 Dec 2024 17:25:44 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
> include it now?  That would preserve the Git history and importantly (to
> me) reserve the name for now.

Fine by me.

> And I don't usually run ERT tests individually, while I'm trying to get
> in the habit of running the (non-expensive) test suite before I push.

I do it all the time, when I install some change and want to make sure
the related tests still pass.

> My suggestion is a compromise: add "make elisp-benchmarks" now, using
> Andrea's code, then consider more complicated ERT-based approaches
> without being in any hurry to do so.  But, also, let's agree that the
> ERT-based approaches are "allowed" to reuse the elisp-benchmarks code
> without providing comparable results or a porting mechanism, and keep
> the "make benchmark" name reserved for a while.

Of course ERT-based approaches are allowed.  I only chimed into this
discussion because that approach bumped into some difficulties.

> My prediction is that it will turn out "make elisp-benchmarks" doesn't
> usually provide very useful results, and expansion of the test framework
> to produce useful results is easier reusing the ERT framework.

We'll see, and if you are right, we will work on improving the
benchmarks.

> My preference would be a top-level directory called "elisp-benchmarks",
> but ultimately that's a minor question, so just let me know the
> preferred destination.

I thought we wanted it under test/ ?

But I'm also okay with having a directory that is sibling to test/ if
there are no objections from Andrea and others.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 17:25                                     ` Pip Cet via Emacs development discussions.
  2024-12-30 18:16                                       ` Eli Zaretskii
@ 2024-12-30 18:26                                       ` Andrea Corallo
  2024-12-30 18:58                                         ` Stefan Kangas
  2024-12-30 21:34                                         ` Pip Cet via Emacs development discussions.
  1 sibling, 2 replies; 64+ messages in thread
From: Andrea Corallo @ 2024-12-30 18:26 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> "Eli Zaretskii" <eliz@gnu.org> writes:
>
> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
> include it now?  That would preserve the Git history and importantly (to
> me) reserve the name for now.
>
>>> Date: Mon, 30 Dec 2024 15:49:30 +0000
>>> From: Pip Cet <pipcet@protonmail.com>
>>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com
>>>
>>> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html
>>> >
>>> > Thanks, but AFAICT this just says that you intended to use/extend ERT
>>> > to run this benchmark suite, but doesn't explain why you think using
>>> > ERT would be an advantage worthy of keeping.
>>>
>>> I think some advantages are stated in that email: the ERT tagging
>>> mechanism is more general, works, and can be extended (I describe one
>>> such extension).  All that isn't currently true for elisp-benchmarks.
>>
>> Unlike the rest of the test suite, where we need a way to be able to
>> run individual tests, a benchmark suite is much more likely to run as
>> a whole, because benchmarking a single kind of jobs in Emacs is much
>> less useful than producing a benchmark of a representative sample of
>> jobs.  So I'm not sure this particular aspect is such a serious
>
> Not my experience.  Running the entire suite is much more likely not to
> produce usable data due to such issues as CPU thermal management (for
> example: the first few tests are run at full clock speed and heat up the
> system so much that thermal throttling is activated; the next few tests
> are run at a reduced rate while the fan is running; eventually we run
> out of amperes that we're allowed to drain the battery by and reduce
> clock speed even further; this results in reduced temperature, so the
> fan speed is reduced, which means we will eventually decide to try a
> higher clock speed again, which will work for a while only before
> repeating the cycle.  The whole thing will appear regular enough we
> won't notice the data is bad, but it will be, until we rerun the test on
> the same system in a different room and get wildly different results).
> A single-second test run in a loop produces the occasional mid-stream
> result which is actually useful (and promptly lost to the averaging
> mechanism of elisp-benchmarks).

Yes, elisp-benchmark is running all the selected benchmarks at each
iteration, so that a single one cannot take advantaged of the initial
cool CPU state.  If unstable throttling on a specific system is a
problem this will show up as computed error for that test.  If a system
is throttling the right (and only) thing to do is to measure it, this is
in my experience what benchmarks do.

That said tipically Eli is right, the typical use of a benchmark suite
is to run it as a whole and look at the total results, this indeed
accounts for avg throttling as well.

> Benchmarking is hard, and I wouldn't have provided this very verbose
> example if I hadn't seen "paradoxical" results that can only be
> explained by such mechanisms.  We need to move away from average run
> times either way, and that requires code changes.

I'm not sure I understand what you mean, if we prefer something like
geo-mean in elisp-beanhcmarks we can change for that, should be easy.
I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
emacs-core).  My opinion that something can potentially be improved in
it (why not), but I personally ATM don't understand the need for ERT.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 18:26                                       ` Andrea Corallo
@ 2024-12-30 18:58                                         ` Stefan Kangas
  2024-12-30 21:34                                         ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 64+ messages in thread
From: Stefan Kangas @ 2024-12-30 18:58 UTC (permalink / raw)
  To: Andrea Corallo, Pip Cet; +Cc: Eli Zaretskii, mattiase, eggert, emacs-devel

Andrea Corallo <acorallo@gnu.org> writes:

> Yes, elisp-benchmark is running all the selected benchmarks at each
> iteration, so that a single one cannot take advantaged of the initial
> cool CPU state.  If unstable throttling on a specific system is a
> problem this will show up as computed error for that test.  If a system
> is throttling the right (and only) thing to do is to measure it, this is
> in my experience what benchmarks do.
>
> That said tipically Eli is right, the typical use of a benchmark suite
> is to run it as a whole and look at the total results, this indeed
> accounts for avg throttling as well.

May I propose writing down some general usage notes along the lines of
the above somewhere?



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 18:26                                       ` Andrea Corallo
  2024-12-30 18:58                                         ` Stefan Kangas
@ 2024-12-30 21:34                                         ` Pip Cet via Emacs development discussions.
  2024-12-31  9:55                                           ` Andrea Corallo
  2024-12-31 12:43                                           ` Eli Zaretskii
  1 sibling, 2 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-30 21:34 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:
>> Benchmarking is hard, and I wouldn't have provided this very verbose
>> example if I hadn't seen "paradoxical" results that can only be
>> explained by such mechanisms.  We need to move away from average run
>> times either way, and that requires code changes.
>
> I'm not sure I understand what you mean, if we prefer something like
> geo-mean in elisp-beanhcmarks we can change for that, should be easy.

In such situations (machines that don't allow reasonable benchmarks;
this has become the standard situation for me) I've usually found it
necessary to store a bucket histogram (or full history) across many
benchmark runs; this clearly allows you to see the different throttling
levels as separate peaks.  If we must use a single number, we want the
fastest actual run; so, in practice, discard a few percentiles to
account for possible rare errors.

> I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
> emacs-core).  My opinion that something can potentially be improved in

What's the best way to report the need for such improvements?  I'm
currently aware of four "bugs" we should definitely fix; one of them,
ideally, before merging.

> it (why not), but I personally ATM don't understand the need for ERT.

Let's focus on the basics right now: people know how to write ERT tests.
We have hundreds of them.  Some of them could be benchmarks, and we want
to make that as easy as possible.

ERT provides a way to do that, in the same file if we want to: just add
a tag.

It provides a way to locate and properly identify resources (five
"bugs": reusing test A as input for test B means we don't have
separation of tests in elisp-benchmarks, and that's something we should
strive for).

It also allows a third class of tests: stress tests which we want to
execute more often than once per test run, which identify occasional
failures in code that needs to be executed very often to establish
stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect
result once every 8 million runs).  IIRC, right now ERT uses ad-hoc
loops for such tests, but it'd be nicer to expose the repetition count
in the framework (I'm not going to run the non-expensive testsuite on
FreeDOS if that means waiting for a million iterations on an emulated
machine).

(I also think we should introduce an ert-how structure that describes how
a test is to be run: do we want to inhibit GC or allow it?  Run some
warm-up test runs or not? What's the expected time, and when should we
time out? We can't run the complete matrix for all tests, so we need
some hints in the test, and the lack of a test declaration in
elisp-benchmarks hurts us there).

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 18:16                                       ` Eli Zaretskii
@ 2024-12-31  4:00                                         ` Pip Cet via Emacs development discussions.
  2024-12-31  5:26                                           ` Stefan Kangas
  2024-12-31 12:53                                           ` Eli Zaretskii
  0 siblings, 2 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31  4:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Mon, 30 Dec 2024 17:25:44 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
>>
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>>
>> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
>> include it now?  That would preserve the Git history and importantly (to
>> me) reserve the name for now.
>
> Fine by me.

Pushed to scratch/elisp-benchmarks.  Sorry for all the noise; we should
really find a way to reduce emacs-diffs notifications when many commits
are made at once.

This branch:

1. contains whitespace errors, because it contains the files precisely
as they are in the elisp-benchmarks repo

2. contains the admin/elpa2emacs script, based on the assessment by the
one author who doesn't have a copyright assignment that less than 15
lines of his original code remain.  I made it the last commit in the
series so it's easy to drop that specific commit and merge the rest.

3. creates a top-level elisp-benchmarks directory

>> And I don't usually run ERT tests individually, while I'm trying to get
>> in the habit of running the (non-expensive) test suite before I push.
>
> I do it all the time, when I install some change and want to make sure
> the related tests still pass.

Interesting: I find it very hard to localize my changes to specific
tests, usually.

>> My preference would be a top-level directory called "elisp-benchmarks",
>> but ultimately that's a minor question, so just let me know the
>> preferred destination.
>
> I thought we wanted it under test/ ?
>
> But I'm also okay with having a directory that is sibling to test/ if
> there are no objections from Andrea and others.

Obviously, it's not too late for such objections.  That's why I wanted
to include the script: redoing the merge that way is much easier than
applying git filter-repo to a "live" Emacs repo (which is destroyed in
the process).

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31  4:00                                         ` Pip Cet via Emacs development discussions.
@ 2024-12-31  5:26                                           ` Stefan Kangas
  2024-12-31 13:05                                             ` Eli Zaretskii
  2024-12-31 12:53                                           ` Eli Zaretskii
  1 sibling, 1 reply; 64+ messages in thread
From: Stefan Kangas @ 2024-12-31  5:26 UTC (permalink / raw)
  To: Pip Cet, Eli Zaretskii; +Cc: acorallo, mattiase, eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> Pushed to scratch/elisp-benchmarks.  Sorry for all the noise; we should
> really find a way to reduce emacs-diffs notifications when many commits
> are made at once.

Thanks.

AFAIU, the purpose of the emacs-diffs list is to detail the full history
of all commits.  We have enough disk and bandwidth these days, and
threads are easy to skip, so I wouldn't worry about it.

>>> My preference would be a top-level directory called "elisp-benchmarks",
>>> but ultimately that's a minor question, so just let me know the
>>> preferred destination.
>>
>> I thought we wanted it under test/ ?
>>
>> But I'm also okay with having a directory that is sibling to test/ if
>> there are no objections from Andrea and others.
>
> Obviously, it's not too late for such objections.  That's why I wanted
> to include the script: redoing the merge that way is much easier than
> applying git filter-repo to a "live" Emacs repo (which is destroyed in
> the process).

My two cents:

I'd put this in a directory named "benchmarks" instead.  The "elisp-"
part is relevant only because that was the name of the old GNU ELPA
package, but seems redundant as part of a directory name in emacs.git.
We also don't have "elisp-src" and "elisp-test", for example.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 21:34                                         ` Pip Cet via Emacs development discussions.
@ 2024-12-31  9:55                                           ` Andrea Corallo
  2024-12-31 12:43                                           ` Eli Zaretskii
  1 sibling, 0 replies; 64+ messages in thread
From: Andrea Corallo @ 2024-12-31  9:55 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> "Andrea Corallo" <acorallo@gnu.org> writes:
>>> Benchmarking is hard, and I wouldn't have provided this very verbose
>>> example if I hadn't seen "paradoxical" results that can only be
>>> explained by such mechanisms.  We need to move away from average run
>>> times either way, and that requires code changes.
>>
>> I'm not sure I understand what you mean, if we prefer something like
>> geo-mean in elisp-beanhcmarks we can change for that, should be easy.
>
> In such situations (machines that don't allow reasonable benchmarks;
> this has become the standard situation for me) I've usually found it
> necessary to store a bucket histogram (or full history) across many
> benchmark runs; this clearly allows you to see the different throttling
> levels as separate peaks.  If we must use a single number, we want the
> fastest actual run

This is not how, in my professional experience at least, benchmarks are
made/used.  If the CPU is throttoling during the execution of a test
this has to be measured and reported in the final score as it reflects
how the system behaves.  Considering only "best scores" is artificial, I
see no reason for further complications in this area.

>> I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
>> emacs-core).  My opinion that something can potentially be improved in
>
> What's the best way to report the need for such improvements?  I'm
> currently aware of four "bugs" we should definitely fix; one of them,
> ideally, before merging.

It's an ELPA package so AFAIK the process is the same than for
emacs-core.

>> it (why not), but I personally ATM don't understand the need for ERT.
>
> Let's focus on the basics right now: people know how to write ERT tests.
> We have hundreds of them.  Some of them could be benchmarks, and we want
> to make that as easy as possible.

Which ones?

> ERT provides a way to do that, in the same file if we want to: just add
> a tag.
>
> It provides a way to locate and properly identify resources (five
> "bugs": reusing test A as input for test B means we don't have
> separation of tests in elisp-benchmarks, and that's something we should
> strive for).

That (if it's the case) sounds like a very simple fix.

> It also allows a third class of tests: stress tests which we want to
> execute more often than once per test run, which identify occasional
> failures in code that needs to be executed very often to establish
> stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect
> result once every 8 million runs).  IIRC, right now ERT uses ad-hoc
> loops for such tests, but it'd be nicer to expose the repetition count
> in the framework (I'm not going to run the non-expensive testsuite on
> FreeDOS if that means waiting for a million iterations on an emulated
> machine).
>
> (I also think we should introduce an ert-how structure that describes how
> a test is to be run: do we want to inhibit GC or allow it?

We definitely don't want to inhibit GC while running benchmarks.  Why
should we?

> Run some
> warm-up test runs or not?

Of course we should, measuring a fresh state is not realistic,
elisp-benchmarks is running an iterations of all tests as warm-up, I
think this is good enough.

> What's the expected time, and when should we
> time out?

Bechmark tests are not testsuite tests, they are not supposed to hang
nor have long execution time, but anyway we can easily introduce a
time-out which all benchmarks has to stay in if we want to be on the
safe side.

> We can't run the complete matrix for all tests, so we need
> some hints in the test, and the lack of a test declaration in
> elisp-benchmarks hurts us there).

As Eli mentioned, I don't think the goal is to be able to select/run
complex matrices of tests here, I believe the typical use cases are two:

1- A user is running all the suite to get the final score (typical use).

2- A developer is running a single benchmark (probably to profile or
micro optimize it).



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-30 21:34                                         ` Pip Cet via Emacs development discussions.
  2024-12-31  9:55                                           ` Andrea Corallo
@ 2024-12-31 12:43                                           ` Eli Zaretskii
  2024-12-31 14:01                                             ` Pip Cet via Emacs development discussions.
  2025-01-04 16:34                                             ` Pip Cet via Emacs development discussions.
  1 sibling, 2 replies; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-31 12:43 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

> Date: Mon, 30 Dec 2024 21:34:55 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
> > emacs-core).  My opinion that something can potentially be improved in
> 
> What's the best way to report the need for such improvements?

Since you've pushed that to a branch, I suggest to submit bug reports
about these issues, using "[scratch/elisp-benchmarks]" in the Subject
of the bug.

> > it (why not), but I personally ATM don't understand the need for ERT.
> 
> Let's focus on the basics right now: people know how to write ERT tests.
> We have hundreds of them.  Some of them could be benchmarks, and we want
> to make that as easy as possible.

We can later add more benchmarks using ERT.  There's no contradiction.

> It also allows a third class of tests: stress tests which we want to
> execute more often than once per test run, which identify occasional
> failures in code that needs to be executed very often to establish
> stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect
> result once every 8 million runs).  IIRC, right now ERT uses ad-hoc
> loops for such tests, but it'd be nicer to expose the repetition count
> in the framework (I'm not going to run the non-expensive testsuite on
> FreeDOS if that means waiting for a million iterations on an emulated
> machine).
> 
> (I also think we should introduce an ert-how structure that describes how
> a test is to be run: do we want to inhibit GC or allow it?  Run some
> warm-up test runs or not? What's the expected time, and when should we
> time out? We can't run the complete matrix for all tests, so we need
> some hints in the test, and the lack of a test declaration in
> elisp-benchmarks hurts us there).

These seem to be long-term goals of improving the benchmark suite.
They are fine by me, but I don't see why they should preclude
installing the benchmarks we have without first converting them to
ERT.  We can do that later, if we decide it's worth the effort.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31  4:00                                         ` Pip Cet via Emacs development discussions.
  2024-12-31  5:26                                           ` Stefan Kangas
@ 2024-12-31 12:53                                           ` Eli Zaretskii
  2024-12-31 14:34                                             ` Andrea Corallo
  1 sibling, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-31 12:53 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

> Date: Tue, 31 Dec 2024 04:00:05 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> Date: Mon, 30 Dec 2024 17:25:44 +0000
> >> From: Pip Cet <pipcet@protonmail.com>
> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
> >>
> >> "Eli Zaretskii" <eliz@gnu.org> writes:
> >>
> >> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
> >> include it now?  That would preserve the Git history and importantly (to
> >> me) reserve the name for now.
> >
> > Fine by me.
> 
> Pushed to scratch/elisp-benchmarks.

Thanks.

> Sorry for all the noise; we should really find a way to reduce
> emacs-diffs notifications when many commits are made at once.

I see no reason: people who subscribe to that list should be prepared
for floods from time to time.

> This branch:
> 
> 1. contains whitespace errors, because it contains the files precisely
> as they are in the elisp-benchmarks repo

We should probably clean that up, unless Andrea says that those
whitespace are somehow needed.

> >> And I don't usually run ERT tests individually, while I'm trying to get
> >> in the habit of running the (non-expensive) test suite before I push.
> >
> > I do it all the time, when I install some change and want to make sure
> > the related tests still pass.
> 
> Interesting: I find it very hard to localize my changes to specific
> tests, usually.

It is not a 100% reliable technique, but in 99% of the cases, running
the tests corresponding to the Lisp file(s) modified by a changeset,
followed by additional tests where Grep finds the functions in which
the changes were made, provide good coverage (if we have tests for
those features at all).



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31  5:26                                           ` Stefan Kangas
@ 2024-12-31 13:05                                             ` Eli Zaretskii
  2024-12-31 14:14                                               ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-31 13:05 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: pipcet, acorallo, mattiase, eggert, emacs-devel

> From: Stefan Kangas <stefankangas@gmail.com>
> Date: Mon, 30 Dec 2024 23:26:05 -0600
> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, 
> 	emacs-devel@gnu.org
> 
> I'd put this in a directory named "benchmarks" instead.

I think this name is indeed better.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31 12:43                                           ` Eli Zaretskii
@ 2024-12-31 14:01                                             ` Pip Cet via Emacs development discussions.
  2025-01-04 16:34                                             ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31 14:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Mon, 30 Dec 2024 21:34:55 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
>>
>> > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
>> > emacs-core).  My opinion that something can potentially be improved in
>>
>> What's the best way to report the need for such improvements?
>
> Since you've pushed that to a branch, I suggest to submit bug reports
> about these issues, using "[scratch/elisp-benchmarks]" in the Subject
> of the bug.

Okay.

>> > it (why not), but I personally ATM don't understand the need for ERT.
>>
>> Let's focus on the basics right now: people know how to write ERT tests.
>> We have hundreds of them.  Some of them could be benchmarks, and we want
>> to make that as easy as possible.
>
> We can later add more benchmarks using ERT.  There's no contradiction.

I agree.  There's definitely no "right now" need for ERT, I was
explaining why it's the change I'll be investigating.

>> It also allows a third class of tests: stress tests which we want to
>> execute more often than once per test run, which identify occasional
>> failures in code that needs to be executed very often to establish
>> stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect
>> result once every 8 million runs).  IIRC, right now ERT uses ad-hoc
>> loops for such tests, but it'd be nicer to expose the repetition count
>> in the framework (I'm not going to run the non-expensive testsuite on
>> FreeDOS if that means waiting for a million iterations on an emulated
>> machine).
>>
>> (I also think we should introduce an ert-how structure that describes how
>> a test is to be run: do we want to inhibit GC or allow it?  Run some
>> warm-up test runs or not? What's the expected time, and when should we
>> time out? We can't run the complete matrix for all tests, so we need
>> some hints in the test, and the lack of a test declaration in
>> elisp-benchmarks hurts us there).
>
> These seem to be long-term goals of improving the benchmark suite.
> They are fine by me, but I don't see why they should preclude
> installing the benchmarks we have without first converting them to
> ERT.  We can do that later, if we decide it's worth the effort.

We seem to agree here: my intention, too, is to merge the
elisp-benchmarks branch ASAP.  Let's establish which changes are
required on that branch, then do a synchronized rebase-merge to preserve
history?

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31 13:05                                             ` Eli Zaretskii
@ 2024-12-31 14:14                                               ` Pip Cet via Emacs development discussions.
  2024-12-31 14:22                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-31 14:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Kangas, acorallo, mattiase, eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> From: Stefan Kangas <stefankangas@gmail.com>
>> Date: Mon, 30 Dec 2024 23:26:05 -0600
>> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu,
>> 	emacs-devel@gnu.org
>>
>> I'd put this in a directory named "benchmarks" instead.
>
> I think this name is indeed better.

Is this about the directory or the make target?  Absolutely no
objections on the directory name, as it won't conflict with putting new
ERT-based benchmarks in test/.

As for the make target, we should decide whether "make benchmark" means
"run all benchmarks no matter which framework" or "run the
elisp-benchmarks specifically".  I don't think

benchmark: elisp-benchmarks check-benchmark

is a bad rule at all.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31 14:14                                               ` Pip Cet via Emacs development discussions.
@ 2024-12-31 14:22                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 64+ messages in thread
From: Eli Zaretskii @ 2024-12-31 14:22 UTC (permalink / raw)
  To: Pip Cet; +Cc: stefankangas, acorallo, mattiase, eggert, emacs-devel

> Date: Tue, 31 Dec 2024 14:14:42 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Stefan Kangas <stefankangas@gmail.com>, acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> >> From: Stefan Kangas <stefankangas@gmail.com>
> >> Date: Mon, 30 Dec 2024 23:26:05 -0600
> >> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu,
> >> 	emacs-devel@gnu.org
> >>
> >> I'd put this in a directory named "benchmarks" instead.
> >
> > I think this name is indeed better.
> 
> Is this about the directory or the make target?

The former.

> As for the make target, we should decide whether "make benchmark" means
> "run all benchmarks no matter which framework" or "run the
> elisp-benchmarks specifically".  I don't think

"make elisp-benchmarks" for the target is OK, IMO.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31 12:53                                           ` Eli Zaretskii
@ 2024-12-31 14:34                                             ` Andrea Corallo
  0 siblings, 0 replies; 64+ messages in thread
From: Andrea Corallo @ 2024-12-31 14:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Pip Cet, stefankangas, mattiase, eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Tue, 31 Dec 2024 04:00:05 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> 
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>> 
>> >> Date: Mon, 30 Dec 2024 17:25:44 +0000
>> >> From: Pip Cet <pipcet@protonmail.com>
>> >> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> >>
>> >> "Eli Zaretskii" <eliz@gnu.org> writes:
>> >>
>> >> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
>> >> include it now?  That would preserve the Git history and importantly (to
>> >> me) reserve the name for now.
>> >
>> > Fine by me.
>> 
>> Pushed to scratch/elisp-benchmarks.
>
> Thanks.
>
>> Sorry for all the noise; we should really find a way to reduce
>> emacs-diffs notifications when many commits are made at once.
>
> I see no reason: people who subscribe to that list should be prepared
> for floods from time to time.
>
>> This branch:
>> 
>> 1. contains whitespace errors, because it contains the files precisely
>> as they are in the elisp-benchmarks repo
>
> We should probably clean that up, unless Andrea says that those
> whitespace are somehow needed.

Yep, I guess the best for simplicity is to push changes to
elisp-benchmarks and later to merge them into emacs-core.  AFAIR it's
also the way other packages in a similar situation operates.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2024-12-31 12:43                                           ` Eli Zaretskii
  2024-12-31 14:01                                             ` Pip Cet via Emacs development discussions.
@ 2025-01-04 16:34                                             ` Pip Cet via Emacs development discussions.
  2025-01-04 18:33                                               ` Eli Zaretskii
  2025-01-06 11:23                                               ` Andrea Corallo
  1 sibling, 2 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2025-01-04 16:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Mon, 30 Dec 2024 21:34:55 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
>>
>> > I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
>> > emacs-core).  My opinion that something can potentially be improved in
>>
>> What's the best way to report the need for such improvements?
>
> Since you've pushed that to a branch, I suggest to submit bug reports
> about these issues, using "[scratch/elisp-benchmarks]" in the Subject
> of the bug.

I've studied the issues a bit more.  This is a bit long, but in summary,
I think improving elisp-benchmarks.el (the specific file, not the entire
package, which I still intend to reuse) would take more time than
starting from ERT, so I'll look into the latter a bit more (maybe I'll
run into unforeseen difficulties and change my mind).

>> > it (why not), but I personally ATM don't understand the need for ERT.
>>
>> Let's focus on the basics right now: people know how to write ERT tests.
>> We have hundreds of them.  Some of them could be benchmarks, and we want
>> to make that as easy as possible.
>
> We can later add more benchmarks using ERT.  There's no contradiction.

Before describing the issues I found, let me agree with this.  If only
for the sake of having a better git history, we should merge the
elisp-benchmarks branch ASAP after changing the directory name as
discussed.  I'll force-push a fixed branch after filing the reports; I
still think doing a synchronized rebase-and-merge would be worth it
since it would result in a cleaner git history than a
merge-with-conflicts of a branch based on a previous commit on the
master branch.

>> It also allows a third class of tests: stress tests which we want to
>> execute more often than once per test run, which identify occasional
>> failures in code that needs to be executed very often to establish
>> stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect
>> result once every 8 million runs).  IIRC, right now ERT uses ad-hoc
>> loops for such tests, but it'd be nicer to expose the repetition count
>> in the framework (I'm not going to run the non-expensive testsuite on
>> FreeDOS if that means waiting for a million iterations on an emulated
>> machine).
>>
>> (I also think we should introduce an ert-how structure that describes how
>> a test is to be run: do we want to inhibit GC or allow it?  Run some
>> warm-up test runs or not? What's the expected time, and when should we
>> time out? We can't run the complete matrix for all tests, so we need
>> some hints in the test, and the lack of a test declaration in
>> elisp-benchmarks hurts us there).
>
> These seem to be long-term goals of improving the benchmark suite.
> They are fine by me, but I don't see why they should preclude
> installing the benchmarks we have without first converting them to
> ERT.  We can do that later, if we decide it's worth the effort.

I agree again.

Please read this message as an explanation for why I, personally, think
that it is worth the effort.  It's not meant as an attack, and it
doesn't contradict what you said above in any way.

I'm reporting a small number of elisp-benchmarks "bugs" (I think the
term is likely to be contentious; I use it because that's what the
mailing list is called).  All of them are fixable.  Most of them are
easily fixable by moving to ERT.

In my opinion, fixing them in elisp-bechmarks.el is not, as far as Emacs
development is concerned, necessary or helpful: we should spend our time
improving ERT rather than discussing which parts of it need to be
reimplemented (the answer, of course, is that all parts of ERT are
needed and none need to be reimplemented: let's just use it).

I'm not saying elisp-benchmarks.el is bad software: if the goal is to
produce a new benchmarking framework, without using existing code, for
use in Emacs, it's a good early start.  Continuing the effort would be a
significant time investment, and the remaining time to reaching the goal
of a generally useful benchmarking framework is greater than what we
need to do if we start with ERT (and reuse the benchmarks, of course;
the issues are overwhelmingly in elisp-benchmarks.el).

elisp-benchmarks.el has not defined the circumstances in which it is
meant to be used: many of the issues can be avoided by running
elisp-benchmarks in a clean session which is terminated immediately
after running the benchmark.  However, if this limitation is meant to be
permanent, it is inappropriate to declare elisp-benchmarks-run
interactive, and it would mean elisp-benchmarks-run should enforce that
it is run just once per session (either by terminating it or by setting
a flag).

In reporting the issues, I worked under the assumption that
elisp-benchmarks can usefully be run in existing Emacs sessions,
interactive or not, as well as new ones, interactive or not.  If this is
considered out of scope for elisp-benchmarks, this would limit its
usefulness massively, and we would still need to declare this limitation
precisely.

Unfortunately, I decided to stop at some point: the issues, IMHO, need
to be addressed before we can consider the question of whether the
numbers produced by elisp-benchmarks are useful enough.

In particular, as you (Andrea) correctly pointed out, it is sometimes
appropriate to use an average run time (or, non-equivalently, an average
speed) for reporting test results; the assumptions needed for this are
very significant and need to be spelled out explicitly.  The vast
majority of "make benchmark" uses which I think should happen cannot
meet these stringent requirements.

To put things simply, it is better to discard outliers (test runs which
take significantly longer than the rest).  Averaging doesn't do that: it
simply ruins your entire test run if there is a significant outlier.
IOW, running the benchmarks with a large repetition count is very likely
to result in useful data being discarded, and a useless result.

elisp-benchmarks.el makes an attempt to detect outliers by reporting the
(modified) standard deviation of test times.  This is, again, okay for
some use cases, but for others, not so much.  In particular, while a
large standard deviation is a sufficient criterion for discarding a
test, a large repetition count can produce a small standard deviation
while reporting an unreliable average.  IMHO, reporting the minimum and
maximum run time would be more useful than the current result (the
minimum time for a successful benchmark run is a very useful number.  If
there was no system malfunction and the repetition count was large
enough, I still thisk this is almost always the number we want).  It
would mean increasing the repetition count would improve the data, while
in the current implementation, it mostly increases the risk of reporting
unreliable data.

My conclusion is that elisp-benchmarks.el (again, the benchmarks are
fine) isn't the right way forward.

I'm happy to change the scratch/elisp-benchmarks branch in the ways
we've discussed, and it should be merged, but if someone decides to
incrementally solve some of the issues, that, while not very harmful,
would be an inefficient use of resources.

Benchmarks need a test framework.  The options are reimplementing ERT or
using it.  I prefer the second approach and will investigate it further.

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-04 16:34                                             ` Pip Cet via Emacs development discussions.
@ 2025-01-04 18:33                                               ` Eli Zaretskii
  2025-01-05 10:18                                                 ` Pip Cet via Emacs development discussions.
  2025-01-06 11:23                                               ` Andrea Corallo
  1 sibling, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2025-01-04 18:33 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

> Date: Sat, 04 Jan 2025 16:34:24 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> > Since you've pushed that to a branch, I suggest to submit bug reports
> > about these issues, using "[scratch/elisp-benchmarks]" in the Subject
> > of the bug.
> 
> I've studied the issues a bit more.  This is a bit long, but in summary,
> I think improving elisp-benchmarks.el (the specific file, not the entire
> package, which I still intend to reuse) would take more time than
> starting from ERT

You mean, starting from scratch??  How can this be less work than
fixing whatever bugs you found in the benchmarks (assuming that we
want to fix all of them)?

> I'm reporting a small number of elisp-benchmarks "bugs" (I think the
> term is likely to be contentious; I use it because that's what the
> mailing list is called).  All of them are fixable.  Most of them are
> easily fixable by moving to ERT.

You mean, if we move to ERT, which by itself is a significant job,
then some or most of these bugs could be fixed as a side effect or
with much less work, is that it?  That could be so, but then the move
to ERT itself is not a small job, so we need to take that into
consideration when deciding whether to move to ERT right now.

> My conclusion is that elisp-benchmarks.el (again, the benchmarks are
> fine) isn't the right way forward.

Well, you though that to begin with, so forgive me if I say that I'd
like a second independent opinion in this case.

> I'm happy to change the scratch/elisp-benchmarks branch in the ways
> we've discussed, and it should be merged, but if someone decides to
> incrementally solve some of the issues, that, while not very harmful,
> would be an inefficient use of resources.

Let's hear what Andrea thinks about the issues you reported (let's
please discuss them on the bug tracker, not here), and let's take it
from there.

> Benchmarks need a test framework.

I don't necessarily agree.  A benchmark doesn't have to have a
"correct" or "expected" result, so a test framework is not necessarily
justified or needed.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-04 18:33                                               ` Eli Zaretskii
@ 2025-01-05 10:18                                                 ` Pip Cet via Emacs development discussions.
  2025-01-15 22:20                                                   ` Stefan Kangas
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2025-01-05 10:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acorallo, stefankangas, mattiase, eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
>> we've discussed, and it should be merged, but if someone decides to
>> incrementally solve some of the issues, that, while not very harmful,
>> would be an inefficient use of resources.
>
> Let's hear what Andrea thinks about the issues you reported (let's
> please discuss them on the bug tracker, not here), and let's take it
> from there.

Sure, just let me know when we can merge the branch (assuming you meant
to say that the merge, too, should wait for Andrea's thoughts) or which
other changes are required!

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-04 16:34                                             ` Pip Cet via Emacs development discussions.
  2025-01-04 18:33                                               ` Eli Zaretskii
@ 2025-01-06 11:23                                               ` Andrea Corallo
  2025-01-06 14:46                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2025-01-06 11:23 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, stefankangas, mattiase, eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> In particular, as you (Andrea) correctly pointed out, it is sometimes
> appropriate to use an average run time (or, non-equivalently, an average
> speed) for reporting test results; the assumptions needed for this are
> very significant and need to be spelled out explicitly.  The vast
> majority of "make benchmark" uses which I think should happen cannot
> meet these stringent requirements.
>
> To put things simply, it is better to discard outliers (test runs which
> take significantly longer than the rest).  Averaging doesn't do that: it
> simply ruins your entire test run if there is a significant outlier.
> IOW, running the benchmarks with a large repetition count is very likely
> to result in useful data being discarded, and a useless result.

As mentioned, I disagree with having some logic put in place to
arbitrarily decide which value is worth to be considered and which value
should be discarded.  If a system is producing noisy measures this has
to be reported as error of the measure.  Those numbers are there for
some real reason and have to be accounted.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-06 11:23                                               ` Andrea Corallo
@ 2025-01-06 14:46                                                 ` Eli Zaretskii
  2025-01-06 18:41                                                   ` Andrea Corallo
  0 siblings, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2025-01-06 14:46 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: pipcet, stefankangas, mattiase, eggert, emacs-devel

> From: Andrea Corallo <acorallo@gnu.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  stefankangas@gmail.com,
>   mattiase@acm.org,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
> Date: Mon, 06 Jan 2025 06:23:22 -0500
> 
> Pip Cet <pipcet@protonmail.com> writes:
> 
> > In particular, as you (Andrea) correctly pointed out, it is sometimes
> > appropriate to use an average run time (or, non-equivalently, an average
> > speed) for reporting test results; the assumptions needed for this are
> > very significant and need to be spelled out explicitly.  The vast
> > majority of "make benchmark" uses which I think should happen cannot
> > meet these stringent requirements.
> >
> > To put things simply, it is better to discard outliers (test runs which
> > take significantly longer than the rest).  Averaging doesn't do that: it
> > simply ruins your entire test run if there is a significant outlier.
> > IOW, running the benchmarks with a large repetition count is very likely
> > to result in useful data being discarded, and a useless result.
> 
> As mentioned, I disagree with having some logic put in place to
> arbitrarily decide which value is worth to be considered and which value
> should be discarded.  If a system is producing noisy measures this has
> to be reported as error of the measure.  Those numbers are there for
> some real reason and have to be accounted.

Without too deep understanding of the underlying issue: IME, if some
sample can include outliers, it is always better to use robust
estimators, rather than attempt to detect and discard outliers.
That's because detection of outliers can decide that a valid
measurement is an outlier, and then the estimation becomes biased.

In practical terms, for estimating the mean, I can suggest to use the
sample median instead of the sample average.  The median is very
robust to outliers, and only slightly less efficient (i.e., converges
a bit slower) than the sample average.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-06 14:46                                                 ` Eli Zaretskii
@ 2025-01-06 18:41                                                   ` Andrea Corallo
  0 siblings, 0 replies; 64+ messages in thread
From: Andrea Corallo @ 2025-01-06 18:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: pipcet, stefankangas, mattiase, eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <acorallo@gnu.org>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  stefankangas@gmail.com,
>>   mattiase@acm.org,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
>> Date: Mon, 06 Jan 2025 06:23:22 -0500
>> 
>> Pip Cet <pipcet@protonmail.com> writes:
>> 
>> > In particular, as you (Andrea) correctly pointed out, it is sometimes
>> > appropriate to use an average run time (or, non-equivalently, an average
>> > speed) for reporting test results; the assumptions needed for this are
>> > very significant and need to be spelled out explicitly.  The vast
>> > majority of "make benchmark" uses which I think should happen cannot
>> > meet these stringent requirements.
>> >
>> > To put things simply, it is better to discard outliers (test runs which
>> > take significantly longer than the rest).  Averaging doesn't do that: it
>> > simply ruins your entire test run if there is a significant outlier.
>> > IOW, running the benchmarks with a large repetition count is very likely
>> > to result in useful data being discarded, and a useless result.
>> 
>> As mentioned, I disagree with having some logic put in place to
>> arbitrarily decide which value is worth to be considered and which value
>> should be discarded.  If a system is producing noisy measures this has
>> to be reported as error of the measure.  Those numbers are there for
>> some real reason and have to be accounted.
>
> Without too deep understanding of the underlying issue: IME, if some
> sample can include outliers, it is always better to use robust
> estimators, rather than attempt to detect and discard outliers.
> That's because detection of outliers can decide that a valid
> measurement is an outlier, and then the estimation becomes biased.

100% agreed

> In practical terms, for estimating the mean, I can suggest to use the
> sample median instead of the sample average.  The median is very
> robust to outliers, and only slightly less efficient (i.e., converges
> a bit slower) than the sample average.

For my experience benchmarks typically use geo-mean, there's quite some
info around on why is that, ex [1].  The use of arithmetic mean in
elisp-benchmarks is an error of youth (I'm responsible of) which I think
should be fixed.

  Andrea

[1] <https://dl.acm.org/doi/pdf/10.1145/5666.5673>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-05 10:18                                                 ` Pip Cet via Emacs development discussions.
@ 2025-01-15 22:20                                                   ` Stefan Kangas
  2025-01-16  6:42                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 64+ messages in thread
From: Stefan Kangas @ 2025-01-15 22:20 UTC (permalink / raw)
  To: Pip Cet, Eli Zaretskii; +Cc: acorallo, mattiase, eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> "Eli Zaretskii" <eliz@gnu.org> writes:
>
>>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
>>> we've discussed, and it should be merged, but if someone decides to
>>> incrementally solve some of the issues, that, while not very harmful,
>>> would be an inefficient use of resources.
>>
>> Let's hear what Andrea thinks about the issues you reported (let's
>> please discuss them on the bug tracker, not here), and let's take it
>> from there.
>
> Sure, just let me know when we can merge the branch (assuming you meant
> to say that the merge, too, should wait for Andrea's thoughts) or which
> other changes are required!

Is there anything left to do here that is blocking the merge?

> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
> include it now?  That would preserve the Git history and importantly (to
> me) reserve the name for now.

Sounds good to me.  We can discuss changing things further after the merge.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-15 22:20                                                   ` Stefan Kangas
@ 2025-01-16  6:42                                                     ` Eli Zaretskii
  2025-01-17 13:59                                                       ` Andrea Corallo
  0 siblings, 1 reply; 64+ messages in thread
From: Eli Zaretskii @ 2025-01-16  6:42 UTC (permalink / raw)
  To: Stefan Kangas, acorallo; +Cc: pipcet, mattiase, eggert, emacs-devel

> From: Stefan Kangas <stefankangas@gmail.com>
> Date: Wed, 15 Jan 2025 14:20:42 -0800
> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, 
> 	emacs-devel@gnu.org
> 
> Pip Cet <pipcet@protonmail.com> writes:
> 
> > "Eli Zaretskii" <eliz@gnu.org> writes:
> >
> >>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
> >>> we've discussed, and it should be merged, but if someone decides to
> >>> incrementally solve some of the issues, that, while not very harmful,
> >>> would be an inefficient use of resources.
> >>
> >> Let's hear what Andrea thinks about the issues you reported (let's
> >> please discuss them on the bug tracker, not here), and let's take it
> >> from there.
> >
> > Sure, just let me know when we can merge the branch (assuming you meant
> > to say that the merge, too, should wait for Andrea's thoughts) or which
> > other changes are required!
> 
> Is there anything left to do here that is blocking the merge?

I didn't follow the discussions too closely, but if there are no
unresolved issues left between Pip and Andrea, I think this can land.

Andrea, any objections or comments?



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-16  6:42                                                     ` Eli Zaretskii
@ 2025-01-17 13:59                                                       ` Andrea Corallo
  2025-01-17 14:37                                                         ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2025-01-17 13:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Kangas, pipcet, mattiase, eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Stefan Kangas <stefankangas@gmail.com>
>> Date: Wed, 15 Jan 2025 14:20:42 -0800
>> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu, 
>> 	emacs-devel@gnu.org
>> 
>> Pip Cet <pipcet@protonmail.com> writes:
>> 
>> > "Eli Zaretskii" <eliz@gnu.org> writes:
>> >
>> >>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
>> >>> we've discussed, and it should be merged, but if someone decides to
>> >>> incrementally solve some of the issues, that, while not very harmful,
>> >>> would be an inefficient use of resources.
>> >>
>> >> Let's hear what Andrea thinks about the issues you reported (let's
>> >> please discuss them on the bug tracker, not here), and let's take it
>> >> from there.
>> >
>> > Sure, just let me know when we can merge the branch (assuming you meant
>> > to say that the merge, too, should wait for Andrea's thoughts) or which
>> > other changes are required!
>> 
>> Is there anything left to do here that is blocking the merge?
>
> I didn't follow the discussions too closely, but if there are no
> unresolved issues left between Pip and Andrea, I think this can land.
>
> Andrea, any objections or comments?

I've no objections, only two questions:

1- admin/elpa2emacs.sh looks elisp-benchmarks specific, if that's the
   case perhaps should have a better name?  Otherwise we should probably
   make it general?

2- What is the workflow for importing into emacs.git existing changes in
   elisp-benchmarks?  I tried to run elpa2emacs.sh as suggested but I do
   get an error (not sure if the suggested invocation is for the initial
   import, for merging incremental changes or both).


  Andrea



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-17 13:59                                                       ` Andrea Corallo
@ 2025-01-17 14:37                                                         ` Pip Cet via Emacs development discussions.
  2025-01-17 20:48                                                           ` Andrea Corallo
  0 siblings, 1 reply; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2025-01-17 14:37 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Eli Zaretskii, Stefan Kangas, mattiase, eggert, emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Stefan Kangas <stefankangas@gmail.com>
>>> Date: Wed, 15 Jan 2025 14:20:42 -0800
>>> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu,
>>> 	emacs-devel@gnu.org
>>>
>>> Pip Cet <pipcet@protonmail.com> writes:
>>>
>>> > "Eli Zaretskii" <eliz@gnu.org> writes:
>>> >
>>> >>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
>>> >>> we've discussed, and it should be merged, but if someone decides to
>>> >>> incrementally solve some of the issues, that, while not very harmful,
>>> >>> would be an inefficient use of resources.
>>> >>
>>> >> Let's hear what Andrea thinks about the issues you reported (let's
>>> >> please discuss them on the bug tracker, not here), and let's take it
>>> >> from there.
>>> >
>>> > Sure, just let me know when we can merge the branch (assuming you meant
>>> > to say that the merge, too, should wait for Andrea's thoughts) or which
>>> > other changes are required!
>>>
>>> Is there anything left to do here that is blocking the merge?
>>
>> I didn't follow the discussions too closely, but if there are no
>> unresolved issues left between Pip and Andrea, I think this can land.
>>
>> Andrea, any objections or comments?
>
> I've no objections, only two questions:

Thanks!  I'll merge without the admin/elpa2emacs.sh script (which the
questions are about), then?

> 1- admin/elpa2emacs.sh looks elisp-benchmarks specific, if that's the
>    case perhaps should have a better name?  Otherwise we should probably
>    make it general?

You're right, it currently hardcodes the "benchmarks/" destination
directory.  My reasoning for this was that other elpa modules would most
likely require different files to be put into different directories, and
so the script would need modifying anyway.  (As git filter-repo isn't
exactly well-behaved, it would probably be easiest to run the script
once for each destination directory).

> 2- What is the workflow for importing into emacs.git existing changes in
>    elisp-benchmarks?  I tried to run elpa2emacs.sh as suggested but I do
>    get an error (not sure if the suggested invocation is for the initial
>    import, for merging incremental changes or both).

What kind of error?  There are two problems here:

1. we add a remote called elpa2emacs-filtered-elpa, but we don't remove
it automatically.  Re-running elpa2emacs.sh may use the old remote, with
undesirable results.  But removing a remote means interfering with an
existing git repository even more (I destroyed several emacs repos while
adjusting this script).

2. we hardcode origin/master as the Emacs branch we're working on.  I
believe this is forgivable, but it means we can't currently test the
merge case, because origin/master doesn't contain the elisp benchmarks
yet.

However, with these changes, merging an incremental change appears to
work fine here.  It's certainly supposed to work!

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-17 14:37                                                         ` Pip Cet via Emacs development discussions.
@ 2025-01-17 20:48                                                           ` Andrea Corallo
  2025-01-17 21:00                                                             ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 64+ messages in thread
From: Andrea Corallo @ 2025-01-17 20:48 UTC (permalink / raw)
  To: Pip Cet; +Cc: Eli Zaretskii, Stefan Kangas, mattiase, eggert, emacs-devel

Pip Cet <pipcet@protonmail.com> writes:

> "Andrea Corallo" <acorallo@gnu.org> writes:
>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>>> From: Stefan Kangas <stefankangas@gmail.com>
>>>> Date: Wed, 15 Jan 2025 14:20:42 -0800
>>>> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu,
>>>> 	emacs-devel@gnu.org
>>>>
>>>> Pip Cet <pipcet@protonmail.com> writes:
>>>>
>>>> > "Eli Zaretskii" <eliz@gnu.org> writes:
>>>> >
>>>> >>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
>>>> >>> we've discussed, and it should be merged, but if someone decides to
>>>> >>> incrementally solve some of the issues, that, while not very harmful,
>>>> >>> would be an inefficient use of resources.
>>>> >>
>>>> >> Let's hear what Andrea thinks about the issues you reported (let's
>>>> >> please discuss them on the bug tracker, not here), and let's take it
>>>> >> from there.
>>>> >
>>>> > Sure, just let me know when we can merge the branch (assuming you meant
>>>> > to say that the merge, too, should wait for Andrea's thoughts) or which
>>>> > other changes are required!
>>>>
>>>> Is there anything left to do here that is blocking the merge?
>>>
>>> I didn't follow the discussions too closely, but if there are no
>>> unresolved issues left between Pip and Andrea, I think this can land.
>>>
>>> Andrea, any objections or comments?
>>
>> I've no objections, only two questions:
>
> Thanks!  I'll merge without the admin/elpa2emacs.sh script (which the
> questions are about), then?

Hi Pip,

I'd prefer we finalize the workflow first, otherwise I don't know how to
merge in emacs.git the fixes/improvements we make in ELPA.

>> 1- admin/elpa2emacs.sh looks elisp-benchmarks specific, if that's the
>>    case perhaps should have a better name?  Otherwise we should probably
>>    make it general?
>
> You're right, it currently hardcodes the "benchmarks/" destination
> directory.  My reasoning for this was that other elpa modules would most
> likely require different files to be put into different directories, and
> so the script would need modifying anyway.  (As git filter-repo isn't
> exactly well-behaved, it would probably be easiest to run the script
> once for each destination directory).
>
>> 2- What is the workflow for importing into emacs.git existing changes in
>>    elisp-benchmarks?  I tried to run elpa2emacs.sh as suggested but I do
>>    get an error (not sure if the suggested invocation is for the initial
>>    import, for merging incremental changes or both).
>
> What kind of error?  There are two problems here:
>
> 1. we add a remote called elpa2emacs-filtered-elpa, but we don't remove
> it automatically.  Re-running elpa2emacs.sh may use the old remote, with
> undesirable results.  But removing a remote means interfering with an
> existing git repository even more (I destroyed several emacs repos while
> adjusting this script).

Okay but I guess we can assume elpa2emacs-filtered-elpa is a name that
would not be choosen by a developer no?  Otherwise IMO we have to add in
the comment instructions on how to re-run elpa2emacs.sh.

> 2. we hardcode origin/master as the Emacs branch we're working on.  I
> believe this is forgivable, but it means we can't currently test the
> merge case, because origin/master doesn't contain the elisp benchmarks
> yet.

I've git.sv.gnu.org/srv/git/emacs.git called 'savannah' instead of
'origin' as I've other remotes.  I think we should have a way to specify
the remote name.

Also I think the script should stop if any of the commands fails while I
see now it keep on running.

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: New "make benchmark" target
  2025-01-17 20:48                                                           ` Andrea Corallo
@ 2025-01-17 21:00                                                             ` Pip Cet via Emacs development discussions.
  0 siblings, 0 replies; 64+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2025-01-17 21:00 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Eli Zaretskii, Stefan Kangas, mattiase, eggert, emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:

> Pip Cet <pipcet@protonmail.com> writes:
>
>> "Andrea Corallo" <acorallo@gnu.org> writes:
>>
>>> Eli Zaretskii <eliz@gnu.org> writes:
>>>
>>>>> From: Stefan Kangas <stefankangas@gmail.com>
>>>>> Date: Wed, 15 Jan 2025 14:20:42 -0800
>>>>> Cc: acorallo@gnu.org, mattiase@acm.org, eggert@cs.ucla.edu,
>>>>> 	emacs-devel@gnu.org
>>>>>
>>>>> Pip Cet <pipcet@protonmail.com> writes:
>>>>>
>>>>> > "Eli Zaretskii" <eliz@gnu.org> writes:
>>>>> >
>>>>> >>> I'm happy to change the scratch/elisp-benchmarks branch in the ways
>>>>> >>> we've discussed, and it should be merged, but if someone decides to
>>>>> >>> incrementally solve some of the issues, that, while not very harmful,
>>>>> >>> would be an inefficient use of resources.
>>>>> >>
>>>>> >> Let's hear what Andrea thinks about the issues you reported (let's
>>>>> >> please discuss them on the bug tracker, not here), and let's take it
>>>>> >> from there.
>>>>> >
>>>>> > Sure, just let me know when we can merge the branch (assuming you meant
>>>>> > to say that the merge, too, should wait for Andrea's thoughts) or which
>>>>> > other changes are required!
>>>>>
>>>>> Is there anything left to do here that is blocking the merge?
>>>>
>>>> I didn't follow the discussions too closely, but if there are no
>>>> unresolved issues left between Pip and Andrea, I think this can land.
>>>>
>>>> Andrea, any objections or comments?
>>>
>>> I've no objections, only two questions:
>>
>> Thanks!  I'll merge without the admin/elpa2emacs.sh script (which the
>> questions are about), then?
>
> Hi Pip,
>
> I'd prefer we finalize the workflow first, otherwise I don't know how to
> merge in emacs.git the fixes/improvements we make in ELPA.

Okay!

>
>>> 1- admin/elpa2emacs.sh looks elisp-benchmarks specific, if that's the
>>>    case perhaps should have a better name?  Otherwise we should probably
>>>    make it general?
>>
>> You're right, it currently hardcodes the "benchmarks/" destination
>> directory.  My reasoning for this was that other elpa modules would most
>> likely require different files to be put into different directories, and
>> so the script would need modifying anyway.  (As git filter-repo isn't
>> exactly well-behaved, it would probably be easiest to run the script
>> once for each destination directory).
>>
>>> 2- What is the workflow for importing into emacs.git existing changes in
>>>    elisp-benchmarks?  I tried to run elpa2emacs.sh as suggested but I do
>>>    get an error (not sure if the suggested invocation is for the initial
>>>    import, for merging incremental changes or both).
>>
>> What kind of error?  There are two problems here:
>>
>> 1. we add a remote called elpa2emacs-filtered-elpa, but we don't remove
>> it automatically.  Re-running elpa2emacs.sh may use the old remote, with
>> undesirable results.  But removing a remote means interfering with an
>> existing git repository even more (I destroyed several emacs repos while
>> adjusting this script).
>
> Okay but I guess we can assume elpa2emacs-filtered-elpa is a name that
> would not be choosen by a developer no?  Otherwise IMO we have to add in
> the comment instructions on how to re-run elpa2emacs.sh.

You're right.  I'll add the code to remove the remote again.

>> 2. we hardcode origin/master as the Emacs branch we're working on.  I
>> believe this is forgivable, but it means we can't currently test the
>> merge case, because origin/master doesn't contain the elisp benchmarks
>> yet.
>
> I've git.sv.gnu.org/srv/git/emacs.git called 'savannah' instead of
> 'origin' as I've other remotes.  I think we should have a way to specify
> the remote name.

Hmm.  Wouldn't that be confusing in the case where people use a fresh
checkout rather than a new worktree?

Let me think about it: I think it might be best to write the script so
it's run in an existing repo, adds the worktree itself, then removes it
when it's done.

> Also I think the script should stop if any of the commands fails while I
> see now it keep on running.

Agreed.

Would it be okay to make this script include a safety prompt?  It
modifies the git repository in potentially destructive ways,
particularly if we change it to be run in the main emacs repo...

Pip




^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2025-01-17 21:00 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
2024-12-12  6:36 ` Eli Zaretskii
2024-12-12  8:23   ` Andrea Corallo
2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
2024-12-12  9:18     ` Eli Zaretskii
2024-12-12  9:35     ` Visuwesh
2024-12-12 10:40     ` Andrea Corallo
2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
2024-12-12 19:09         ` Eli Zaretskii
2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
2024-12-12 10:59       ` Andrea Corallo
2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
2024-12-13  0:49           ` Stefan Kangas
2024-12-13  7:37             ` Andrea Corallo
2024-12-14 12:00               ` Stefan Kangas
2024-12-14 14:06                 ` Stefan Monnier
2024-12-14 11:34             ` Pip Cet via Emacs development discussions.
2024-12-14 11:58               ` Stefan Kangas
2024-12-14 20:07                 ` Pip Cet via Emacs development discussions.
2024-12-14 20:20                   ` João Távora
2024-12-15  0:57                   ` Stefan Kangas
2024-12-22 16:04                     ` Pip Cet via Emacs development discussions.
2024-12-29 10:47                       ` Andrea Corallo
2024-12-30 11:45                         ` Pip Cet via Emacs development discussions.
2024-12-30 14:15                           ` Eli Zaretskii
2024-12-30 15:00                             ` Pip Cet via Emacs development discussions.
2024-12-30 15:21                               ` Eli Zaretskii
2024-12-30 15:49                                 ` Pip Cet via Emacs development discussions.
2024-12-30 15:53                                   ` João Távora
2024-12-30 16:40                                   ` Eli Zaretskii
2024-12-30 17:25                                     ` Pip Cet via Emacs development discussions.
2024-12-30 18:16                                       ` Eli Zaretskii
2024-12-31  4:00                                         ` Pip Cet via Emacs development discussions.
2024-12-31  5:26                                           ` Stefan Kangas
2024-12-31 13:05                                             ` Eli Zaretskii
2024-12-31 14:14                                               ` Pip Cet via Emacs development discussions.
2024-12-31 14:22                                                 ` Eli Zaretskii
2024-12-31 12:53                                           ` Eli Zaretskii
2024-12-31 14:34                                             ` Andrea Corallo
2024-12-30 18:26                                       ` Andrea Corallo
2024-12-30 18:58                                         ` Stefan Kangas
2024-12-30 21:34                                         ` Pip Cet via Emacs development discussions.
2024-12-31  9:55                                           ` Andrea Corallo
2024-12-31 12:43                                           ` Eli Zaretskii
2024-12-31 14:01                                             ` Pip Cet via Emacs development discussions.
2025-01-04 16:34                                             ` Pip Cet via Emacs development discussions.
2025-01-04 18:33                                               ` Eli Zaretskii
2025-01-05 10:18                                                 ` Pip Cet via Emacs development discussions.
2025-01-15 22:20                                                   ` Stefan Kangas
2025-01-16  6:42                                                     ` Eli Zaretskii
2025-01-17 13:59                                                       ` Andrea Corallo
2025-01-17 14:37                                                         ` Pip Cet via Emacs development discussions.
2025-01-17 20:48                                                           ` Andrea Corallo
2025-01-17 21:00                                                             ` Pip Cet via Emacs development discussions.
2025-01-06 11:23                                               ` Andrea Corallo
2025-01-06 14:46                                                 ` Eli Zaretskii
2025-01-06 18:41                                                   ` Andrea Corallo
2024-12-15  0:58                   ` Stefan Kangas
2024-12-12 10:42 ` Improving EQ Óscar Fuentes
2024-12-12 10:50   ` Andrea Corallo
2024-12-12 11:21     ` Óscar Fuentes
2024-12-13 12:24       ` Pip Cet via Emacs development discussions.
2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
2024-12-12 18:10     ` John ff

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).