Improving EQ

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Improving EQ
@ 2024-12-11 22:37 Pip Cet via Emacs development discussions.
  2024-12-12  6:36 ` Eli Zaretskii
  2024-12-12 10:42 ` Improving EQ Óscar Fuentes
  0 siblings, 2 replies; 17+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-11 22:37 UTC (permalink / raw)
  To: emacs-devel

I looked at the "new" code generated for our EQ macro, and decided that
a fix was in order.  I'm therefore sending a first proposal to explain
what I think should be done, and some numbers.

This patch:
* moves the "slow path" of EQ into a NO_INLINE function
* exits early if the arguments to EQ are actually BASE_EQ
* returns quickly (after a single memory access which cannot be avoided
until we fix our tagging scheme to distinguish exotic objects from
ordinary ones) when symbols_with_pos_enabled isn't true.

The effect on the code size of the stripped emacs binary is small, but
significant: 8906336 bytes instead of 8955488 bytes on this machine.
(The effect on the code size of the emacs binary with debugging
information is much larger, reducing it from 32182000 bytes to 31125832
bytes on this system.)  There is no effect on the size of the .pdmp
file, which is expected.

What's missing here is a benchmark, but unless there's a really nasty
surprise when that happens, I'm quite confident that we can improve the
code here.

The proposed code doesn't use __builtin_expect anymore.

I've deliberately written slow_eq so it returns the same value as EQ,
even if the slow code path is disabled.

Pip

commit 2c807f7320bcb9654e0f148d64c92053b1a47b42 (HEAD -> faster-eq)
Author: Pip Cet <pipcet@protonmail.com>
Date:   Wed Dec 11 22:31:07 2024 +0000

    Change EQ to move slow code path into a separate function

    * src/data.c (slow_eq): New function.
    * src/lisp.h (EQ): Call it.

diff --git a/src/data.c b/src/data.c
index 66cf34c1e60..5ee383d2f48 100644
--- a/src/data.c
+++ b/src/data.c
@@ -162,6 +162,15 @@ circular_list (Lisp_Object list)
 \f
 /* Data type predicates.  */

+/* NO_INLINE to avoid excessive code growth when LTO is in use.  */
+NO_INLINE bool slow_eq (Lisp_Object x, Lisp_Object y)
+{
+  return BASE_EQ ((symbols_with_pos_enabled && SYMBOL_WITH_POS_P (x) ?
+		   XSYMBOL_WITH_POS_SYM (x) : x),
+		  (symbols_with_pos_enabled && SYMBOL_WITH_POS_P (y) ?
+		   XSYMBOL_WITH_POS_SYM (y) : y));
+}
+
 DEFUN ("eq", Feq, Seq, 2, 2, 0,
        doc: /* Return t if the two args are the same Lisp object.  */
        attributes: const)
diff --git a/src/lisp.h b/src/lisp.h
index 832a1755c04..64d4835a499 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -618,6 +618,7 @@ #define ENUM_BF(TYPE) enum TYPE
 extern Lisp_Object default_value (Lisp_Object symbol);
 extern void defalias (Lisp_Object symbol, Lisp_Object definition);
 extern char *fixnum_to_string (EMACS_INT number, char *buffer, char *end);
+extern bool slow_eq (Lisp_Object x, Lisp_Object y);

 /* Defined in emacs.c.  */
@@ -1353,10 +1354,12 @@ make_fixed_natnum (EMACS_INT n)
 INLINE bool
 EQ (Lisp_Object x, Lisp_Object y)
 {
-  return BASE_EQ ((__builtin_expect (symbols_with_pos_enabled, false)
-		   && SYMBOL_WITH_POS_P (x) ? XSYMBOL_WITH_POS_SYM (x) : x),
-		  (__builtin_expect (symbols_with_pos_enabled, false)
-		   && SYMBOL_WITH_POS_P (y) ? XSYMBOL_WITH_POS_SYM (y) : y));
+  if (BASE_EQ (x, y))
+    return true;
+  else if (!symbols_with_pos_enabled)
+    return false;
+  else
+    return slow_eq (x, y);
 }

 INLINE intmax_t

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
@ 2024-12-12  6:36 ` Eli Zaretskii
  2024-12-12  8:23   ` Andrea Corallo
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12 10:42 ` Improving EQ Óscar Fuentes
  1 sibling, 2 replies; 17+ messages in thread
From: Eli Zaretskii @ 2024-12-12  6:36 UTC (permalink / raw)
  To: Pip Cet, Mattias Engdegård, Paul Eggert; +Cc: emacs-devel

> Date: Wed, 11 Dec 2024 22:37:04 +0000
> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> 
> What's missing here is a benchmark, but unless there's a really nasty
> surprise when that happens, I'm quite confident that we can improve the
> code here.

The usual easy benchmark is to byte-compile all the *.el files in the
source tree.  That is, remove all the *.elc files, then say "make" and
time that.

There was also some Emacs benchmark suite that someone posted, but I
cannot find it now, maybe someone else will.

Adding Mattias and Paul to this discussion.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12  6:36 ` Eli Zaretskii
@ 2024-12-12  8:23   ` Andrea Corallo
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  1 sibling, 0 replies; 17+ messages in thread
From: Andrea Corallo @ 2024-12-12  8:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Pip Cet, Mattias Engdegård, Paul Eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>> 
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> The usual easy benchmark is to byte-compile all the *.el files in the
> source tree.  That is, remove all the *.elc files, then say "make" and
> time that.

Agree, considering that tests the non zero 'symbols_with_pos_enabled'
case.

> There was also some Emacs benchmark suite that someone posted, but I
> cannot find it now, maybe someone else will.

<https://elpa.gnu.org/packages/elisp-benchmarks.html>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12  6:36 ` Eli Zaretskii
  2024-12-12  8:23   ` Andrea Corallo
@ 2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12  9:18     ` Eli Zaretskii
                       ` (3 more replies)
  1 sibling, 4 replies; 17+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12  8:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, Paul Eggert, emacs-devel

"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> The usual easy benchmark is to byte-compile all the *.el files in the
> source tree.  That is, remove all the *.elc files, then say "make" and
> time that.

Considering the point of the optimization was to make compilation (when
symbols_with_pos_enabled is true) slower, but speed up non-compilation
use cases, I think that may be the opposite of what we want :-)

Furthermore, the master branch doesn't currently build after deleting
all the *.elc files, because recompilation exceeds max-lisp-eval-depth
in that scenario (together with the known purespace issue, this pretty
much means "make bootstrap" is the only way I can rebuild an emacs tree
right now. It'd be great if Someone could look into this, but I've
failed to understand the native-compilation code (and been told off for
trying to) too often for that Someone to be me. Plus, of course, I fully
understand that native compilation currently has wrong code generation
bugs which obviously have to take priority over build issues...)

> There was also some Emacs benchmark suite that someone posted, but I
> cannot find it now, maybe someone else will.

https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
we could agree on a benchmark, and even better if there were a way to
reliably run it from emacs -Q :-)

In fact, I would suggest to move a reduced benchmark suite to the emacs
repo itself, and run it using "make benchmark".

Also, just to let everyone know, I'm planning to make the "exotic"
property (this object must or can use the slow_eq path) part (probably
the LSB) of the tag rather than accessing it via a global variable and
the PVEC type. This should reduce code size further, should speed up
things, and has some other advantages which I'll go into when I have
working code.

Pip

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
@ 2024-12-12  9:18     ` Eli Zaretskii
  2024-12-12  9:35     ` Visuwesh
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2024-12-12  9:18 UTC (permalink / raw)
  To: Pip Cet; +Cc: mattiase, eggert, emacs-devel

> Date: Thu, 12 Dec 2024 08:36:50 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
> 
> "Eli Zaretskii" <eliz@gnu.org> writes:
> 
> > The usual easy benchmark is to byte-compile all the *.el files in the
> > source tree.  That is, remove all the *.elc files, then say "make" and
> > time that.
> 
> Considering the point of the optimization was to make compilation (when
> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> use cases, I think that may be the opposite of what we want :-)

That's fine, because knowing where this slows us down and by how much
is also important.

> Furthermore, the master branch doesn't currently build after deleting
> all the *.elc files, because recompilation exceeds max-lisp-eval-depth
> in that scenario (together with the known purespace issue, this pretty
> much means "make bootstrap" is the only way I can rebuild an emacs tree
> right now. It'd be great if Someone could look into this, but I've
> failed to understand the native-compilation code (and been told off for
> trying to) too often for that Someone to be me. Plus, of course, I fully
> understand that native compilation currently has wrong code generation
> bugs which obviously have to take priority over build issues...)

If this is with native-compilation, how about trying without?

Also, enlarging max-lisp-eval-depth (assuming you don't somehow hit
infinite recursion) locally should be easy: just add that to the
relevant Makefiles.

> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)

Our benchmark facilities are very rudimentary, so agreement is not an
issue: we just use whatever is available.

> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".

Working on a better benchmark is very useful, but maybe we should try
solving one problem at a time?

> Also, just to let everyone know, I'm planning to make the "exotic"
> property (this object must or can use the slow_eq path) part (probably
> the LSB) of the tag rather than accessing it via a global variable and
> the PVEC type. This should reduce code size further, should speed up
> things, and has some other advantages which I'll go into when I have
> working code.

Whenever you change something in the tags, please remember to update
.gdbinit, otherwise we lose debugging support.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12  9:18     ` Eli Zaretskii
@ 2024-12-12  9:35     ` Visuwesh
  2024-12-12 10:40     ` Andrea Corallo
  2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
  3 siblings, 0 replies; 17+ messages in thread
From: Visuwesh @ 2024-12-12  9:35 UTC (permalink / raw)
  To: Pip Cet via Emacs development discussions.
  Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert

[வியாழன் டிசம்பர் 12, 2024] Pip Cet via "Emacs development discussions." wrote:

>> There was also some Emacs benchmark suite that someone posted, but I
>> cannot find it now, maybe someone else will.
>
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)

Will the command package-isolate help in this scenario?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
  2024-12-12  9:18     ` Eli Zaretskii
  2024-12-12  9:35     ` Visuwesh
@ 2024-12-12 10:40     ` Andrea Corallo
  2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
  2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
  3 siblings, 1 reply; 17+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:40 UTC (permalink / raw)
  To: Pip Cet via Emacs development discussions.
  Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert

Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:

> "Eli Zaretskii" <eliz@gnu.org> writes:
>
>>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>>
>>> What's missing here is a benchmark, but unless there's a really nasty
>>> surprise when that happens, I'm quite confident that we can improve the
>>> code here.
>>
>> The usual easy benchmark is to byte-compile all the *.el files in the
>> source tree.  That is, remove all the *.elc files, then say "make" and
>> time that.
>
> Considering the point of the optimization was to make compilation (when
> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> use cases, I think that may be the opposite of what we want :-)

Glad you finally agree on the goal of the optimization.

> Furthermore, the master branch doesn't currently build after deleting
> all the *.elc files, because recompilation exceeds max-lisp-eval-depth
> in that scenario (together with the known purespace issue, this pretty
> much means "make bootstrap" is the only way I can rebuild an emacs tree
> right now. It'd be great if Someone could look into this, but I've
> failed to understand the native-compilation code (and been told off for
> trying to) too often for that Someone to be me. Plus, of course, I fully
> understand that native compilation currently has wrong code generation
> bugs which obviously have to take priority over build issues...)
>
>> There was also some Emacs benchmark suite that someone posted, but I
>> cannot find it now, maybe someone else will.
>
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)

What is not reliable in the elisp-benchmarks invocation suggested in the
instructions in it?

> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".

That would be nice.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
  2024-12-12  6:36 ` Eli Zaretskii
@ 2024-12-12 10:42 ` Óscar Fuentes
  2024-12-12 10:50   ` Andrea Corallo
  1 sibling, 1 reply; 17+ messages in thread
From: Óscar Fuentes @ 2024-12-12 10:42 UTC (permalink / raw)
  To: emacs-devel; +Cc: Pip Cet

Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:

> I looked at the "new" code generated for our EQ macro, and decided that
> a fix was in order.  I'm therefore sending a first proposal to explain
> what I think should be done, and some numbers.
>
> This patch:
> * moves the "slow path" of EQ into a NO_INLINE function
> * exits early if the arguments to EQ are actually BASE_EQ
> * returns quickly (after a single memory access which cannot be avoided
> until we fix our tagging scheme to distinguish exotic objects from
> ordinary ones) when symbols_with_pos_enabled isn't true.
>
> The effect on the code size of the stripped emacs binary is small, but
> significant: 8906336 bytes instead of 8955488 bytes on this machine.
> (The effect on the code size of the emacs binary with debugging
> information is much larger, reducing it from 32182000 bytes to 31125832
> bytes on this system.)  There is no effect on the size of the .pdmp
> file, which is expected.
>
> What's missing here is a benchmark, but unless there's a really nasty
> surprise when that happens, I'm quite confident that we can improve the
> code here.

I've seen too many cases where *removing* instructions (mind you,
literally removing, not changing!) made the code significantly slower.

Modern CPUs are insanely complex and combined with compilers make
intuition-based predictions even more futile.

But reading your message makes me wonder if EQ and some other "simple"
fundamental functions are not lowered by nativecomp? If not, maybe
that's a significant opportunity for improvement.

As for your patch, one thing that would be easy to do and might save
quite a lot of head scratching is to count the fraction of the calls to
EQ that benefit from the fast path on a "representative" Emacs run. Then
you would have hard data to decide if fighting the compiler/CPU on that
case is a worthy cause.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12 10:42 ` Improving EQ Óscar Fuentes
@ 2024-12-12 10:50   ` Andrea Corallo
  2024-12-12 11:21     ` Óscar Fuentes
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:50 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel, Pip Cet

Óscar Fuentes <ofv@wanadoo.es> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> I looked at the "new" code generated for our EQ macro, and decided that
>> a fix was in order.  I'm therefore sending a first proposal to explain
>> what I think should be done, and some numbers.
>>
>> This patch:
>> * moves the "slow path" of EQ into a NO_INLINE function
>> * exits early if the arguments to EQ are actually BASE_EQ
>> * returns quickly (after a single memory access which cannot be avoided
>> until we fix our tagging scheme to distinguish exotic objects from
>> ordinary ones) when symbols_with_pos_enabled isn't true.
>>
>> The effect on the code size of the stripped emacs binary is small, but
>> significant: 8906336 bytes instead of 8955488 bytes on this machine.
>> (The effect on the code size of the emacs binary with debugging
>> information is much larger, reducing it from 32182000 bytes to 31125832
>> bytes on this system.)  There is no effect on the size of the .pdmp
>> file, which is expected.
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> I've seen too many cases where *removing* instructions (mind you,
> literally removing, not changing!) made the code significantly slower.
>
> Modern CPUs are insanely complex and combined with compilers make
> intuition-based predictions even more futile.

That's why the patch needs to be benchmarked anyway.

> But reading your message makes me wonder if EQ and some other "simple"
> fundamental functions are not lowered by nativecomp? If not, maybe
> that's a significant opportunity for improvement.

Nativecomp only compiles eq for Lisp code, the one discussed here is the
eq used in C (and bytecode).

BTW ATM nativecomp generates code with the same layout of the eq we had
in C till my last change of few weeks ago.  When eq will be stable in C
I guess I'll replicate the layout for generated code for Lisp as well.

  Andrea



^ permalink raw reply	[flat|nested] 17+ messages in thread

* New "make benchmark" target
  2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
                       ` (2 preceding siblings ...)
  2024-12-12 10:40     ` Andrea Corallo
@ 2024-12-12 10:53     ` Stefan Kangas
  2024-12-12 10:59       ` Andrea Corallo
  3 siblings, 1 reply; 17+ messages in thread
From: Stefan Kangas @ 2024-12-12 10:53 UTC (permalink / raw)
  To: Pip Cet, Eli Zaretskii
  Cc: Mattias Engdegård, Paul Eggert, emacs-devel, Andrea Corallo

Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:

> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)
>
> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".

SGTM, but why a reduced suite and not just the whole thing?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: New "make benchmark" target
  2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
@ 2024-12-12 10:59       ` Andrea Corallo
  2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
  0 siblings, 1 reply; 17+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:59 UTC (permalink / raw)
  To: Stefan Kangas
  Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
>> we could agree on a benchmark, and even better if there were a way to
>> reliably run it from emacs -Q :-)
>>
>> In fact, I would suggest to move a reduced benchmark suite to the emacs
>> repo itself, and run it using "make benchmark".
>
> SGTM, but why a reduced suite and not just the whole thing?

My fear is that if we start going into the rabbit hole of which
benchmark of elisp-benchmarks should or should not be included, we will
never agree and as a consequence succeed.  So I guess I'd favor as well
including all elisp-benchmarks.

  Andrea



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12 10:50   ` Andrea Corallo
@ 2024-12-12 11:21     ` Óscar Fuentes
  2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
  2024-12-12 18:10     ` John ff
  2 siblings, 0 replies; 17+ messages in thread
From: Óscar Fuentes @ 2024-12-12 11:21 UTC (permalink / raw)
  To: emacs-devel

Andrea Corallo <acorallo@gnu.org> writes:

>> But reading your message makes me wonder if EQ and some other "simple"
>> fundamental functions are not lowered by nativecomp? If not, maybe
>> that's a significant opportunity for improvement.
>
> Nativecomp only compiles eq for Lisp code, the one discussed here is the
> eq used in C (and bytecode).

Ok, thanks.

Of course this change also affects Emacs running with nativecomp, as
many calls to EQ are made by C functions not lowered by nativecomp.

My guess is that nativecomp's performance would benefit quite a bit from
the general approach of this patch, as every point where nativecomp
calls C is a pessimization spot, but that's another topic.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: New "make benchmark" target
  2024-12-12 10:59       ` Andrea Corallo
@ 2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
  0 siblings, 0 replies; 17+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 16:53 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Stefan Kangas, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
	emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:

> Stefan Kangas <stefankangas@gmail.com> writes:
>
>> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>> writes:
>>
>>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
>>> we could agree on a benchmark, and even better if there were a way to
>>> reliably run it from emacs -Q :-)
>>>
>>> In fact, I would suggest to move a reduced benchmark suite to the emacs
>>> repo itself, and run it using "make benchmark".
>>
>> SGTM, but why a reduced suite and not just the whole thing?
>
> My fear is that if we start going into the rabbit hole of which
> benchmark of elisp-benchmarks should or should not be included, we will
> never agree and as a consequence succeed.  So I guess I'd favor as well
> including all elisp-benchmarks.

I agree a full benchmark suite would be even better, I don't recall why
I typed "reduced" there. So let's do that?

Pip




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12 10:50   ` Andrea Corallo
  2024-12-12 11:21     ` Óscar Fuentes
@ 2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
  2024-12-12 18:10     ` John ff
  2 siblings, 0 replies; 17+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:05 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel

"Andrea Corallo" <acorallo@gnu.org> writes:
> Óscar Fuentes <ofv@wanadoo.es> writes:
>> I've seen too many cases where *removing* instructions (mind you,
>> literally removing, not changing!) made the code significantly slower.

Yes, there are many ways in which that can happen. Removing prefetches
or branch hints is the most obvious example, but I don't claim to know
all the ways, and ultimately if an expected performance improvement does
not materialize we might have to decide this one on code size reasons
alone (of course, if performance is drastically worse, we shouldn't
apply the patch).

>> Modern CPUs are insanely complex and combined with compilers make
>> intuition-based predictions even more futile.

The compiler isn't the issue here, since I checked the assembly code
that was generated. Totally agree about CPUs. For example, moving code
out of line will change many conditional branch locations to a single
one (the one in the out-of-line function), which may help or hurt branch
prediction, and that's just one of many ways in which inline functions
often lose.  So we should also benchmark whether this might be one of
those cases, in which case we'd want to move all of EQ to a non-inlined
function...

> That's why the patch needs to be benchmarked anyway.

Absolute agreement there. I tried some initial benchmarks and it's lost
in the noise, but that was while running on battery on a laptop, and I
need to test on a machine with a proper fixed clock rate.

>> But reading your message makes me wonder if EQ and some other "simple"
>> fundamental functions are not lowered by nativecomp? If not, maybe
>> that's a significant opportunity for improvement.
>
> Nativecomp only compiles eq for Lisp code, the one discussed here is the
> eq used in C (and bytecode).
>
> BTW ATM nativecomp generates code with the same layout of the eq we had
> in C till my last change of few weeks ago.  When eq will be stable in C
> I guess I'll replicate the layout for generated code for Lisp as well.

Thanks, that would be great. Yes, it makes most sense to test with and
without nativecomp, expecting improvement to be more significant in the
latter case (but as EQ is used by C code used by native-compiled Lisp, I
expect a small improvement there, too).

Pip

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12 10:40     ` Andrea Corallo
@ 2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
  2024-12-12 19:09         ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 17:46 UTC (permalink / raw)
  To: Andrea Corallo
  Cc: Pip Cet via "Emacs development discussions.",
	Eli Zaretskii, Mattias Engdegård, Paul Eggert

"Andrea Corallo" <acorallo@gnu.org> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> "Eli Zaretskii" <eliz@gnu.org> writes:
>>
>>>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>>>> From:  Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>>>
>>>> What's missing here is a benchmark, but unless there's a really nasty
>>>> surprise when that happens, I'm quite confident that we can improve the
>>>> code here.
>>>
>>> The usual easy benchmark is to byte-compile all the *.el files in the
>>> source tree.  That is, remove all the *.elc files, then say "make" and
>>> time that.
>>
>> Considering the point of the optimization was to make compilation (when
>> symbols_with_pos_enabled is true) slower, but speed up non-compilation
>> use cases, I think that may be the opposite of what we want :-)
>
> Glad you finally agree on the goal of the optimization.

I find that statement to be offensive, because you know it to be
factually incorrect; but even if it weren't, gloating like that is
extremely poor form for a maintainer.

Pip




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12 10:50   ` Andrea Corallo
  2024-12-12 11:21     ` Óscar Fuentes
  2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
@ 2024-12-12 18:10     ` John ff
  2 siblings, 0 replies; 17+ messages in thread
From: John ff @ 2024-12-12 18:10 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Óscar Fuentes, emacs-devel, Pip Cet

[-- Attachment #1: Type: text/plain, Size: 2352 bytes --]




-------- Original Message --------
From: Andrea Corallo <acorallo@gnu.org>
Sent: Thu Dec 12 10:50:04 GMT 2024
To: "Óscar Fuentes" <ofv@wanadoo.es>
Cc: emacs-devel@gnu.org, Pip Cet <pipcet@protonmail.com>
Subject: Re: Improving EQ

Óscar Fuentes <ofv@wanadoo.es> writes:

> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> I looked at the "new" code generated for our EQ macro, and decided that
>> a fix was in order.  I'm therefore sending a first proposal to explain
>> what I think should be done, and some numbers.
>>
>> This patch:
>> * moves the "slow path" of EQ into a NO_INLINE function
>> * exits early if the arguments to EQ are actually BASE_EQ
>> * returns quickly (after a single memory access which cannot be avoided
>> until we fix our tagging scheme to distinguish exotic objects from
>> ordinary ones) when symbols_with_pos_enabled isn't true.
>>
>> The effect on the code size of the stripped emacs binary is small, but
>> significant: 8906336 bytes instead of 8955488 bytes on this machine.
>> (The effect on the code size of the emacs binary with debugging
>> information is much larger, reducing it from 32182000 bytes to 31125832
>> bytes on this system.)  There is no effect on the size of the .pdmp
>> file, which is expected.
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> I've seen too many cases where *removing* instructions (mind you,
> literally removing, not changing!) made the code significantly slower.
>
> Modern CPUs are insanely complex and combined with compilers make
> intuition-based predictions even more futile.

That's why the patch needs to be benchmarked anyway.

> But reading your message makes me wonder if EQ and some other "simple"
> fundamental functions are not lowered by nativecomp? If not, maybe
> that's a significant opportunity for improvement.

Nativecomp only compiles eq for Lisp code, the one discussed here is the
eq used in C (and bytecode).

BTW ATM nativecomp generates code with the same layout of the eq we had
in C till my last change of few weeks ago.  When eq will be stable in C
I guess I'll replicate the layout for generated code for Lisp as well.

  Andrea


[-- Attachment #2: Type: text/html, Size: 3135 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Improving EQ
  2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
@ 2024-12-12 19:09         ` Eli Zaretskii
  0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2024-12-12 19:09 UTC (permalink / raw)
  To: Pip Cet; +Cc: acorallo, emacs-devel, mattiase, eggert

> Date: Thu, 12 Dec 2024 17:46:55 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: "Pip Cet via \"Emacs development discussions.\"" <emacs-devel@gnu.org>, Eli Zaretskii <eliz@gnu.org>, Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>
> 
> "Andrea Corallo" <acorallo@gnu.org> writes:
> 
> >> Considering the point of the optimization was to make compilation (when
> >> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> >> use cases, I think that may be the opposite of what we want :-)
> >
> > Glad you finally agree on the goal of the optimization.
> 
> I find that statement to be offensive, because you know it to be
> factually incorrect; but even if it weren't, gloating like that is
> extremely poor form for a maintainer.

I'm quite sure Andrea meant it as tongue-in-cheek, nowhere near
gloating.  Please keep in mind that for Andrea and others here,
English is not their first language, and their choice of words could
reflect that.



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-12-12 19:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
2024-12-12  6:36 ` Eli Zaretskii
2024-12-12  8:23   ` Andrea Corallo
2024-12-12  8:36   ` Pip Cet via Emacs development discussions.
2024-12-12  9:18     ` Eli Zaretskii
2024-12-12  9:35     ` Visuwesh
2024-12-12 10:40     ` Andrea Corallo
2024-12-12 17:46       ` Pip Cet via Emacs development discussions.
2024-12-12 19:09         ` Eli Zaretskii
2024-12-12 10:53     ` New "make benchmark" target Stefan Kangas
2024-12-12 10:59       ` Andrea Corallo
2024-12-12 16:53         ` Pip Cet via Emacs development discussions.
2024-12-12 10:42 ` Improving EQ Óscar Fuentes
2024-12-12 10:50   ` Andrea Corallo
2024-12-12 11:21     ` Óscar Fuentes
2024-12-12 17:05     ` Pip Cet via Emacs development discussions.
2024-12-12 18:10     ` John ff

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).