* Improving EQ
@ 2024-12-11 22:37 Pip Cet via Emacs development discussions.
2024-12-12 6:36 ` Eli Zaretskii
2024-12-12 10:42 ` Improving EQ Óscar Fuentes
0 siblings, 2 replies; 12+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-11 22:37 UTC (permalink / raw)
To: emacs-devel
I looked at the "new" code generated for our EQ macro, and decided that
a fix was in order. I'm therefore sending a first proposal to explain
what I think should be done, and some numbers.
This patch:
* moves the "slow path" of EQ into a NO_INLINE function
* exits early if the arguments to EQ are actually BASE_EQ
* returns quickly (after a single memory access which cannot be avoided
until we fix our tagging scheme to distinguish exotic objects from
ordinary ones) when symbols_with_pos_enabled isn't true.
The effect on the code size of the stripped emacs binary is small, but
significant: 8906336 bytes instead of 8955488 bytes on this machine.
(The effect on the code size of the emacs binary with debugging
information is much larger, reducing it from 32182000 bytes to 31125832
bytes on this system.) There is no effect on the size of the .pdmp
file, which is expected.
What's missing here is a benchmark, but unless there's a really nasty
surprise when that happens, I'm quite confident that we can improve the
code here.
The proposed code doesn't use __builtin_expect anymore.
I've deliberately written slow_eq so it returns the same value as EQ,
even if the slow code path is disabled.
Pip
commit 2c807f7320bcb9654e0f148d64c92053b1a47b42 (HEAD -> faster-eq)
Author: Pip Cet <pipcet@protonmail.com>
Date: Wed Dec 11 22:31:07 2024 +0000
Change EQ to move slow code path into a separate function
* src/data.c (slow_eq): New function.
* src/lisp.h (EQ): Call it.
diff --git a/src/data.c b/src/data.c
index 66cf34c1e60..5ee383d2f48 100644
--- a/src/data.c
+++ b/src/data.c
@@ -162,6 +162,15 @@ circular_list (Lisp_Object list)
\f
/* Data type predicates. */
+/* NO_INLINE to avoid excessive code growth when LTO is in use. */
+NO_INLINE bool slow_eq (Lisp_Object x, Lisp_Object y)
+{
+ return BASE_EQ ((symbols_with_pos_enabled && SYMBOL_WITH_POS_P (x) ?
+ XSYMBOL_WITH_POS_SYM (x) : x),
+ (symbols_with_pos_enabled && SYMBOL_WITH_POS_P (y) ?
+ XSYMBOL_WITH_POS_SYM (y) : y));
+}
+
DEFUN ("eq", Feq, Seq, 2, 2, 0,
doc: /* Return t if the two args are the same Lisp object. */
attributes: const)
diff --git a/src/lisp.h b/src/lisp.h
index 832a1755c04..64d4835a499 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -618,6 +618,7 @@ #define ENUM_BF(TYPE) enum TYPE
extern Lisp_Object default_value (Lisp_Object symbol);
extern void defalias (Lisp_Object symbol, Lisp_Object definition);
extern char *fixnum_to_string (EMACS_INT number, char *buffer, char *end);
+extern bool slow_eq (Lisp_Object x, Lisp_Object y);
/* Defined in emacs.c. */
@@ -1353,10 +1354,12 @@ make_fixed_natnum (EMACS_INT n)
INLINE bool
EQ (Lisp_Object x, Lisp_Object y)
{
- return BASE_EQ ((__builtin_expect (symbols_with_pos_enabled, false)
- && SYMBOL_WITH_POS_P (x) ? XSYMBOL_WITH_POS_SYM (x) : x),
- (__builtin_expect (symbols_with_pos_enabled, false)
- && SYMBOL_WITH_POS_P (y) ? XSYMBOL_WITH_POS_SYM (y) : y));
+ if (BASE_EQ (x, y))
+ return true;
+ else if (!symbols_with_pos_enabled)
+ return false;
+ else
+ return slow_eq (x, y);
}
INLINE intmax_t
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
@ 2024-12-12 6:36 ` Eli Zaretskii
2024-12-12 8:23 ` Andrea Corallo
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
2024-12-12 10:42 ` Improving EQ Óscar Fuentes
1 sibling, 2 replies; 12+ messages in thread
From: Eli Zaretskii @ 2024-12-12 6:36 UTC (permalink / raw)
To: Pip Cet, Mattias Engdegård, Paul Eggert; +Cc: emacs-devel
> Date: Wed, 11 Dec 2024 22:37:04 +0000
> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>
> What's missing here is a benchmark, but unless there's a really nasty
> surprise when that happens, I'm quite confident that we can improve the
> code here.
The usual easy benchmark is to byte-compile all the *.el files in the
source tree. That is, remove all the *.elc files, then say "make" and
time that.
There was also some Emacs benchmark suite that someone posted, but I
cannot find it now, maybe someone else will.
Adding Mattias and Paul to this discussion.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 6:36 ` Eli Zaretskii
@ 2024-12-12 8:23 ` Andrea Corallo
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
1 sibling, 0 replies; 12+ messages in thread
From: Andrea Corallo @ 2024-12-12 8:23 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Pip Cet, Mattias Engdegård, Paul Eggert, emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> The usual easy benchmark is to byte-compile all the *.el files in the
> source tree. That is, remove all the *.elc files, then say "make" and
> time that.
Agree, considering that tests the non zero 'symbols_with_pos_enabled'
case.
> There was also some Emacs benchmark suite that someone posted, but I
> cannot find it now, maybe someone else will.
<https://elpa.gnu.org/packages/elisp-benchmarks.html>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 6:36 ` Eli Zaretskii
2024-12-12 8:23 ` Andrea Corallo
@ 2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
2024-12-12 9:18 ` Eli Zaretskii
` (3 more replies)
1 sibling, 4 replies; 12+ messages in thread
From: Pip Cet via Emacs development discussions. @ 2024-12-12 8:36 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Mattias Engdegård, Paul Eggert, emacs-devel
"Eli Zaretskii" <eliz@gnu.org> writes:
>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> The usual easy benchmark is to byte-compile all the *.el files in the
> source tree. That is, remove all the *.elc files, then say "make" and
> time that.
Considering the point of the optimization was to make compilation (when
symbols_with_pos_enabled is true) slower, but speed up non-compilation
use cases, I think that may be the opposite of what we want :-)
Furthermore, the master branch doesn't currently build after deleting
all the *.elc files, because recompilation exceeds max-lisp-eval-depth
in that scenario (together with the known purespace issue, this pretty
much means "make bootstrap" is the only way I can rebuild an emacs tree
right now. It'd be great if Someone could look into this, but I've
failed to understand the native-compilation code (and been told off for
trying to) too often for that Someone to be me. Plus, of course, I fully
understand that native compilation currently has wrong code generation
bugs which obviously have to take priority over build issues...)
> There was also some Emacs benchmark suite that someone posted, but I
> cannot find it now, maybe someone else will.
https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
we could agree on a benchmark, and even better if there were a way to
reliably run it from emacs -Q :-)
In fact, I would suggest to move a reduced benchmark suite to the emacs
repo itself, and run it using "make benchmark".
Also, just to let everyone know, I'm planning to make the "exotic"
property (this object must or can use the slow_eq path) part (probably
the LSB) of the tag rather than accessing it via a global variable and
the PVEC type. This should reduce code size further, should speed up
things, and has some other advantages which I'll go into when I have
working code.
Pip
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
@ 2024-12-12 9:18 ` Eli Zaretskii
2024-12-12 9:35 ` Visuwesh
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2024-12-12 9:18 UTC (permalink / raw)
To: Pip Cet; +Cc: mattiase, eggert, emacs-devel
> Date: Thu, 12 Dec 2024 08:36:50 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: Mattias Engdegård <mattiase@acm.org>, Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
>
> "Eli Zaretskii" <eliz@gnu.org> writes:
>
> > The usual easy benchmark is to byte-compile all the *.el files in the
> > source tree. That is, remove all the *.elc files, then say "make" and
> > time that.
>
> Considering the point of the optimization was to make compilation (when
> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> use cases, I think that may be the opposite of what we want :-)
That's fine, because knowing where this slows us down and by how much
is also important.
> Furthermore, the master branch doesn't currently build after deleting
> all the *.elc files, because recompilation exceeds max-lisp-eval-depth
> in that scenario (together with the known purespace issue, this pretty
> much means "make bootstrap" is the only way I can rebuild an emacs tree
> right now. It'd be great if Someone could look into this, but I've
> failed to understand the native-compilation code (and been told off for
> trying to) too often for that Someone to be me. Plus, of course, I fully
> understand that native compilation currently has wrong code generation
> bugs which obviously have to take priority over build issues...)
If this is with native-compilation, how about trying without?
Also, enlarging max-lisp-eval-depth (assuming you don't somehow hit
infinite recursion) locally should be easy: just add that to the
relevant Makefiles.
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)
Our benchmark facilities are very rudimentary, so agreement is not an
issue: we just use whatever is available.
> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".
Working on a better benchmark is very useful, but maybe we should try
solving one problem at a time?
> Also, just to let everyone know, I'm planning to make the "exotic"
> property (this object must or can use the slow_eq path) part (probably
> the LSB) of the tag rather than accessing it via a global variable and
> the PVEC type. This should reduce code size further, should speed up
> things, and has some other advantages which I'll go into when I have
> working code.
Whenever you change something in the tags, please remember to update
.gdbinit, otherwise we lose debugging support.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
2024-12-12 9:18 ` Eli Zaretskii
@ 2024-12-12 9:35 ` Visuwesh
2024-12-12 10:40 ` Andrea Corallo
2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas
3 siblings, 0 replies; 12+ messages in thread
From: Visuwesh @ 2024-12-12 9:35 UTC (permalink / raw)
To: Pip Cet via Emacs development discussions.
Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert
[வியாழன் டிசம்பர் 12, 2024] Pip Cet via "Emacs development discussions." wrote:
>> There was also some Emacs benchmark suite that someone posted, but I
>> cannot find it now, maybe someone else will.
>
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)
Will the command package-isolate help in this scenario?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
2024-12-12 9:18 ` Eli Zaretskii
2024-12-12 9:35 ` Visuwesh
@ 2024-12-12 10:40 ` Andrea Corallo
2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas
3 siblings, 0 replies; 12+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:40 UTC (permalink / raw)
To: Pip Cet via Emacs development discussions.
Cc: Eli Zaretskii, Pip Cet, Mattias Engdegård, Paul Eggert
Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:
> "Eli Zaretskii" <eliz@gnu.org> writes:
>
>>> Date: Wed, 11 Dec 2024 22:37:04 +0000
>>> From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
>>>
>>> What's missing here is a benchmark, but unless there's a really nasty
>>> surprise when that happens, I'm quite confident that we can improve the
>>> code here.
>>
>> The usual easy benchmark is to byte-compile all the *.el files in the
>> source tree. That is, remove all the *.elc files, then say "make" and
>> time that.
>
> Considering the point of the optimization was to make compilation (when
> symbols_with_pos_enabled is true) slower, but speed up non-compilation
> use cases, I think that may be the opposite of what we want :-)
Glad you finally agree on the goal of the optimization.
> Furthermore, the master branch doesn't currently build after deleting
> all the *.elc files, because recompilation exceeds max-lisp-eval-depth
> in that scenario (together with the known purespace issue, this pretty
> much means "make bootstrap" is the only way I can rebuild an emacs tree
> right now. It'd be great if Someone could look into this, but I've
> failed to understand the native-compilation code (and been told off for
> trying to) too often for that Someone to be me. Plus, of course, I fully
> understand that native compilation currently has wrong code generation
> bugs which obviously have to take priority over build issues...)
>
>> There was also some Emacs benchmark suite that someone posted, but I
>> cannot find it now, maybe someone else will.
>
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)
What is not reliable in the elisp-benchmarks invocation suggested in the
instructions in it?
> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".
That would be nice.
^ permalink raw reply [flat|nested] 12+ messages in thread
* New "make benchmark" target
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
` (2 preceding siblings ...)
2024-12-12 10:40 ` Andrea Corallo
@ 2024-12-12 10:53 ` Stefan Kangas
2024-12-12 10:59 ` Andrea Corallo
3 siblings, 1 reply; 12+ messages in thread
From: Stefan Kangas @ 2024-12-12 10:53 UTC (permalink / raw)
To: Pip Cet, Eli Zaretskii
Cc: Mattias Engdegård, Paul Eggert, emacs-devel, Andrea Corallo
Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:
> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
> we could agree on a benchmark, and even better if there were a way to
> reliably run it from emacs -Q :-)
>
> In fact, I would suggest to move a reduced benchmark suite to the emacs
> repo itself, and run it using "make benchmark".
SGTM, but why a reduced suite and not just the whole thing?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: New "make benchmark" target
2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas
@ 2024-12-12 10:59 ` Andrea Corallo
0 siblings, 0 replies; 12+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:59 UTC (permalink / raw)
To: Stefan Kangas
Cc: Pip Cet, Eli Zaretskii, Mattias Engdegård, Paul Eggert,
emacs-devel
Stefan Kangas <stefankangas@gmail.com> writes:
> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> https://elpa.gnu.org/packages/elisp-benchmarks.html ? It'd be great if
>> we could agree on a benchmark, and even better if there were a way to
>> reliably run it from emacs -Q :-)
>>
>> In fact, I would suggest to move a reduced benchmark suite to the emacs
>> repo itself, and run it using "make benchmark".
>
> SGTM, but why a reduced suite and not just the whole thing?
My fear is that if we start going into the rabbit hole of which
benchmark of elisp-benchmarks should or should not be included, we will
never agree and as a consequence succeed. So I guess I'd favor as well
including all elisp-benchmarks.
Andrea
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
2024-12-12 6:36 ` Eli Zaretskii
@ 2024-12-12 10:42 ` Óscar Fuentes
2024-12-12 10:50 ` Andrea Corallo
1 sibling, 1 reply; 12+ messages in thread
From: Óscar Fuentes @ 2024-12-12 10:42 UTC (permalink / raw)
To: emacs-devel; +Cc: Pip Cet
Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
writes:
> I looked at the "new" code generated for our EQ macro, and decided that
> a fix was in order. I'm therefore sending a first proposal to explain
> what I think should be done, and some numbers.
>
> This patch:
> * moves the "slow path" of EQ into a NO_INLINE function
> * exits early if the arguments to EQ are actually BASE_EQ
> * returns quickly (after a single memory access which cannot be avoided
> until we fix our tagging scheme to distinguish exotic objects from
> ordinary ones) when symbols_with_pos_enabled isn't true.
>
> The effect on the code size of the stripped emacs binary is small, but
> significant: 8906336 bytes instead of 8955488 bytes on this machine.
> (The effect on the code size of the emacs binary with debugging
> information is much larger, reducing it from 32182000 bytes to 31125832
> bytes on this system.) There is no effect on the size of the .pdmp
> file, which is expected.
>
> What's missing here is a benchmark, but unless there's a really nasty
> surprise when that happens, I'm quite confident that we can improve the
> code here.
I've seen too many cases where *removing* instructions (mind you,
literally removing, not changing!) made the code significantly slower.
Modern CPUs are insanely complex and combined with compilers make
intuition-based predictions even more futile.
But reading your message makes me wonder if EQ and some other "simple"
fundamental functions are not lowered by nativecomp? If not, maybe
that's a significant opportunity for improvement.
As for your patch, one thing that would be easy to do and might save
quite a lot of head scratching is to count the fraction of the calls to
EQ that benefit from the fast path on a "representative" Emacs run. Then
you would have hard data to decide if fighting the compiler/CPU on that
case is a worthy cause.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 10:42 ` Improving EQ Óscar Fuentes
@ 2024-12-12 10:50 ` Andrea Corallo
2024-12-12 11:21 ` Óscar Fuentes
0 siblings, 1 reply; 12+ messages in thread
From: Andrea Corallo @ 2024-12-12 10:50 UTC (permalink / raw)
To: Óscar Fuentes; +Cc: emacs-devel, Pip Cet
Óscar Fuentes <ofv@wanadoo.es> writes:
> Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
> writes:
>
>> I looked at the "new" code generated for our EQ macro, and decided that
>> a fix was in order. I'm therefore sending a first proposal to explain
>> what I think should be done, and some numbers.
>>
>> This patch:
>> * moves the "slow path" of EQ into a NO_INLINE function
>> * exits early if the arguments to EQ are actually BASE_EQ
>> * returns quickly (after a single memory access which cannot be avoided
>> until we fix our tagging scheme to distinguish exotic objects from
>> ordinary ones) when symbols_with_pos_enabled isn't true.
>>
>> The effect on the code size of the stripped emacs binary is small, but
>> significant: 8906336 bytes instead of 8955488 bytes on this machine.
>> (The effect on the code size of the emacs binary with debugging
>> information is much larger, reducing it from 32182000 bytes to 31125832
>> bytes on this system.) There is no effect on the size of the .pdmp
>> file, which is expected.
>>
>> What's missing here is a benchmark, but unless there's a really nasty
>> surprise when that happens, I'm quite confident that we can improve the
>> code here.
>
> I've seen too many cases where *removing* instructions (mind you,
> literally removing, not changing!) made the code significantly slower.
>
> Modern CPUs are insanely complex and combined with compilers make
> intuition-based predictions even more futile.
That's why the patch needs to be benchmarked anyway.
> But reading your message makes me wonder if EQ and some other "simple"
> fundamental functions are not lowered by nativecomp? If not, maybe
> that's a significant opportunity for improvement.
Nativecomp only compiles eq for Lisp code, the one discussed here is the
eq used in C (and bytecode).
BTW ATM nativecomp generates code with the same layout of the eq we had
in C till my last change of few weeks ago. When eq will be stable in C
I guess I'll replicate the layout for generated code for Lisp as well.
Andrea
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Improving EQ
2024-12-12 10:50 ` Andrea Corallo
@ 2024-12-12 11:21 ` Óscar Fuentes
0 siblings, 0 replies; 12+ messages in thread
From: Óscar Fuentes @ 2024-12-12 11:21 UTC (permalink / raw)
To: emacs-devel
Andrea Corallo <acorallo@gnu.org> writes:
>> But reading your message makes me wonder if EQ and some other "simple"
>> fundamental functions are not lowered by nativecomp? If not, maybe
>> that's a significant opportunity for improvement.
>
> Nativecomp only compiles eq for Lisp code, the one discussed here is the
> eq used in C (and bytecode).
Ok, thanks.
Of course this change also affects Emacs running with nativecomp, as
many calls to EQ are made by C functions not lowered by nativecomp.
My guess is that nativecomp's performance would benefit quite a bit from
the general approach of this patch, as every point where nativecomp
calls C is a pessimization spot, but that's another topic.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-12-12 11:21 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-11 22:37 Improving EQ Pip Cet via Emacs development discussions.
2024-12-12 6:36 ` Eli Zaretskii
2024-12-12 8:23 ` Andrea Corallo
2024-12-12 8:36 ` Pip Cet via Emacs development discussions.
2024-12-12 9:18 ` Eli Zaretskii
2024-12-12 9:35 ` Visuwesh
2024-12-12 10:40 ` Andrea Corallo
2024-12-12 10:53 ` New "make benchmark" target Stefan Kangas
2024-12-12 10:59 ` Andrea Corallo
2024-12-12 10:42 ` Improving EQ Óscar Fuentes
2024-12-12 10:50 ` Andrea Corallo
2024-12-12 11:21 ` Óscar Fuentes
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).