Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
@ 2023-08-14  6:28 Gerd Möllmann
  2023-08-14  6:56 ` Gerd Möllmann
  0 siblings, 1 reply; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-14  6:28 UTC (permalink / raw)
  To: incal; +Cc: emacs-devel

>> It is, but it is also tangent to comparison between Elisp
>> and CL. The main (AFAIU) difference between Elisp and CL is
>> in how the bignums are stored. Elisp uses its own internal
>> object type while CL uses GMP's native format.

Just for the record, SBCL/CMUCL don't use GMP.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  6:28 [PATCH] Re: Bignum performance (was: Shrinking the C core) Gerd Möllmann
@ 2023-08-14  6:56 ` Gerd Möllmann
  2023-08-14  7:04   ` Ihor Radchenko
  0 siblings, 1 reply; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-14  6:56 UTC (permalink / raw)
  To: incal; +Cc: emacs-devel

> Just for the record, SBCL/CMUCL don't use GMP.

Hm, thinking of this - did someone measure how much time is spent in 
malloc/realloc/free in the benchmarks?  That is what GMP uses, and SBCL 
doesn't.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  6:56 ` Gerd Möllmann
@ 2023-08-14  7:04   ` Ihor Radchenko
  2023-08-14  7:35     ` Gerd Möllmann
  0 siblings, 1 reply; 41+ messages in thread
From: Ihor Radchenko @ 2023-08-14  7:04 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: incal, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

>> Just for the record, SBCL/CMUCL don't use GMP.
>
> Hm, thinking of this - did someone measure how much time is spent in 
> malloc/realloc/free in the benchmarks?  That is what GMP uses, and SBCL 
> doesn't.

https://yhetil.org/emacs-devel/87bkfdsmde.fsf@localhost/

We can further get rid of the GC by temporarily disabling it (just for
demonstration):

(let ((beg (float-time)))
  (setq gc-cons-threshold most-positive-fixnum)
  (fib 10000 1000)
  (message "%.3f s" (- (float-time) beg)) )

perf record  ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln
0.739 s

    17.11%  emacs    libgmp.so.10.5.0      [.] __gmpz_sizeinbase
     7.35%  emacs    libgmp.so.10.5.0      [.] __gmpz_add
     6.51%  emacs    emacs                 [.] arith_driver
     6.03%  emacs    libc.so.6             [.] malloc
     5.57%  emacs    emacs                 [.] allocate_vectorlike
     5.20%  emacs    [unknown]             [k] 0xffffffffaae01857
     4.16%  emacs    libgmp.so.10.5.0      [.] __gmpn_add_n_coreisbr
     3.72%  emacs    emacs                 [.] check_number_coerce_marker
     3.35%  emacs    fib.eln               [.] F666962_fib_0
     3.29%  emacs    emacs                 [.] allocate_pseudovector
     2.30%  emacs    emacs                 [.] Flss
-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  7:04   ` Ihor Radchenko
@ 2023-08-14  7:35     ` Gerd Möllmann
  2023-08-14  8:09       ` Ihor Radchenko
  0 siblings, 1 reply; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-14  7:35 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: incal, emacs-devel

On 14.08.23 09:04, Ihor Radchenko wrote:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> 
>>> Just for the record, SBCL/CMUCL don't use GMP.
>>
>> Hm, thinking of this - did someone measure how much time is spent in
>> malloc/realloc/free in the benchmarks?  That is what GMP uses, and SBCL
>> doesn't.
> 
> https://yhetil.org/emacs-devel/87bkfdsmde.fsf@localhost/
> 
> We can further get rid of the GC by temporarily disabling it (just for
> demonstration):
> 
> (let ((beg (float-time)))
>    (setq gc-cons-threshold most-positive-fixnum)
>    (fib 10000 1000)
>    (message "%.3f s" (- (float-time) beg)) )
> 
> perf record  ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln
> 0.739 s
> 
>      17.11%  emacs    libgmp.so.10.5.0      [.] __gmpz_sizeinbase
>       7.35%  emacs    libgmp.so.10.5.0      [.] __gmpz_add
>       6.51%  emacs    emacs                 [.] arith_driver
>       6.03%  emacs    libc.so.6             [.] malloc
>       5.57%  emacs    emacs                 [.] allocate_vectorlike
>       5.20%  emacs    [unknown]             [k] 0xffffffffaae01857
>       4.16%  emacs    libgmp.so.10.5.0      [.] __gmpn_add_n_coreisbr
>       3.72%  emacs    emacs                 [.] check_number_coerce_marker
>       3.35%  emacs    fib.eln               [.] F666962_fib_0
>       3.29%  emacs    emacs                 [.] allocate_pseudovector
>       2.30%  emacs    emacs                 [.] Flss

Hm, then maybe we can look at the disassembly of then benchmark in SBCL? 
  Not that in the end the compiler is so smart that it optimizes a lot 
of stuff simply away because it can prove that the result of the 
computations cannot possibly be observed?



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  7:35     ` Gerd Möllmann
@ 2023-08-14  8:09       ` Ihor Radchenko
  2023-08-14  9:28         ` Gerd Möllmann
  0 siblings, 1 reply; 41+ messages in thread
From: Ihor Radchenko @ 2023-08-14  8:09 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: incal, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Hm, then maybe we can look at the disassembly of then benchmark in SBCL? 
>   Not that in the end the compiler is so smart that it optimizes a lot 
> of stuff simply away because it can prove that the result of the 
> computations cannot possibly be observed?

Sorry, but I do not know how to do it. Not familiar with CL.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  8:09       ` Ihor Radchenko
@ 2023-08-14  9:28         ` Gerd Möllmann
  2023-08-14  9:42           ` Ihor Radchenko
  2023-08-14 16:51           ` Emanuel Berg
  0 siblings, 2 replies; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-14  9:28 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: incal, emacs-devel

On 14.08.23 10:09, Ihor Radchenko wrote:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> 
>> Hm, then maybe we can look at the disassembly of then benchmark in SBCL?
>>    Not that in the end the compiler is so smart that it optimizes a lot
>> of stuff simply away because it can prove that the result of the
>> computations cannot possibly be observed?
> 
> Sorry, but I do not know how to do it. Not familiar with CL.
> 
Ok.  I used the code from https://dataswamp.org/~incal/cl/fib.cl/fib.cl.

And the first thing that stares at me in this function:

(defun fib (reps num)
   (declare (optimize speed (safety 0) (debug 0)))
   (let ((z 0))
     (declare (type (unsigned-byte 53) reps num z))
     (dotimes (r reps)
       (let*((p1 1)
             (p2 1))
         (dotimes (i (- num 2))
           (setf z (+ p1 p2)
                 p2 p1
                 p1 z))))
     z))

is the declaration (unsigned-byte 53).

The declaration means we are lying to the compiler because Z gets bigger 
than 53 bits eventually.  And all bets are off because of the OPTIMIZE 
declaration.  The result is that everything is done in fixnums on 64-bit 
machines.

; disassembly for FIB
; Size: 92 bytes. Origin: #x700530086C                        ; FIB
; 6C:       030080D2         MOVZ NL3, #0
; 70:       040080D2         MOVZ NL4, #0
; 74:       0E000014         B L3
; 78: L0:   410080D2         MOVZ NL1, #2
; 7C:       E20301AA         MOV NL2, NL1
; 80:       EB030CAA         MOV R1, R2
; 84:       651100D1         SUB NL5, R1, #4
; 88:       000080D2         MOVZ NL0, #0
; 8C:       05000014         B L2
; 90: L1:   2300028B         ADD NL3, NL1, NL2
; 94:       E20301AA         MOV NL2, NL1
; 98:       E10303AA         MOV NL1, NL3
; 9C:       00080091         ADD NL0, NL0, #2
; A0: L2:   1F0005EB         CMP NL0, NL5
; A4:       6BFFFF54         BLT L1
; A8:       84080091         ADD NL4, NL4, #2
; AC: L3:   9F000AEB         CMP NL4, R0
; B0:       4BFEFF54         BLT L0
; B4:       EA0303AA         MOV R0, NL3
; B8:       FB031AAA         MOV CSP, CFP
; BC:       5A7B40A9         LDP CFP, LR, [CFP]
; C0:       BF0300F1         CMP NULL, #0
; C4:       C0035FD6         RET

Tada!




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  9:28         ` Gerd Möllmann
@ 2023-08-14  9:42           ` Ihor Radchenko
  2023-08-15 14:03             ` Emanuel Berg
  2023-08-14 16:51           ` Emanuel Berg
  1 sibling, 1 reply; 41+ messages in thread
From: Ihor Radchenko @ 2023-08-14  9:42 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: incal, emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Ok.  I used the code from https://dataswamp.org/~incal/cl/fib.cl/fib.cl.
> ...
> The declaration means we are lying to the compiler because Z gets bigger 
> than 53 bits eventually.  And all bets are off because of the OPTIMIZE 
> declaration.  The result is that everything is done in fixnums on 64-bit 
> machines.

That explains a lot :)

I now tried

(defun fib (reps num)
  (declare (optimize speed (safety 0) (debug 0)))
  (let ((z 0))
    ;; (declare (type (unsigned-byte 53) reps num z))
    (dotimes (r reps)
      (let*((p1 1)
            (p2 1))
        (dotimes (i (- num 2))
          (setf z (+ p1 p2)
                p2 p1
                p1 z))))
    z))

and got

$ SBCL_HOME=/usr/lib64/sbcl perf record sbcl --load /tmp/fib.cl

;;;  0.263333 s real time
;;;  0.263641 s run time

$ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln
0.739 s

Still ~3x faster compared to Elisp, but not orders of magnitude.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  9:28         ` Gerd Möllmann
  2023-08-14  9:42           ` Ihor Radchenko
@ 2023-08-14 16:51           ` Emanuel Berg
  2023-08-15  4:58             ` Gerd Möllmann
  2023-08-15  6:26             ` [PATCH] Re: Bignum performance Po Lu
  1 sibling, 2 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-14 16:51 UTC (permalink / raw)
  To: emacs-devel

Gerd Möllmann wrote:

>> Sorry, but I do not know how to do it. Not familiar
>> with CL.
>> 
> Ok. I used the code from
> https://dataswamp.org/~incal/cl/fib.cl/fib.cl

Yikes, how did that happen, some slip involving symbolic
links ...

Here it is: 
  https://dataswamp.org/~incal/cl/bench/fib.cl

And timing is done with this:
  https://dataswamp.org/~incal/cl/bench/timing.cl

Note the ugly absolute path in fib.cl BTW, otherwise you get
the path not of the file but of SBCL or Slime, maybe. I think
one is supposed to use ASDF, but surely there must some easy
way to just load a file using a relative path to the
current file?

(load "~/public_html/cl/bench/timing.cl")

> is the declaration (unsigned-byte 53).
>
> The declaration means we are lying to the compiler because
> Z gets bigger than 53 bits eventually. And all bets are off
> because of the OPTIMIZE declaration. The result is that
> everything is done in fixnums on 64-bit machines.

A very impressive optimization indeed, and expressed in
a cryptic way.

> ; disassembly for FIB
> ; Size: 92 bytes. Origin: #x700530086C                        ; FIB
> ; 6C:       030080D2         MOVZ NL3, #0
> ; 70:       040080D2         MOVZ NL4, #0
> ; 74:       0E000014         B L3
> ; 78: L0:   410080D2         MOVZ NL1, #2
> ; 7C:       E20301AA         MOV NL2, NL1
> ; 80:       EB030CAA         MOV R1, R2
> ; 84:       651100D1         SUB NL5, R1, #4
> ; 88:       000080D2         MOVZ NL0, #0
> ; 8C:       05000014         B L2
> ; 90: L1:   2300028B         ADD NL3, NL1, NL2
> ; 94:       E20301AA         MOV NL2, NL1
> ; 98:       E10303AA         MOV NL1, NL3
> ; 9C:       00080091         ADD NL0, NL0, #2
> ; A0: L2:   1F0005EB         CMP NL0, NL5
> ; A4:       6BFFFF54         BLT L1
> ; A8:       84080091         ADD NL4, NL4, #2
> ; AC: L3:   9F000AEB         CMP NL4, R0
> ; B0:       4BFEFF54         BLT L0
> ; B4:       EA0303AA         MOV R0, NL3
> ; B8:       FB031AAA         MOV CSP, CFP
> ; BC:       5A7B40A9         LDP CFP, LR, [CFP]
> ; C0:       BF0300F1         CMP NULL, #0
> ; C4:       C0035FD6         RET
>
> Tada!

How do you see only fixnums are used?

Are we talking 1 word = 2 bytes = 16 bits here, s2c?

If so, the range of fixnums are -32 768 to 32 767 inclusive,
so those are hardly huge numbers.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14 16:51           ` Emanuel Berg
@ 2023-08-15  4:58             ` Gerd Möllmann
  2023-08-15 14:20               ` Emanuel Berg
  2023-08-15  6:26             ` [PATCH] Re: Bignum performance Po Lu
  1 sibling, 1 reply; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-15  4:58 UTC (permalink / raw)
  To: incal; +Cc: emacs-devel

> How do you see only fixnums are used?

By reading the assembly, and remembering a thing or two from my times as 
a CMUCL contributor, e.g. its fixnum representation.  SBCL is a fork of 
CMUCL.

> Are we talking 1 word = 2 bytes = 16 bits here, s2c?
> 
> If so, the range of fixnums are -32 768 to 32 767 inclusive,
> so those are hardly huge numbers.

It's a 64-bit machine, more specifically an M1. i.e. arm64.  How you get 
from there to these 16 bits escapes me.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-14 16:51           ` Emanuel Berg
  2023-08-15  4:58             ` Gerd Möllmann
@ 2023-08-15  6:26             ` Po Lu
  2023-08-15 14:33               ` Emanuel Berg
  1 sibling, 1 reply; 41+ messages in thread
From: Po Lu @ 2023-08-15  6:26 UTC (permalink / raw)
  To: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

> Are we talking 1 word = 2 bytes = 16 bits here, s2c?
>
> If so, the range of fixnums are -32 768 to 32 767 inclusive,
> so those are hardly huge numbers.

Under Arm64, general purpose integer registers are 64 bits wide.  That
is also the word size of said machine.

I agree that vague terminology such as ``word'' is confusing, because of
its use by the Unix assembler.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-14  9:42           ` Ihor Radchenko
@ 2023-08-15 14:03             ` Emanuel Berg
  2023-08-15 15:01               ` Ihor Radchenko
  0 siblings, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-08-15 14:03 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

>> The declaration means we are lying to the compiler because
>> Z gets bigger than 53 bits eventually. And all bets are off
>> because of the OPTIMIZE declaration. The result is that
>> everything is done in fixnums on 64-bit machines.
>
> That explains a lot :)
>
> I now tried
>
> (defun fib (reps num)
>   (declare (optimize speed (safety 0) (debug 0)))
>   (let ((z 0))
>     ;; (declare (type (unsigned-byte 53) reps num z))
>     (dotimes (r reps)
>       (let*((p1 1)
>             (p2 1))
>         (dotimes (i (- num 2))
>           (setf z (+ p1 p2)
>                 p2 p1
>                 p1 z))))
>     z))
>
> and got
>
> $ SBCL_HOME=/usr/lib64/sbcl perf record sbcl --load /tmp/fib.cl
>
> ;;;  0.263333 s real time
> ;;;  0.263641 s run time
>
> $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln
> 0.739 s
>
> Still ~3x faster compared to Elisp, but not orders
> of magnitude.

A pretty good optimization! :O

But what kind of optimization is it?

Also, what happens if you remove the OPTIMIZE declaration
as well?

Still, isn't the rule of the "beat the benchmark" game to beat
it as fast as possible?

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-15  4:58             ` Gerd Möllmann
@ 2023-08-15 14:20               ` Emanuel Berg
  0 siblings, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-15 14:20 UTC (permalink / raw)
  To: emacs-devel

Gerd Möllmann wrote:

>> Are we talking 1 word = 2 bytes = 16 bits here, s2c? If so,
>> the range of fixnums are -32 768 to 32 767 inclusive, so
>> those are hardly huge numbers.
>
> It's a 64-bit machine, more specifically an M1. i.e. arm64.
> How you get from there to these 16 bits escapes me.

Here [1] it says a word for ARM is 32 bits.

So to store zero in a fixnum word it will look like this

  00000000 00000000 00000000 00000000

If one bit, the MSB, is used as the sign bit, the interval,
inclusive, is

  (list (* -1 (expt 2 (1- 32)))
        (1-   (expt 2 (1- 32))) )

  (-2147483648 2147483647)

Okay, now we are talking some pretty big number alltho I have
seen even bigger ...

[1] https://modexp.wordpress.com/2018/10/30/arm64-assembly/

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-15  6:26             ` [PATCH] Re: Bignum performance Po Lu
@ 2023-08-15 14:33               ` Emanuel Berg
  2023-08-15 17:07                 ` tomas
  2023-08-16  1:31                 ` Po Lu
  0 siblings, 2 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-15 14:33 UTC (permalink / raw)
  To: emacs-devel

Po Lu wrote:

>> Are we talking 1 word = 2 bytes = 16 bits here, s2c?
>>
>> If so, the range of fixnums are -32 768 to 32 767
>> inclusive, so those are hardly huge numbers.
>
> Under Arm64, general purpose integer registers are 64 bits
> wide. That is also the word size of said machine.

If they are, the range for fixnums is

(list (* -1 (expt 2 (1- 64)))
      (1-   (expt 2 (1- 64))) )

(-9223372036854775808 9223372036854775807)

Only after that it gets slower :P

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-15 14:03             ` Emanuel Berg
@ 2023-08-15 15:01               ` Ihor Radchenko
  2023-08-15 22:21                 ` Emanuel Berg
  2023-08-15 22:33                 ` Emanuel Berg
  0 siblings, 2 replies; 41+ messages in thread
From: Ihor Radchenko @ 2023-08-15 15:01 UTC (permalink / raw)
  To: Emanuel Berg; +Cc: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

>>     ;; (declare (type (unsigned-byte 53) reps num z))
>> ...
>> Still ~3x faster compared to Elisp, but not orders
>> of magnitude.
>
> A pretty good optimization! :O
>
> But what kind of optimization is it?

The commented "optimization" is: "Hey, SBCL, do not use bignums. If ints
overflow, so be it".

> Also, what happens if you remove the OPTIMIZE declaration
> as well?

No difference.

> Still, isn't the rule of the "beat the benchmark" game to beat
> it as fast as possible?

Yes, but when CBCL is orders of magnitude faster, it indicates something
conceptually wrong in the algo. 3x is a matter of variation in the
internal details (like extra type checking in Elisp that Po Lu outlined).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-15 14:33               ` Emanuel Berg
@ 2023-08-15 17:07                 ` tomas
  2023-08-15 22:46                   ` Emanuel Berg
  2023-08-16  1:31                 ` Po Lu
  1 sibling, 1 reply; 41+ messages in thread
From: tomas @ 2023-08-15 17:07 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 738 bytes --]

On Tue, Aug 15, 2023 at 04:33:04PM +0200, Emanuel Berg wrote:
> Po Lu wrote:
> 
> >> Are we talking 1 word = 2 bytes = 16 bits here, s2c?
> >>
> >> If so, the range of fixnums are -32 768 to 32 767
> >> inclusive, so those are hardly huge numbers.
> >
> > Under Arm64, general purpose integer registers are 64 bits
> > wide. That is also the word size of said machine.
> 
> If they are, the range for fixnums is
> 
> (list (* -1 (expt 2 (1- 64)))
>       (1-   (expt 2 (1- 64))) )
> 
> (-9223372036854775808 9223372036854775807)
> 
> Only after that it gets slower :P

Unless SBCL uses tagged object representation. Unless the
compiler can prove that some "thing" is going to be an
int. Unless...

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-15 15:01               ` Ihor Radchenko
@ 2023-08-15 22:21                 ` Emanuel Berg
  2023-08-15 22:33                 ` Emanuel Berg
  1 sibling, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-15 22:21 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

>> A pretty good optimization! :O
>>
>> But what kind of optimization is it?
>
> The commented "optimization" is: "Hey, SBCL, do not use
> bignums. If ints overflow, so be it".

?

But then how can the algorithm execute correctly?

>> Still, isn't the rule of the "beat the benchmark" game to
>> beat it as fast as possible?
>
> Yes, but when CBCL is orders of magnitude faster, it
> indicates something conceptually wrong in the algo. 3x is
> a matter of variation in the internal details (like extra
> type checking in Elisp that Po Lu outlined).

If you are saying the algorithm doesn't output correct data
for the conventional conception of the Fibonacci algorithm,
then that optimization and whatever time it makes isn't valid,
I'll remove it this instant.

Hm, maybe we need something like unit testing to confirm that
the algorithms perform not just fast, but also as intended to
solve whatever problem they were designed to ...

-- 
underground experts united
https://dataswamp.org/~incal

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-15 15:01               ` Ihor Radchenko
  2023-08-15 22:21                 ` Emanuel Berg
@ 2023-08-15 22:33                 ` Emanuel Berg
  2023-08-16  4:36                   ` tomas
  1 sibling, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-08-15 22:33 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

> Yes, but when CBCL is orders of magnitude faster, it
> indicates something conceptually wrong in the algo.

Indeed, I'll remove it, thanks.

But my CL skills aren't at that level so someone else added
it. A strange optimization indeed, that breaks the code. Hm,
maybe not that unusual when I think about it. But that is for
normal code, not supposed benchmarks ...

So this is the explanation for the +78 875% speed disadvantage
for Elisp! As reported a long time ago when comparing Elisp
and CL. I.e., what is documented in this file

  https://dataswamp.org/~incal/emacs-init/fib.el

and discussed here (or be it gnu.emacs.help) several
months ago.

\o/

Fires up a cigar!

Always a pleasure when a mystery gets solved ... but TBH
I actually believed Elisp was that much slower. Turns out, the
CL implementation wasn't even correct. Bummer, but ultimately
good for us as it turned out.

> 3x is a matter of variation in the internal details (like
> extra type checking in Elisp that Po Lu outlined).

I'll remove the supposed optimization, and we'll take it from
there. We (you) have already improved the bignum object
allocation reuse and Mr. Möllmann solved this issue, so we
have a positive trajectory already. But 78 875% or 3x doesn't
matter in principle, we do it until we are done regardless ...

-- 
underground experts united
https://dataswamp.org/~incal

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-15 17:07                 ` tomas
@ 2023-08-15 22:46                   ` Emanuel Berg
  0 siblings, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-15 22:46 UTC (permalink / raw)
  To: emacs-devel

tomas wrote:

>>>> Are we talking 1 word = 2 bytes = 16 bits here, s2c?
>>>>
>>>> If so, the range of fixnums are -32 768 to 32 767
>>>> inclusive, so those are hardly huge numbers.
>>>
>>> Under Arm64, general purpose integer registers are 64 bits
>>> wide. That is also the word size of said machine.
>> 
>> If they are, the range for fixnums is
>> 
>> (list (* -1 (expt 2 (1- 64)))
>>       (1-   (expt 2 (1- 64))) )
>> 
>> (-9223372036854775808 9223372036854775807)
>> 
>> Only after that it gets slower :P
>
> Unless SBCL uses tagged object representation. Unless the
> compiler can prove that some "thing" is going to be an int.
> Unless...

Actually the rules are quite simple. As long as the expected
execution and correct return value is observed and ultimately
achieved by the algorithm, all optimizations are fair.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-15 14:33               ` Emanuel Berg
  2023-08-15 17:07                 ` tomas
@ 2023-08-16  1:31                 ` Po Lu
  2023-08-16  1:37                   ` Emanuel Berg
  1 sibling, 1 reply; 41+ messages in thread
From: Po Lu @ 2023-08-16  1:31 UTC (permalink / raw)
  To: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

> Po Lu wrote:
>
>>> Are we talking 1 word = 2 bytes = 16 bits here, s2c?
>>>
>>> If so, the range of fixnums are -32 768 to 32 767
>>> inclusive, so those are hardly huge numbers.
>>
>> Under Arm64, general purpose integer registers are 64 bits
>> wide. That is also the word size of said machine.
>
> If they are, the range for fixnums is
>
> (list (* -1 (expt 2 (1- 64)))
>       (1-   (expt 2 (1- 64))) )
>
> (-9223372036854775808 9223372036854775807)
>
> Only after that it gets slower :P

Lisp systems normally set aside several of the high or low bits of a
register as a tag linking a type to the object represented.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  1:31                 ` Po Lu
@ 2023-08-16  1:37                   ` Emanuel Berg
  2023-08-16  3:17                     ` Po Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-08-16  1:37 UTC (permalink / raw)
  To: emacs-devel

Po Lu wrote:

>>> Under Arm64, general purpose integer registers are 64 bits
>>> wide. That is also the word size of said machine.
>>
>> If they are, the range for fixnums is
>>
>> (list (* -1 (expt 2 (1- 64)))
>>       (1-   (expt 2 (1- 64))) )
>>
>> (-9223372036854775808 9223372036854775807)
>>
>> Only after that it gets slower :P
>
> Lisp systems normally set aside several of the high or low
> bits of a register as a tag linking a type to the
> object represented.

But here we are at the CPU architecture level (register
length), surely Lisp don't meddle with that?

No, I sense that it is, actually. So please explain, then, how
it works. And in particular, how many bits do we (Elisp and
CL) actually have for our fixnums?

Or, ar we talking bignums now?

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  1:37                   ` Emanuel Berg
@ 2023-08-16  3:17                     ` Po Lu
  2023-08-16  4:44                       ` tomas
  2023-08-16  5:18                       ` Gerd Möllmann
  0 siblings, 2 replies; 41+ messages in thread
From: Po Lu @ 2023-08-16  3:17 UTC (permalink / raw)
  To: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

> Po Lu wrote:
>
>>>> Under Arm64, general purpose integer registers are 64 bits
>>>> wide. That is also the word size of said machine.
>>>
>>> If they are, the range for fixnums is
>>>
>>> (list (* -1 (expt 2 (1- 64)))
>>>       (1-   (expt 2 (1- 64))) )
>>>
>>> (-9223372036854775808 9223372036854775807)
>>>
>>> Only after that it gets slower :P
>>
>> Lisp systems normally set aside several of the high or low
>> bits of a register as a tag linking a type to the
>> object represented.
>
> But here we are at the CPU architecture level (register
> length), surely Lisp don't meddle with that?
>
> No, I sense that it is, actually. So please explain, then, how
> it works. And in particular, how many bits do we (Elisp and
> CL) actually have for our fixnums?

I don't know about SBCL, but as for Emacs, refer to the definition of
VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but
I have no idea where they've disappeared to.)



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-15 22:33                 ` Emanuel Berg
@ 2023-08-16  4:36                   ` tomas
  2023-08-16  5:23                     ` Emanuel Berg
  0 siblings, 1 reply; 41+ messages in thread
From: tomas @ 2023-08-16  4:36 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]

On Wed, Aug 16, 2023 at 12:33:33AM +0200, Emanuel Berg wrote:
> Ihor Radchenko wrote:
> 
> > Yes, but when CBCL is orders of magnitude faster, it
> > indicates something conceptually wrong in the algo.
> 
> Indeed, I'll remove it, thanks.
> 
> But my CL skills aren't at that level so someone else added
> it. A strange optimization indeed, that breaks the code.

It only breaks the code if you "don't know what you are doing".

See, without the optimization the code will have, at each and
every arithmetic operation, to check "Hmm... Is this thing going
to overflow? Hm. It might, so better use bignums. Phew, it didn't,
so back to fixnums".

Now we know that modern CPU architectures have a hard time with
conditional statements (pipeline stalls, cache mispredictions,
all that nasty stuff). So this "Hmm..." above is costing real
money. Even in cases you won't need it, because things ain't
gonna overflow.

The compiler tries to do a good job of looking into calculations
and deciding "this incf down there won't ever push us over the
fixnum limit, because we know we are starting with a number
below 10".

But the programmer sometimes has more knowledge and can prove
that things won't overflow, ever. Or that, should things overflow,
it won't matter anyway.

It's for those cases that this kind of optimizations are made.

C, by the way, always runs in this mode. Unsigned integers will
silently wrap around, that's documented behaviour. Signed integers
will do whatever their thing is (technically this is called
"unspecified behaviour".

Perhaps you wanted just to compute fib modulo some big power
of two? Then your program was correct, after all...

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  3:17                     ` Po Lu
@ 2023-08-16  4:44                       ` tomas
  2023-08-16  5:18                       ` Gerd Möllmann
  1 sibling, 0 replies; 41+ messages in thread
From: tomas @ 2023-08-16  4:44 UTC (permalink / raw)
  To: Po Lu; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1277 bytes --]

On Wed, Aug 16, 2023 at 11:17:01AM +0800, Po Lu wrote:
> Emanuel Berg <incal@dataswamp.org> writes:
> 
> > Po Lu wrote:

[...]

> >> Lisp systems normally set aside several of the high or low
> >> bits of a register as a tag linking a type to the
> >> object represented.
> >
> > But here we are at the CPU architecture level (register
> > length), surely Lisp don't meddle with that?
> >
> > No, I sense that it is, actually. So please explain, then, how
> > it works. And in particular, how many bits do we (Elisp and
> > CL) actually have for our fixnums?
> 
> I don't know about SBCL, but as for Emacs, refer to the definition of
> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but
> I have no idea where they've disappeared to.)

That is what I was hinting at with "tagged representation": Emacs
Lisp does it, we don't know about SBCL. Typically, a good implementation
has small stretches of code where the values are as-is because the
compiler can prove what their type is (fixnum, whatever). But that
means that fixnums are usually limited to less than the full machine
word's width (e.g. 60 bits if your tag is four bits wide), because
your lisp has to be able to stuff them back into such a place.

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  3:17                     ` Po Lu
  2023-08-16  4:44                       ` tomas
@ 2023-08-16  5:18                       ` Gerd Möllmann
  2023-08-16  5:35                         ` Emanuel Berg
                                           ` (2 more replies)
  1 sibling, 3 replies; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-16  5:18 UTC (permalink / raw)
  To: luangruo; +Cc: emacs-devel

> I don't know about SBCL, but as for Emacs, refer to the definition of
> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but
> I have no idea where they've disappeared to.)

The SBCL I have here, a Homebrew installation, uses the scheme where

....0	-> fixnum
....1   -> other objects (discriminated by additional tag bits)

I had the same for Emacs in the branch gerd_int in the 2000s, if memory 
serves me.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
  2023-08-16  4:36                   ` tomas
@ 2023-08-16  5:23                     ` Emanuel Berg
  0 siblings, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-16  5:23 UTC (permalink / raw)
  To: emacs-devel

tomas wrote:

> Perhaps you wanted just to compute fib modulo some big power
> of two? Then your program was correct, after all...

So, it works for fixnums but not for bignums?

The code must always work for valid indata, if it
doesn't, even for a single such case, the optimization breaks
the algorithm and will be removed.

Maybe we can duplicate it and remove the declaration in one
of them. So we would have one Fibonacci to test the speed of
fixnums, and one for bignums. In the fixnums one, the
declaration would still be legal.

-- 
underground experts united
https://dataswamp.org/~incal

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  5:18                       ` Gerd Möllmann
@ 2023-08-16  5:35                         ` Emanuel Berg
  2023-08-18  7:14                           ` Simon Leinen
  2023-08-16  5:41                         ` Gerd Möllmann
  2023-08-16  6:42                         ` Po Lu
  2 siblings, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-08-16  5:35 UTC (permalink / raw)
  To: emacs-devel

Gerd Möllmann wrote:

>> I don't know about SBCL, but as for Emacs, refer to the
>> definition of VALBITS in lisp.h (maybe also the right files
>> among m/*.h and s/*.h, but I have no idea where they've
>> disappeared to.)
>
> The SBCL I have here, a Homebrew installation, uses the
> scheme where
>
> ....0	-> fixnum
> ....1   -> other objects (discriminated by additional tag bits)
>
> I had the same for Emacs in the branch gerd_int in the
> 2000s, if memory serves me.

Ah, there is `fixnump' (and `bignump') to test for a specific
number,

  (fixnump (expt 2 60)) ; t
  (fixnump (expt 2 61)) ; nil

2^60 = 1152921504606846976, so that's pretty big.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  5:18                       ` Gerd Möllmann
  2023-08-16  5:35                         ` Emanuel Berg
@ 2023-08-16  5:41                         ` Gerd Möllmann
  2023-08-16  6:42                         ` Po Lu
  2 siblings, 0 replies; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-16  5:41 UTC (permalink / raw)
  To: luangruo; +Cc: emacs-devel

On 16.08.23 07:18, Gerd Möllmann wrote:
>> I don't know about SBCL, but as for Emacs, refer to the definition of
>> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but
>> I have no idea where they've disappeared to.)
> 
> The SBCL I have here, a Homebrew installation, uses the scheme where

Oops, I'm actually not running the SBCL from Homebrew, but my own...
Anyway, the tagging is like this:

@c 64-bit lowtag assignment (wider-fixnums)
@c xyz0 -- Fixnum (where z or yz may also be 0 depending on 
n-fixnum-tag-bits)
@c xx01 -- Other-immediate
@c xx11 -- Pointer
@c 0011 --   Instance-pointer
@c 0111 --   List-pointer
@c 1011 --   Function-pointer
@c 1111 --   Other-pointer

https://github.com/sbcl/sbcl/blob/master/doc/internals/objects-in-memory.texinfo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  5:18                       ` Gerd Möllmann
  2023-08-16  5:35                         ` Emanuel Berg
  2023-08-16  5:41                         ` Gerd Möllmann
@ 2023-08-16  6:42                         ` Po Lu
  2023-08-16  8:05                           ` Gerd Möllmann
  2 siblings, 1 reply; 41+ messages in thread
From: Po Lu @ 2023-08-16  6:42 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: emacs-devel

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

>> I don't know about SBCL, but as for Emacs, refer to the definition of
>> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but
>> I have no idea where they've disappeared to.)
>
> The SBCL I have here, a Homebrew installation, uses the scheme where
>
> ....0	-> fixnum
> ....1   -> other objects (discriminated by additional tag bits)
>
> I had the same for Emacs in the branch gerd_int in the 2000s, if
> memory serves me.

In today's Emacs, 0 is Lisp_Symbol; this facilitates representing Qnil
as C NULL.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  6:42                         ` Po Lu
@ 2023-08-16  8:05                           ` Gerd Möllmann
  0 siblings, 0 replies; 41+ messages in thread
From: Gerd Möllmann @ 2023-08-16  8:05 UTC (permalink / raw)
  To: Po Lu; +Cc: emacs-devel

On 16.08.23 08:42, Po Lu wrote:
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> 
>>> I don't know about SBCL, but as for Emacs, refer to the definition of
>>> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but
>>> I have no idea where they've disappeared to.)
>>
>> The SBCL I have here, a Homebrew installation, uses the scheme where
>>
>> ....0	-> fixnum
>> ....1   -> other objects (discriminated by additional tag bits)
>>
>> I had the same for Emacs in the branch gerd_int in the 2000s, if
>> memory serves me.
> 
> In today's Emacs, 0 is Lisp_Symbol; this facilitates representing Qnil
> as C NULL.

Yes, and I don't see a need to change anything in this regard in Emacs. 
IMHO, the fixnum range is more than sufficient nowadays, in general.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-16  5:35                         ` Emanuel Berg
@ 2023-08-18  7:14                           ` Simon Leinen
  2023-08-19 13:10                             ` Emanuel Berg
  2023-09-04  4:13                             ` Emanuel Berg
  0 siblings, 2 replies; 41+ messages in thread
From: Simon Leinen @ 2023-08-18  7:14 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1215 bytes --]

Emacs also has `most-negative-fixnum' and `most-positive-fixnum' (borrowed
from Common Lisp but now part of the core).

On my 64-bit system (GNU Emacs 30.0.50 on aarch64-apple-darwin22.5.0):

(= (- (expt 2 61)) most-negative-fixnum) ⇒ t
(= (1- (expt 2 61)) most-positive-fixnum) ⇒ t

(Same as on Emanuel's.)
-- 
Simon.

On Wed, Aug 16, 2023 at 1:12 PM Emanuel Berg <incal@dataswamp.org> wrote:

> Gerd Möllmann wrote:
>
> >> I don't know about SBCL, but as for Emacs, refer to the
> >> definition of VALBITS in lisp.h (maybe also the right files
> >> among m/*.h and s/*.h, but I have no idea where they've
> >> disappeared to.)
> >
> > The SBCL I have here, a Homebrew installation, uses the
> > scheme where
> >
> > ....0 -> fixnum
> > ....1   -> other objects (discriminated by additional tag bits)
> >
> > I had the same for Emacs in the branch gerd_int in the
> > 2000s, if memory serves me.
>
> Ah, there is `fixnump' (and `bignump') to test for a specific
> number,
>
>   (fixnump (expt 2 60)) ; t
>   (fixnump (expt 2 61)) ; nil
>
> 2^60 = 1152921504606846976, so that's pretty big.
>
> --
> underground experts united
> https://dataswamp.org/~incal
>
>
>

[-- Attachment #2: Type: text/html, Size: 1806 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-18  7:14                           ` Simon Leinen
@ 2023-08-19 13:10                             ` Emanuel Berg
  2023-08-20  5:07                               ` Ihor Radchenko
  2023-09-04  4:13                             ` Emanuel Berg
  1 sibling, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-08-19 13:10 UTC (permalink / raw)
  To: emacs-devel

Simon Leinen wrote:

> Emacs also has `most-negative-fixnum' and
> `most-positive-fixnum' (borrowed from Common Lisp but now
> part of the core).
>
> On my 64-bit system (GNU Emacs 30.0.50 on
> aarch64-apple-darwin22.5.0):
>
> (= (- (expt 2 61)) most-negative-fixnum) → t
> (= (1- (expt 2 61)) most-positive-fixnum) → t
>
> (Same as on Emanuel's.)

At least that computation was correct then!

Now that Gerd has solved the little mystery why Fibonacci was
seemingly so much faster on Common Lisp and we also got that
performance gain patch from Ihor I don't know how much sense
it makes to continue translating the emacs-benchmarks to
Common Lisp, or rather if anyone is motivated enough to do it,
but this is how far I got:

  https://dataswamp.org/~incal/cl/bench/

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-19 13:10                             ` Emanuel Berg
@ 2023-08-20  5:07                               ` Ihor Radchenko
  2023-08-20  6:20                                 ` Emanuel Berg
  2023-08-28  5:32                                 ` Emanuel Berg
  0 siblings, 2 replies; 41+ messages in thread
From: Ihor Radchenko @ 2023-08-20  5:07 UTC (permalink / raw)
  To: Emanuel Berg; +Cc: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

> Now that Gerd has solved the little mystery why Fibonacci was
> seemingly so much faster on Common Lisp and we also got that
> performance gain patch from Ihor I don't know how much sense
> it makes to continue translating the emacs-benchmarks to
> Common Lisp, or rather if anyone is motivated enough to do it,
> but this is how far I got:
>
>   https://dataswamp.org/~incal/cl/bench/

You can again compare Elisp with CL and let us know what is being
noticeably slower. It will be an indication that something might be
improved.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-20  5:07                               ` Ihor Radchenko
@ 2023-08-20  6:20                                 ` Emanuel Berg
  2023-08-28  5:32                                 ` Emanuel Berg
  1 sibling, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-20  6:20 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

>> Now that Gerd has solved the little mystery why Fibonacci
>> was seemingly so much faster on Common Lisp and we also got
>> that performance gain patch from Ihor I don't know how much
>> sense it makes to continue translating the emacs-benchmarks
>> to Common Lisp, or rather if anyone is motivated enough to
>> do it, but this is how far I got:
>>
>>   https://dataswamp.org/~incal/cl/bench/
>
> You can again compare Elisp with CL and let us know what is
> being noticeably slower. It will be an indication that
> something might be improved.

Thanks, you are right, hopefully it will happen.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-20  5:07                               ` Ihor Radchenko
  2023-08-20  6:20                                 ` Emanuel Berg
@ 2023-08-28  5:32                                 ` Emanuel Berg
  2023-09-03  0:48                                   ` Emanuel Berg
  2023-09-03  1:57                                   ` [PATCH] Re: Bignum performance Emanuel Berg
  1 sibling, 2 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-08-28  5:32 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

> You can again compare Elisp with CL and let us know what is
> being noticeably slower. It will be an indication that
> something might be improved.

Here is a new file:

  https://dataswamp.org/~incal/cl/bench/flet.cl

The CL time is 0.84 vs Elisp time at 1.35.

So here, CL is 61% faster!

(format "%d%%" (round (* 100 (1- (/ 1.35 0.84)))))

flet
0.839996 s real time
0.839262 s run time

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-28  5:32                                 ` Emanuel Berg
@ 2023-09-03  0:48                                   ` Emanuel Berg
  2023-09-03  8:50                                     ` Ihor Radchenko
  2023-09-03  1:57                                   ` [PATCH] Re: Bignum performance Emanuel Berg
  1 sibling, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-09-03  0:48 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

> You can again compare Elisp with CL and let us know what is
> being noticeably slower. It will be an indication that
> something might be improved.

Her is yet another file,

  https://dataswamp.org/~incal/cl/bench/inclist.cl

The Elisp time for this benchmark is, with `native-comp-speed'
at the default - for the benchmarks package - maximal
optimization level, namely 3 - at 5.86 sec, while the CL is at
1.635993 sec.

So here, CL is 258% faster.

I run the benchmarks from Emacs, and the CL also from Emacs,
with SLIME and SBCL. So that should be pretty fair, but maybe
when all benchmarks are translated one could do the Elisp with
batch and the SBCL not using Emacs at all.

-- 
underground experts united
https://dataswamp.org/~incal

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-28  5:32                                 ` Emanuel Berg
  2023-09-03  0:48                                   ` Emanuel Berg
@ 2023-09-03  1:57                                   ` Emanuel Berg
  1 sibling, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-09-03  1:57 UTC (permalink / raw)
  To: emacs-devel

Yet another file,

  https://dataswamp.org/~incal/cl/bench/inclist-type-hints.cl

Here we have type hints, answering my own questions a while
back how to do it!

It is done with `declare', as in this

  (declare (optimize (speed 3) (safety 0)))

for the concerned function, then they are actually put to use
with `the'. ("Type hints enabled"?)

Elisp vs SBCL, in seconds:

  (faster 5.73 0.675997) ; CL is 748% faster

Note that the optimization worked a lot better for SBCL than
in did for Elisp, if we compare with the non-optimized (i.e.
no type hints) file I just posted [1] [it hasn't arrived
yet on this ML as I type this, but should arrive] - anyway

  (faster 5.86     5.73)     ;   2% - Elisp vs `cl-the' Elisp
  (faster 1.635993 0.675997) ; 142% - SBCL  vs `the' SBCL

SBCL become 142% faster with the optimization, and the Elisp -
just 2%.

Another interesting thing is that here, superficially or on
the interface level at least, we have the same type hint
possibilities as in SBCL.

[1] https://dataswamp.org/~incal/cl/bench/inclist.cl

-- 
underground experts united
https://dataswamp.org/~incal

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-09-03  0:48                                   ` Emanuel Berg
@ 2023-09-03  8:50                                     ` Ihor Radchenko
  2023-09-03  9:05                                       ` Emanuel Berg
  0 siblings, 1 reply; 41+ messages in thread
From: Ihor Radchenko @ 2023-09-03  8:50 UTC (permalink / raw)
  To: Emanuel Berg; +Cc: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

> Her is yet another file,
>
>   https://dataswamp.org/~incal/cl/bench/inclist.cl

Unfortunately, I cannot use this file because it is referring to
~/public_html/cl/bench/timing.cl, which I do not have.

> The Elisp time for this benchmark is, with `native-comp-speed'
> at the default - for the benchmarks package - maximal
> optimization level, namely 3 - at 5.86 sec, while the CL is at
> 1.635993 sec.

For me, Elisp runs in 1.2 sec.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-09-03  8:50                                     ` Ihor Radchenko
@ 2023-09-03  9:05                                       ` Emanuel Berg
  2023-09-03 10:30                                         ` Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) Ihor Radchenko
  0 siblings, 1 reply; 41+ messages in thread
From: Emanuel Berg @ 2023-09-03  9:05 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

>> Her is yet another file,
>>
>>   https://dataswamp.org/~incal/cl/bench/inclist.cl
>
> Unfortunately, I cannot use this file because it is referring to
> ~/public_html/cl/bench/timing.cl, which I do not have.

It is in the same directory

  https://dataswamp.org/~incal/cl/bench/timing.cl

but how to not have to use an absolute path when loading it
I don't know how to do in CL, maybe it cannot be done without
adsf or some other such solution.

So you have to set that manually as for now.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance)
  2023-09-03  9:05                                       ` Emanuel Berg
@ 2023-09-03 10:30                                         ` Ihor Radchenko
  2023-09-04  1:03                                           ` Emanuel Berg
  0 siblings, 1 reply; 41+ messages in thread
From: Ihor Radchenko @ 2023-09-03 10:30 UTC (permalink / raw)
  To: Emanuel Berg, Andrea Corallo; +Cc: emacs-devel

Emanuel Berg <incal@dataswamp.org> writes:

>> Unfortunately, I cannot use this file because it is referring to
>> ~/public_html/cl/bench/timing.cl, which I do not have.
>
> It is in the same directory
>
>   https://dataswamp.org/~incal/cl/bench/timing.cl

It would be nice if you attached the files to email.
Otherwise, people examining this threads a few years in future may not
be able to access the files.

> So you have to set that manually as for now.

I did, and the results are very different from yours:

$  SBCL_HOME=/usr/lib64/sbcl sbcl --load /tmp/inclist.cl

;; 1.096667 s real time
;; 1.096235 s run time

$ SBCL_HOME=/usr/lib64/sbcl sbcl --load /tmp/inclist-type-hints.cl 
;; 0.55 s real time
;; 0.549992 s run time

(emacs master)
$ perf record ./src/emacs -batch -l /home/yantar92/.emacs.d/straight/repos/elisp-benchmarks/elisp-benchmarks.el --eval '(setq elb-speed 2)' --eval '(elisp-benchmarks-run "inclist")'

* Results

  | test               | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) |
  |--------------------+----------------+------------+---------+-------------+-----------------|
  | inclist            |           1.20 |       0.00 |       0 |        1.20 |            0.02 |
  | inclist-type-hints |           1.05 |       0.00 |       0 |        1.05 |            0.02 |
  |--------------------+----------------+------------+---------+-------------+-----------------|
  | total              |           2.26 |       0.00 |       0 |        2.26 |            0.02 |

inclist: 1.1 sec vs. 1.2 sec
inclist-type-hints: 0.55 sec vs. 1.05 sec

with native-comp speed 3, the results are on par for inclist

$ perf record ./src/emacs -batch -l /home/yantar92/.emacs.d/straight/repos/elisp-benchmarks/elisp-benchmarks.el --eval '(setq elb-speed 3)' --eval '(elisp-benchmarks-run "inclist")'

* Results

  | test               | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) |
  |--------------------+----------------+------------+---------+-------------+-----------------|
  | inclist            |           1.07 |       0.00 |       0 |        1.07 |            0.02 |
  | inclist-type-hints |           0.99 |       0.00 |       0 |        0.99 |            0.00 |
  |--------------------+----------------+------------+---------+-------------+-----------------|

inclist: 1.1 sec vs. 1.07 sec
inclist-type-hints: 0.55 sec vs. 0.99 sec

There is nothing obvious that is slower in Elisp - most of the time is
spent in the native compiled functions:

    47.83%  emacs     inclist-b6453dcf-34842bf7.eln                  [.] F656c622d696e636c697374_elb_inclist_0
    44.80%  emacs     inclist-type-hints-bb635d76-535ebfb0.eln       [.] F656c622d696e636c6973742d7468_elb_inclist_th_0
     1.45%  emacs     emacs                                          [.] process_mark_stack

So, might be some missing optimization in native-comp itself.
CCing Andrea, in case if he has some insight about further analysis.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance)
  2023-09-03 10:30                                         ` Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) Ihor Radchenko
@ 2023-09-04  1:03                                           ` Emanuel Berg
  0 siblings, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-09-04  1:03 UTC (permalink / raw)
  To: emacs-devel

Ihor Radchenko wrote:

> I did, and the results are very different from yours

It is probably because of the Emacs batch and/or the SBCL
non-SLIME style of execution, using commands similar to
yours [last] I get the following results.

Elisp vs SBCL

inclist:            (faster 1.04 1.59) ; CL 35% slower
inclist-type-hints: (faster 1.05 0.69) ; CL 52% faster

Elisp optimization: (faster 1.04 1.05) ; Elisp 1% slower from optimization
CL optimization:    (faster 1.59 0.69) ; CL 130% faster from optimization

We see that, surprisingly, CL is slower for plain inclist.

With type hints tho, CL benefits hugely to beat the Elisp
non-optimized record.

While Elisp doesn't seem to benefit from the optimization
at all.

#! /bin/zsh
#
# this file:
#   https://dataswamp.org/~incal/cl/bench/inc2-cl

sbcl --noinform --load inclist.cl --load inclist-type-hints.cl --quit

#! /bin/zsh
#
# this file:
#   https://dataswamp.org/~incal/cl/bench/inc2-el

emacs                                                            \
    -batch                                                       \
    -l ~/.emacs.d/elpa/elisp-benchmarks-1.14/elisp-benchmarks.el \
    --eval '(setq elb-speed 2)'                                  \
    --eval '(elisp-benchmarks-run "inclist")'

-- 
underground experts united
https://dataswamp.org/~incal

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Re: Bignum performance
  2023-08-18  7:14                           ` Simon Leinen
  2023-08-19 13:10                             ` Emanuel Berg
@ 2023-09-04  4:13                             ` Emanuel Berg
  1 sibling, 0 replies; 41+ messages in thread
From: Emanuel Berg @ 2023-09-04  4:13 UTC (permalink / raw)
  To: emacs-devel

Simon Leinen wrote:

> Emacs also has `most-negative-fixnum' and
> `most-positive-fixnum' (borrowed from Common Lisp but now
> part of the core).
>
> On my 64-bit system (GNU Emacs 30.0.50 on aarch64-apple-darwin22.5.0):
>
> (= (- (expt 2 61)) most-negative-fixnum) → t
> (= (1- (expt 2 61)) most-positive-fixnum) → t
>
> (Same as on Emanuel's.)

(let ((bits 62))
  (and (= (-  (expt 2 (1- bits))) most-negative-fixnum)
       (= (1- (expt 2 (1- bits))) most-positive-fixnum) )) ; t

Sweet B)

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2023-09-04  4:13 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14  6:28 [PATCH] Re: Bignum performance (was: Shrinking the C core) Gerd Möllmann
2023-08-14  6:56 ` Gerd Möllmann
2023-08-14  7:04   ` Ihor Radchenko
2023-08-14  7:35     ` Gerd Möllmann
2023-08-14  8:09       ` Ihor Radchenko
2023-08-14  9:28         ` Gerd Möllmann
2023-08-14  9:42           ` Ihor Radchenko
2023-08-15 14:03             ` Emanuel Berg
2023-08-15 15:01               ` Ihor Radchenko
2023-08-15 22:21                 ` Emanuel Berg
2023-08-15 22:33                 ` Emanuel Berg
2023-08-16  4:36                   ` tomas
2023-08-16  5:23                     ` Emanuel Berg
2023-08-14 16:51           ` Emanuel Berg
2023-08-15  4:58             ` Gerd Möllmann
2023-08-15 14:20               ` Emanuel Berg
2023-08-15  6:26             ` [PATCH] Re: Bignum performance Po Lu
2023-08-15 14:33               ` Emanuel Berg
2023-08-15 17:07                 ` tomas
2023-08-15 22:46                   ` Emanuel Berg
2023-08-16  1:31                 ` Po Lu
2023-08-16  1:37                   ` Emanuel Berg
2023-08-16  3:17                     ` Po Lu
2023-08-16  4:44                       ` tomas
2023-08-16  5:18                       ` Gerd Möllmann
2023-08-16  5:35                         ` Emanuel Berg
2023-08-18  7:14                           ` Simon Leinen
2023-08-19 13:10                             ` Emanuel Berg
2023-08-20  5:07                               ` Ihor Radchenko
2023-08-20  6:20                                 ` Emanuel Berg
2023-08-28  5:32                                 ` Emanuel Berg
2023-09-03  0:48                                   ` Emanuel Berg
2023-09-03  8:50                                     ` Ihor Radchenko
2023-09-03  9:05                                       ` Emanuel Berg
2023-09-03 10:30                                         ` Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) Ihor Radchenko
2023-09-04  1:03                                           ` Emanuel Berg
2023-09-03  1:57                                   ` [PATCH] Re: Bignum performance Emanuel Berg
2023-09-04  4:13                             ` Emanuel Berg
2023-08-16  5:41                         ` Gerd Möllmann
2023-08-16  6:42                         ` Po Lu
2023-08-16  8:05                           ` Gerd Möllmann

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).