* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core)
@ 2023-08-14 6:28 Gerd Möllmann
2023-08-14 6:56 ` Gerd Möllmann
0 siblings, 1 reply; 47+ messages in thread
From: Gerd Möllmann @ 2023-08-14 6:28 UTC (permalink / raw)
To: incal; +Cc: emacs-devel
>> It is, but it is also tangent to comparison between Elisp
>> and CL. The main (AFAIU) difference between Elisp and CL is
>> in how the bignums are stored. Elisp uses its own internal
>> object type while CL uses GMP's native format.
Just for the record, SBCL/CMUCL don't use GMP.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 6:28 [PATCH] Re: Bignum performance (was: Shrinking the C core) Gerd Möllmann @ 2023-08-14 6:56 ` Gerd Möllmann 2023-08-14 7:04 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Gerd Möllmann @ 2023-08-14 6:56 UTC (permalink / raw) To: incal; +Cc: emacs-devel > Just for the record, SBCL/CMUCL don't use GMP. Hm, thinking of this - did someone measure how much time is spent in malloc/realloc/free in the benchmarks? That is what GMP uses, and SBCL doesn't. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 6:56 ` Gerd Möllmann @ 2023-08-14 7:04 ` Ihor Radchenko 2023-08-14 7:35 ` Gerd Möllmann 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-14 7:04 UTC (permalink / raw) To: Gerd Möllmann; +Cc: incal, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> Just for the record, SBCL/CMUCL don't use GMP. > > Hm, thinking of this - did someone measure how much time is spent in > malloc/realloc/free in the benchmarks? That is what GMP uses, and SBCL > doesn't. https://yhetil.org/emacs-devel/87bkfdsmde.fsf@localhost/ We can further get rid of the GC by temporarily disabling it (just for demonstration): (let ((beg (float-time))) (setq gc-cons-threshold most-positive-fixnum) (fib 10000 1000) (message "%.3f s" (- (float-time) beg)) ) perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln 0.739 s 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add 6.51% emacs emacs [.] arith_driver 6.03% emacs libc.so.6 [.] malloc 5.57% emacs emacs [.] allocate_vectorlike 5.20% emacs [unknown] [k] 0xffffffffaae01857 4.16% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr 3.72% emacs emacs [.] check_number_coerce_marker 3.35% emacs fib.eln [.] F666962_fib_0 3.29% emacs emacs [.] allocate_pseudovector 2.30% emacs emacs [.] Flss -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 7:04 ` Ihor Radchenko @ 2023-08-14 7:35 ` Gerd Möllmann 2023-08-14 8:09 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Gerd Möllmann @ 2023-08-14 7:35 UTC (permalink / raw) To: Ihor Radchenko; +Cc: incal, emacs-devel On 14.08.23 09:04, Ihor Radchenko wrote: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >>> Just for the record, SBCL/CMUCL don't use GMP. >> >> Hm, thinking of this - did someone measure how much time is spent in >> malloc/realloc/free in the benchmarks? That is what GMP uses, and SBCL >> doesn't. > > https://yhetil.org/emacs-devel/87bkfdsmde.fsf@localhost/ > > We can further get rid of the GC by temporarily disabling it (just for > demonstration): > > (let ((beg (float-time))) > (setq gc-cons-threshold most-positive-fixnum) > (fib 10000 1000) > (message "%.3f s" (- (float-time) beg)) ) > > perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln > 0.739 s > > 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase > 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add > 6.51% emacs emacs [.] arith_driver > 6.03% emacs libc.so.6 [.] malloc > 5.57% emacs emacs [.] allocate_vectorlike > 5.20% emacs [unknown] [k] 0xffffffffaae01857 > 4.16% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr > 3.72% emacs emacs [.] check_number_coerce_marker > 3.35% emacs fib.eln [.] F666962_fib_0 > 3.29% emacs emacs [.] allocate_pseudovector > 2.30% emacs emacs [.] Flss Hm, then maybe we can look at the disassembly of then benchmark in SBCL? Not that in the end the compiler is so smart that it optimizes a lot of stuff simply away because it can prove that the result of the computations cannot possibly be observed? ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 7:35 ` Gerd Möllmann @ 2023-08-14 8:09 ` Ihor Radchenko 2023-08-14 9:28 ` Gerd Möllmann 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-14 8:09 UTC (permalink / raw) To: Gerd Möllmann; +Cc: incal, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Hm, then maybe we can look at the disassembly of then benchmark in SBCL? > Not that in the end the compiler is so smart that it optimizes a lot > of stuff simply away because it can prove that the result of the > computations cannot possibly be observed? Sorry, but I do not know how to do it. Not familiar with CL. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 8:09 ` Ihor Radchenko @ 2023-08-14 9:28 ` Gerd Möllmann 2023-08-14 9:42 ` Ihor Radchenko 2023-08-14 16:51 ` Emanuel Berg 0 siblings, 2 replies; 47+ messages in thread From: Gerd Möllmann @ 2023-08-14 9:28 UTC (permalink / raw) To: Ihor Radchenko; +Cc: incal, emacs-devel On 14.08.23 10:09, Ihor Radchenko wrote: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >> Hm, then maybe we can look at the disassembly of then benchmark in SBCL? >> Not that in the end the compiler is so smart that it optimizes a lot >> of stuff simply away because it can prove that the result of the >> computations cannot possibly be observed? > > Sorry, but I do not know how to do it. Not familiar with CL. > Ok. I used the code from https://dataswamp.org/~incal/cl/fib.cl/fib.cl. And the first thing that stares at me in this function: (defun fib (reps num) (declare (optimize speed (safety 0) (debug 0))) (let ((z 0)) (declare (type (unsigned-byte 53) reps num z)) (dotimes (r reps) (let*((p1 1) (p2 1)) (dotimes (i (- num 2)) (setf z (+ p1 p2) p2 p1 p1 z)))) z)) is the declaration (unsigned-byte 53). The declaration means we are lying to the compiler because Z gets bigger than 53 bits eventually. And all bets are off because of the OPTIMIZE declaration. The result is that everything is done in fixnums on 64-bit machines. ; disassembly for FIB ; Size: 92 bytes. Origin: #x700530086C ; FIB ; 6C: 030080D2 MOVZ NL3, #0 ; 70: 040080D2 MOVZ NL4, #0 ; 74: 0E000014 B L3 ; 78: L0: 410080D2 MOVZ NL1, #2 ; 7C: E20301AA MOV NL2, NL1 ; 80: EB030CAA MOV R1, R2 ; 84: 651100D1 SUB NL5, R1, #4 ; 88: 000080D2 MOVZ NL0, #0 ; 8C: 05000014 B L2 ; 90: L1: 2300028B ADD NL3, NL1, NL2 ; 94: E20301AA MOV NL2, NL1 ; 98: E10303AA MOV NL1, NL3 ; 9C: 00080091 ADD NL0, NL0, #2 ; A0: L2: 1F0005EB CMP NL0, NL5 ; A4: 6BFFFF54 BLT L1 ; A8: 84080091 ADD NL4, NL4, #2 ; AC: L3: 9F000AEB CMP NL4, R0 ; B0: 4BFEFF54 BLT L0 ; B4: EA0303AA MOV R0, NL3 ; B8: FB031AAA MOV CSP, CFP ; BC: 5A7B40A9 LDP CFP, LR, [CFP] ; C0: BF0300F1 CMP NULL, #0 ; C4: C0035FD6 RET Tada! ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 9:28 ` Gerd Möllmann @ 2023-08-14 9:42 ` Ihor Radchenko 2023-08-15 14:03 ` Emanuel Berg 2023-08-14 16:51 ` Emanuel Berg 1 sibling, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-14 9:42 UTC (permalink / raw) To: Gerd Möllmann; +Cc: incal, emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: > Ok. I used the code from https://dataswamp.org/~incal/cl/fib.cl/fib.cl. > ... > The declaration means we are lying to the compiler because Z gets bigger > than 53 bits eventually. And all bets are off because of the OPTIMIZE > declaration. The result is that everything is done in fixnums on 64-bit > machines. That explains a lot :) I now tried (defun fib (reps num) (declare (optimize speed (safety 0) (debug 0))) (let ((z 0)) ;; (declare (type (unsigned-byte 53) reps num z)) (dotimes (r reps) (let*((p1 1) (p2 1)) (dotimes (i (- num 2)) (setf z (+ p1 p2) p2 p1 p1 z)))) z)) and got $ SBCL_HOME=/usr/lib64/sbcl perf record sbcl --load /tmp/fib.cl ;;; 0.263333 s real time ;;; 0.263641 s run time $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln 0.739 s Still ~3x faster compared to Elisp, but not orders of magnitude. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 9:42 ` Ihor Radchenko @ 2023-08-15 14:03 ` Emanuel Berg 2023-08-15 15:01 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 14:03 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> The declaration means we are lying to the compiler because >> Z gets bigger than 53 bits eventually. And all bets are off >> because of the OPTIMIZE declaration. The result is that >> everything is done in fixnums on 64-bit machines. > > That explains a lot :) > > I now tried > > (defun fib (reps num) > (declare (optimize speed (safety 0) (debug 0))) > (let ((z 0)) > ;; (declare (type (unsigned-byte 53) reps num z)) > (dotimes (r reps) > (let*((p1 1) > (p2 1)) > (dotimes (i (- num 2)) > (setf z (+ p1 p2) > p2 p1 > p1 z)))) > z)) > > and got > > $ SBCL_HOME=/usr/lib64/sbcl perf record sbcl --load /tmp/fib.cl > > ;;; 0.263333 s real time > ;;; 0.263641 s run time > > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln > 0.739 s > > Still ~3x faster compared to Elisp, but not orders > of magnitude. A pretty good optimization! :O But what kind of optimization is it? Also, what happens if you remove the OPTIMIZE declaration as well? Still, isn't the rule of the "beat the benchmark" game to beat it as fast as possible? -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-15 14:03 ` Emanuel Berg @ 2023-08-15 15:01 ` Ihor Radchenko 2023-08-15 22:21 ` Emanuel Berg 2023-08-15 22:33 ` Emanuel Berg 0 siblings, 2 replies; 47+ messages in thread From: Ihor Radchenko @ 2023-08-15 15:01 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> ;; (declare (type (unsigned-byte 53) reps num z)) >> ... >> Still ~3x faster compared to Elisp, but not orders >> of magnitude. > > A pretty good optimization! :O > > But what kind of optimization is it? The commented "optimization" is: "Hey, SBCL, do not use bignums. If ints overflow, so be it". > Also, what happens if you remove the OPTIMIZE declaration > as well? No difference. > Still, isn't the rule of the "beat the benchmark" game to beat > it as fast as possible? Yes, but when CBCL is orders of magnitude faster, it indicates something conceptually wrong in the algo. 3x is a matter of variation in the internal details (like extra type checking in Elisp that Po Lu outlined). -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-15 15:01 ` Ihor Radchenko @ 2023-08-15 22:21 ` Emanuel Berg 2023-08-15 22:33 ` Emanuel Berg 1 sibling, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 22:21 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> A pretty good optimization! :O >> >> But what kind of optimization is it? > > The commented "optimization" is: "Hey, SBCL, do not use > bignums. If ints overflow, so be it". ? But then how can the algorithm execute correctly? >> Still, isn't the rule of the "beat the benchmark" game to >> beat it as fast as possible? > > Yes, but when CBCL is orders of magnitude faster, it > indicates something conceptually wrong in the algo. 3x is > a matter of variation in the internal details (like extra > type checking in Elisp that Po Lu outlined). If you are saying the algorithm doesn't output correct data for the conventional conception of the Fibonacci algorithm, then that optimization and whatever time it makes isn't valid, I'll remove it this instant. Hm, maybe we need something like unit testing to confirm that the algorithms perform not just fast, but also as intended to solve whatever problem they were designed to ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-15 15:01 ` Ihor Radchenko 2023-08-15 22:21 ` Emanuel Berg @ 2023-08-15 22:33 ` Emanuel Berg 2023-08-16 4:36 ` tomas 1 sibling, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 22:33 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: > Yes, but when CBCL is orders of magnitude faster, it > indicates something conceptually wrong in the algo. Indeed, I'll remove it, thanks. But my CL skills aren't at that level so someone else added it. A strange optimization indeed, that breaks the code. Hm, maybe not that unusual when I think about it. But that is for normal code, not supposed benchmarks ... So this is the explanation for the +78 875% speed disadvantage for Elisp! As reported a long time ago when comparing Elisp and CL. I.e., what is documented in this file https://dataswamp.org/~incal/emacs-init/fib.el and discussed here (or be it gnu.emacs.help) several months ago. \o/ Fires up a cigar! Always a pleasure when a mystery gets solved ... but TBH I actually believed Elisp was that much slower. Turns out, the CL implementation wasn't even correct. Bummer, but ultimately good for us as it turned out. > 3x is a matter of variation in the internal details (like > extra type checking in Elisp that Po Lu outlined). I'll remove the supposed optimization, and we'll take it from there. We (you) have already improved the bignum object allocation reuse and Mr. Möllmann solved this issue, so we have a positive trajectory already. But 78 875% or 3x doesn't matter in principle, we do it until we are done regardless ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-15 22:33 ` Emanuel Berg @ 2023-08-16 4:36 ` tomas 2023-08-16 5:23 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: tomas @ 2023-08-16 4:36 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1720 bytes --] On Wed, Aug 16, 2023 at 12:33:33AM +0200, Emanuel Berg wrote: > Ihor Radchenko wrote: > > > Yes, but when CBCL is orders of magnitude faster, it > > indicates something conceptually wrong in the algo. > > Indeed, I'll remove it, thanks. > > But my CL skills aren't at that level so someone else added > it. A strange optimization indeed, that breaks the code. It only breaks the code if you "don't know what you are doing". See, without the optimization the code will have, at each and every arithmetic operation, to check "Hmm... Is this thing going to overflow? Hm. It might, so better use bignums. Phew, it didn't, so back to fixnums". Now we know that modern CPU architectures have a hard time with conditional statements (pipeline stalls, cache mispredictions, all that nasty stuff). So this "Hmm..." above is costing real money. Even in cases you won't need it, because things ain't gonna overflow. The compiler tries to do a good job of looking into calculations and deciding "this incf down there won't ever push us over the fixnum limit, because we know we are starting with a number below 10". But the programmer sometimes has more knowledge and can prove that things won't overflow, ever. Or that, should things overflow, it won't matter anyway. It's for those cases that this kind of optimizations are made. C, by the way, always runs in this mode. Unsigned integers will silently wrap around, that's documented behaviour. Signed integers will do whatever their thing is (technically this is called "unspecified behaviour". Perhaps you wanted just to compute fib modulo some big power of two? Then your program was correct, after all... Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-16 4:36 ` tomas @ 2023-08-16 5:23 ` Emanuel Berg 0 siblings, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-16 5:23 UTC (permalink / raw) To: emacs-devel tomas wrote: > Perhaps you wanted just to compute fib modulo some big power > of two? Then your program was correct, after all... So, it works for fixnums but not for bignums? The code must always work for valid indata, if it doesn't, even for a single such case, the optimization breaks the algorithm and will be removed. Maybe we can duplicate it and remove the declaration in one of them. So we would have one Fibonacci to test the speed of fixnums, and one for bignums. In the fixnums one, the declaration would still be legal. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 9:28 ` Gerd Möllmann 2023-08-14 9:42 ` Ihor Radchenko @ 2023-08-14 16:51 ` Emanuel Berg 2023-08-15 4:58 ` Gerd Möllmann 2023-08-15 6:26 ` [PATCH] Re: Bignum performance Po Lu 1 sibling, 2 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-14 16:51 UTC (permalink / raw) To: emacs-devel Gerd Möllmann wrote: >> Sorry, but I do not know how to do it. Not familiar >> with CL. >> > Ok. I used the code from > https://dataswamp.org/~incal/cl/fib.cl/fib.cl Yikes, how did that happen, some slip involving symbolic links ... Here it is: https://dataswamp.org/~incal/cl/bench/fib.cl And timing is done with this: https://dataswamp.org/~incal/cl/bench/timing.cl Note the ugly absolute path in fib.cl BTW, otherwise you get the path not of the file but of SBCL or Slime, maybe. I think one is supposed to use ASDF, but surely there must some easy way to just load a file using a relative path to the current file? (load "~/public_html/cl/bench/timing.cl") > is the declaration (unsigned-byte 53). > > The declaration means we are lying to the compiler because > Z gets bigger than 53 bits eventually. And all bets are off > because of the OPTIMIZE declaration. The result is that > everything is done in fixnums on 64-bit machines. A very impressive optimization indeed, and expressed in a cryptic way. > ; disassembly for FIB > ; Size: 92 bytes. Origin: #x700530086C ; FIB > ; 6C: 030080D2 MOVZ NL3, #0 > ; 70: 040080D2 MOVZ NL4, #0 > ; 74: 0E000014 B L3 > ; 78: L0: 410080D2 MOVZ NL1, #2 > ; 7C: E20301AA MOV NL2, NL1 > ; 80: EB030CAA MOV R1, R2 > ; 84: 651100D1 SUB NL5, R1, #4 > ; 88: 000080D2 MOVZ NL0, #0 > ; 8C: 05000014 B L2 > ; 90: L1: 2300028B ADD NL3, NL1, NL2 > ; 94: E20301AA MOV NL2, NL1 > ; 98: E10303AA MOV NL1, NL3 > ; 9C: 00080091 ADD NL0, NL0, #2 > ; A0: L2: 1F0005EB CMP NL0, NL5 > ; A4: 6BFFFF54 BLT L1 > ; A8: 84080091 ADD NL4, NL4, #2 > ; AC: L3: 9F000AEB CMP NL4, R0 > ; B0: 4BFEFF54 BLT L0 > ; B4: EA0303AA MOV R0, NL3 > ; B8: FB031AAA MOV CSP, CFP > ; BC: 5A7B40A9 LDP CFP, LR, [CFP] > ; C0: BF0300F1 CMP NULL, #0 > ; C4: C0035FD6 RET > > Tada! How do you see only fixnums are used? Are we talking 1 word = 2 bytes = 16 bits here, s2c? If so, the range of fixnums are -32 768 to 32 767 inclusive, so those are hardly huge numbers. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-14 16:51 ` Emanuel Berg @ 2023-08-15 4:58 ` Gerd Möllmann 2023-08-15 14:20 ` Emanuel Berg 2023-08-15 6:26 ` [PATCH] Re: Bignum performance Po Lu 1 sibling, 1 reply; 47+ messages in thread From: Gerd Möllmann @ 2023-08-15 4:58 UTC (permalink / raw) To: incal; +Cc: emacs-devel > How do you see only fixnums are used? By reading the assembly, and remembering a thing or two from my times as a CMUCL contributor, e.g. its fixnum representation. SBCL is a fork of CMUCL. > Are we talking 1 word = 2 bytes = 16 bits here, s2c? > > If so, the range of fixnums are -32 768 to 32 767 inclusive, > so those are hardly huge numbers. It's a 64-bit machine, more specifically an M1. i.e. arm64. How you get from there to these 16 bits escapes me. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-15 4:58 ` Gerd Möllmann @ 2023-08-15 14:20 ` Emanuel Berg 0 siblings, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 14:20 UTC (permalink / raw) To: emacs-devel Gerd Möllmann wrote: >> Are we talking 1 word = 2 bytes = 16 bits here, s2c? If so, >> the range of fixnums are -32 768 to 32 767 inclusive, so >> those are hardly huge numbers. > > It's a 64-bit machine, more specifically an M1. i.e. arm64. > How you get from there to these 16 bits escapes me. Here [1] it says a word for ARM is 32 bits. So to store zero in a fixnum word it will look like this 00000000 00000000 00000000 00000000 If one bit, the MSB, is used as the sign bit, the interval, inclusive, is (list (* -1 (expt 2 (1- 32))) (1- (expt 2 (1- 32))) ) (-2147483648 2147483647) Okay, now we are talking some pretty big number alltho I have seen even bigger ... [1] https://modexp.wordpress.com/2018/10/30/arm64-assembly/ -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 16:51 ` Emanuel Berg 2023-08-15 4:58 ` Gerd Möllmann @ 2023-08-15 6:26 ` Po Lu 2023-08-15 14:33 ` Emanuel Berg 1 sibling, 1 reply; 47+ messages in thread From: Po Lu @ 2023-08-15 6:26 UTC (permalink / raw) To: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > Are we talking 1 word = 2 bytes = 16 bits here, s2c? > > If so, the range of fixnums are -32 768 to 32 767 inclusive, > so those are hardly huge numbers. Under Arm64, general purpose integer registers are 64 bits wide. That is also the word size of said machine. I agree that vague terminology such as ``word'' is confusing, because of its use by the Unix assembler. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-15 6:26 ` [PATCH] Re: Bignum performance Po Lu @ 2023-08-15 14:33 ` Emanuel Berg 2023-08-15 17:07 ` tomas 2023-08-16 1:31 ` Po Lu 0 siblings, 2 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 14:33 UTC (permalink / raw) To: emacs-devel Po Lu wrote: >> Are we talking 1 word = 2 bytes = 16 bits here, s2c? >> >> If so, the range of fixnums are -32 768 to 32 767 >> inclusive, so those are hardly huge numbers. > > Under Arm64, general purpose integer registers are 64 bits > wide. That is also the word size of said machine. If they are, the range for fixnums is (list (* -1 (expt 2 (1- 64))) (1- (expt 2 (1- 64))) ) (-9223372036854775808 9223372036854775807) Only after that it gets slower :P -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-15 14:33 ` Emanuel Berg @ 2023-08-15 17:07 ` tomas 2023-08-15 22:46 ` Emanuel Berg 2023-08-16 1:31 ` Po Lu 1 sibling, 1 reply; 47+ messages in thread From: tomas @ 2023-08-15 17:07 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 738 bytes --] On Tue, Aug 15, 2023 at 04:33:04PM +0200, Emanuel Berg wrote: > Po Lu wrote: > > >> Are we talking 1 word = 2 bytes = 16 bits here, s2c? > >> > >> If so, the range of fixnums are -32 768 to 32 767 > >> inclusive, so those are hardly huge numbers. > > > > Under Arm64, general purpose integer registers are 64 bits > > wide. That is also the word size of said machine. > > If they are, the range for fixnums is > > (list (* -1 (expt 2 (1- 64))) > (1- (expt 2 (1- 64))) ) > > (-9223372036854775808 9223372036854775807) > > Only after that it gets slower :P Unless SBCL uses tagged object representation. Unless the compiler can prove that some "thing" is going to be an int. Unless... Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-15 17:07 ` tomas @ 2023-08-15 22:46 ` Emanuel Berg 0 siblings, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 22:46 UTC (permalink / raw) To: emacs-devel tomas wrote: >>>> Are we talking 1 word = 2 bytes = 16 bits here, s2c? >>>> >>>> If so, the range of fixnums are -32 768 to 32 767 >>>> inclusive, so those are hardly huge numbers. >>> >>> Under Arm64, general purpose integer registers are 64 bits >>> wide. That is also the word size of said machine. >> >> If they are, the range for fixnums is >> >> (list (* -1 (expt 2 (1- 64))) >> (1- (expt 2 (1- 64))) ) >> >> (-9223372036854775808 9223372036854775807) >> >> Only after that it gets slower :P > > Unless SBCL uses tagged object representation. Unless the > compiler can prove that some "thing" is going to be an int. > Unless... Actually the rules are quite simple. As long as the expected execution and correct return value is observed and ultimately achieved by the algorithm, all optimizations are fair. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-15 14:33 ` Emanuel Berg 2023-08-15 17:07 ` tomas @ 2023-08-16 1:31 ` Po Lu 2023-08-16 1:37 ` Emanuel Berg 1 sibling, 1 reply; 47+ messages in thread From: Po Lu @ 2023-08-16 1:31 UTC (permalink / raw) To: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > Po Lu wrote: > >>> Are we talking 1 word = 2 bytes = 16 bits here, s2c? >>> >>> If so, the range of fixnums are -32 768 to 32 767 >>> inclusive, so those are hardly huge numbers. >> >> Under Arm64, general purpose integer registers are 64 bits >> wide. That is also the word size of said machine. > > If they are, the range for fixnums is > > (list (* -1 (expt 2 (1- 64))) > (1- (expt 2 (1- 64))) ) > > (-9223372036854775808 9223372036854775807) > > Only after that it gets slower :P Lisp systems normally set aside several of the high or low bits of a register as a tag linking a type to the object represented. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 1:31 ` Po Lu @ 2023-08-16 1:37 ` Emanuel Berg 2023-08-16 3:17 ` Po Lu 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-16 1:37 UTC (permalink / raw) To: emacs-devel Po Lu wrote: >>> Under Arm64, general purpose integer registers are 64 bits >>> wide. That is also the word size of said machine. >> >> If they are, the range for fixnums is >> >> (list (* -1 (expt 2 (1- 64))) >> (1- (expt 2 (1- 64))) ) >> >> (-9223372036854775808 9223372036854775807) >> >> Only after that it gets slower :P > > Lisp systems normally set aside several of the high or low > bits of a register as a tag linking a type to the > object represented. But here we are at the CPU architecture level (register length), surely Lisp don't meddle with that? No, I sense that it is, actually. So please explain, then, how it works. And in particular, how many bits do we (Elisp and CL) actually have for our fixnums? Or, ar we talking bignums now? -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 1:37 ` Emanuel Berg @ 2023-08-16 3:17 ` Po Lu 2023-08-16 4:44 ` tomas 2023-08-16 5:18 ` Gerd Möllmann 0 siblings, 2 replies; 47+ messages in thread From: Po Lu @ 2023-08-16 3:17 UTC (permalink / raw) To: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > Po Lu wrote: > >>>> Under Arm64, general purpose integer registers are 64 bits >>>> wide. That is also the word size of said machine. >>> >>> If they are, the range for fixnums is >>> >>> (list (* -1 (expt 2 (1- 64))) >>> (1- (expt 2 (1- 64))) ) >>> >>> (-9223372036854775808 9223372036854775807) >>> >>> Only after that it gets slower :P >> >> Lisp systems normally set aside several of the high or low >> bits of a register as a tag linking a type to the >> object represented. > > But here we are at the CPU architecture level (register > length), surely Lisp don't meddle with that? > > No, I sense that it is, actually. So please explain, then, how > it works. And in particular, how many bits do we (Elisp and > CL) actually have for our fixnums? I don't know about SBCL, but as for Emacs, refer to the definition of VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but I have no idea where they've disappeared to.) ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 3:17 ` Po Lu @ 2023-08-16 4:44 ` tomas 2023-08-16 5:18 ` Gerd Möllmann 1 sibling, 0 replies; 47+ messages in thread From: tomas @ 2023-08-16 4:44 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1277 bytes --] On Wed, Aug 16, 2023 at 11:17:01AM +0800, Po Lu wrote: > Emanuel Berg <incal@dataswamp.org> writes: > > > Po Lu wrote: [...] > >> Lisp systems normally set aside several of the high or low > >> bits of a register as a tag linking a type to the > >> object represented. > > > > But here we are at the CPU architecture level (register > > length), surely Lisp don't meddle with that? > > > > No, I sense that it is, actually. So please explain, then, how > > it works. And in particular, how many bits do we (Elisp and > > CL) actually have for our fixnums? > > I don't know about SBCL, but as for Emacs, refer to the definition of > VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but > I have no idea where they've disappeared to.) That is what I was hinting at with "tagged representation": Emacs Lisp does it, we don't know about SBCL. Typically, a good implementation has small stretches of code where the values are as-is because the compiler can prove what their type is (fixnum, whatever). But that means that fixnums are usually limited to less than the full machine word's width (e.g. 60 bits if your tag is four bits wide), because your lisp has to be able to stuff them back into such a place. Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 3:17 ` Po Lu 2023-08-16 4:44 ` tomas @ 2023-08-16 5:18 ` Gerd Möllmann 2023-08-16 5:35 ` Emanuel Berg ` (2 more replies) 1 sibling, 3 replies; 47+ messages in thread From: Gerd Möllmann @ 2023-08-16 5:18 UTC (permalink / raw) To: luangruo; +Cc: emacs-devel > I don't know about SBCL, but as for Emacs, refer to the definition of > VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but > I have no idea where they've disappeared to.) The SBCL I have here, a Homebrew installation, uses the scheme where ....0 -> fixnum ....1 -> other objects (discriminated by additional tag bits) I had the same for Emacs in the branch gerd_int in the 2000s, if memory serves me. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 5:18 ` Gerd Möllmann @ 2023-08-16 5:35 ` Emanuel Berg 2023-08-18 7:14 ` Simon Leinen 2023-08-16 5:41 ` Gerd Möllmann 2023-08-16 6:42 ` Po Lu 2 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-16 5:35 UTC (permalink / raw) To: emacs-devel Gerd Möllmann wrote: >> I don't know about SBCL, but as for Emacs, refer to the >> definition of VALBITS in lisp.h (maybe also the right files >> among m/*.h and s/*.h, but I have no idea where they've >> disappeared to.) > > The SBCL I have here, a Homebrew installation, uses the > scheme where > > ....0 -> fixnum > ....1 -> other objects (discriminated by additional tag bits) > > I had the same for Emacs in the branch gerd_int in the > 2000s, if memory serves me. Ah, there is `fixnump' (and `bignump') to test for a specific number, (fixnump (expt 2 60)) ; t (fixnump (expt 2 61)) ; nil 2^60 = 1152921504606846976, so that's pretty big. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 5:35 ` Emanuel Berg @ 2023-08-18 7:14 ` Simon Leinen 2023-08-19 13:10 ` Emanuel Berg 2023-09-04 4:13 ` Emanuel Berg 0 siblings, 2 replies; 47+ messages in thread From: Simon Leinen @ 2023-08-18 7:14 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1215 bytes --] Emacs also has `most-negative-fixnum' and `most-positive-fixnum' (borrowed from Common Lisp but now part of the core). On my 64-bit system (GNU Emacs 30.0.50 on aarch64-apple-darwin22.5.0): (= (- (expt 2 61)) most-negative-fixnum) ⇒ t (= (1- (expt 2 61)) most-positive-fixnum) ⇒ t (Same as on Emanuel's.) -- Simon. On Wed, Aug 16, 2023 at 1:12 PM Emanuel Berg <incal@dataswamp.org> wrote: > Gerd Möllmann wrote: > > >> I don't know about SBCL, but as for Emacs, refer to the > >> definition of VALBITS in lisp.h (maybe also the right files > >> among m/*.h and s/*.h, but I have no idea where they've > >> disappeared to.) > > > > The SBCL I have here, a Homebrew installation, uses the > > scheme where > > > > ....0 -> fixnum > > ....1 -> other objects (discriminated by additional tag bits) > > > > I had the same for Emacs in the branch gerd_int in the > > 2000s, if memory serves me. > > Ah, there is `fixnump' (and `bignump') to test for a specific > number, > > (fixnump (expt 2 60)) ; t > (fixnump (expt 2 61)) ; nil > > 2^60 = 1152921504606846976, so that's pretty big. > > -- > underground experts united > https://dataswamp.org/~incal > > > [-- Attachment #2: Type: text/html, Size: 1806 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-18 7:14 ` Simon Leinen @ 2023-08-19 13:10 ` Emanuel Berg 2023-08-20 5:07 ` Ihor Radchenko 2023-09-04 4:13 ` Emanuel Berg 1 sibling, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-19 13:10 UTC (permalink / raw) To: emacs-devel Simon Leinen wrote: > Emacs also has `most-negative-fixnum' and > `most-positive-fixnum' (borrowed from Common Lisp but now > part of the core). > > On my 64-bit system (GNU Emacs 30.0.50 on > aarch64-apple-darwin22.5.0): > > (= (- (expt 2 61)) most-negative-fixnum) → t > (= (1- (expt 2 61)) most-positive-fixnum) → t > > (Same as on Emanuel's.) At least that computation was correct then! Now that Gerd has solved the little mystery why Fibonacci was seemingly so much faster on Common Lisp and we also got that performance gain patch from Ihor I don't know how much sense it makes to continue translating the emacs-benchmarks to Common Lisp, or rather if anyone is motivated enough to do it, but this is how far I got: https://dataswamp.org/~incal/cl/bench/ -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-19 13:10 ` Emanuel Berg @ 2023-08-20 5:07 ` Ihor Radchenko 2023-08-20 6:20 ` Emanuel Berg 2023-08-28 5:32 ` Emanuel Berg 0 siblings, 2 replies; 47+ messages in thread From: Ihor Radchenko @ 2023-08-20 5:07 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > Now that Gerd has solved the little mystery why Fibonacci was > seemingly so much faster on Common Lisp and we also got that > performance gain patch from Ihor I don't know how much sense > it makes to continue translating the emacs-benchmarks to > Common Lisp, or rather if anyone is motivated enough to do it, > but this is how far I got: > > https://dataswamp.org/~incal/cl/bench/ You can again compare Elisp with CL and let us know what is being noticeably slower. It will be an indication that something might be improved. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-20 5:07 ` Ihor Radchenko @ 2023-08-20 6:20 ` Emanuel Berg 2023-08-28 5:32 ` Emanuel Berg 1 sibling, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-20 6:20 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> Now that Gerd has solved the little mystery why Fibonacci >> was seemingly so much faster on Common Lisp and we also got >> that performance gain patch from Ihor I don't know how much >> sense it makes to continue translating the emacs-benchmarks >> to Common Lisp, or rather if anyone is motivated enough to >> do it, but this is how far I got: >> >> https://dataswamp.org/~incal/cl/bench/ > > You can again compare Elisp with CL and let us know what is > being noticeably slower. It will be an indication that > something might be improved. Thanks, you are right, hopefully it will happen. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-20 5:07 ` Ihor Radchenko 2023-08-20 6:20 ` Emanuel Berg @ 2023-08-28 5:32 ` Emanuel Berg 2023-09-03 0:48 ` Emanuel Berg 2023-09-03 1:57 ` [PATCH] Re: Bignum performance Emanuel Berg 1 sibling, 2 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-28 5:32 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: > You can again compare Elisp with CL and let us know what is > being noticeably slower. It will be an indication that > something might be improved. Here is a new file: https://dataswamp.org/~incal/cl/bench/flet.cl The CL time is 0.84 vs Elisp time at 1.35. So here, CL is 61% faster! (format "%d%%" (round (* 100 (1- (/ 1.35 0.84))))) flet 0.839996 s real time 0.839262 s run time -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-28 5:32 ` Emanuel Berg @ 2023-09-03 0:48 ` Emanuel Berg 2023-09-03 8:50 ` Ihor Radchenko 2023-09-03 1:57 ` [PATCH] Re: Bignum performance Emanuel Berg 1 sibling, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-09-03 0:48 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: > You can again compare Elisp with CL and let us know what is > being noticeably slower. It will be an indication that > something might be improved. Her is yet another file, https://dataswamp.org/~incal/cl/bench/inclist.cl The Elisp time for this benchmark is, with `native-comp-speed' at the default - for the benchmarks package - maximal optimization level, namely 3 - at 5.86 sec, while the CL is at 1.635993 sec. So here, CL is 258% faster. I run the benchmarks from Emacs, and the CL also from Emacs, with SLIME and SBCL. So that should be pretty fair, but maybe when all benchmarks are translated one could do the Elisp with batch and the SBCL not using Emacs at all. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-09-03 0:48 ` Emanuel Berg @ 2023-09-03 8:50 ` Ihor Radchenko 2023-09-03 9:05 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-09-03 8:50 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > Her is yet another file, > > https://dataswamp.org/~incal/cl/bench/inclist.cl Unfortunately, I cannot use this file because it is referring to ~/public_html/cl/bench/timing.cl, which I do not have. > The Elisp time for this benchmark is, with `native-comp-speed' > at the default - for the benchmarks package - maximal > optimization level, namely 3 - at 5.86 sec, while the CL is at > 1.635993 sec. For me, Elisp runs in 1.2 sec. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-09-03 8:50 ` Ihor Radchenko @ 2023-09-03 9:05 ` Emanuel Berg 2023-09-03 10:30 ` Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-09-03 9:05 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> Her is yet another file, >> >> https://dataswamp.org/~incal/cl/bench/inclist.cl > > Unfortunately, I cannot use this file because it is referring to > ~/public_html/cl/bench/timing.cl, which I do not have. It is in the same directory https://dataswamp.org/~incal/cl/bench/timing.cl but how to not have to use an absolute path when loading it I don't know how to do in CL, maybe it cannot be done without adsf or some other such solution. So you have to set that manually as for now. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) 2023-09-03 9:05 ` Emanuel Berg @ 2023-09-03 10:30 ` Ihor Radchenko 2023-09-04 1:03 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-09-03 10:30 UTC (permalink / raw) To: Emanuel Berg, Andrea Corallo; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> Unfortunately, I cannot use this file because it is referring to >> ~/public_html/cl/bench/timing.cl, which I do not have. > > It is in the same directory > > https://dataswamp.org/~incal/cl/bench/timing.cl It would be nice if you attached the files to email. Otherwise, people examining this threads a few years in future may not be able to access the files. > So you have to set that manually as for now. I did, and the results are very different from yours: $ SBCL_HOME=/usr/lib64/sbcl sbcl --load /tmp/inclist.cl ;; 1.096667 s real time ;; 1.096235 s run time $ SBCL_HOME=/usr/lib64/sbcl sbcl --load /tmp/inclist-type-hints.cl ;; 0.55 s real time ;; 0.549992 s run time (emacs master) $ perf record ./src/emacs -batch -l /home/yantar92/.emacs.d/straight/repos/elisp-benchmarks/elisp-benchmarks.el --eval '(setq elb-speed 2)' --eval '(elisp-benchmarks-run "inclist")' * Results | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | |--------------------+----------------+------------+---------+-------------+-----------------| | inclist | 1.20 | 0.00 | 0 | 1.20 | 0.02 | | inclist-type-hints | 1.05 | 0.00 | 0 | 1.05 | 0.02 | |--------------------+----------------+------------+---------+-------------+-----------------| | total | 2.26 | 0.00 | 0 | 2.26 | 0.02 | inclist: 1.1 sec vs. 1.2 sec inclist-type-hints: 0.55 sec vs. 1.05 sec with native-comp speed 3, the results are on par for inclist $ perf record ./src/emacs -batch -l /home/yantar92/.emacs.d/straight/repos/elisp-benchmarks/elisp-benchmarks.el --eval '(setq elb-speed 3)' --eval '(elisp-benchmarks-run "inclist")' * Results | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | |--------------------+----------------+------------+---------+-------------+-----------------| | inclist | 1.07 | 0.00 | 0 | 1.07 | 0.02 | | inclist-type-hints | 0.99 | 0.00 | 0 | 0.99 | 0.00 | |--------------------+----------------+------------+---------+-------------+-----------------| inclist: 1.1 sec vs. 1.07 sec inclist-type-hints: 0.55 sec vs. 0.99 sec There is nothing obvious that is slower in Elisp - most of the time is spent in the native compiled functions: 47.83% emacs inclist-b6453dcf-34842bf7.eln [.] F656c622d696e636c697374_elb_inclist_0 44.80% emacs inclist-type-hints-bb635d76-535ebfb0.eln [.] F656c622d696e636c6973742d7468_elb_inclist_th_0 1.45% emacs emacs [.] process_mark_stack So, might be some missing optimization in native-comp itself. CCing Andrea, in case if he has some insight about further analysis. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) 2023-09-03 10:30 ` Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) Ihor Radchenko @ 2023-09-04 1:03 ` Emanuel Berg 0 siblings, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-09-04 1:03 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: > I did, and the results are very different from yours It is probably because of the Emacs batch and/or the SBCL non-SLIME style of execution, using commands similar to yours [last] I get the following results. Elisp vs SBCL inclist: (faster 1.04 1.59) ; CL 35% slower inclist-type-hints: (faster 1.05 0.69) ; CL 52% faster Elisp optimization: (faster 1.04 1.05) ; Elisp 1% slower from optimization CL optimization: (faster 1.59 0.69) ; CL 130% faster from optimization We see that, surprisingly, CL is slower for plain inclist. With type hints tho, CL benefits hugely to beat the Elisp non-optimized record. While Elisp doesn't seem to benefit from the optimization at all. #! /bin/zsh # # this file: # https://dataswamp.org/~incal/cl/bench/inc2-cl sbcl --noinform --load inclist.cl --load inclist-type-hints.cl --quit #! /bin/zsh # # this file: # https://dataswamp.org/~incal/cl/bench/inc2-el emacs \ -batch \ -l ~/.emacs.d/elpa/elisp-benchmarks-1.14/elisp-benchmarks.el \ --eval '(setq elb-speed 2)' \ --eval '(elisp-benchmarks-run "inclist")' -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-28 5:32 ` Emanuel Berg 2023-09-03 0:48 ` Emanuel Berg @ 2023-09-03 1:57 ` Emanuel Berg 1 sibling, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-09-03 1:57 UTC (permalink / raw) To: emacs-devel Yet another file, https://dataswamp.org/~incal/cl/bench/inclist-type-hints.cl Here we have type hints, answering my own questions a while back how to do it! It is done with `declare', as in this (declare (optimize (speed 3) (safety 0))) for the concerned function, then they are actually put to use with `the'. ("Type hints enabled"?) Elisp vs SBCL, in seconds: (faster 5.73 0.675997) ; CL is 748% faster Note that the optimization worked a lot better for SBCL than in did for Elisp, if we compare with the non-optimized (i.e. no type hints) file I just posted [1] [it hasn't arrived yet on this ML as I type this, but should arrive] - anyway (faster 5.86 5.73) ; 2% - Elisp vs `cl-the' Elisp (faster 1.635993 0.675997) ; 142% - SBCL vs `the' SBCL SBCL become 142% faster with the optimization, and the Elisp - just 2%. Another interesting thing is that here, superficially or on the interface level at least, we have the same type hint possibilities as in SBCL. [1] https://dataswamp.org/~incal/cl/bench/inclist.cl -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-18 7:14 ` Simon Leinen 2023-08-19 13:10 ` Emanuel Berg @ 2023-09-04 4:13 ` Emanuel Berg 1 sibling, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-09-04 4:13 UTC (permalink / raw) To: emacs-devel Simon Leinen wrote: > Emacs also has `most-negative-fixnum' and > `most-positive-fixnum' (borrowed from Common Lisp but now > part of the core). > > On my 64-bit system (GNU Emacs 30.0.50 on aarch64-apple-darwin22.5.0): > > (= (- (expt 2 61)) most-negative-fixnum) → t > (= (1- (expt 2 61)) most-positive-fixnum) → t > > (Same as on Emanuel's.) (let ((bits 62)) (and (= (- (expt 2 (1- bits))) most-negative-fixnum) (= (1- (expt 2 (1- bits))) most-positive-fixnum) )) ; t Sweet B) -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 5:18 ` Gerd Möllmann 2023-08-16 5:35 ` Emanuel Berg @ 2023-08-16 5:41 ` Gerd Möllmann 2023-08-16 6:42 ` Po Lu 2 siblings, 0 replies; 47+ messages in thread From: Gerd Möllmann @ 2023-08-16 5:41 UTC (permalink / raw) To: luangruo; +Cc: emacs-devel On 16.08.23 07:18, Gerd Möllmann wrote: >> I don't know about SBCL, but as for Emacs, refer to the definition of >> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but >> I have no idea where they've disappeared to.) > > The SBCL I have here, a Homebrew installation, uses the scheme where Oops, I'm actually not running the SBCL from Homebrew, but my own... Anyway, the tagging is like this: @c 64-bit lowtag assignment (wider-fixnums) @c xyz0 -- Fixnum (where z or yz may also be 0 depending on n-fixnum-tag-bits) @c xx01 -- Other-immediate @c xx11 -- Pointer @c 0011 -- Instance-pointer @c 0111 -- List-pointer @c 1011 -- Function-pointer @c 1111 -- Other-pointer https://github.com/sbcl/sbcl/blob/master/doc/internals/objects-in-memory.texinfo ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 5:18 ` Gerd Möllmann 2023-08-16 5:35 ` Emanuel Berg 2023-08-16 5:41 ` Gerd Möllmann @ 2023-08-16 6:42 ` Po Lu 2023-08-16 8:05 ` Gerd Möllmann 2 siblings, 1 reply; 47+ messages in thread From: Po Lu @ 2023-08-16 6:42 UTC (permalink / raw) To: Gerd Möllmann; +Cc: emacs-devel Gerd Möllmann <gerd.moellmann@gmail.com> writes: >> I don't know about SBCL, but as for Emacs, refer to the definition of >> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but >> I have no idea where they've disappeared to.) > > The SBCL I have here, a Homebrew installation, uses the scheme where > > ....0 -> fixnum > ....1 -> other objects (discriminated by additional tag bits) > > I had the same for Emacs in the branch gerd_int in the 2000s, if > memory serves me. In today's Emacs, 0 is Lisp_Symbol; this facilitates representing Qnil as C NULL. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-16 6:42 ` Po Lu @ 2023-08-16 8:05 ` Gerd Möllmann 0 siblings, 0 replies; 47+ messages in thread From: Gerd Möllmann @ 2023-08-16 8:05 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel On 16.08.23 08:42, Po Lu wrote: > Gerd Möllmann <gerd.moellmann@gmail.com> writes: > >>> I don't know about SBCL, but as for Emacs, refer to the definition of >>> VALBITS in lisp.h (maybe also the right files among m/*.h and s/*.h, but >>> I have no idea where they've disappeared to.) >> >> The SBCL I have here, a Homebrew installation, uses the scheme where >> >> ....0 -> fixnum >> ....1 -> other objects (discriminated by additional tag bits) >> >> I had the same for Emacs in the branch gerd_int in the 2000s, if >> memory serves me. > > In today's Emacs, 0 is Lisp_Symbol; this facilitates representing Qnil > as C NULL. Yes, and I don't see a need to change anything in this regard in Emacs. IMHO, the fixnum range is more than sufficient nowadays, in general. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Shrinking the C core @ 2023-08-09 9:46 Eric S. Raymond 2023-08-09 12:34 ` Po Lu 0 siblings, 1 reply; 47+ messages in thread From: Eric S. Raymond @ 2023-08-09 9:46 UTC (permalink / raw) To: emacs-devel Recently I have been refamiliarizing myself with the Emacs C core. Some days ago, as a test that I understand the C core API and the current build recipe, I made and pushed a small commit that moved the policy code in delete-file out to Lisp, basing it on a smaller and simpler new entry point named delete-file-internal (this is parallel to the way delete-directory is already partitioned). I've since been poking around the C core code and am now wondering why there is so much C-core code that seems like it could be pushed out to Lisp. For example, in src/fileio.c: DEFUN ("unhandled-file-name-directory", Funhandled_file_name_directory, Sunhandled_file_name_directory, 1, 1, 0, doc: /* Return a directly usable directory name somehow associated with FILENAME. A `directly usable' directory name is one that may be used without the intervention of any file name handler. If FILENAME is a directly usable file itself, return \(file-name-as-directory FILENAME). If FILENAME refers to a file which is not accessible from a local process, then this should return nil. The `call-process' and `start-process' functions use this function to get a current directory to run processes in. */) (Lisp_Object filename) { Lisp_Object handler; /* If the file name has special constructs in it, call the corresponding file name handler. */ handler = Ffind_file_name_handler (filename, Qunhandled_file_name_directory); if (!NILP (handler)) { Lisp_Object handled_name = call2 (handler, Qunhandled_file_name_directory, filename); return STRINGP (handled_name) ? handled_name : Qnil; } return Ffile_name_as_directory (filename); } Why is this in C? Is there any reason not to push it out to Lisp and reduce the core complexity? More generally, if I do this kind of refactor, will I be stepping on anyone's toes? -- >>esr>> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Shrinking the C core 2023-08-09 9:46 Shrinking the C core Eric S. Raymond @ 2023-08-09 12:34 ` Po Lu 2023-08-09 15:51 ` Eric S. Raymond 0 siblings, 1 reply; 47+ messages in thread From: Po Lu @ 2023-08-09 12:34 UTC (permalink / raw) To: Eric S. Raymond; +Cc: emacs-devel "Eric S. Raymond" <esr@thyrsus.com> writes: > Recently I have been refamiliarizing myself with the Emacs C core. > > Some days ago, as a test that I understand the C core API and the > current build recipe, I made and pushed a small commit that moved > the policy code in delete-file out to Lisp, basing it on a smaller > and simpler new entry point named delete-file-internal (this is > parallel to the way delete-directory is already partitioned). > > I've since been poking around the C core code and am now wondering why > there is so much C-core code that seems like it could be pushed out to > Lisp. For example, in src/fileio.c: > > DEFUN ("unhandled-file-name-directory", Funhandled_file_name_directory, > Sunhandled_file_name_directory, 1, 1, 0, > doc: /* Return a directly usable directory name somehow associated with FILENAME. > A `directly usable' directory name is one that may be used without the > intervention of any file name handler. > If FILENAME is a directly usable file itself, return > \(file-name-as-directory FILENAME). > If FILENAME refers to a file which is not accessible from a local process, > then this should return nil. > The `call-process' and `start-process' functions use this function to > get a current directory to run processes in. */) > (Lisp_Object filename) > { > Lisp_Object handler; > > /* If the file name has special constructs in it, > call the corresponding file name handler. */ > handler = Ffind_file_name_handler (filename, Qunhandled_file_name_directory); > if (!NILP (handler)) > { > Lisp_Object handled_name = call2 (handler, Qunhandled_file_name_directory, > filename); > return STRINGP (handled_name) ? handled_name : Qnil; > } > > return Ffile_name_as_directory (filename); > } > > Why is this in C? Is there any reason not to push it out to Lisp and > reduce the core complexity? There is a plenitude of such reasons. Whenever some code is moved to Lisp, its structure and history is lost. Often, comments within the extracted C code remain, but the code itself is left ajar. Bootstrap problems are frequently introduced, as well as latent bugs. And Emacs becomes ever so much slower. These are not simply theoretical concerns, but problems I've encountered many times in practice. Compounding that, fileio.c is abundant with complex logic amended and iteratively refined over many years, which is dangerously prone to loss or mistranscription during a refactor or a rewrite. Finally, this specific case is because we don't want to provide Lisp with an easy means to bypass file name handlers. All primitives operating on file names should thus consult file name handlers, enabling packages like TRAMP to continue operating correctly. > More generally, if I do this kind of refactor, will I be stepping on > anyone's toes? Probably. I think a better idea for a first project is this item in etc/TODO: ** A better display of the bar cursor Distribute a bar cursor of width > 1 evenly between the two glyphs on each side of the bar (what to do at the edges?). which has been on my plate for a while. I would be grateful to anyone who decides to preempt me. Thanks in advance! ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Shrinking the C core 2023-08-09 12:34 ` Po Lu @ 2023-08-09 15:51 ` Eric S. Raymond 2023-08-09 23:56 ` Po Lu 0 siblings, 1 reply; 47+ messages in thread From: Eric S. Raymond @ 2023-08-09 15:51 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel Po Lu <luangruo@yahoo.com>: > There is a plenitude of such reasons. Whenever some code is moved to > Lisp, its structure and history is lost. Often, comments within the > extracted C code remain, but the code itself is left ajar. Bootstrap > problems are frequently introduced, as well as latent bugs. And Emacs > becomes ever so much slower. When I first worked on Emacs code in the 1980s Lisp was already fast enough, and machine speeds have gone up by something like 10^3 since. I plain don't believe the "slower" part can be an issue on modern hardware, not even on tiny SBCs. > Finally, this specific case is because we don't want to provide Lisp > with an easy means to bypass file name handlers. All primitives > operating on file names should thus consult file name handlers, enabling > packages like TRAMP to continue operating correctly. If calling the file-name handlers through Lisp is a crash landing, you were already out of luck. Go have a look at delete-directory. > Probably. I think a better idea for a first project is this item in > etc/TODO: This would ... not be my first project. :-) -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Shrinking the C core 2023-08-09 15:51 ` Eric S. Raymond @ 2023-08-09 23:56 ` Po Lu 2023-08-10 1:19 ` Eric S. Raymond 0 siblings, 1 reply; 47+ messages in thread From: Po Lu @ 2023-08-09 23:56 UTC (permalink / raw) To: Eric S. Raymond; +Cc: emacs-devel "Eric S. Raymond" <esr@thyrsus.com> writes: > When I first worked on Emacs code in the 1980s Lisp was already fast > enough, and machine speeds have gone up by something like 10^3 since. > I plain don't believe the "slower" part can be an issue on modern > hardware, not even on tiny SBCs. Can you promise the same, if your changes are not restricted to one or two functions in fileio.c, but instead pervade throughout C source? Finally, you haven't addressed the remainder of the reasons I itemized. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Shrinking the C core 2023-08-09 23:56 ` Po Lu @ 2023-08-10 1:19 ` Eric S. Raymond 2023-08-10 7:44 ` Eli Zaretskii 0 siblings, 1 reply; 47+ messages in thread From: Eric S. Raymond @ 2023-08-10 1:19 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel Po Lu <luangruo@yahoo.com>: > "Eric S. Raymond" <esr@thyrsus.com> writes: > > > When I first worked on Emacs code in the 1980s Lisp was already fast > > enough, and machine speeds have gone up by something like 10^3 since. > > I plain don't believe the "slower" part can be an issue on modern > > hardware, not even on tiny SBCs. > > Can you promise the same, if your changes are not restricted to one or > two functions in fileio.c, but instead pervade throughout C source? Yes, in fact, I can. Because if by some miracle we were able to instantly rewrite the entirety of Emacs in Python (which I'm not advocating, I chose it because it's the slowest of the major modern scripting languages) basic considerations of clocks per second would predict it to run a *dead minimum* of two orders of magnitude faster than the Emacs of, say, 1990. And 1990 Emacs was already way fast enough for the human eye and brain, which can't even register interface lag of less than 0.17 seconds (look up the story of Jef Raskin and how he exploited this psychophysical fact in the design of the Canon Cat sometime; it's very instructive). The human auditory system can perceive finer timeslices, down to about 0.02s in skilled musicians, but we're not using elisp for audio signal processing. If you take away nothing else from this conversation, at least get it through your head that "more Lisp might make Emacs too slow" is a deeply, *deeply* silly idea. It's 2023 and the only ways you can make a user-facing program slow enough for response lag to be noticeable are disk latency on spinning rust, network round-trips, or operations with a superlinear big-O in critical paths. Mere interpretive overhead won't do it. > Finally, you haven't addressed the remainder of the reasons I itemized. They were too obvious, describing problems that competent software engineers know how to prevent or hedge against, and you addressed me as though I were a n00b that just fell off a cabbage truck. My earliest contributions to Emacs were done so long ago that they predated the systematic Changelog convention; have you heard the expression "teaching your grandmother to suck eggs"? My patience for that sort of thing is limited. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Shrinking the C core 2023-08-10 1:19 ` Eric S. Raymond @ 2023-08-10 7:44 ` Eli Zaretskii 2023-08-10 21:54 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Eli Zaretskii @ 2023-08-10 7:44 UTC (permalink / raw) To: esr; +Cc: luangruo, emacs-devel > Date: Wed, 9 Aug 2023 21:19:11 -0400 > From: "Eric S. Raymond" <esr@thyrsus.com> > Cc: emacs-devel@gnu.org > > Po Lu <luangruo@yahoo.com>: > > "Eric S. Raymond" <esr@thyrsus.com> writes: > > > > > When I first worked on Emacs code in the 1980s Lisp was already fast > > > enough, and machine speeds have gone up by something like 10^3 since. > > > I plain don't believe the "slower" part can be an issue on modern > > > hardware, not even on tiny SBCs. > > > > Can you promise the same, if your changes are not restricted to one or > > two functions in fileio.c, but instead pervade throughout C source? > > Yes, in fact, I can. Because if by some miracle we were able to > instantly rewrite the entirety of Emacs in Python (which I'm not > advocating, I chose it because it's the slowest of the major modern > scripting languages) basic considerations of clocks per second would > predict it to run a *dead minimum* of two orders of magnitude faster > than the Emacs of, say, 1990. > > And 1990 Emacs was already way fast enough for the human eye and > brain, which can't even register interface lag of less than 0.17 > seconds (look up the story of Jef Raskin and how he exploited this > psychophysical fact in the design of the Canon Cat sometime; it's very > instructive). The human auditory system can perceive finer timeslices, > down to about 0.02s in skilled musicians, but we're not using elisp > for audio signal processing. This kind of argument is inherently flawed: it's true that today's machines are much faster than those in, say, 1990, but Emacs nowadays demands much more horsepower from the CPU than it did back then. What's more, Emacs is still a single-threaded Lisp machine, although in the last 10 years CPU power develops more and more in the direction of multiple cores and execution units, with single execution units being basically as fast (or as slow) today as they were a decade ago. And if these theoretical arguments don't convince you, then there are facts: the Emacs display engine, for example, was completely rewritten since the 1990s, and is significantly more expensive than the old one (because it lifts several of the gravest limitations of the old redisplay). Similarly with some other core parts and internals. We are trying to make Lisp programs faster all the time, precisely because users do complain about annoying delays and slowness. Various optimizations in the byte-compiler and the whole native-compilation feature are parts of this effort, and are another evidence that the performance concerns are not illusory in Emacs. And we are still not there yet: people still do complain from time to time, and not always because someone selected a sub-optimal algorithm where better ones exist. The slowdown caused by moving one primitive to Lisp might be insignificant, but these slowdowns add up and eventually do show in user-experience reports. Rewriting code in Lisp also increases the GC pressure, and GC cycles are known as one of the significant causes of slow performance in quite a few cases. We are currently tracking the GC performance (see the emacs-gc-stats@gnu.org mailing list) for that reason, in the hope that we can modify GC and/or its thresholds to improve performance. > If you take away nothing else from this conversation, at least get it > through your head that "more Lisp might make Emacs too slow" is a > deeply, *deeply* silly idea. It's 2023 and the only ways you can make > a user-facing program slow enough for response lag to be noticeable > are disk latency on spinning rust, network round-trips, or operations > with a superlinear big-O in critical paths. Mere interpretive overhead > won't do it. We found this conclusion to be false in practice, at least in Emacs practice. > > Finally, you haven't addressed the remainder of the reasons I itemized. > > They were too obvious, describing problems that competent software > engineers know how to prevent or hedge against, and you addressed me > as though I were a n00b that just fell off a cabbage truck. My > earliest contributions to Emacs were done so long ago that they > predated the systematic Changelog convention; have you heard the > expression "teaching your grandmother to suck eggs"? My patience for > that sort of thing is limited. Please be more patient, and please consider what others here say to be mostly in good-faith and based on non-trivial experience. If something in what others here say sounds like an offense to your intelligence, it is most probably a misunderstanding: for most people here English is not their first language, so don't expect them to always be able to find the best words to express what they want to say. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Shrinking the C core 2023-08-10 7:44 ` Eli Zaretskii @ 2023-08-10 21:54 ` Emanuel Berg 2023-08-11 10:27 ` Bignum performance (was: Shrinking the C core) Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-10 21:54 UTC (permalink / raw) To: emacs-devel Eli Zaretskii wrote: > We are trying to make Lisp programs faster all the time, > precisely because users do complain about annoying delays > and slowness. Various optimizations in the byte-compiler and > the whole native-compilation feature are parts of this > effort It's very fast with that, we should encourage more people do use native-compilation. >> If you take away nothing else from this conversation, at least get it >> through your head that "more Lisp might make Emacs too slow" is a >> deeply, *deeply* silly idea. It's 2023 and the only ways you can make >> a user-facing program slow enough for response lag to be noticeable >> are disk latency on spinning rust, network round-trips, or operations >> with a superlinear big-O in critical paths. Mere interpretive overhead >> won't do it. > > We found this conclusion to be false in practice, at least in Emacs > practice. In theory Lisp can be as fast as any other language but in practice it is not the case with Elisp and Emacs at least. Here is a n experiment with stats how Emacs/Elisp compares to SBCL/CL, for this particular one it shows that Elisp, even natively compiled, is still +78875% slower than Common Lisp. ;;; -*- lexical-binding: t -*- ;; ;; this file: ;; https://dataswamp.org/~incal/emacs-init/fib.el ;; ;; the CL: ;; https://dataswamp.org/~incal/cl/fib.cl ;; ;; code from: ;; elisp-benchmarks-1.14 ;; ;; commands: [results] ;; $ emacs -Q -batch -l fib.el [8.660 s] ;; $ emacs -Q -batch -l fib.elc [3.386 s] ;; $ emacs -Q -batch -l fib-54a44480-bad305eb.eln [3.159 s] ;; $ sbcl -l fib.cl [0.004 s] ;; ;; (stats) ;; plain -> byte: +156% ;; plain -> native: +174% ;; plain -> sbcl: +216400% ;; ;; byte -> native: +7% ;; byte -> sbcl: +84550% ;; ;; native -> sbcl: +78875% (require 'cl-lib) (defun compare-table (l) (cl-loop for (ni ti) in l with first = t do (setq first t) (cl-loop for (nj tj) in l do (when first (insert "\n") (setq first nil)) (unless (string= ni nj) (let ((imp (* (- (/ ti tj) 1.0) 100))) (when (< 0 imp) (insert (format ";; %s -> %s: %+.0f%%\n" ni nj imp) ))))))) (defun stats () (let ((p '("plain" 8.660)) (b '("byte" 3.386)) (n '("native" 3.159)) (s '("sbcl" 0.004)) ) (compare-table (list p b n s)) )) (defun fib (reps num) (let ((z 0)) (dotimes (_ reps) (let ((p1 1) (p2 1)) (dotimes (_ (- num 2)) (setf z (+ p1 p2) p2 p1 p1 z)))) z)) (let ((beg (float-time))) (fib 10000 1000) (message "%.3f s" (- (float-time) beg)) ) ;; (shell-command "emacs -Q -batch -l \"~/.emacs.d/emacs-init/fib.el\"") ;; (shell-command "emacs -Q -batch -l \"~/.emacs.d/emacs-init/fib.elc\"") ;; (shell-command "emacs -Q -batch -l \"~/.emacs.d/eln-cache/30.0.50-3b889b4a/fib-54a44480-8bbda87b.eln\"") (provide 'fib) -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Bignum performance (was: Shrinking the C core) 2023-08-10 21:54 ` Emanuel Berg @ 2023-08-11 10:27 ` Ihor Radchenko 2023-08-11 12:10 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-11 10:27 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > In theory Lisp can be as fast as any other language but in > practice it is not the case with Elisp and Emacs at least. > > Here is a n experiment with stats how Emacs/Elisp compares > to SBCL/CL, for this particular one it shows that Elisp, even > natively compiled, is still +78875% slower than Common Lisp. > > ... > (defun fib (reps num) > (let ((z 0)) > (dotimes (_ reps) > (let ((p1 1) > (p2 1)) > (dotimes (_ (- num 2)) > (setf z (+ p1 p2) > p2 p1 > p1 z)))) > z)) > > (let ((beg (float-time))) > (fib 10000 1000) > (message "%.3f s" (- (float-time) beg)) ) Most of the time is spent in (1) GC; (2) Creating bigint: perf record emacs -Q -batch -l /tmp/fib.eln perf report: Creating bignums: 40.95% emacs emacs [.] allocate_vectorlike GC: 20.21% emacs emacs [.] process_mark_stack 3.41% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase GC: 3.21% emacs emacs [.] mark_char_table 2.82% emacs emacs [.] pdumper_marked_p_impl 2.23% emacs libc.so.6 [.] 0x0000000000090076 1.78% emacs libgmp.so.10.5.0 [.] __gmpz_add 1.71% emacs emacs [.] pdumper_set_marked_impl 1.59% emacs emacs [.] arith_driver 1.31% emacs libc.so.6 [.] malloc GC: 1.15% emacs emacs [.] sweep_vectors 1.03% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr 0.88% emacs libc.so.6 [.] cfree 0.87% emacs fib.eln [.] F666962_fib_0 0.85% emacs emacs [.] check_number_coerce_marker 0.80% emacs libc.so.6 [.] 0x0000000000091043 0.74% emacs emacs [.] allocate_pseudovector 0.65% emacs emacs [.] Flss 0.57% emacs libgmp.so.10.5.0 [.] __gmpz_realloc 0.56% emacs emacs [.] make_bignum_bits My conclusion from this is that big number implementation is not optimal. Mostly because it does not reuse the existing bignum objects and always create new ones - every single time we perform an arithmetic operation. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Bignum performance (was: Shrinking the C core) 2023-08-11 10:27 ` Bignum performance (was: Shrinking the C core) Ihor Radchenko @ 2023-08-11 12:10 ` Emanuel Berg 2023-08-11 12:32 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-11 12:10 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> In theory Lisp can be as fast as any other language but in >> practice it is not the case with Elisp and Emacs at least. >> >> Here is a n experiment with stats how Emacs/Elisp compares >> to SBCL/CL, for this particular one it shows that Elisp, >> even natively compiled, is still +78875% slower than >> Common Lisp. >> >> ... >> (defun fib (reps num) >> (let ((z 0)) >> (dotimes (_ reps) >> (let ((p1 1) >> (p2 1)) >> (dotimes (_ (- num 2)) >> (setf z (+ p1 p2) >> p2 p1 >> p1 z)))) >> z)) >> >> (let ((beg (float-time))) >> (fib 10000 1000) >> (message "%.3f s" (- (float-time) beg)) ) > > Most of the time is spent in (1) GC; (2) Creating bigint: > > perf record emacs -Q -batch -l /tmp/fib.eln > > perf report: > > Creating bignums: > 40.95% emacs emacs [.] allocate_vectorlike > GC: > 20.21% emacs emacs [.] process_mark_stack > 3.41% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase > GC: > 3.21% emacs emacs [.] mark_char_table > 2.82% emacs emacs [.] pdumper_marked_p_impl > 2.23% emacs libc.so.6 [.] 0x0000000000090076 > 1.78% emacs libgmp.so.10.5.0 [.] __gmpz_add > 1.71% emacs emacs [.] pdumper_set_marked_impl > 1.59% emacs emacs [.] arith_driver > 1.31% emacs libc.so.6 [.] malloc > GC: > 1.15% emacs emacs [.] sweep_vectors > 1.03% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr > 0.88% emacs libc.so.6 [.] cfree > 0.87% emacs fib.eln [.] F666962_fib_0 > 0.85% emacs emacs [.] check_number_coerce_marker > 0.80% emacs libc.so.6 [.] 0x0000000000091043 > 0.74% emacs emacs [.] allocate_pseudovector > 0.65% emacs emacs [.] Flss > 0.57% emacs libgmp.so.10.5.0 [.] __gmpz_realloc > 0.56% emacs emacs [.] make_bignum_bits > > My conclusion from this is that big number implementation is > not optimal. Mostly because it does not reuse the existing > bignum objects and always create new ones - every single > time we perform an arithmetic operation. Okay, interesting, how can you see that from the above data? So is this a problem with the compiler? Or some associated library? If so, I'll see if I can upgrade gcc to gcc 13 and see if that improves it, maybe they already fixed it ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Bignum performance (was: Shrinking the C core) 2023-08-11 12:10 ` Emanuel Berg @ 2023-08-11 12:32 ` Ihor Radchenko 2023-08-11 12:38 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-11 12:32 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> perf record emacs -Q -batch -l /tmp/fib.eln >> >> perf report: >> >> Creating bignums: >> 40.95% emacs emacs [.] allocate_vectorlike >> GC: >> 20.21% emacs emacs [.] process_mark_stack >> ... >> My conclusion from this is that big number implementation is >> not optimal. Mostly because it does not reuse the existing >> bignum objects and always create new ones - every single >> time we perform an arithmetic operation. > > Okay, interesting, how can you see that from the above data? process_mark_stack is the GC routine. And I see no other reason to call allocate_vectorlike so much except allocating new bignum objects (which are vectorlike; see src/lisp.h:pvec_type and src/bignum.h:Lisp_Bignum). > So is this a problem with the compiler? Or some > associated library? GC is the well-known problem of garbage-collector being slow when we allocate a large number of objects. And the fact that we allocate many objects is related to immutability of bignums. Every time we do (setq bignum (* bignum fixint)), we abandon the old object holding BIGNUM value and allocate a new bignum object with a new value. Clearly, this allocation is not free and takes a lot of CPU time. While the computation itself is fast. Maybe we could somehow re-use the already allocated bignum objects, similar to what is done for cons cells (see src/alloc.c:Fcons). -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Bignum performance (was: Shrinking the C core) 2023-08-11 12:32 ` Ihor Radchenko @ 2023-08-11 12:38 ` Emanuel Berg 2023-08-11 14:07 ` [PATCH] " Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-11 12:38 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: > And the fact that we allocate many objects is related to > immutability of bignums. Every time we do (setq bignum (* > bignum fixint)), we abandon the old object holding BIGNUM > value and allocate a new bignum object with a new value. > Clearly, this allocation is not free and takes a lot of CPU > time. While the computation itself is fast. So this happens in Emacs C code, OK. > Maybe we could somehow re-use the already allocated bignum > objects, similar to what is done for cons cells (see > src/alloc.c:Fcons). Sounds reasonable :) -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-11 12:38 ` Emanuel Berg @ 2023-08-11 14:07 ` Ihor Radchenko 2023-08-11 18:06 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-11 14:07 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 3633 bytes --] Emanuel Berg <incal@dataswamp.org> writes: >> Maybe we could somehow re-use the already allocated bignum >> objects, similar to what is done for cons cells (see >> src/alloc.c:Fcons). > > Sounds reasonable :) And... is has been already done, actually. allocate_vectorlike calls allocate_vector_from_block, which re-uses pre-allocated objects. And looking into the call graph, this exact branch calling allocate_vector_from_block is indeed called for the bignums: 33.05% 0.00% emacs [unknown] [.] 0000000000000000 | ---0 | |--28.04%--allocate_vectorlike | | | --27.78%--allocate_vector_from_block (inlined) | | | |--2.13%--next_vector (inlined) | | | --0.74%--setup_on_free_list (inlined) If it manually cut off `allocate_vector_from_block', the benchmark time increases twice. So, there is already some improvement coming from re-using allocated memory. I looked deeper into the code tried to cut down on unnecessary looping over the pre-allocated `vector_free_lists'. See the attached patch. Without the patch: perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln 2.321 s 28.60% emacs emacs [.] allocate_vectorlike 24.36% emacs emacs [.] process_mark_stack 3.76% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase 3.59% emacs emacs [.] pdumper_marked_p_impl 3.53% emacs emacs [.] mark_char_table With the patch: perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln 1.968 s 33.17% emacs emacs [.] process_mark_stack 5.51% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase 5.05% emacs emacs [.] mark_char_table 4.88% emacs emacs [.] pdumper_marked_p_impl 3.30% emacs emacs [.] pdumper_set_marked_impl ... 2.52% emacs emacs [.] allocate_vectorlike allocate_vectorlike clearly takes a lot less time by not trying to loop over all the ~500 empty elements of vector_free_lists. We can further get rid of the GC by temporarily disabling it (just for demonstration): (let ((beg (float-time))) (setq gc-cons-threshold most-positive-fixnum) (fib 10000 1000) (message "%.3f s" (- (float-time) beg)) ) perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln 0.739 s 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add 6.51% emacs emacs [.] arith_driver 6.03% emacs libc.so.6 [.] malloc 5.57% emacs emacs [.] allocate_vectorlike 5.20% emacs [unknown] [k] 0xffffffffaae01857 4.16% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr 3.72% emacs emacs [.] check_number_coerce_marker 3.35% emacs fib.eln [.] F666962_fib_0 3.29% emacs emacs [.] allocate_pseudovector 2.30% emacs emacs [.] Flss Now, the actual bignum arithmetics (lisp/gmp.c) takes most of the CPU time. I am not sure what differs between Elisp gmp bindings and analogous SBCL binding so that SBCL is so much faster. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: allocate_vector_from_block.diff --] [-- Type: text/x-patch, Size: 1712 bytes --] diff --git a/src/alloc.c b/src/alloc.c index 17ca5c725d0..62e96b4c9de 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -3140,6 +3140,7 @@ large_vector_vec (struct large_vector *p) vectors of the same NBYTES size, so NTH == VINDEX (NBYTES). */ static struct Lisp_Vector *vector_free_lists[VECTOR_MAX_FREE_LIST_INDEX]; +static int vector_free_lists_min_idx = VECTOR_MAX_FREE_LIST_INDEX; /* Singly-linked list of large vectors. */ @@ -3176,6 +3177,8 @@ setup_on_free_list (struct Lisp_Vector *v, ptrdiff_t nbytes) set_next_vector (v, vector_free_lists[vindex]); ASAN_POISON_VECTOR_CONTENTS (v, nbytes - header_size); vector_free_lists[vindex] = v; + if ( vindex < vector_free_lists_min_idx ) + vector_free_lists_min_idx = vindex; } /* Get a new vector block. */ @@ -3230,8 +3233,8 @@ allocate_vector_from_block (ptrdiff_t nbytes) /* Next, check free lists containing larger vectors. Since we will split the result, we should have remaining space large enough to use for one-slot vector at least. */ - for (index = VINDEX (nbytes + VBLOCK_BYTES_MIN); - index < VECTOR_MAX_FREE_LIST_INDEX; index++) + for (index = max ( VINDEX (nbytes + VBLOCK_BYTES_MIN), vector_free_lists_min_idx ); + index < VECTOR_MAX_FREE_LIST_INDEX; index++, vector_free_lists_min_idx++) if (vector_free_lists[index]) { /* This vector is larger than requested. */ @@ -3413,6 +3416,7 @@ sweep_vectors (void) gcstat.total_vectors = 0; gcstat.total_vector_slots = gcstat.total_free_vector_slots = 0; memset (vector_free_lists, 0, sizeof (vector_free_lists)); + vector_free_lists_min_idx = VECTOR_MAX_FREE_LIST_INDEX; /* Looking through vector blocks. */ [-- Attachment #3: Type: text/plain, Size: 224 bytes --] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-11 14:07 ` [PATCH] " Ihor Radchenko @ 2023-08-11 18:06 ` Emanuel Berg 2023-08-11 19:41 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-11 18:06 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >>> Maybe we could somehow re-use the already allocated bignum >>> objects, similar to what is done for cons cells (see >>> src/alloc.c:Fcons). >> >> Sounds reasonable :) > > And... is has been already done, actually. > allocate_vectorlike calls allocate_vector_from_block, which > re-uses pre-allocated objects. > > And looking into the call graph, this exact branch calling > allocate_vector_from_block is indeed called for the bignums [...] Are we talking a list of Emacs C functions executing with the corresponding times they have been in execution in a tree data structure? :O E.g. where do we find allocate_vectorlike ? > With the patch: > > perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln > 1.968 s > > 33.17% emacs emacs [.] process_mark_stack > 5.51% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase > 5.05% emacs emacs [.] mark_char_table > 4.88% emacs emacs [.] pdumper_marked_p_impl > 3.30% emacs emacs [.] pdumper_set_marked_impl > ... > 2.52% emacs emacs [.] allocate_vectorlike > > allocate_vectorlike clearly takes a lot less time by not trying to loop > over all the ~500 empty elements of vector_free_lists. > > We can further get rid of the GC by temporarily disabling it (just for > demonstration): > > (let ((beg (float-time))) > (setq gc-cons-threshold most-positive-fixnum) > (fib 10000 1000) > (message "%.3f s" (- (float-time) beg)) ) > > perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln > 0.739 s > > 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase > 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add > 6.51% emacs emacs [.] arith_driver > 6.03% emacs libc.so.6 [.] malloc > 5.57% emacs emacs [.] allocate_vectorlike > 5.20% emacs [unknown] [k] 0xffffffffaae01857 > 4.16% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr > 3.72% emacs emacs [.] check_number_coerce_marker > 3.35% emacs fib.eln [.] F666962_fib_0 > 3.29% emacs emacs [.] allocate_pseudovector > 2.30% emacs emacs [.] Flss > > Now, the actual bignum arithmetics (lisp/gmp.c) takes most of the CPU time. > > I am not sure what differs between Elisp gmp bindings and analogous SBCL > binding so that SBCL is so much faster. > > diff --git a/src/alloc.c b/src/alloc.c > index 17ca5c725d0..62e96b4c9de 100644 > --- a/src/alloc.c > +++ b/src/alloc.c > @@ -3140,6 +3140,7 @@ large_vector_vec (struct large_vector *p) > vectors of the same NBYTES size, so NTH == VINDEX (NBYTES). */ > > static struct Lisp_Vector *vector_free_lists[VECTOR_MAX_FREE_LIST_INDEX]; > +static int vector_free_lists_min_idx = VECTOR_MAX_FREE_LIST_INDEX; > > /* Singly-linked list of large vectors. */ > > @@ -3176,6 +3177,8 @@ setup_on_free_list (struct Lisp_Vector *v, ptrdiff_t > nbytes) > set_next_vector (v, vector_free_lists[vindex]); > ASAN_POISON_VECTOR_CONTENTS (v, nbytes - header_size); > vector_free_lists[vindex] = v; > + if ( vindex < vector_free_lists_min_idx ) > + vector_free_lists_min_idx = vindex; > } > > /* Get a new vector block. */ > @@ -3230,8 +3233,8 @@ allocate_vector_from_block (ptrdiff_t nbytes) > /* Next, check free lists containing larger vectors. Since > we will split the result, we should have remaining space > large enough to use for one-slot vector at least. */ > - for (index = VINDEX (nbytes + VBLOCK_BYTES_MIN); > - index < VECTOR_MAX_FREE_LIST_INDEX; index++) > + for (index = max ( VINDEX (nbytes + VBLOCK_BYTES_MIN), > vector_free_lists_min_idx ); > + index < VECTOR_MAX_FREE_LIST_INDEX; index++, > vector_free_lists_min_idx++) > if (vector_free_lists[index]) > { > /* This vector is larger than requested. */ > @@ -3413,6 +3416,7 @@ sweep_vectors (void) > gcstat.total_vectors = 0; > gcstat.total_vector_slots = gcstat.total_free_vector_slots = 0; > memset (vector_free_lists, 0, sizeof (vector_free_lists)); > + vector_free_lists_min_idx = VECTOR_MAX_FREE_LIST_INDEX; > > /* Looking through vector blocks. */ Amazing! :O See if you can do my original test, which was 1-3 Elisp, byte-compiled Elisp, and natively compiled Elisp, and the Common Lisp execution (on your computer), if you'd like. Actually it is a bit of a bummer to the community since Emacs is like THE portal into Lisp. We should have the best Lisp in the business, and I don't see why not? Emacs + SBCL + CL + Elisp anyone? I.e. real CL not the cl- which is actually in Elisp. Not that there is anything wrong with that! On the contrary ;) -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-11 18:06 ` Emanuel Berg @ 2023-08-11 19:41 ` Ihor Radchenko 2023-08-11 19:50 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-11 19:41 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> And... is has been already done, actually. >> allocate_vectorlike calls allocate_vector_from_block, which >> re-uses pre-allocated objects. >> >> And looking into the call graph, this exact branch calling >> allocate_vector_from_block is indeed called for the bignums [...] > > Are we talking a list of Emacs C functions executing with the > corresponding times they have been in execution in a tree data > structure? :O That's what GNU perf does - it is a sampling profiler in GNU/Linux. The Elisp equivalent is profiler.el, but it does not reveal underlying C functions. > E.g. where do we find allocate_vectorlike ? I have listed the commands I used (from terminal): 1. perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln <records CPU stats while running emacs> 2. perf report <displays the stats> You need Emacs compiled with debug symbols the get meaningful output. See more at https://www.brendangregg.com/perf.html > See if you can do my original test, which was 1-3 Elisp, > byte-compiled Elisp, and natively compiled Elisp, and the > Common Lisp execution (on your computer), if you'd like. As you wish: $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.el [5.783 s] $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.elc [1.961 s] $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln [1.901 s] $ SBCL_HOME=/usr/lib64/sbcl sbcl --load /tmp/fib.cl [0.007 s] without the patch (on my system) $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.el [6.546 s] $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.elc [2.498 s] $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln [2.518 s] Also, the patch gives improvements for more than just bignums. I ran elisp-benchmarks (https://elpa.gnu.org/packages/elisp-benchmarks.html) and got (before the patch) | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | |--------------------+----------------+------------+---------+-------------+-----------------| | bubble | 0.70 | 0.06 | 1 | 0.76 | 0.07 | | bubble-no-cons | 1.17 | 0.00 | 0 | 1.17 | 0.02 | | bytecomp | 1.74 | 0.29 | 13 | 2.03 | 0.12 | | dhrystone | 2.30 | 0.00 | 0 | 2.30 | 0.07 | | eieio | 1.25 | 0.13 | 7 | 1.38 | 0.03 | | fibn | 0.00 | 0.00 | 0 | 0.00 | 0.00 | | fibn-named-let | 1.53 | 0.00 | 0 | 1.53 | 0.03 | | fibn-rec | 0.00 | 0.00 | 0 | 0.00 | 0.00 | | fibn-tc | 0.00 | 0.00 | 0 | 0.00 | 0.00 | | flet | 1.48 | 0.00 | 0 | 1.48 | 0.04 | | inclist | 1.07 | 0.00 | 0 | 1.07 | 0.02 | | inclist-type-hints | 1.00 | 0.00 | 0 | 1.00 | 0.07 | | listlen-tc | 0.13 | 0.00 | 0 | 0.13 | 0.03 | | map-closure | 5.26 | 0.00 | 0 | 5.26 | 0.09 | | nbody | 1.61 | 0.17 | 1 | 1.78 | 0.06 | | pack-unpack | 0.31 | 0.02 | 1 | 0.33 | 0.00 | | pack-unpack-old | 0.50 | 0.05 | 3 | 0.55 | 0.02 | | pcase | 1.85 | 0.00 | 0 | 1.85 | 0.05 | | pidigits | 4.41 | 0.96 | 17 | 5.37 | 0.13 | | scroll | 0.64 | 0.00 | 0 | 0.64 | 0.01 | | smie | 1.59 | 0.04 | 2 | 1.63 | 0.03 | |--------------------+----------------+------------+---------+-------------+-----------------| | total | 28.54 | 1.72 | 45 | 30.26 | 0.26 | (after the patch) | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | |--------------------+----------------+------------+---------+-------------+-----------------| | bubble | 0.68 | 0.05 | 1 | 0.73 | 0.04 | | bubble-no-cons | 1.00 | 0.00 | 0 | 1.00 | 0.04 | | bytecomp | 1.60 | 0.23 | 13 | 1.82 | 0.16 | | dhrystone | 2.03 | 0.00 | 0 | 2.03 | 0.05 | | eieio | 1.08 | 0.12 | 7 | 1.20 | 0.07 | | fibn | 0.00 | 0.00 | 0 | 0.00 | 0.00 | | fibn-named-let | 1.44 | 0.00 | 0 | 1.44 | 0.12 | | fibn-rec | 0.00 | 0.00 | 0 | 0.00 | 0.00 | | fibn-tc | 0.00 | 0.00 | 0 | 0.00 | 0.00 | | flet | 1.36 | 0.00 | 0 | 1.36 | 0.09 | | inclist | 1.00 | 0.00 | 0 | 1.00 | 0.00 | | inclist-type-hints | 1.00 | 0.00 | 0 | 1.00 | 0.07 | | listlen-tc | 0.11 | 0.00 | 0 | 0.11 | 0.02 | | map-closure | 4.91 | 0.00 | 0 | 4.91 | 0.12 | | nbody | 1.47 | 0.17 | 1 | 1.64 | 0.08 | | pack-unpack | 0.29 | 0.02 | 1 | 0.31 | 0.01 | | pack-unpack-old | 0.43 | 0.05 | 3 | 0.48 | 0.01 | | pcase | 1.84 | 0.00 | 0 | 1.84 | 0.07 | | pidigits | 3.16 | 0.94 | 17 | 4.11 | 0.10 | | scroll | 0.58 | 0.00 | 0 | 0.58 | 0.00 | | smie | 1.40 | 0.04 | 2 | 1.44 | 0.06 | |--------------------+----------------+------------+---------+-------------+-----------------| | total | 25.38 | 1.62 | 45 | 27.00 | 0.32 | About ~10% improvement, with each individual benchmark being faster. Note how fibn test takes 0.00 seconds. It is limited to fixnum range. > Actually it is a bit of a bummer to the community since Emacs > is like THE portal into Lisp. We should have the best Lisp in > the business, and I don't see why not? Emacs + SBCL + CL + > Elisp anyone? This is a balancing act. Elisp is tailored for Emacs as an editor. So, trade-offs are inevitable. I am skeptical about Elisp overperforming CL. But it does not mean that we should not try to improve things. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-11 19:41 ` Ihor Radchenko @ 2023-08-11 19:50 ` Emanuel Berg 2023-08-12 8:24 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-11 19:50 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> See if you can do my original test, which was 1-3 Elisp, >> byte-compiled Elisp, and natively compiled Elisp, and the >> Common Lisp execution (on your computer), if you'd like. > > As you wish: > > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.el [5.783 s] > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.elc [1.961 s] > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln [1.901 s] > $ SBCL_HOME=/usr/lib64/sbcl sbcl --load /tmp/fib.cl [0.007 s] > > without the patch (on my system) > > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.el [6.546 s] > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.elc [2.498 s] > $ ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln [2.518 s] The stats seem to speak one language ... > Also, the patch gives improvements for more than just > bignums > | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) > | tot avg err (s) | > |--------------------+----------------+------------+---------+-------------+-----------------| > | bubble | 0.70 | 0.06 | 1 | 0.76 | 0.07 | > | bubble-no-cons | 1.17 | 0.00 | 0 | 1.17 | 0.02 | > | bytecomp | 1.74 | 0.29 | 13 | 2.03 | 0.12 | > | dhrystone | 2.30 | 0.00 | 0 | 2.30 | 0.07 | > | eieio | 1.25 | 0.13 | 7 | 1.38 | 0.03 | > | fibn | 0.00 | 0.00 | 0 | 0.00 | 0.00 | > | fibn-named-let | 1.53 | 0.00 | 0 | 1.53 | 0.03 | > | fibn-rec | 0.00 | 0.00 | 0 | 0.00 | 0.00 | > | fibn-tc | 0.00 | 0.00 | 0 | 0.00 | 0.00 | > | flet | 1.48 | 0.00 | 0 | 1.48 | 0.04 | > | inclist | 1.07 | 0.00 | 0 | 1.07 | 0.02 | > | inclist-type-hints | 1.00 | 0.00 | 0 | 1.00 | 0.07 | > | listlen-tc | 0.13 | 0.00 | 0 | 0.13 | 0.03 | > | map-closure | 5.26 | 0.00 | 0 | 5.26 | 0.09 | > | nbody | 1.61 | 0.17 | 1 | 1.78 | 0.06 | > | pack-unpack | 0.31 | 0.02 | 1 | 0.33 | 0.00 | > | pack-unpack-old | 0.50 | 0.05 | 3 | 0.55 | 0.02 | > | pcase | 1.85 | 0.00 | 0 | 1.85 | 0.05 | > | pidigits | 4.41 | 0.96 | 17 | 5.37 | 0.13 | > | scroll | 0.64 | 0.00 | 0 | 0.64 | 0.01 | > | smie | 1.59 | 0.04 | 2 | 1.63 | 0.03 | > |--------------------+----------------+------------+---------+-------------+-----------------| > | total | 28.54 | 1.72 | 45 | 30.26 | 0.26 | > > (after the patch) > | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) > | tot avg err (s) | > |--------------------+----------------+------------+---------+-------------+-----------------| > | bubble | 0.68 | 0.05 | 1 | 0.73 | 0.04 | > | bubble-no-cons | 1.00 | 0.00 | 0 | 1.00 | 0.04 | > | bytecomp | 1.60 | 0.23 | 13 | 1.82 | 0.16 | > | dhrystone | 2.03 | 0.00 | 0 | 2.03 | 0.05 | > | eieio | 1.08 | 0.12 | 7 | 1.20 | 0.07 | > | fibn | 0.00 | 0.00 | 0 | 0.00 | 0.00 | > | fibn-named-let | 1.44 | 0.00 | 0 | 1.44 | 0.12 | > | fibn-rec | 0.00 | 0.00 | 0 | 0.00 | 0.00 | > | fibn-tc | 0.00 | 0.00 | 0 | 0.00 | 0.00 | > | flet | 1.36 | 0.00 | 0 | 1.36 | 0.09 | > | inclist | 1.00 | 0.00 | 0 | 1.00 | 0.00 | > | inclist-type-hints | 1.00 | 0.00 | 0 | 1.00 | 0.07 | > | listlen-tc | 0.11 | 0.00 | 0 | 0.11 | 0.02 | > | map-closure | 4.91 | 0.00 | 0 | 4.91 | 0.12 | > | nbody | 1.47 | 0.17 | 1 | 1.64 | 0.08 | > | pack-unpack | 0.29 | 0.02 | 1 | 0.31 | 0.01 | > | pack-unpack-old | 0.43 | 0.05 | 3 | 0.48 | 0.01 | > | pcase | 1.84 | 0.00 | 0 | 1.84 | 0.07 | > | pidigits | 3.16 | 0.94 | 17 | 4.11 | 0.10 | > | scroll | 0.58 | 0.00 | 0 | 0.58 | 0.00 | > | smie | 1.40 | 0.04 | 2 | 1.44 | 0.06 | > |--------------------+----------------+------------+---------+-------------+-----------------| > | total | 25.38 | 1.62 | 45 | 27.00 | 0.32 | > > About ~10% improvement, with each individual benchmark being faster. Ten percent? We take it :) > Note how fibn test takes 0.00 seconds. It is limited to > fixnum range. What does that say/mean? -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-11 19:50 ` Emanuel Berg @ 2023-08-12 8:24 ` Ihor Radchenko 2023-08-12 16:03 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-12 8:24 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> Note how fibn test takes 0.00 seconds. It is limited to >> fixnum range. > > What does that say/mean? It tells that normal int operations are much, much faster compared to bigint. So, your benchmark is rather esoteric if we consider normal usage patterns. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-12 8:24 ` Ihor Radchenko @ 2023-08-12 16:03 ` Emanuel Berg 2023-08-13 9:09 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-12 16:03 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >>> Note how fibn test takes 0.00 seconds. It is limited to >>> fixnum range. >> >> What does that say/mean? > > It tells that normal int operations are much, much faster > compared to bigint. So, your benchmark is rather esoteric if > we consider normal usage patterns. Didn't you provide a whole set of benchmarks with an approximated gain of 10%? Maybe some esoteric nature is built-in in the benchmark concept ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-12 16:03 ` Emanuel Berg @ 2023-08-13 9:09 ` Ihor Radchenko 2023-08-13 9:49 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-13 9:09 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> It tells that normal int operations are much, much faster >> compared to bigint. So, your benchmark is rather esoteric if >> we consider normal usage patterns. > > Didn't you provide a whole set of benchmarks with an > approximated gain of 10%? Maybe some esoteric nature is > built-in in the benchmark concept ... The main problem your benchmark demonstrated is with bignum. By accident, it also revealed slight inefficiency in vector allocation, but this inefficiency is nowhere near SBCL 0.007 sec vs. Elisp 2.5 sec. In practice, as more generic benchmarks demonstrated, we only had 10% performance hit. Not something to claim that Elisp is much slower compared to CL. It would be more useful to compare CL with Elisp using less specialized benchmarks that do not involve bignums. As Mattias commented, we do not care much about bignum performance in Elisp - it is a rarely used feature; we are content that it simply works, even if not fast, and the core contributors (at least, Mattias) are not seeing improving bignums as their priority. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-13 9:09 ` Ihor Radchenko @ 2023-08-13 9:49 ` Emanuel Berg 2023-08-13 10:21 ` Ihor Radchenko 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-13 9:49 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: > The main problem your benchmark demonstrated is with bignum. > By accident, it also revealed slight inefficiency in vector > allocation, but this inefficiency is nowhere near SBCL 0.007 > sec vs. Elisp 2.5 sec. Yeah, we can't have that. > In practice, as more generic benchmarks demonstrated, we > only had 10% performance hit. Not something to claim that > Elisp is much slower compared to CL. What do you mean, generic +10% is a huge improvement. > It would be more useful to compare CL with Elisp using less > specialized benchmarks that do not involve bignums. > As Mattias commented, we do not care much about bignum > performance in Elisp - it is a rarely used feature; we are > content that it simply works, even if not fast, and the core > contributors (at least, Mattias) are not seeing improving > bignums as their priority. But didn't your patch do that already? It would indicate that it is possible to do it all in/to Elisp, which would be the best way to solve this problem _and_ not have any of the integration, maybe portability issues described ... So 1, the first explanation why CL is much faster is another implementation of bignums handling which is faster in CL, if that has already been solved here absolutely no reason not to include it as 10% is a huge gain, even more so for a whole set of benchmarks. Instead of relying on a single benchmark, one should have a set of benchmarks and every benchmark should have a purpose, this doesn't have to be so involved tho, for example "bignums" could be the purpose of my benchmark, so one would have several, say a dozen, each with the purpose of slowing the computer down with respect to some aspect or known situation that one would try to provoke ... It can be well-known algorithms for that matter. One would then do the same thing in CL and see, where do CL perform much better? The next question would be, why? If it is just about piling up +10%, let's do it! -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-13 9:49 ` Emanuel Berg @ 2023-08-13 10:21 ` Ihor Radchenko 2023-08-14 2:20 ` Emanuel Berg 0 siblings, 1 reply; 47+ messages in thread From: Ihor Radchenko @ 2023-08-13 10:21 UTC (permalink / raw) To: Emanuel Berg; +Cc: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: >> In practice, as more generic benchmarks demonstrated, we >> only had 10% performance hit. Not something to claim that >> Elisp is much slower compared to CL. > > What do you mean, generic +10% is a huge improvement. It is, but it is also tangent to comparison between Elisp and CL. The main (AFAIU) difference between Elisp and CL is in how the bignums are stored. Elisp uses its own internal object type while CL uses GMP's native format. And we have huge overheads converting things back-and-forth between GMP and Elisp formats. It is by choice. And my patch did not do anything about this difference. Also, +10% is just on my machine. We need someone else to test things before jumping to far-reaching conclusions. I plan to submit the patch in a less ad-hoc state later, as a separate ticket. >> It would be more useful to compare CL with Elisp using less >> specialized benchmarks that do not involve bignums. >> As Mattias commented, we do not care much about bignum >> performance in Elisp - it is a rarely used feature; we are >> content that it simply works, even if not fast, and the core >> contributors (at least, Mattias) are not seeing improving >> bignums as their priority. > > But didn't your patch do that already? No. The benchmark only compared between Elisp before/after the patch. Not with CL. > Instead of relying on a single benchmark, one should have > a set of benchmarks and every benchmark should have a purpose, > this doesn't have to be so involved tho, for example "bignums" > could be the purpose of my benchmark, so one would have > several, say a dozen, each with the purpose of slowing the > computer down with respect to some aspect or known > situation that one would try to provoke ... It can be > well-known algorithms for that matter. > > One would then do the same thing in CL and see, where do CL > perform much better? The next question would be, why? Sure. Feel free to share such benchmark for Elisp vs. CL. I only know the benchmark library for Elisp. No equivalent comparable benchmark for CL. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance (was: Shrinking the C core) 2023-08-13 10:21 ` Ihor Radchenko @ 2023-08-14 2:20 ` Emanuel Berg 2023-08-14 2:42 ` [PATCH] Re: Bignum performance Po Lu 0 siblings, 1 reply; 47+ messages in thread From: Emanuel Berg @ 2023-08-14 2:20 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >>> In practice, as more generic benchmarks demonstrated, we >>> only had 10% performance hit. Not something to claim that >>> Elisp is much slower compared to CL. >> >> What do you mean, generic +10% is a huge improvement. > > It is, but it is also tangent to comparison between Elisp > and CL. The main (AFAIU) difference between Elisp and CL is > in how the bignums are stored. Elisp uses its own internal > object type while CL uses GMP's native format. GMP = GNU Multiple Precision Arithmetic Library. https://en.wikipedia.org/wiki/GNU_Multiple_Precision_Arithmetic_Library > And we have huge overheads converting things back-and-forth > between GMP and Elisp formats. It is by choice. And my patch > did not do anything about this difference. But that's all the better, your patch solved (very likely) the problem and did so without causing havoc by trying to forcibly merge opposing solutions. And the method was: instead of reallocating new objects for bignums, we are no reusing existing allocations for new data? >>> It would be more useful to compare CL with Elisp using >>> less specialized benchmarks that do not involve bignums. >>> As Mattias commented, we do not care much about bignum >>> performance in Elisp - it is a rarely used feature; we are >>> content that it simply works, even if not fast, and the >>> core contributors (at least, Mattias) are not seeing >>> improving bignums as their priority. >> >> But didn't your patch do that already? > > No. The benchmark only compared between Elisp before/after > the patch. Not with CL. No, that much I understood. It was Elisp before and after the patch, as you say. Isn't before/after all data you need? Nah, it can be useful to have an external reference as well and here were are also hoping we can use the benchmarks to answer they question if CL is just so much faster in general, or if there are certain areas where it excels - and if so - what those areas are and what they contain to unlock all that speed. >> Instead of relying on a single benchmark, one should have >> a set of benchmarks and every benchmark should have >> a purpose, this doesn't have to be so involved tho, for >> example "bignums" could be the purpose of my benchmark, so >> one would have several, say a dozen, each with the purpose >> of slowing the computer down with respect to some aspect or >> known situation that one would try to provoke ... It can be >> well-known algorithms for that matter. >> >> One would then do the same thing in CL and see, where do CL >> perform much better? The next question would be, why? > > Sure. Feel free to share such benchmark for Elisp vs. CL. > I only know the benchmark library for Elisp. No equivalent > comparable benchmark for CL. I'm working on it! This will be very interesting, for sure. The need for speed - but in a very methodical way ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 2:20 ` Emanuel Berg @ 2023-08-14 2:42 ` Po Lu 2023-08-14 4:16 ` Emanuel Berg 2023-08-14 7:15 ` Ihor Radchenko 0 siblings, 2 replies; 47+ messages in thread From: Po Lu @ 2023-08-14 2:42 UTC (permalink / raw) To: emacs-devel Emanuel Berg <incal@dataswamp.org> writes: > Ihor Radchenko wrote: > >>>> In practice, as more generic benchmarks demonstrated, we >>>> only had 10% performance hit. Not something to claim that >>>> Elisp is much slower compared to CL. >>> >>> What do you mean, generic +10% is a huge improvement. >> >> It is, but it is also tangent to comparison between Elisp >> and CL. The main (AFAIU) difference between Elisp and CL is >> in how the bignums are stored. Elisp uses its own internal >> object type while CL uses GMP's native format. > > GMP = GNU Multiple Precision Arithmetic Library. > > https://en.wikipedia.org/wiki/GNU_Multiple_Precision_Arithmetic_Library > >> And we have huge overheads converting things back-and-forth >> between GMP and Elisp formats. It is by choice. And my patch >> did not do anything about this difference. AFAIU, no conversion takes place between ``Elisp formats'' and GMP formats. Our bignums rely on GMP for all data storage and memory allocation. struct Lisp_Bignum { union vectorlike_header header; mpz_t value; <-------------------- GMP type } GCALIGNED_STRUCT; and finally: INLINE mpz_t const * bignum_val (struct Lisp_Bignum const *i) { return &i->value; } INLINE mpz_t const * xbignum_val (Lisp_Object i) { return bignum_val (XBIGNUM (i)); } ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 2:42 ` [PATCH] Re: Bignum performance Po Lu @ 2023-08-14 4:16 ` Emanuel Berg 2023-08-14 7:15 ` Ihor Radchenko 1 sibling, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-14 4:16 UTC (permalink / raw) To: emacs-devel Po Lu wrote: > AFAIU, no conversion takes place between ``Elisp formats'' > and GMP formats. Our bignums rely on GMP for all data > storage and memory allocation. There was a problem with that as likely indicated by the Fibonacci benchmark, that situation has hopefully been patched by now or will so be, so now it will be interesting to see if we can identify other such areas, and if they can be solved as effortlessly ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 2:42 ` [PATCH] Re: Bignum performance Po Lu 2023-08-14 4:16 ` Emanuel Berg @ 2023-08-14 7:15 ` Ihor Radchenko 2023-08-14 7:50 ` Po Lu 2023-08-15 14:28 ` Emanuel Berg 1 sibling, 2 replies; 47+ messages in thread From: Ihor Radchenko @ 2023-08-14 7:15 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel Po Lu <luangruo@yahoo.com> writes: >>> And we have huge overheads converting things back-and-forth >>> between GMP and Elisp formats. It is by choice. And my patch >>> did not do anything about this difference. > > AFAIU, no conversion takes place between ``Elisp formats'' and GMP > formats. Our bignums rely on GMP for all data storage and memory > allocation. Thanks for the clarification! So, GMP is not as fast as SBCL's implementation after all. SBCL uses https://github.com/sbcl/sbcl/blob/master/src/code/bignum.lisp - a custom bignum implementation, which is clearly faster compared to GMP (in the provided benchmark): perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln 0.739 s (0.007 s for SBCL) 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add ^^ already >0.1 sec. 6.51% emacs emacs [.] arith_driver 6.03% emacs libc.so.6 [.] malloc 5.57% emacs emacs [.] allocate_vectorlike 5.20% emacs [unknown] [k] 0xffffffffaae01857 4.16% emacs libgmp.so.10.5.0 [.] __gmpn_add_n_coreisbr 3.72% emacs emacs [.] check_number_coerce_marker 3.35% emacs fib.eln [.] F666962_fib_0 3.29% emacs emacs [.] allocate_pseudovector 2.30% emacs emacs [.] Flss -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 7:15 ` Ihor Radchenko @ 2023-08-14 7:50 ` Po Lu 2023-08-14 9:28 ` Ihor Radchenko 2023-08-15 14:28 ` Emanuel Berg 1 sibling, 1 reply; 47+ messages in thread From: Po Lu @ 2023-08-14 7:50 UTC (permalink / raw) To: Ihor Radchenko; +Cc: emacs-devel Ihor Radchenko <yantar92@posteo.net> writes: > Po Lu <luangruo@yahoo.com> writes: > >>>> And we have huge overheads converting things back-and-forth >>>> between GMP and Elisp formats. It is by choice. And my patch >>>> did not do anything about this difference. >> >> AFAIU, no conversion takes place between ``Elisp formats'' and GMP >> formats. Our bignums rely on GMP for all data storage and memory >> allocation. > > Thanks for the clarification! > So, GMP is not as fast as SBCL's implementation after all. > SBCL uses https://github.com/sbcl/sbcl/blob/master/src/code/bignum.lisp > - a custom bignum implementation, which is clearly faster compared to > GMP (in the provided benchmark): GMP is significantly faster than all other known bignum libraries. Bignums are not considered essential for Emacs's performance, so the GMP library is utilized in an inefficient fashion. > perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln > 0.739 s > (0.007 s for SBCL) > > 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase > > 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add > > ^^ already >0.1 sec. The subroutine actually performing arithmetic is in fact mpn_add_n_coreisbr. mpz_add and mpz_sizeinbase are ``mpz'' functions that perform memory allocation, and our bignum functions frequently utilize mpz_sizeinbase to ascertain whether a result can be represented as a fixnum. As such, they don't constitute a fair comparison between the speed of the GMP library itself and SBCL. GMP provides low-level functions that place responsibility for memory management and input verification in the hands of the programmer. These are usually implemented in CPU-specific assembler, and are very fast. That being said, they're not available within mini-gmp, and the primary bottleneck is in fact mpz_sizeinbase. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 7:50 ` Po Lu @ 2023-08-14 9:28 ` Ihor Radchenko 0 siblings, 0 replies; 47+ messages in thread From: Ihor Radchenko @ 2023-08-14 9:28 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel Po Lu <luangruo@yahoo.com> writes: > GMP is significantly faster than all other known bignum libraries. > Bignums are not considered essential for Emacs's performance, so the GMP > library is utilized in an inefficient fashion. I understand. It adds to the Mattias' opinion that we should not care too much about bignum performance in practice. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH] Re: Bignum performance 2023-08-14 7:15 ` Ihor Radchenko 2023-08-14 7:50 ` Po Lu @ 2023-08-15 14:28 ` Emanuel Berg 1 sibling, 0 replies; 47+ messages in thread From: Emanuel Berg @ 2023-08-15 14:28 UTC (permalink / raw) To: emacs-devel Ihor Radchenko wrote: >> AFAIU, no conversion takes place between ``Elisp formats'' >> and GMP formats. Our bignums rely on GMP for all data >> storage and memory allocation. > > Thanks for the clarification! So, GMP is not as fast as > SBCL's implementation after all. SBCL uses > https://github.com/sbcl/sbcl/blob/master/src/code/bignum.lisp > - a custom bignum implementation, which is clearly faster > compared to GMP (in the provided benchmark): > > perf record ~/Git/emacs/src/emacs -Q -batch -l /tmp/fib.eln > 0.739 s > (0.007 s for SBCL) > > 17.11% emacs libgmp.so.10.5.0 [.] __gmpz_sizeinbase > 7.35% emacs libgmp.so.10.5.0 [.] __gmpz_add > > ^^ already >0.1 sec. And we are not the only ones using GMP, right? So maybe this issue in particular would be solved even more broadly by a SBCL -> GMP transition ... BTW, here are a bunch of new benchmarks from elisp-benchmarks brought to CL. Ironically, as some of them were from CL in the first place. But it is the way it goes. https://dataswamp.org/~incal/cl/bench/ Several benchmarks are to Emacs specific to make sense (work) anywhere else, but there are still a few left to do. Not that in fib.cl there are now, also brought from elisp-benchmarks, a bunch of new implementations to try. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2023-09-04 4:13 UTC | newest] Thread overview: 47+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-14 6:28 [PATCH] Re: Bignum performance (was: Shrinking the C core) Gerd Möllmann 2023-08-14 6:56 ` Gerd Möllmann 2023-08-14 7:04 ` Ihor Radchenko 2023-08-14 7:35 ` Gerd Möllmann 2023-08-14 8:09 ` Ihor Radchenko 2023-08-14 9:28 ` Gerd Möllmann 2023-08-14 9:42 ` Ihor Radchenko 2023-08-15 14:03 ` Emanuel Berg 2023-08-15 15:01 ` Ihor Radchenko 2023-08-15 22:21 ` Emanuel Berg 2023-08-15 22:33 ` Emanuel Berg 2023-08-16 4:36 ` tomas 2023-08-16 5:23 ` Emanuel Berg 2023-08-14 16:51 ` Emanuel Berg 2023-08-15 4:58 ` Gerd Möllmann 2023-08-15 14:20 ` Emanuel Berg 2023-08-15 6:26 ` [PATCH] Re: Bignum performance Po Lu 2023-08-15 14:33 ` Emanuel Berg 2023-08-15 17:07 ` tomas 2023-08-15 22:46 ` Emanuel Berg 2023-08-16 1:31 ` Po Lu 2023-08-16 1:37 ` Emanuel Berg 2023-08-16 3:17 ` Po Lu 2023-08-16 4:44 ` tomas 2023-08-16 5:18 ` Gerd Möllmann 2023-08-16 5:35 ` Emanuel Berg 2023-08-18 7:14 ` Simon Leinen 2023-08-19 13:10 ` Emanuel Berg 2023-08-20 5:07 ` Ihor Radchenko 2023-08-20 6:20 ` Emanuel Berg 2023-08-28 5:32 ` Emanuel Berg 2023-09-03 0:48 ` Emanuel Berg 2023-09-03 8:50 ` Ihor Radchenko 2023-09-03 9:05 ` Emanuel Berg 2023-09-03 10:30 ` Elisp native-comp vs. SBCL for inclist-type-hints benchmark (was: [PATCH] Re: Bignum performance) Ihor Radchenko 2023-09-04 1:03 ` Emanuel Berg 2023-09-03 1:57 ` [PATCH] Re: Bignum performance Emanuel Berg 2023-09-04 4:13 ` Emanuel Berg 2023-08-16 5:41 ` Gerd Möllmann 2023-08-16 6:42 ` Po Lu 2023-08-16 8:05 ` Gerd Möllmann -- strict thread matches above, loose matches on Subject: below -- 2023-08-09 9:46 Shrinking the C core Eric S. Raymond 2023-08-09 12:34 ` Po Lu 2023-08-09 15:51 ` Eric S. Raymond 2023-08-09 23:56 ` Po Lu 2023-08-10 1:19 ` Eric S. Raymond 2023-08-10 7:44 ` Eli Zaretskii 2023-08-10 21:54 ` Emanuel Berg 2023-08-11 10:27 ` Bignum performance (was: Shrinking the C core) Ihor Radchenko 2023-08-11 12:10 ` Emanuel Berg 2023-08-11 12:32 ` Ihor Radchenko 2023-08-11 12:38 ` Emanuel Berg 2023-08-11 14:07 ` [PATCH] " Ihor Radchenko 2023-08-11 18:06 ` Emanuel Berg 2023-08-11 19:41 ` Ihor Radchenko 2023-08-11 19:50 ` Emanuel Berg 2023-08-12 8:24 ` Ihor Radchenko 2023-08-12 16:03 ` Emanuel Berg 2023-08-13 9:09 ` Ihor Radchenko 2023-08-13 9:49 ` Emanuel Berg 2023-08-13 10:21 ` Ihor Radchenko 2023-08-14 2:20 ` Emanuel Berg 2023-08-14 2:42 ` [PATCH] Re: Bignum performance Po Lu 2023-08-14 4:16 ` Emanuel Berg 2023-08-14 7:15 ` Ihor Radchenko 2023-08-14 7:50 ` Po Lu 2023-08-14 9:28 ` Ihor Radchenko 2023-08-15 14:28 ` Emanuel Berg
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.