unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Elisp performance
@ 2009-07-29 12:50 Daniel Kraft
  2009-07-30  3:23 ` Ken Raeburn
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Daniel Kraft @ 2009-07-29 12:50 UTC (permalink / raw)
  To: guile-devel; +Cc: Andy Wingo

[-- Attachment #1: Type: text/plain, Size: 3836 bytes --]

Hi all,

as promised, here are some performance results for the current elisp 
compiler.  BTW, it supports now to disable checks for void-value on 
variable access via compiler options as well as the 
lexical-let/lexical-let* constructs that should mimic the ones from 
emacs' cl package (but in emacs they degrade performance, in Guile they 
improve it).

Lambda arguments are still always dynamically bound, which is quite a 
pity as it inhibits tail-call optimization; I'll work on a compiler 
option to always bind certain or all symbols lexically, including lambda 
arguments to counter this (and there's also an emacs branch doing this, 
so we're ultimatly even trying to be compatible...).

For the timings, I used a simple prime sieve implemented imperatively 
with while and set's, because the recursive one would put elisp without 
tail-calls at a disadvantage (and because lexical binding does not yet 
work for lambda arguments); the code is attached.  Both Scheme and Elisp 
version are identical with just syntax changed (set! -> setq, define -> 
defun, begin -> progn).  Additionally, for lexical let all let's were 
replaced by lexical-let's in the elisp version.  Here are the results:

Iterative prime sieve, (length (find-primes-to 5000)):
   Scheme: 0.42s
   Elisp, no void checks, lexical let: 3.40s
   Elisp, no void checks, dynamic let: 4.43s
   Elisp, void checks, dynamic let: 5.12s
   Elisp, void checks, lexical let: 4.06s

So it apparent that both disabling void checks and using lexical let's 
instead of the standard dynamic ones improve performance notably. 
However, were quite out of reach of the Scheme version.

My conjecture is that the remaining bottle-neck are the primitive calls, 
as they are still resolved using the dynamic-binding and fluid 
mechanisms; so maybe it is really a good idea to change function 
bindings to just global ones without dynamic scoping, and implement 
flet/flet* with dynamic-wind.  In contrast to variables, for functions 
access performance is a lot more important than let performance, I 
guess, so this might be the way to go.  What do you think?

As another test I implemented just a small loop, again both in Scheme 
and Elisp with identical code (also attached).  Here are the timing results:

Tight loop, (test 1000000):
   Scheme: 0.72s
   Elisp, no void checks, lexical let: 2.33s
   Elisp, no void checks, lexical let, guile-ref: 1.29s
   Elisp, no void checks, lexical let, guile-primitive: 0.92s

In the guile-ref and guile-primitive runs, I replaced both primitives (< 
and 1+) using the extension constructs (guile-ref (guile) </1+), which 
is like (@ (guile) </1+) in Scheme, or (guile-primitive </1+) which 
builds a (make-primitive-ref) in TreeIL.  The guile-ref timing is what 
we would also get if we changed function bindings like I suggested above.

Obviously, it would help a lot to do so.  On the other hand, switching 
to primitive-ref's would help even more, but I fear we can not easily do 
so, because we can not know if a symbol targets a primitive or was 
rebound at compile time...  BTW, a quick test with Scheme:

scheme@(guile-user)> (set! + (lambda (a b) 42))
scheme@(guile-user)> +
#<program b700a790 at <unknown port>:0:8 (a b)>
scheme@(guile-user)> (+ 1 2)
3
scheme@(guile-user)> (let ((+ (lambda (a b) 42)))
... (+ 1 2))
42

So it seems that the Scheme compiler just ignores this possibility... 
Is (set! + ...) and expecting (+ 1 2) to change invalid, or should this 
strictly be regarded as a bug?

In any case, because of dynamic scoping and the expected behaviour of 
flet to change possibly primitives during its extent, I think we can't 
do anything like that for Elisp (except providing guile-primitive for 
hand-optimizing such calls).

Yours,
Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri

[-- Attachment #2: iterate_sieve.el --]
[-- Type: text/plain, Size: 1098 bytes --]

; Imperative prime sieve in Elisp.

(defun sieve-out (pivot number-list)
  (lexical-let ((i number-list)
        (result '())
        (result-tail '()))
    (while (not (null i))
      (if (not (zerop (% (car i) pivot)))
        (if (consp result)
          (progn
            (setcdr result-tail (cons (car i) '()))
            (setq result-tail (cdr result-tail)))
          (progn
            (setq result (cons (car i) '()))
            (setq result-tail result))))
      (setq i (cdr i)))
    result))

(defun sieve (number-list)
  (lexical-let ((i number-list)
        (result '())
        (result-tail '()))
    (while (not (null i))
      (if (consp result)
        (progn
          (setcdr result-tail (cons (car i) '()))
          (setq result-tail (cdr result-tail)))
        (progn
          (setq result (list (car i)))
          (setq result-tail result)))
      (setq i (sieve-out (car i) i)))
    result))

(defun find-primes-to (to)
  (lexical-let ((numbers '())
        (i to))
    (while (> i 2)
      (setq numbers (cons i numbers))
      (setq i (1- i)))
    (sieve numbers)))

[-- Attachment #3: iterate_sieve.scm --]
[-- Type: text/plain, Size: 1089 bytes --]

; Imperative prime sieve in Scheme.

(define (sieve-out pivot number-list)
  (let ((i number-list)
        (result '())
        (result-tail '()))
    (while (not (null? i))
      (if (not (zero? (modulo (car i) pivot)))
        (if (pair? result)
          (begin
            (set-cdr! result-tail (cons (car i) '()))
            (set! result-tail (cdr result-tail)))
          (begin
            (set! result (list (car i)))
            (set! result-tail result))))
      (set! i (cdr i)))
    result))

(define (sieve number-list)
  (let ((i number-list)
        (result '())
        (result-tail '()))
    (while (not (null? i))
      (if (pair? result)
        (begin
          (set-cdr! result-tail (cons (car i) '()))
          (set! result-tail (cdr result-tail)))
        (begin
          (set! result (cons (car i) '()))
          (set! result-tail result)))
      (set! i (sieve-out (car i) i)))
    result))

(define (find-primes-to to)
  (let ((numbers '())
        (i to))
    (while (> i 2)
      (set! numbers (cons i numbers))
      (set! i (1- i)))
    (sieve numbers)))

[-- Attachment #4: tight_loop.el --]
[-- Type: text/plain, Size: 151 bytes --]

; Just a tight loop in Elisp

(defun test (to)
  (lexical-let ((i 0))
    (while ((guile-primitive <) i to)
      (setq i ((guile-primitive 1+) i)))))

[-- Attachment #5: tight_loop.scm --]
[-- Type: text/plain, Size: 110 bytes --]

; Just a tight loop in Scheme.

(define (test to)
  (let ((i 0))
    (while (< i to)
      (set! i (1+ i)))))

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-29 12:50 Elisp performance Daniel Kraft
@ 2009-07-30  3:23 ` Ken Raeburn
  2009-07-31  5:15   ` Daniel Kraft
  2009-08-04 15:51   ` Andy Wingo
  2009-07-30 20:18 ` Neil Jerram
  2009-08-04 15:47 ` Andy Wingo
  2 siblings, 2 replies; 23+ messages in thread
From: Ken Raeburn @ 2009-07-30  3:23 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: guile-devel

On Jul 29, 2009, at 08:50, Daniel Kraft wrote:
> Iterative prime sieve, (length (find-primes-to 5000)):
>  Scheme: 0.42s
>  Elisp, no void checks, lexical let: 3.40s
>  Elisp, no void checks, dynamic let: 4.43s
>  Elisp, void checks, dynamic let: 5.12s
>  Elisp, void checks, lexical let: 4.06s

It doesn't mean much without testing on the same machine, of course,  
but I get ~1s times for running your lexical-let elisp version in  
emacs.  It'd be a bit of a symbolic win if you can match Emacs's own  
Lisp execution times. :-)  Though of course, Emacs runs nearly  
everything byte-compiled, so it would only be symbolic.  (In this  
case, the byte-compiled lexical-let version took about 0.2s on my  
machine, even better than the Scheme version on your machine.)

I should run some timing tests myself -- I'm particularly interested  
in the allocation/gc performance of Guile vs Emacs, for obvious reasons.

> My conjecture is that the remaining bottle-neck are the primitive  
> calls, as they are still resolved using the dynamic-binding and  
> fluid mechanisms; so maybe it is really a good idea to change  
> function bindings to just global ones without dynamic scoping, and  
> implement flet/flet* with dynamic-wind.  In contrast to variables,  
> for functions access performance is a lot more important than let  
> performance, I guess, so this might be the way to go.  What do you  
> think?

Sounds reasonable to me, but I'm not an expert in that area.

> Obviously, it would help a lot to do so.  On the other hand,  
> switching to primitive-ref's would help even more, but I fear we can  
> not easily do so, because we can not know if a symbol targets a  
> primitive or was rebound at compile time...  BTW, a quick test with  
> Scheme:
> [....]
> So it seems that the Scheme compiler just ignores this  
> possibility... Is (set! + ...) and expecting (+ 1 2) to change  
> invalid, or should this strictly be regarded as a bug?

In general I don't think you can assume it for all symbols, but if it  
helps, the Emacs byte-compiler also assumes that "+" and "cons" and  
certain other functions won't be redefined.  It's even got an "add1"  
opcode.

So if I understand right, if you make similar assumptions and change  
how function bindings are handled, your performance for this code  
drops to under a second?  It sounds like maybe you can get within  
shouting distance of Emacs's own performance already, at least for  
certain test cases.

Would this interfere with possibly blending Scheme GOOPS code with  
Elisp someday?  Or is the generic support there at a lower level than  
this?  (E.g., a "marker" object holds a buffer handle, possibly nil,  
and an offset that automatically gets adjusted if text is inserted  
before it.  You can use "(+ 1 marker)" and get back an integer one  
greater than the marker's current offset.  If markers were implemented  
using GOOPS, would this use of "+" work, given the changes you're  
suggesting?)

Ken




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-29 12:50 Elisp performance Daniel Kraft
  2009-07-30  3:23 ` Ken Raeburn
@ 2009-07-30 20:18 ` Neil Jerram
  2009-07-30 23:54   ` Ken Raeburn
                     ` (2 more replies)
  2009-08-04 15:47 ` Andy Wingo
  2 siblings, 3 replies; 23+ messages in thread
From: Neil Jerram @ 2009-07-30 20:18 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Andy Wingo, guile-devel

Daniel Kraft <d@domob.eu> writes:

> Lambda arguments are still always dynamically bound, which is quite a
> pity as it inhibits tail-call optimization;

This prompted me to wonder if using fluids is the best way to
implement dynamic binding.

Perhaps I'm forgetting something basic, but surely it's using
`dynamic-wind' that gives us dynamic binding semantics, not fluids.

(Note that `with-fluids', which is probably the closest thing in Guile
Scheme to a dynamic let, is a combination of `dynamic-wind' and
`fluid-set!'.)

The main thing I believe that makes a fluid different from a normal
variable is that a fluid can have a different value in each thread -
which is not relevant to Elisp.

So what's my point?  I'm not sure, just musing.  As far as performance
and tail-call optimization are concerned, I would guess that the main
thing that needs to be addressed in order to reinstate tail-call
optimization would be the dynamic wind - i.e. the compiler knowing
that it isn't necessary to set up another dynamic-wind frame, it can
just jump with the current variables.

> Iterative prime sieve, (length (find-primes-to 5000)):
>   Scheme: 0.42s
>   Elisp, no void checks, lexical let: 3.40s
>   Elisp, no void checks, dynamic let: 4.43s
>   Elisp, void checks, dynamic let: 5.12s
>   Elisp, void checks, lexical let: 4.06s

As Ken says, it would be good to see an Emacs timing too.

(Also, if I get time, I'll try this with the old translation code
too.  Just for fun, and hopefully for confirmation that the new
approach is much better!)

> My conjecture is that the remaining bottle-neck are the primitive
> calls, as they are still resolved using the dynamic-binding and fluid
> mechanisms; so maybe it is really a good idea to change function
> bindings to just global ones without dynamic scoping, and implement
> flet/flet* with dynamic-wind.  In contrast to variables, for functions
> access performance is a lot more important than let performance, I
> guess, so this might be the way to go.  What do you think?

See above.  I'm not sure why fluid access should be significantly
slower than non-fluid access, so I would guess that the performance
cost is coming from the dynamic-wind part.

Regards,
        Neil




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30 20:18 ` Neil Jerram
@ 2009-07-30 23:54   ` Ken Raeburn
  2009-07-31  6:09     ` Daniel Kraft
  2009-08-04 10:26     ` Andy Wingo
  2009-07-31  6:02   ` Daniel Kraft
  2009-08-04 10:17   ` Andy Wingo
  2 siblings, 2 replies; 23+ messages in thread
From: Ken Raeburn @ 2009-07-30 23:54 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Daniel Kraft, guile-devel

On Jul 30, 2009, at 16:18, Neil Jerram wrote:
> The main thing I believe that makes a fluid different from a normal
> variable is that a fluid can have a different value in each thread -
> which is not relevant to Elisp.

Not yet, at least.

And maybe that's enough.  There's other stuff in Emacs besides  
variable bindings that would require dynamic-wind support, like flet,  
save-excursion (preserves current buffer and position), with-output-to- 
string and with-output-to-temp-buffer (preserve 'standard-output'), as  
well as any number of explicit calls to unwind-protect from Elisp code  
to alter and restore other global state of the program (e.g., with set- 
default-file-modes or set-standard-case-table).  The current  
incarnation of these things assumes a single-threaded execution  
model.  So, since we can't make Emacs multithreaded by simply dropping  
Guile Elisp into it, if Emacs is all we're concerned about, maybe  
assuming single-threaded execution only is okay for now.

On the other hand, if we want users to be able to write code snippets  
in Elisp and have them callable from a concurrently-multithreaded  
Scheme program (no Emacs processes in sight), I think we'll want the  
multithreaded support for the basic Elisp language sooner rather than  
later, even if multithreaded Emacs needs much more work.

I also don't know how complicated a switch to stop using fluids would  
be.  If it's not too painful, the performance impact would be  
interesting to see -- does it get us closer to the performance of  
Emacs itself?

> See above.  I'm not sure why fluid access should be significantly
> slower than non-fluid access, so I would guess that the performance
> cost is coming from the dynamic-wind part.

For a program the size of Emacs, I'd be concerned about the numbers of  
fluids and dynamic states created, especially since the space cost  
(and run time cost for growing the vectors, every 20th new entry)  
grows with the product of the two.  The number of dynamic states may  
shrink, but the size of each dynamic state -- the number of allocated  
fluid slots -- doesn't shrink, it only grows.

While they don't really correlate to anything in Guile Elisp directly,  
the default max_specpdl_size (the specpdl stack is where things are  
tracked for unwinding -- previous bindings, unwind-protect calls,  
previous location data for save-excursion, etc) was raised to 1000 a  
few years ago, and the maximum lisp evaluation depth (limits calls to  
eval, funcall, apply) to 400 just a couple years ago.  I assume that's  
because the previous limits (650 and 300 respectively) were  
occasionally getting reached.  That suggests to me rather a lot of  
dynamic states, and probably a large number of fluids.

Ken




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30  3:23 ` Ken Raeburn
@ 2009-07-31  5:15   ` Daniel Kraft
  2009-08-04 15:51   ` Andy Wingo
  1 sibling, 0 replies; 23+ messages in thread
From: Daniel Kraft @ 2009-07-31  5:15 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: guile-devel

Ken Raeburn wrote:
>> Obviously, it would help a lot to do so.  On the other hand, switching 
>> to primitive-ref's would help even more, but I fear we can not easily 
>> do so, because we can not know if a symbol targets a primitive or was 
>> rebound at compile time...  BTW, a quick test with Scheme:
>> [....]
>> So it seems that the Scheme compiler just ignores this possibility... 
>> Is (set! + ...) and expecting (+ 1 2) to change invalid, or should 
>> this strictly be regarded as a bug?
> 
> In general I don't think you can assume it for all symbols, but if it 
> helps, the Emacs byte-compiler also assumes that "+" and "cons" and 
> certain other functions won't be redefined.  It's even got an "add1" 
> opcode.
> 
> So if I understand right, if you make similar assumptions and change how 
> function bindings are handled, your performance for this code drops to 
> under a second?  It sounds like maybe you can get within shouting 
> distance of Emacs's own performance already, at least for certain test 
> cases.

Well, that's partially true.  For those built-ins that map directly to 
Guile primitives, it would probably be an advantage to build 
make-primitive-ref's for the generated TreeIL code directly; I'm not 
sure if that's done at the moment, but in the future this will help for 
things like optimization and special op-codes (e.g. add1).

I think it would really be reasonable to assume certain symbols don't 
get rebound, if they don't have a different lexical binding at the 
moment; although I think that the concept of dynamic binding is actually 
also about the ability to rebind even built-ins to allow for changes... 
  For instance, what if I wanted to write a program that evaluates the 
"efficiency" of some numerical algorithm by overloading + and * to count 
the number of operations performed?  This seems like a valid need to me 
(in fact, I might be doing something similar for my Bachelor's thesis; 
though probably not in Scheme or elisp, so this does not directly matter 
here).

So my idea was to provide a compiler option to always use an ordinary 
function call for certain or all primitives as a compromise; that sounds 
like a quite good idea to me catering for both needs.

However, as a side-note:  I don't think my code would drop below one 
second if this was implemented (hm, at least I'm not sure), because for 
instance all built-ins returning booleans (like < in the example) can 
not map directly to Guile primitives because I need to translate #f to 
%nil inbetween...  It's a pity because comparisons are probably quite 
common especially in such loops, but if we don't want to get rid of 
translation and don't care about #f in elisp (see my other post in the 
%nil thread), I see no way around this.

> Would this interfere with possibly blending Scheme GOOPS code with Elisp 
> someday?  Or is the generic support there at a lower level than this?  
> (E.g., a "marker" object holds a buffer handle, possibly nil, and an 
> offset that automatically gets adjusted if text is inserted before it.  
> You can use "(+ 1 marker)" and get back an integer one greater than the 
> marker's current offset.  If markers were implemented using GOOPS, would 
> this use of "+" work, given the changes you're suggesting?)

To be honest, I've nearly no idea about GOOPS so far and thus can't 
comment here...

Yours,
Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30 20:18 ` Neil Jerram
  2009-07-30 23:54   ` Ken Raeburn
@ 2009-07-31  6:02   ` Daniel Kraft
  2009-07-31  9:59     ` Ken Raeburn
  2009-08-04 11:00     ` Andy Wingo
  2009-08-04 10:17   ` Andy Wingo
  2 siblings, 2 replies; 23+ messages in thread
From: Daniel Kraft @ 2009-07-31  6:02 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, guile-devel

Hi Neil,

Neil Jerram wrote:
> Daniel Kraft <d@domob.eu> writes: 
>> Lambda arguments are still always dynamically bound, which is quite a
>> pity as it inhibits tail-call optimization;
> 
> This prompted me to wonder if using fluids is the best way to
> implement dynamic binding.
> 
> Perhaps I'm forgetting something basic, but surely it's using
> `dynamic-wind' that gives us dynamic binding semantics, not fluids.

I also thought about this yesterday; and I think you're right, just as I 
propose to do for function bindnigs we could also do for values. 
Somehow I had the impression in the back of my head that fluids are 
meant to provide dynamic binding as we need it and with-fluid* does some 
"magic" to do it most efficiently.

However, as I now see it, the dynamic state is meant to apply to threads 
and not between different with-fluid*'s, right?  So instead of 
with-fluid* using just a dynamic-wind will be faster (or at least on 
par)...?

If so, I think we should get rid of the fluids and switch to directly 
using dynamic-wind.  The point about future multi-threading might be 
interesting, though...  If we did switch elisp to multi-threading at 
some point, what would become of dynamic binding?  I guess we can't do 
this across threads, so each dynamically bound variable would also be 
thread local?  I think this is the most reasonable concept, trying to 
make dynamic bindnig work across threads looks really messy to me. 
Then, the fluid concept will be just what we need again;  but we 
probably also want to find a way for shared global variables -- which 
has time until then, of course ;)

Another thing:  If we use dynamic-wind directly instead of the current 
fluid system, we could get rid of void as special value and instead just 
unbind the symbol from our module in case of void; then, on access Guile 
would complain itself and we wouldn't need the special checks for void. 
  With dynamic-wind, we could ensure that variables are re-bound or 
unbound to their previous state when leaving the dynamic scope.  This 
rebinding would, however, be more expensive as we would have to care for 
a lot more special cases.  With regards to void, I like the current 
implementation with the possibility to disable the access check if not 
needed and then get full speed.  So I'm not sure if I would change void 
handling at all even if we switched away from fluids and it became possible.

> (Note that `with-fluids', which is probably the closest thing in Guile
> Scheme to a dynamic let, is a combination of `dynamic-wind' and
> `fluid-set!'.)

Yes, I agree; and with-fluids is quite nice to use, also.  But as the 
compiler handles dynamic binding in our case, I also don't care about 
explicitly setting/reverting the variables with dynamic-wind.  If 
with-fluids uses dynamic-wind internally, we can only win on performance 
and all we lose seems to be the thread-locality for the future.  But 
this may still be a point, I'm not sure...

> So what's my point?  I'm not sure, just musing.  As far as performance
> and tail-call optimization are concerned, I would guess that the main
> thing that needs to be addressed in order to reinstate tail-call
> optimization would be the dynamic wind - i.e. the compiler knowing
> that it isn't necessary to set up another dynamic-wind frame, it can
> just jump with the current variables.

Hm, I'm not sure if I got your point correctly, but I think that dynamic 
binding and tail-call optimization are difficult to combine in general, 
no matter if the dynamic binding is done with fluids or dynamic-wind.

As you said, the point is about the compiler or VM handling the 
"unwinding" (i.e. resetting changed dynamic bindings to their old 
values) and doing this in a way to allow tail-call optimization.  I 
don't think a tail-call is possible in general at all if the caller's 
body is executed inside a dynamic wind (well, at least for recursive 
calls, each level will have to create it's own unwinding-context and 
thus grow the stack despite tail-calls)...?  I'm not very much into 
theoretical considerations in this topic, so maybe you know more about 
if/how/when tail optimization is possible together with dynamic binding 
or dynamic-wind; but some ideas below.

For dynamic binding however, I think one could do tail-call 
optimization, at least in some cases like when the callee binds a 
superset of the symbols bound by the callee (which would suffice for 
optimization of recursive calls).  But then we would need to do dynamic 
binding in some other way than with dynamic-wind (i.e. with some 
stack-like data structure where we can push/pop values for the symbols) 
and also have to know about tail-calls in the elisp compiler directly; I 
don't think we should go for this...  When I implement the option to 
"always" bind certain symbols lexically, at least for lambda's with 
lexically bound arguments tail-call optimization will again be available 
for free.

But if we want to work on this, a not-so-bad idea could be thinking 
about dynamic-binding support directly by the VM (with something like 
the special data structure I mentioned; we could even look at how emacs 
does it exactly); then, the VM (or a lower-level compiler) which already 
knows about tail-calls could be taught to also handle the dynamic 
binding aspects.

>> Iterative prime sieve, (length (find-primes-to 5000)):
>>   Scheme: 0.42s
>>   Elisp, no void checks, lexical let: 3.40s
>>   Elisp, no void checks, dynamic let: 4.43s
>>   Elisp, void checks, dynamic let: 5.12s
>>   Elisp, void checks, lexical let: 4.06s
> 
> As Ken says, it would be good to see an Emacs timing too.

I'd like to provide one, but have no idea how to do so...  I just 
managed to find out how to evaluate elisp code from a buffer, but it 
would be cool to get some elisp REPL if possible (maybe even without 
starting emacs itself and just from a shell?) -- and how to I time?  As 
well as byte-compile and time then?

> (Also, if I get time, I'll try this with the old translation code
> too.  Just for fun, and hopefully for confirmation that the new
> approach is much better!)

That would be cool!

>> My conjecture is that the remaining bottle-neck are the primitive
>> calls, as they are still resolved using the dynamic-binding and fluid
>> mechanisms; so maybe it is really a good idea to change function
>> bindings to just global ones without dynamic scoping, and implement
>> flet/flet* with dynamic-wind.  In contrast to variables, for functions
>> access performance is a lot more important than let performance, I
>> guess, so this might be the way to go.  What do you think?
> 
> See above.  I'm not sure why fluid access should be significantly
> slower than non-fluid access, so I would guess that the performance
> cost is coming from the dynamic-wind part.

I don't know about fluid access; dynamic-winding should not take place 
(at least not repeatedly) in my loop because there are no let's within 
it...  However, as I wrote in another message, I think now that the 
difference between the naive elisp version and the guile-ref one might 
also be the boolean translation #f -> %nil in elisp's < built-in vs. 
calling Guile's primitive directly and not caring about translation.

Some testing however:

(define (test1 to)
   (let ((i 0))
     (while (< i to)
       (set! i (1+ i)))))

(define (test2 to)
   (let ((i (make-fluid)))
     (fluid-set! i 0)
     (while (< (fluid-ref i) to)
       (fluid-set! i (1+ (fluid-ref i))))))

(define global-i)
(define (test3 to)
   (set! global-i 0)
   (while (< global-i to)
     (set! global-i (1+ global-i))))

And calling all three versions with 1000000 iterations gave about 0.71s 
for test1 and test3 (so lexical i and module i seem to be the same 
speed) but 2.5s for test2.  Thus I think that the fluids are indeed 
slower (I've no idea how they are implemented, but these results seem to 
indicate it).

Ok, I guess a quick summary is in order:  I think implementing dynamic 
binding directly using dynamic-wind instead of with fluids is a good 
idea, as long as we don't already want to keep the fluids for future 
multi-threading.  If this ever comes, we are probably again forced back 
to fluids.  Do you think we should switch or not?  Regarding tail calls, 
my opinion is that there's no "easy" way to make them work with 
dynamically bound arguments (as arguments are in elisp), if there is at 
all -- and that we should not bother.  I don't think emacs does general 
tail-call optimization (does it?), and there will be a way to switch 
lambda arguments back to lexical scoping, so that for those lambda's 
tail-calls will be optimized just fine with the current compiler/VM 
infrastructure.

Yours,
Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30 23:54   ` Ken Raeburn
@ 2009-07-31  6:09     ` Daniel Kraft
  2009-08-04 10:26       ` Andy Wingo
  2009-08-04 10:26     ` Andy Wingo
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Kraft @ 2009-07-31  6:09 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Andy Wingo, guile-devel, Neil Jerram

Ken Raeburn wrote:
  > And maybe that's enough.  There's other stuff in Emacs besides variable
> bindings that would require dynamic-wind support, like flet, 
> save-excursion (preserves current buffer and position), 
> with-output-to-string and with-output-to-temp-buffer (preserve 
> 'standard-output'), as well as any number of explicit calls to 
> unwind-protect from Elisp code to alter and restore other global state 
> of the program (e.g., with set-default-file-modes or 
> set-standard-case-table).  The current incarnation of these things 
> assumes a single-threaded execution model.  So, since we can't make 
> Emacs multithreaded by simply dropping Guile Elisp into it, if Emacs is 
> all we're concerned about, maybe assuming single-threaded execution only 
> is okay for now.
 >
> On the other hand, if we want users to be able to write code snippets in 
> Elisp and have them callable from a concurrently-multithreaded Scheme 
> program (no Emacs processes in sight), I think we'll want the 
> multithreaded support for the basic Elisp language sooner rather than 
> later, even if multithreaded Emacs needs much more work.

As already stated upthread, I'm not sure about this myself...  It would 
be cool to stay as flexible as possible with regards to future 
multi-threading, but on the other hand I also like the idea of getting 
rid of the fluids.  My timings there seem to suggest that fluids at 
least have "some" performance hit.

Of course, anyone interested in performance can also use lexical-let 
instead of let and also get rid of all this performance problems as well 
;)  But the same argument may hold for the threading argument, too, so 
if you want to write elisp code that is called from multi-threaded 
Scheme, just avoid dynamic binding and you'll get no problems...

> I also don't know how complicated a switch to stop using fluids would 
> be.  If it's not too painful, the performance impact would be 
> interesting to see -- does it get us closer to the performance of Emacs 
> itself?

Well, it will not be trivial but also quite doable, I think.  Regarding 
the performance, I'm quite sure it will help, but not so much about the 
actual quantitative impact -- but if we're lucky, it might be like that 
between dynamic and lexical let of my timings in the original post. 
This seems quite plausible to me.

Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-31  6:02   ` Daniel Kraft
@ 2009-07-31  9:59     ` Ken Raeburn
  2009-07-31 15:14       ` Daniel Kraft
  2009-08-04 11:00     ` Andy Wingo
  1 sibling, 1 reply; 23+ messages in thread
From: Ken Raeburn @ 2009-07-31  9:59 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Andy Wingo, guile-devel, Neil Jerram

On Jul 31, 2009, at 02:02, Daniel Kraft wrote:
>>> Iterative prime sieve, (length (find-primes-to 5000)):
>>>  Scheme: 0.42s
>>>  Elisp, no void checks, lexical let: 3.40s
>>>  Elisp, no void checks, dynamic let: 4.43s
>>>  Elisp, void checks, dynamic let: 5.12s
>>>  Elisp, void checks, lexical let: 4.06s
>> As Ken says, it would be good to see an Emacs timing too.
>
> I'd like to provide one, but have no idea how to do so...  I just  
> managed to find out how to evaluate elisp code from a buffer, but it  
> would be cool to get some elisp REPL if possible (maybe even without  
> starting emacs itself and just from a shell?) -- and how to I time?   
> As well as byte-compile and time then?

Easy -- M-x ielm RET starts "interactive emacs lisp mode" in a buffer!

The function (current-time) provides the current timestamp with  
microsecond resolution (if supported by the OS) as a list of three  
numbers (working around Emacs integer representation limitations), so  
you can invoke it before and after and compute the difference.

So add something like this to your elisp code, and save it in a file:

(defun time-diff (t1 t2)
   (+ (* 65536.0 (- (nth 0 t1) (nth 0 t2)))
      (* 1.0 (- (nth 1 t1) (nth 1 t2)))
      (* 1.0e-6 (- (nth 2 t1) (nth 2 t2)))))

(defun time-test (exp)
   (let ((start (current-time)))
     (eval exp)
     (time-diff (current-time) start)))

Eval that code, then in ielm run:

  (time-test '(length (find-primes-to 5000)))

Be sure to quote the expression, or it'll be evaluated before the  
start time is computed... I made that mistake a couple of times. :-)   
And you can use

   (byte-compile 'sieve)

to compile an individual function, or M-x byte-compile-file followed  
by M-x load-file to load the compiled version of all the functions so  
you can run them.  After byte-compiling, try M-x disassemble RET sieve  
RET if you're curious about what Emacs does with it.

> Ok, I guess a quick summary is in order:  I think implementing  
> dynamic binding directly using dynamic-wind instead of with fluids  
> is a good idea, as long as we don't already want to keep the fluids  
> for future multi-threading.  If this ever comes, we are probably  
> again forced back to fluids.  Do you think we should switch or not?   
> Regarding tail calls, my opinion is that there's no "easy" way to  
> make them work with dynamically bound arguments (as arguments are in  
> elisp), if there is at all -- and that we should not bother.  I  
> don't think emacs does general tail-call optimization (does it?),  
> and there will be a way to switch lambda arguments back to lexical  
> scoping, so that for those lambda's tail-calls will be optimized  
> just fine with the current compiler/VM infrastructure.

If the new thread keeps running while the creating thread exits the  
scope that created a dynamic binding, while the new thread wants to  
keep using or updating it... ugh.

I can imagine cases where you might want to carry bindings across into  
a new thread -- consider a function calling a multithreaded variant of  
mapcar on a long list of inputs, passing a function or lambda  
expression that wants to refer to arguments of the enclosing  
function.  (E.g., "grep" on a list of strings, the applied function  
taking an individual string as its argument, and finding the regexp  
via dynamic scope; with a worker thread per CPU or per core, a long  
list can be processed much faster.)  But, if the new thread is started  
with no dynamic bindings, I think you can fudge it with lexical  
bindings:

   ...
   (mt-mapcar (lexical-let ((regexp regexp))
                 (lambda (s) (if (string-match regexp s) s nil)))
              huge-list-o-strings)

Since there's a way to work around it for specific cases that need  
other behavior, I think it's probably reasonable for the new thread to  
start with global bindings only.

Ken




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-31  9:59     ` Ken Raeburn
@ 2009-07-31 15:14       ` Daniel Kraft
  2009-08-04 11:14         ` Andy Wingo
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel Kraft @ 2009-07-31 15:14 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Andy Wingo, guile-devel, Neil Jerram

Hi Ken,

Ken Raeburn wrote:
> On Jul 31, 2009, at 02:02, Daniel Kraft wrote:
>>>> Iterative prime sieve, (length (find-primes-to 5000)):
>>>>  Scheme: 0.42s
>>>>  Elisp, no void checks, lexical let: 3.40s
>>>>  Elisp, no void checks, dynamic let: 4.43s
>>>>  Elisp, void checks, dynamic let: 5.12s
>>>>  Elisp, void checks, lexical let: 4.06s
>>> As Ken says, it would be good to see an Emacs timing too.
>>
>> I'd like to provide one, but have no idea how to do so...  I just 
>> managed to find out how to evaluate elisp code from a buffer, but it 
>> would be cool to get some elisp REPL if possible (maybe even without 
>> starting emacs itself and just from a shell?) -- and how to I time?  
>> As well as byte-compile and time then?
> 
[...]

Thanks for the emacs hints!  For the prime sieve, emacs itself takes 
about 1.89s (interpreted) and 0.32s (compiled).  So it's significantly 
faster than my elisp version at the moment, the reason might be the 
performance hit from the primitive calls already discussed (but I don't 
know).  And the compiled version is even faster than Guile runs the 
Scheme code.

The "tight loop" takes in emacs about 0.91s (interpreted) and 0.26s 
(compiled) when using lexical-let and 0.58s / 0.16s when using dynamic 
binding.  So with the manual guile-primitive hack, my version actually 
reaches emacs here; which is still quite unfair of course, as my code is 
faster with lexical-let and emacs is slower...

>> Ok, I guess a quick summary is in order:  I think implementing dynamic 
>> binding directly using dynamic-wind instead of with fluids is a good 
>> idea, as long as we don't already want to keep the fluids for future 
>> multi-threading.  If this ever comes, we are probably again forced 
>> back to fluids.  Do you think we should switch or not?  Regarding tail 
>> calls, my opinion is that there's no "easy" way to make them work with 
>> dynamically bound arguments (as arguments are in elisp), if there is 
>> at all -- and that we should not bother.  I don't think emacs does 
>> general tail-call optimization (does it?), and there will be a way to 
>> switch lambda arguments back to lexical scoping, so that for those 
>> lambda's tail-calls will be optimized just fine with the current 
>> compiler/VM infrastructure.
> 
> If the new thread keeps running while the creating thread exits the 
> scope that created a dynamic binding, while the new thread wants to keep 
> using or updating it... ugh.

Yep, so I think we should either make dynamic bindings thread local 
(keep the fluids) or make them global with the problems of concurrent 
access but simply declare they are unsafe to use with multiple threads 
and users should instead use lexical bindings in relevant code.

I did not get your opinion about which of these ways to go for dynamic 
bindings...  From my point of view, both have quite advantages and 
disadvantages; fluids are cleaner and can be used in the multi-threaded 
context (should it become topical), but getting rid of them with 
dynamic-wind promises better performance and simpler working when single 
threaded.  Maybe we could even provide both implementations to choose 
for the user, I'll investigate how much effort that would be and what 
problems there arise.

Yours,
Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30 20:18 ` Neil Jerram
  2009-07-30 23:54   ` Ken Raeburn
  2009-07-31  6:02   ` Daniel Kraft
@ 2009-08-04 10:17   ` Andy Wingo
  2009-08-04 10:54     ` Daniel Kraft
  2009-08-04 15:58     ` Ken Raeburn
  2 siblings, 2 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 10:17 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Daniel Kraft, guile-devel

Hello!

(Was away for the weekend, but back hacking all week now.)

On Thu 30 Jul 2009 22:18, Neil Jerram <neil@ossau.uklinux.net> writes:

> Daniel Kraft <d@domob.eu> writes:
>
>> Lambda arguments are still always dynamically bound, which is quite a
>> pity as it inhibits tail-call optimization;

Indeed, a pity. Though self-tail calls can be optimized... still,
irritating.

> This prompted me to wonder if using fluids is the best way to
> implement dynamic binding.
>
> Perhaps I'm forgetting something basic, but surely it's using
> `dynamic-wind' that gives us dynamic binding semantics, not fluids.

I haven't read the rest of this thread yet, but this does not seem to be
related to TCO. The middle thunk of a dynamic-wind is *not* called in a
tail context, as indeed it does have a continuation -- the exit thunk.

> The main thing I believe that makes a fluid different from a normal
> variable is that a fluid can have a different value in each thread -
> which is not relevant to Elisp.

True. Of course, it would be a nice thing to give to elisp -- I mean,
currently in Emacs variables have one value per thread as well, it's
just that there's only one thread. So if we give Emacs the capability to
run multiple threads (as we should IMO), we should provide thread-local
values.

>> My conjecture is that the remaining bottle-neck are the primitive
>> calls, as they are still resolved using the dynamic-binding and fluid
>> mechanisms; so maybe it is really a good idea to change function
>> bindings to just global ones without dynamic scoping, and implement
>> flet/flet* with dynamic-wind.  In contrast to variables, for functions
>> access performance is a lot more important than let performance, I
>> guess, so this might be the way to go.  What do you think?
>
> See above.  I'm not sure why fluid access should be significantly
> slower than non-fluid access, so I would guess that the performance
> cost is coming from the dynamic-wind part.

Yes, dynamic-wind does involve a cons. I wish this weren't the case, and
perhaps in the future we can fix this, but that's the way it is now.

But regarding speed: profiling is your friend.

http://article.gmane.org/gmane.lisp.guile.devel/6639, for example.

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30 23:54   ` Ken Raeburn
  2009-07-31  6:09     ` Daniel Kraft
@ 2009-08-04 10:26     ` Andy Wingo
  1 sibling, 0 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 10:26 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Daniel Kraft, guile-devel, Neil Jerram

Hi,

On Fri 31 Jul 2009 01:54, Ken Raeburn <raeburn@raeburn.org> writes:

> On Jul 30, 2009, at 16:18, Neil Jerram wrote:
>> The main thing I believe that makes a fluid different from a normal
>> variable is that a fluid can have a different value in each thread -
>> which is not relevant to Elisp.
>
> Not yet, at least.
>
> And maybe that's enough.  There's other stuff in Emacs besides variable
> bindings that would require dynamic-wind support, like flet,
> save-excursion (preserves current buffer and position), with-output-to- 
> string and with-output-to-temp-buffer (preserve 'standard-output'), as
> well as any number of explicit calls to unwind-protect from Elisp code
> to alter and restore other global state of the program (e.g., with set- 
> default-file-modes or set-standard-case-table).  The current incarnation
> of these things assumes a single-threaded execution  model.  So, since
> we can't make Emacs multithreaded by simply dropping  Guile Elisp into
> it, if Emacs is all we're concerned about, maybe  assuming
> single-threaded execution only is okay for now.

I think this would be a shame.

Some of the procedures you mention don't necessarily need
dynamic-wind -- they would be sufficiently modelled by a non-reentrant
`with-throw-handler' or such. Downward-only continuations, that is.

> I also don't know how complicated a switch to stop using fluids would
> be.  If it's not too painful, the performance impact would be
> interesting to see -- does it get us closer to the performance of  Emacs
> itself?

I don't think fluids are the issue TBH. I'll see if I can do some
profiling later today.

>> See above.  I'm not sure why fluid access should be significantly
>> slower than non-fluid access, so I would guess that the performance
>> cost is coming from the dynamic-wind part.
>
> For a program the size of Emacs, I'd be concerned about the numbers of
> fluids and dynamic states created, especially since the space cost  (and
> run time cost for growing the vectors, every 20th new entry)  grows with
> the product of the two.  The number of dynamic states may  shrink, but
> the size of each dynamic state -- the number of allocated  fluid slots
> -- doesn't shrink, it only grows.

OTOH, this is the same problem as interning symbols, which Emacs must
surely share -- does Emacs GC its symbols? I wouldn't worry about this
either. Emacs can call some pre-allocation function to pre-grow the
vector at startup.

> While they don't really correlate to anything in Guile Elisp directly,
> the default max_specpdl_size (the specpdl stack is where things are
> tracked for unwinding -- previous bindings, unwind-protect calls,
> previous location data for save-excursion, etc) was raised to 1000 a
> few years ago, and the maximum lisp evaluation depth (limits calls to
> eval, funcall, apply) to 400 just a couple years ago.  I assume that's
> because the previous limits (650 and 300 respectively) were
> occasionally getting reached.  That suggests to me rather a lot of
> dynamic states, and probably a large number of fluids.

Interesting data, thanks.

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-31  6:09     ` Daniel Kraft
@ 2009-08-04 10:26       ` Andy Wingo
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 10:26 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Ken Raeburn, guile-devel, Neil Jerram

Hi Daniel,

On Fri 31 Jul 2009 08:09, Daniel Kraft <d@domob.eu> writes:

> Of course, anyone interested in performance can also use lexical-let
> instead of let and also get rid of all this performance problems as well
> ;)  But the same argument may hold for the threading argument, too, so
> if you want to write elisp code that is called from multi-threaded
> Scheme, just avoid dynamic binding and you'll get no problems...

Makes me think, perhaps a lexical-lambda would also be interesting.

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 10:17   ` Andy Wingo
@ 2009-08-04 10:54     ` Daniel Kraft
  2009-08-04 15:58     ` Ken Raeburn
  1 sibling, 0 replies; 23+ messages in thread
From: Daniel Kraft @ 2009-08-04 10:54 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel, Neil Jerram

Andy Wingo wrote:
>> Daniel Kraft <d@domob.eu> writes:
>>
>>> Lambda arguments are still always dynamically bound, which is quite a
>>> pity as it inhibits tail-call optimization;
> 
> Indeed, a pity. Though self-tail calls can be optimized... still,
> irritating.

Yes, but I fear this will complicate the compiler a lot and does not 
gain that much -- for anyone who wants to use tail-calls, he can do so 
with the always-lexical feature, and existing code should not depend on 
tail-calls anyway as Emacs seems not to do it.

>> The main thing I believe that makes a fluid different from a normal
>> variable is that a fluid can have a different value in each thread -
>> which is not relevant to Elisp.
> 
> True. Of course, it would be a nice thing to give to elisp -- I mean,
> currently in Emacs variables have one value per thread as well, it's
> just that there's only one thread. So if we give Emacs the capability to
> run multiple threads (as we should IMO), we should provide thread-local
> values.

I like the idea, too.  In this case, we should stick with the fluids.  I 
also thought about giving the user the choice of which "dynamic binding" 
facility to use, and it would not be that much of a mess, I think; maybe 
I'm going to give dynamic binding without fluids and just dynamic-wind a 
try to see how much performance this really gives.

Cheers,
Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-31  6:02   ` Daniel Kraft
  2009-07-31  9:59     ` Ken Raeburn
@ 2009-08-04 11:00     ` Andy Wingo
  2009-08-08 22:15       ` Ludovic Courtès
  1 sibling, 1 reply; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 11:00 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: guile-devel, Neil Jerram

Hello Daniel,

On Fri 31 Jul 2009 08:02, Daniel Kraft <d@domob.eu> writes:

> Hi Neil,
>
> Neil Jerram wrote:
>> Daniel Kraft <d@domob.eu> writes: 
>>> Lambda arguments are still always dynamically bound, which is quite a
>>> pity as it inhibits tail-call optimization;
>>
>> This prompted me to wonder if using fluids is the best way to
>> implement dynamic binding.
>>
>> Perhaps I'm forgetting something basic, but surely it's using
>> `dynamic-wind' that gives us dynamic binding semantics, not fluids.
>
> I also thought about this yesterday; and I think you're right, just as I
> propose to do for function bindnigs we could also do for values. Somehow
> I had the impression in the back of my head that fluids are meant to
> provide dynamic binding as we need it and with-fluid* does some "magic"
> to do it most efficiently.
>
> However, as I now see it, the dynamic state is meant to apply to threads
> and not between different with-fluid*'s, right?  So instead of
> with-fluid* using just a dynamic-wind will be faster (or at least on
> par)...?

It will be on par. Here's the deal.

In Guile, be it in Scheme or in other languages, variables that are
`set!' need to be allocated on the heap. This is so that different
closures can see the "same" variable. It is also so that reinstated
continuations can see the result of mutations of lexically captured
variables.

Now, this imposes two penalties. The one is that because the variable is
allocated on the heap -- actually in a "variable" object, like in
`make-variable' -- lexically binding a variable that is `set!' causes a
cons. Occasionally this will cause a GC, and Guile's GC is slow, so this
causes speed penalties.

That's a penalty at binding time -- e.g. when entering a `let'
expression. There is also a penalty at access time, because referencing
or setting the value has to go through an indirection, into the heap.
This matters, but it doesn't matter as much as the cons.

Using fluids, you never actually `set!' a fluid -- you `fluid-set!' a
fluid. Heap-wise, fluids are a bit heavier than variables, as they are
double cells instead of single cells. I think this is rather silly, and
we should probably fix this -- the number can go into the smob flags,
and the next loc (if we still need it) into the data word. 64K fluids
might be enough... otherwise we could make fluids tc7 objects. Surely
16M fluids are enough.

So, let's assume we manage to make fluids single-cell objects at some
point. (Note that a cell is two words.) Then the heap "load" of fluids
is basically the same as variables. There is the fluid vector too, so
that's 2 words per fluid, plus 1 word per fluid per thread -- but the
vector is easier on the GC.

Indirection-wise, since we can cache actual fluid values in Elisp
functions instead of the variable objects that contain them -- because
the variables that contain the fluids are never set!, it's just the
fluids themselves that are fluid-set! -- *the access cost of fluids is
the same*. It might even be better, due to memory locality of the fluid
vector.

So fluids are not the problem, given sufficient support from the VM.

                              * * *

Dynamic-wind, on the other hand, does pose a problem. The problem is
that dynamic-wind is a primitive -- to call it, the VM saves its
registers, conses together its arguments, and passes them to the
evaluator. The evaluator destructures the consed list, eventually finds
the C function, and calls scm_dynamic_wind, which conses on the guards
to the current wind list, recursively calls the VM for the enter, body,
and exit thunks, and then resets the wind list.

All this entering and leaving the VM is expensive and unnecessary. It
would be better to add VM ops for entering and leaving dynamic contexts,
but you still have to cons the handlers onto the wind list -- at least
with Guile's current dynwind implementation.

But... I have been meaning for a long time to implement dynwind and
`catch' in the VM so we wouldn't need these recursive VM invocations.
(These days, dynwind is the largest reason for recursive VM invocation,
besides map and for-each which are still implemented as primitives.)

Does that give you a better insight to the performance implications?

> If so, I think we should get rid of the fluids and switch to directly
> using dynamic-wind.  The point about future multi-threading might be
> interesting, though...  If we did switch elisp to multi-threading at
> some point, what would become of dynamic binding?  I guess we can't do
> this across threads, so each dynamically bound variable would also be
> thread local?  I think this is the most reasonable concept, trying to
> make dynamic bindnig work across threads looks really messy to me. Then,
> the fluid concept will be just what we need again;  but we probably also
> want to find a way for shared global variables -- which has time until
> then, of course ;)

Simply disabling multithreading so you wouldn't have to use fluids would
not be sufficient -- as I showed above, the cost of fluids versus `set!'
values is the same. You could get a win via disallowing captured
continuations, which would allow you to use `with-throw-handler' instead
of `dynamic-wind', but the performance impact of the two is currently
the same, basically.

> Another thing:  If we use dynamic-wind directly instead of the current
> fluid system, we could get rid of void as special value and instead just
> unbind the symbol from our module in case of void; then, on access Guile
> would complain itself and we wouldn't need the special checks for void.
> With dynamic-wind, we could ensure that variables are re-bound or
> unbound to their previous state when leaving the dynamic scope.  This
> rebinding would, however, be more expensive as we would have to care for
> a lot more special cases.  With regards to void, I like the current
> implementation with the possibility to disable the access check if not
> needed and then get full speed.  So I'm not sure if I would change void
> handling at all even if we switched away from fluids and it became
> possible.

The VM handles unbound values in a few places. See the code for
local-ref, local-boxed-ref, and closure-boxed-ref, for example. It uses
SCM_UNDEFINED, which answers to SCM_UNBNDP. (An irritating lack of
symmetry, I know. There is also SCM_UNBOUND, I think... grrr....)

>> (Note that `with-fluids', which is probably the closest thing in Guile
>> Scheme to a dynamic let, is a combination of `dynamic-wind' and
>> `fluid-set!'.)
>
> Yes, I agree; and with-fluids is quite nice to use, also.  But as the
> compiler handles dynamic binding in our case, I also don't care about
> explicitly setting/reverting the variables with dynamic-wind.  If
> with-fluids uses dynamic-wind internally, we can only win on performance
> and all we lose seems to be the thread-locality for the future.  But
> this may still be a point, I'm not sure...

We should be using VM ops at some point, I would think... at least for
looking up and caching the fluid values, and for reffing and setting
dynamically bound vars.

>> So what's my point?  I'm not sure, just musing.  As far as performance
>> and tail-call optimization are concerned, I would guess that the main
>> thing that needs to be addressed in order to reinstate tail-call
>> optimization would be the dynamic wind - i.e. the compiler knowing
>> that it isn't necessary to set up another dynamic-wind frame, it can
>> just jump with the current variables.
>
> Hm, I'm not sure if I got your point correctly, but I think that dynamic
> binding and tail-call optimization are difficult to combine in general,
> no matter if the dynamic binding is done with fluids or dynamic-wind.

Yes, I agree. The unbinding is a continuation that is not from the
parent call, whereas a tail call from a procedure without dynamic
bindings has no such continuation.

Regards,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-31 15:14       ` Daniel Kraft
@ 2009-08-04 11:14         ` Andy Wingo
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 11:14 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Ken Raeburn, guile-devel, Neil Jerram

Hi Daniel,

On Fri 31 Jul 2009 17:14, Daniel Kraft <d@domob.eu> writes:

> Hi Ken,
>
> Ken Raeburn wrote:
>> On Jul 31, 2009, at 02:02, Daniel Kraft wrote:
>>>>> Iterative prime sieve, (length (find-primes-to 5000)):
>>>>>  Scheme: 0.42s
>>>>>  Elisp, no void checks, lexical let: 3.40s
>>>>>  Elisp, no void checks, dynamic let: 4.43s
>>>>>  Elisp, void checks, dynamic let: 5.12s
>>>>>  Elisp, void checks, lexical let: 4.06s
>>>> As Ken says, it would be good to see an Emacs timing too.
>>>
>>> I'd like to provide one, but have no idea how to do so...  I just
>>> managed to find out how to evaluate elisp code from a buffer, but it
>>> would be cool to get some elisp REPL if possible (maybe even without
>>> starting emacs itself and just from a shell?) -- and how to I time?
>>> As well as byte-compile and time then?
>>
> [...]
>
> Thanks for the emacs hints!  For the prime sieve, emacs itself takes
> about 1.89s (interpreted) and 0.32s (compiled).  So it's significantly
> faster than my elisp version at the moment, the reason might be the
> performance hit from the primitive calls already discussed (but I don't
> know).  And the compiled version is even faster than Guile runs the
> Scheme code.

Very interesting results! We'll work on the elisp, as noted in other
mails. The Scheme vs Emacs results are somewhat surprising, but that's
probably due to using set! instead of functional loops. Also, Guile's
compiler still needlessly allocates closures for loops. Perhaps I should
fix that this week.

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-29 12:50 Elisp performance Daniel Kraft
  2009-07-30  3:23 ` Ken Raeburn
  2009-07-30 20:18 ` Neil Jerram
@ 2009-08-04 15:47 ` Andy Wingo
  2009-08-04 16:12   ` Ken Raeburn
  2009-08-04 16:17   ` Daniel Kraft
  2 siblings, 2 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 15:47 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: guile-devel

Hi Daniel,

On Wed 29 Jul 2009 14:50, Daniel Kraft <d@domob.eu> writes:

> For the timings, I used a simple prime sieve implemented imperatively
> with while and set's, because the recursive one would put elisp without
> tail-calls at a disadvantage (and because lexical binding does not yet
> work for lambda arguments); the code is attached.  Both Scheme and Elisp
> version are identical with just syntax changed (set! -> setq, define -> 
> defun, begin -> progn).  Additionally, for lexical let all let's were
> replaced by lexical-let's in the elisp version.

After some thought on it, your approach sounds good. If you are
benchmarking "how fast does X implement the prime sieve", it would be
better to compare idiomatic code -- for Scheme, that would be a
functional solution to the problem. But if your goal is "how fast is
idiom X in language Y compared to language Z", your approach makes sense
to me.

> Iterative prime sieve, (length (find-primes-to 5000)):
>   Scheme: 0.42s
>   Elisp, no void checks, lexical let: 3.40s
>   Elisp, no void checks, dynamic let: 4.43s
>   Elisp, void checks, dynamic let: 5.12s
>   Elisp, void checks, lexical let: 4.06s
>
> So it apparent that both disabling void checks and using lexical let's
> instead of the standard dynamic ones improve performance notably.
> However, were quite out of reach of the Scheme version.
>
> My conjecture is that

Don't conjecture, profile ;-)

Also, disassembly is your friend here.

> Obviously, it would help a lot to do so.  On the other hand, switching
> to primitive-ref's would help even more, but I fear we can not easily do
> so, because we can not know if a symbol targets a primitive or was
> rebound at compile time...  BTW, a quick test with Scheme:
>
> scheme@(guile-user)> (set! + (lambda (a b) 42))
> scheme@(guile-user)> +
> #<program b700a790 at <unknown port>:0:8 (a b)>
> scheme@(guile-user)> (+ 1 2)
> 3
> scheme@(guile-user)> (let ((+ (lambda (a b) 42)))
> ... (+ 1 2))
> 42

This looks like a bug to me in the compiler. I guess we should open-code
primitives based on value and not variable identity.

> In any case, because of dynamic scoping and the expected behaviour of
> flet to change possibly primitives during its extent, I think we can't
> do anything like that for Elisp (except providing guile-primitive for
> hand-optimizing such calls).

Hmmmmmm. It seems that Emacs does inline, but it also reacts to flet.

ELISP> (defun add (x y) (+ x y))
add
ELISP> (compile-defun 'add)
nil
ELISP> (disassemble #'add)
byte code for add:
  args: (x y)
0       varref    x
1       varref    y
2       plus      
3       return    
ELISP> (require 'cl)
cl
ELISP> (flet ((+ (x y) (- x y))) (add 10 20))
-10

How does this work? Ken do you know?

Regards,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-07-30  3:23 ` Ken Raeburn
  2009-07-31  5:15   ` Daniel Kraft
@ 2009-08-04 15:51   ` Andy Wingo
  1 sibling, 0 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 15:51 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Daniel Kraft, guile-devel

Hi Ken,

As a voice from the the sidelines, I just want to say thanks for all
your input!

On Thu 30 Jul 2009 05:23, Ken Raeburn <raeburn@raeburn.org> writes:

> Would [open-coding primitives] interfere with possibly blending Scheme
> GOOPS code with Elisp someday? Or is the generic support there at a
> lower level than this? (E.g., a "marker" object holds a buffer handle,
> possibly nil, and an offset that automatically gets adjusted if text
> is inserted before it. You can use "(+ 1 marker)" and get back an
> integer one greater than the marker's current offset. If markers were
> implemented using GOOPS, would this use of "+" work, given the changes
> you're suggesting?)

This works fine already. If the arguments to the "add" opcode are not
fixnums, it dispatches to scm_add(), which has a few more numeric cases;
then if none of those cases match, it dispatches to the generic function
attached to +. Initially there is no such function, but if there is one,
it can be handled there. In this way javascript can specialize on
strings, so (+ "hello " "world") => "hello world".

Regards,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 10:17   ` Andy Wingo
  2009-08-04 10:54     ` Daniel Kraft
@ 2009-08-04 15:58     ` Ken Raeburn
  1 sibling, 0 replies; 23+ messages in thread
From: Ken Raeburn @ 2009-08-04 15:58 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Daniel Kraft, guile-devel, Neil Jerram

On Aug 4, 2009, at 06:17, Andy Wingo wrote:
> Hello!
>
> (Was away for the weekend, but back hacking all week now.)

Welcome back!

> On Thu 30 Jul 2009 22:18, Neil Jerram <neil@ossau.uklinux.net> writes:
> Daniel Kraft <d@domob.eu> writes:
>>
>>> Lambda arguments are still always dynamically bound, which is  
>>> quite a
>>> pity as it inhibits tail-call optimization;
> Indeed, a pity. Though self-tail calls can be optimized... still,
> irritating.

In the compiler, or in the byte-code engine?

The Emacs byte compiler does not do self-tail call optimizations.   
Since the function value of a symbol can be copied out and/or replaced  
(and Emacs ships with the "advice" package which does both), the  
symbol's function value may wind up not being the function body being  
compiled; turning a self-tail call into a jump back to the beginning  
would cause observable behavior changes from what happens now.

On the other hand, I haven't looked at what elisp compiles to in  
Guile; maybe it's possible to have the byte-code engine detect when  
you're making a tail call to yourself (such as recursive calls when  
the function value slot hasn't been messed with) and optimize them on  
the fly?

Ken




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 15:47 ` Andy Wingo
@ 2009-08-04 16:12   ` Ken Raeburn
  2009-08-04 19:28     ` Andy Wingo
  2009-08-04 16:17   ` Daniel Kraft
  1 sibling, 1 reply; 23+ messages in thread
From: Ken Raeburn @ 2009-08-04 16:12 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Daniel Kraft, guile-devel

On Aug 4, 2009, at 11:47, Andy Wingo wrote:
>> In any case, because of dynamic scoping and the expected behaviour of
>> flet to change possibly primitives during its extent, I think we  
>> can't
>> do anything like that for Elisp (except providing guile-primitive for
>> hand-optimizing such calls).
>
> Hmmmmmm. It seems that Emacs does inline, but it also reacts to flet.

Which Emacs are you using here?  I'm using one of the pretest versions  
of GNU Emacs 23, and also tried 22.1, and both gave me different  
results from yours.

> ELISP> (defun add (x y) (+ x y))
> add
> ELISP> (compile-defun 'add)
> nil

Documentation says:
---
compile-defun is an interactive compiled Lisp function in
`bytecomp.el'.

(compile-defun &optional arg)

Compile and evaluate the current top-level form.
Print the result in the echo area.
With argument arg, insert value in current buffer after the form.
---

So, the argument isn't a symbol to byte-compile the function  
definition of; it's just a flag.
I used (byte-compile 'add).

> ELISP> (disassemble #'add)
> byte code for add:
>  args: (x y)
> 0       varref    x
> 1       varref    y
> 2       plus
> 3       return

I got the same disassembly.

> ELISP> (require 'cl)
> cl
> ELISP> (flet ((+ (x y) (- x y))) (add 10 20))
> -10

It prints 30 for me.

> How does this work? Ken do you know?

My guess would be that the bytecode interpreter in your version is  
hardcoded to look up the function value of "+" and call it.  In Emacs  
23 (and apparently 22.1), it calls the C function Fplus (which is the  
default function binding for the symbol "+"); it doesn't check to see  
if "+" has been redefined.

Ken




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 15:47 ` Andy Wingo
  2009-08-04 16:12   ` Ken Raeburn
@ 2009-08-04 16:17   ` Daniel Kraft
  2009-08-04 19:25     ` Andy Wingo
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Kraft @ 2009-08-04 16:17 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo wrote:
> Don't conjecture, profile ;-)

I'd like to do so, but it seems that Guile does not include a profiler, 
does it?  A search turned up some patch/code, but I didn't get the 
impression it would work very well -- do you have some hints, getting 
some real profiling would be very useful, indeed!

> Also, disassembly is your friend here.

Guess I'll have to get a little more proficient with the VM to make real 
use of it, though I already used it a little to check that certain elisp 
code compiles to the same assembly as the equivalent Scheme one (where I 
expected it to be so).

Yours,
Daniel

-- 
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 16:17   ` Daniel Kraft
@ 2009-08-04 19:25     ` Andy Wingo
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 19:25 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: guile-devel

On Tue 04 Aug 2009 18:17, Daniel Kraft <d@domob.eu> writes:

> Andy Wingo wrote:
>> Don't conjecture, profile ;-)
>
> I'd like to do so, but it seems that Guile does not include a profiler,
> does it?  A search turned up some patch/code, but I didn't get the
> impression it would work very well -- do you have some hints, getting
> some real profiling would be very useful, indeed!

If you download guile-lib from git, (statprof) works well.


>> Also, disassembly is your friend here.
>
> Guess I'll have to get a little more proficient with the VM to make real
> use of it, though I already used it a little to check that certain elisp
> code compiles to the same assembly as the equivalent Scheme one (where I
> expected it to be so).

It's not that bad :) But I'll be looking into your branch tomorrow, so
we can poke it together.

Peace,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 16:12   ` Ken Raeburn
@ 2009-08-04 19:28     ` Andy Wingo
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Wingo @ 2009-08-04 19:28 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Daniel Kraft, guile-devel

On Tue 04 Aug 2009 18:12, Ken Raeburn <raeburn@raeburn.org> writes:

> On Aug 4, 2009, at 11:47, Andy Wingo wrote:
>>> In any case, because of dynamic scoping and the expected behaviour of
>>> flet to change possibly primitives during its extent, I think we
>>> can't
>>> do anything like that for Elisp (except providing guile-primitive for
>>> hand-optimizing such calls).
>>
>> Hmmmmmm. It seems that Emacs does inline, but it also reacts to flet.
>
> Which Emacs are you using here?  I'm using one of the pretest versions
> of GNU Emacs 23, and also tried 22.1, and both gave me different
> results from yours.

I'm using GNU Emacs 23.0.92.1 (i686-pc-linux-gnu, GTK+ Version 2.16.0)
of 2009-04-04

Something from git anyway.

>> ELISP> (defun add (x y) (+ x y))
>> add
>> ELISP> (compile-defun 'add)
>> nil
>
> Documentation says:
> ---
> compile-defun is an interactive compiled Lisp function in
> `bytecomp.el'.
>
> (compile-defun &optional arg)
>
> Compile and evaluate the current top-level form.
> Print the result in the echo area.
> With argument arg, insert value in current buffer after the form.
> ---
>
> So, the argument isn't a symbol to byte-compile the function definition
> of; it's just a flag.
> I used (byte-compile 'add).

Ahhhhh. Indeed now it prints 30 for me.

>> ELISP> (disassemble #'add)
>> byte code for add:
>>  args: (x y)
>> 0       varref    x
>> 1       varref    y
>> 2       plus
>> 3       return
>
> I got the same disassembly.

It appears "disassemble" actually compiled the function first, without
altering the defun.

>> How does this work? Ken do you know?
>
> My guess would be that the bytecode interpreter in your version is
> hardcoded to look up the function value of "+" and call it.  In Emacs
> 23 (and apparently 22.1), it calls the C function Fplus (which is the
> default function binding for the symbol "+"); it doesn't check to see
> if "+" has been redefined.

Yes I saw this, but was led astray via compile-defun ;)

Thanks, this shows that we can open-code primitives without worrying
about flet.

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Elisp performance
  2009-08-04 11:00     ` Andy Wingo
@ 2009-08-08 22:15       ` Ludovic Courtès
  0 siblings, 0 replies; 23+ messages in thread
From: Ludovic Courtès @ 2009-08-08 22:15 UTC (permalink / raw)
  To: guile-devel

Hi!

Andy Wingo <wingo@pobox.com> writes:

> Dynamic-wind, on the other hand, does pose a problem. The problem is
> that dynamic-wind is a primitive -- to call it, the VM saves its
> registers, conses together its arguments, and passes them to the
> evaluator. The evaluator destructures the consed list, eventually finds
> the C function
[...]

IMO a pressing issue here is to avoid consing when invoking "rest-less"
subrs (à la `scm_i_gsubr_apply ()' [0]).

(FWIW this has been on my to-do list for some time but feel free to give
it a try!)

Thanks,
Ludo'.

[0] http://thread.gmane.org/gmane.lisp.guile.devel/8244





^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-08-08 22:15 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-29 12:50 Elisp performance Daniel Kraft
2009-07-30  3:23 ` Ken Raeburn
2009-07-31  5:15   ` Daniel Kraft
2009-08-04 15:51   ` Andy Wingo
2009-07-30 20:18 ` Neil Jerram
2009-07-30 23:54   ` Ken Raeburn
2009-07-31  6:09     ` Daniel Kraft
2009-08-04 10:26       ` Andy Wingo
2009-08-04 10:26     ` Andy Wingo
2009-07-31  6:02   ` Daniel Kraft
2009-07-31  9:59     ` Ken Raeburn
2009-07-31 15:14       ` Daniel Kraft
2009-08-04 11:14         ` Andy Wingo
2009-08-04 11:00     ` Andy Wingo
2009-08-08 22:15       ` Ludovic Courtès
2009-08-04 10:17   ` Andy Wingo
2009-08-04 10:54     ` Daniel Kraft
2009-08-04 15:58     ` Ken Raeburn
2009-08-04 15:47 ` Andy Wingo
2009-08-04 16:12   ` Ken Raeburn
2009-08-04 19:28     ` Andy Wingo
2009-08-04 16:17   ` Daniel Kraft
2009-08-04 19:25     ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).