* Loop optimization
@ 2011-03-07 21:01 Michael Ellis
2011-03-07 21:36 ` Andy Wingo
0 siblings, 1 reply; 6+ messages in thread
From: Michael Ellis @ 2011-03-07 21:01 UTC (permalink / raw)
To: Guile bug
I got a curious result while doing some really brain-dead performance
comparisons between guile and python. All times are
post-compilation.
------ guile ----------------
(let loop ((i 0) (j 1e7))
(set! i (+ i 1))
(if (< j i)
(begin (display i) (newline))
(loop i j)))
real 0m1.182s
user 0m1.150s
sys 0m0.015s
------ python ------------
i=0
j=int(1e7)
while j > i:
i += 1
print i
real 0m1.943s
user 0m1.915s
sys 0m0.015s
-----------------------------
So, congrats! Guile 2.0 is noticeably faster here (I can make python
win by using "for i in xrange()" but that's not the point of this
post.) What surprised me was that using a constant in the "if" test
slows guile down by a factor 8.
(let loop ((i 0))
(set! i (+ i 1))
(if (< 1e7 i)
(begin (display i) (newline))
(loop i)))
real 0m8.976s
user 0m8.789s
sys 0m0.035s
Is this an expected outcome? I was naively supposing that compilation
would be more effective at optimizing when the limit was a constant
instead of a variable.
I'm running OS X 10.6.6 on a 2.4GHz Intel Core Duo MacBook.
Cheers,
Mike
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Loop optimization
2011-03-07 21:01 Loop optimization Michael Ellis
@ 2011-03-07 21:36 ` Andy Wingo
2011-03-07 23:10 ` Michael Ellis
0 siblings, 1 reply; 6+ messages in thread
From: Andy Wingo @ 2011-03-07 21:36 UTC (permalink / raw)
To: Michael Ellis; +Cc: Guile bug
On Mon 07 Mar 2011 22:01, Michael Ellis <michael.f.ellis@gmail.com> writes:
> (let loop ((i 0) (j 1e7))
> (set! i (+ i 1))
> (if (< j i)
> (begin (display i) (newline))
> (loop i j)))
Note that if you avoid a set!, this gets faster:
(let loop ((i 0) (j 1e7))
(if (<= j i)
(begin (display i) (newline))
(loop (1+ i) j)))
On my machine this goes down from 1.5 seconds to 0.91 seconds. See
"Variables and the VM" in the manual, for more.
You get better if you actually use an int as the boundary condition:
(let loop ((i 0) (j 10000000))
(if (<= j i)
(begin (display i) (newline))
(loop (1+ i) j)))
0.62 seconds here.
In fact that is the deal:
> (let loop ((i 0))
> (set! i (+ i 1))
> (if (< 1e7 i)
> (begin (display i) (newline))
> (loop i)))
Here we are loading up a float (which conses) at every iteration. Oddly
enough this would work better in a procedure, because procedures have
constant tables (again, "Variables and the VM").
So:
(define (proc)
(let loop ((i 0))
(set! i (+ i 1))
(if (< 1e7 i)
(begin (display i) (newline))
(loop i))))
(proc)
1.19 seconds here.
But the real thing is again to avoid the float<->int compare. By way of
comparison:
(let loop ((i 0))
(if (<= 10000000 i)
(begin (display i) (newline))
(loop (1+ i))))
0.56 seconds here.
Regards,
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Loop optimization
2011-03-07 21:36 ` Andy Wingo
@ 2011-03-07 23:10 ` Michael Ellis
2011-03-08 0:28 ` Mark H Weaver
2011-03-08 21:43 ` Andy Wingo
0 siblings, 2 replies; 6+ messages in thread
From: Michael Ellis @ 2011-03-07 23:10 UTC (permalink / raw)
To: Andy Wingo; +Cc: Guile bug
Thanks for the education, Andy, and congrats again on the guile
compiler implementation. Your version handily outperforms the fastest
python implementation I know.
for i in xrange (10000000):
pass
print i
real 0m0.641s
user 0m0.625s
sys 0m0.012s
(let loop ((i 0))
(if (<= 10000000 i)
(begin (display i) (newline))
(loop (1+ i))))
real 0m0.495s
user 0m0.465s
sys 0m0.010s
Cheers,
Mike
On Mon, Mar 7, 2011 at 4:36 PM, Andy Wingo <wingo@pobox.com> wrote:
> (let loop ((i 0))
> (if (<= 10000000 i)
> (begin (display i) (newline))
> (loop (1+ i))))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Loop optimization
2011-03-07 23:10 ` Michael Ellis
@ 2011-03-08 0:28 ` Mark H Weaver
2011-03-08 0:30 ` Mark H Weaver
2011-03-08 21:43 ` Andy Wingo
1 sibling, 1 reply; 6+ messages in thread
From: Mark H Weaver @ 2011-03-08 0:28 UTC (permalink / raw)
To: Michael Ellis; +Cc: Guile bug
Michael Ellis <michael.f.ellis@gmail.com> writes:
> Thanks for the education, Andy, and congrats again on the guile
> compiler implementation. Your version handily outperforms the fastest
> python implementation I know.
That's especially impressive given that the two pieces of code below are
markedly different: The guile version formats 10 million integers and
outputs them, whereas the python version does nothing but count and
output the final value.
I'd be very curious to see the results of a _fair_ comparison.
Best,
Mark
>
> for i in xrange (10000000):
> pass
> print i
>
> real 0m0.641s
> user 0m0.625s
> sys 0m0.012s
>
>
> (let loop ((i 0))
> (if (<= 10000000 i)
> (begin (display i) (newline))
> (loop (1+ i))))
>
>
> real 0m0.495s
> user 0m0.465s
> sys 0m0.010s
>
> Cheers,
> Mike
>
>
>
> On Mon, Mar 7, 2011 at 4:36 PM, Andy Wingo <wingo@pobox.com> wrote:
>> (let loop ((i 0))
>> (if (<= 10000000 i)
>> (begin (display i) (newline))
>> (loop (1+ i))))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Loop optimization
2011-03-08 0:28 ` Mark H Weaver
@ 2011-03-08 0:30 ` Mark H Weaver
0 siblings, 0 replies; 6+ messages in thread
From: Mark H Weaver @ 2011-03-08 0:30 UTC (permalink / raw)
To: Michael Ellis; +Cc: Guile bug
I wrote:
> That's especially impressive given that the two pieces of code below are
> markedly different: The guile version formats 10 million integers and
> outputs them, whereas the python version does nothing but count and
> output the final value.
Sorry, dumb mistake on my part. Apologies for the wasted bandwidth.
Mark
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Loop optimization
2011-03-07 23:10 ` Michael Ellis
2011-03-08 0:28 ` Mark H Weaver
@ 2011-03-08 21:43 ` Andy Wingo
1 sibling, 0 replies; 6+ messages in thread
From: Andy Wingo @ 2011-03-08 21:43 UTC (permalink / raw)
To: Michael Ellis; +Cc: Guile bug
On Tue 08 Mar 2011 00:10, Michael Ellis <michael.f.ellis@gmail.com> writes:
> Thanks for the education, Andy, and congrats again on the guile
> compiler implementation. Your version handily outperforms the fastest
> python implementation I know.
Thanks, but there is still a ways to go. If guile can count to 10
million in half a second, that means it's spending 100+ cycles per loop,
when it could be spending 5 or something. I figure we have a good
factor of 4 to squeeze out easily with native compilation, and more with
flow analysis and untagging and such.
Regards,
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-03-08 21:43 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-07 21:01 Loop optimization Michael Ellis
2011-03-07 21:36 ` Andy Wingo
2011-03-07 23:10 ` Michael Ellis
2011-03-08 0:28 ` Mark H Weaver
2011-03-08 0:30 ` Mark H Weaver
2011-03-08 21:43 ` Andy Wingo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).