guile performance - Ackermann function: way slower than emacs, slower still if compiled

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Ken Raeburn <raeburn@raeburn.org>
To: guile-devel <guile-devel@gnu.org>
Subject: guile performance - Ackermann function: way slower than emacs, slower still if compiled
Date: Tue, 4 Aug 2009 05:28:33 -0400	[thread overview]
Message-ID: <CC607382-6167-422E-AD08-7BC3E6E55114@raeburn.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 5656 bytes --]

I implemented Ackermann's function A(m,n), a recursive function with  
scary performance characteristics, based on the definition given at  
wikipedia involving lots of +1 and -1 operations...  A(0,n) takes  
constant time to compute assuming bounded integer sizes, A(1,n) takes  
linear time, A(2,n) is quadratic, A(3,n) is exponential... A(4,n)  
takes a really long time if n=1, and for anything higher you can just  
give up now.  (A(4,2) is in the neighborhood of 2**65536, and A(4,3)  
is about 2 raised to that power.  And it gets there by adding or  
subtracting 1 in each invocation.)  Optimizations are possible, like  
A(1,n)=n+2, A(2,n)=2n+3, A(3,n)=2**(n+3)-3, but I didn't encode those  
optimizations.

I wrote both elisp and scheme versions; I changed the recursion to an  
explicitly-managed stack of values, to avoid some of the dynamic-vs- 
lexical argument binding issues, turn the native stack usage into  
large numbers of cell allocations and discards, and avoid exceeding  
the maximum Lisp evaluation depth Emacs permits.  Not surprisingly,  
it's not too fast, and Emacs spends a lot of time in garbage  
collection passes even with the byte-compiled version.  My time-test  
function here counts seconds of real time (on an 8-core Mac without a  
lot of other stuff going on, so CPU usage is probably very close):

ELISP> (time-test '(ackermann 3 1))
1.6e-05
ELISP> (time-test '(ackermann 3 6))
0.036018
ELISP> (time-test '(ackermann 3 7))
0.16574899999999998
ELISP> (time-test '(ackermann 3 8))
0.6457170000000001
ELISP> (time-test '(ackermann 3 9))
2.482042
ELISP> (time-test '(ackermann 4 0))
1.4e-05
ELISP> (time-test '(ackermann 4 1))
642.746514
ELISP>

With the installed guile-1.8.7 binary and the interpreted Scheme  
version of the code, performance was worse:

% time guile -l ack.scm -c '(ackermann 3 1)'
0.009u 0.003s 0:00.01 0.0%	0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 6)'
0.409u 0.078s 0:00.52 90.3%	0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 9)'
25.300u 4.540s 0:29.85 99.9%	0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 4 1)'
6016.150u 1122.109s 1:58:59.32 99.9%	0+0k 8+4io 149pf+0w
%

Yes, that's just shy of two hours for the last one, where Emacs took  
less than 11 minutes.

(Did I make some silly mistake in translating, that causes the Scheme  
version to be far less efficient?)

It's not a terribly fair comparison, for various reasons, but moving  
on...

I built and installed 1.9.1, so I could test with more recent code:

% time guile -l ack.scm -c '(ackermann 3 1)'
0.026u 0.006s 0:00.03 66.6%	0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 6)'
0.761u 0.141s 0:00.90 100.0%	0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 9)'
47.242u 8.644s 0:55.89 99.9%	0+0k 0+0io 0pf+0w
%

Since this behaved worse than 1.8.6, I skipped the 4,1 test.  I tested  
the compiled version:

% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 1))'
0.012u 0.004s 0:00.01 100.0%	0+0k 0+0io 0pf+0w
% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 6))'
0.772u 0.333s 0:01.10 100.0%	0+0k 0+0io 0pf+0w
% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 9))'
48.728u 22.016s 1:10.75 99.9%	0+0k 0+0io 0pf+0w
%

Not much better than loading the .scm file, and only better for small  
values of n... in fact, with m=3 and n>=6 the performance seems worse  
in the precompiled case:

% time guile -l ack.scm -c '(ackermann 3 10)'
181.997u 34.928s 3:36.94 99.9%	0+0k 0+0io 0pf+0w
% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 10))'
196.678u 85.618s 4:42.34 99.9%	0+0k 0+0io 0pf+0w
%

Poking at it a couple of times with Apple's process sampling tool, it  
does seem to be spending a bit of time in garbage collection -- but  
scm_leave_guile (called from scm_async_click via  
scm_pthread_mutex_lock) and getrusage (called from  
scm_c_get_internal_run_time via mytime) also seem to show up on the  
stack a significant fraction of the time, and each of those appears to  
be a comparable if not bigger time sink.

According to the execution timing with the compiled code, GC actually  
doesn't seem to be a huge fraction of the run time, maybe 8-9%:

scheme@(guile-user)> ,t (ackermann 3 8)
2045
clock utime stime cutime cstime gctime
18.43 12.96  5.47   0.00   0.00   1.66
scheme@(guile-user)> ,t (ackermann 3 9)
4093
clock utime stime cutime cstime gctime
73.47 51.61 21.85   0.00   0.00   5.95
scheme@(guile-user)>

The latter is the case that took under three seconds in Emacs, remember.

Garbage collection is much more of a factor in the non-compiled version:

scheme@(guile-user)> (load "ack.scm")
scheme@(guile-user)> ,t (ackermann 3 8)
2045
clock utime stime cutime cstime gctime
21.97 19.74  2.21   0.00   0.00   7.76
scheme@(guile-user)> ,t (ackermann 3 9)
4093
clock utime stime cutime cstime gctime
86.26 77.37  8.88   0.00   0.00  30.20
scheme@(guile-user)>

It also looks like execution time measurements may add more to the non- 
compiled execution time, as with the measurements the compiled code  
was still faster at 3,9.

In retrospect, this is much heavier on the interpretation or execution  
of some of the simple operations than I intended, and Emacs benefits a  
lot from opcodes like "add1", and I'm more interested specifically in  
exercising allocation and GC at the moment, so it's not that useful a  
test for me after all.  But I thought I'd send this along anyways in  
case someone was interested.  In particular the fact that the  
performance of the precompiled version gets worse as n grows might be  
something to investigate, unless the cause is already known....

Ken

[-- Attachment #2: ack.scm --]
[-- Type: application/octet-stream, Size: 889 bytes --]

(define (ackermann m n)
  (let ((stack (list m)))
    (while (not (null? stack))
      ;; Compute A(top-of-stack, n)
      (set! m (car stack))
      (set! stack (cdr stack))
      (while (not (= m 0))
	(cond ((and (= n 0)
		    (> m 0))
;;;	   (ackermann (- m 1) 1)
	       (set! m (- m 1))
	       (set! n 1))
	      (#t
;;;	   (ackermann (- m 1) (ackermann m (- n 1)))
	       ;; Push outer parameter onto the stack and
	       ;; compute the inner function.
	       (set! stack (cons (- m 1) stack))
	       (set! n (- n 1))
	       )))
      ;; Now m=0 so result is n+1.
      (set! n (+ n 1))
      ;; Done computing A() this round.
    ))
  n)

(define (time-diff t1 t2)
  (+ (* 1.0e-6 (- (cdr t1) (cdr t2)))
     (- (car t1) (car t2))))

(define (ack-time-test m n)
  (let ((start (gettimeofday)))
    (ackermann m n)
    (display (time-diff (gettimeofday) start))
    (newline)))

[-- Attachment #3: ack.el --]
[-- Type: application/octet-stream, Size: 867 bytes --]

(defun ackermann (m n)
  (let ((stack (list m)))
    (while stack
      ;; Compute A(top-of-stack, n)
      (setq m (car stack)
	    stack (cdr stack))
      (while (not (= m 0))
	(cond ((and (= n 0)
		    (> m 0))
;;;	   (ackermann (- m 1) 1)
	       (setq m (- m 1)
		     n 1))
	      (t
;;;	   (ackermann (- m 1) (ackermann m (- n 1)))
	       ;; Push outer parameter onto the stack and
	       ;; compute the inner function.
	       (setq stack (cons (- m 1) stack)
		     n (- n 1))
	       )))
      ;; Now m=0 so result is n+1.
      (setq n (+ n 1))
      ;; Done computing A() this round.
    ))
  n)

(defun time-diff (t1 t2)
 (+ (* 65536.0 (- (nth 0 t1) (nth 0 t2)))
    (* 1.0 (- (nth 1 t1) (nth 1 t2)))
    (* 1.0e-6 (- (nth 2 t1) (nth 2 t2)))))

(defun time-test (exp)
 (let ((start (current-time)))
   (eval exp)
   (time-diff (current-time) start)))

next             reply	other threads:[~2009-08-04  9:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-04  9:28 Ken Raeburn [this message]
2009-08-04 13:52 ` guile performance - Ackermann function: way slower than emacs, slower still if compiled Marijn Schouten (hkBst)
2009-08-04 15:56   ` Ken Raeburn
2009-08-05  9:06 ` Andy Wingo
2009-08-05 14:06   ` Ken Raeburn
2009-08-06  7:31     ` Ken Raeburn
2009-08-26 22:39       ` Neil Jerram
2009-08-06 16:30     ` entering and leaving guile mode, and GC stack protection (was Re: guile performance - Ackermann function: way slower than emacs, slower still if compiled) Ken Raeburn
2009-08-12 22:06       ` entering and leaving guile mode, and GC stack protection Andy Wingo
2009-08-14  7:58         ` Ludovic Courtès
2009-08-08 21:52   ` guile performance - Ackermann function: way slower than emacs, slower still if compiled Ludovic Courtès
2009-08-12 22:28     ` Andy Wingo
2009-08-05 10:42 ` Andy Wingo
2009-08-06 15:59   ` Andy Wingo
2009-08-07 17:27     ` Andy Wingo
2009-08-05 11:15 ` Andy Wingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CC607382-6167-422E-AD08-7BC3E6E55114@raeburn.org \
    --to=raeburn@raeburn.org \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).