From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: guile performance - Ackermann function: way slower than emacs, slower still if compiled Date: Wed, 05 Aug 2009 11:06:01 +0200 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1249463580 18399 80.91.229.12 (5 Aug 2009 09:13:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 5 Aug 2009 09:13:00 +0000 (UTC) Cc: guile-devel To: Ken Raeburn Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Aug 05 11:12:53 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MYcYE-0006hQ-1N for guile-devel@m.gmane.org; Wed, 05 Aug 2009 11:12:50 +0200 Original-Received: from localhost ([127.0.0.1]:42218 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MYcYC-00017l-W6 for guile-devel@m.gmane.org; Wed, 05 Aug 2009 05:12:49 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MYcY8-000153-9E for guile-devel@gnu.org; Wed, 05 Aug 2009 05:12:44 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MYcY2-0000yX-L9 for guile-devel@gnu.org; Wed, 05 Aug 2009 05:12:43 -0400 Original-Received: from [199.232.76.173] (port=33479 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MYcY2-0000y4-Br for guile-devel@gnu.org; Wed, 05 Aug 2009 05:12:38 -0400 Original-Received: from a-sasl-quonix.sasl.smtp.pobox.com ([208.72.237.25]:46242 helo=sasl.smtp.pobox.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MYcY1-0000tY-Va for guile-devel@gnu.org; Wed, 05 Aug 2009 05:12:38 -0400 Original-Received: from localhost.localdomain (unknown [127.0.0.1]) by a-sasl-quonix.sasl.smtp.pobox.com (Postfix) with ESMTP id A2EA2210A2; Wed, 5 Aug 2009 05:12:36 -0400 (EDT) Original-Received: from unquote (unknown [82.123.246.238]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-sasl-quonix.sasl.smtp.pobox.com (Postfix) with ESMTPSA id 16794210A1; Wed, 5 Aug 2009 05:12:33 -0400 (EDT) In-Reply-To: (Ken Raeburn's message of "Tue, 4 Aug 2009 05:28:33 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux) X-Pobox-Relay-ID: 1E0C42F0-81A0-11DE-A6F4-F699A5B33865-02397024!a-sasl-quonix.pobox.com X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:9030 Archived-At: Hi Ken, On Tue 04 Aug 2009 11:28, Ken Raeburn writes: > I implemented Ackermann's function A(m,n), a recursive function with > scary performance characteristics, based on the definition given at > wikipedia involving lots of +1 and -1 operations... Guile's VM does not yet have opcodes for +1 and and -1. It probably should, though. Still, this is a fairly artificial test. > I wrote both elisp and scheme versions; I changed the recursion to an > explicitly-managed stack of values, to avoid some of the dynamic-vs- > lexical argument binding issues, turn the native stack usage into large > numbers of cell allocations and discards, and avoid exceeding the > maximum Lisp evaluation depth Emacs permits. OK now we know we are entering the realm of the artificial benchmark :-) > Not surprisingly, it's > not too fast, and Emacs spends a lot of time in garbage collection > passes even with the byte-compiled version. My time-test function here > counts seconds of real time (on an 8-core Mac without a lot of other > stuff going on, so CPU usage is probably very close): > > ELISP> (time-test '(ackermann 3 1)) > 1.6e-05 > ELISP> (time-test '(ackermann 3 6)) > 0.036018 > ELISP> (time-test '(ackermann 3 7)) > 0.16574899999999998 > ELISP> (time-test '(ackermann 3 8)) > 0.6457170000000001 > ELISP> (time-test '(ackermann 3 9)) > 2.482042 > ELISP> (time-test '(ackermann 4 0)) > 1.4e-05 > ELISP> (time-test '(ackermann 4 1)) > 642.746514 OK > With the installed guile-1.8.7 binary and the interpreted Scheme version > of the code, performance was worse: > > % time guile -l ack.scm -c '(ackermann 3 1)' > 0.009u 0.003s 0:00.01 0.0% 0+0k 0+0io 0pf+0w > % time guile -l ack.scm -c '(ackermann 3 6)' > 0.409u 0.078s 0:00.52 90.3% 0+0k 0+0io 0pf+0w > % time guile -l ack.scm -c '(ackermann 3 9)' > 25.300u 4.540s 0:29.85 99.9% 0+0k 0+0io 0pf+0w > % time guile -l ack.scm -c '(ackermann 4 1)' > 6016.150u 1122.109s 1:58:59.32 99.9% 0+0k 8+4io 149pf+0w So, you're also timing Guile startup here, which takes about 20 ms. > Yes, that's just shy of two hours for the last one, where Emacs took > less than 11 minutes. Of course, that's comparing compiled code to interpreted code, but that is a pretty bad result, yes. > I built and installed 1.9.1, so I could test with more recent code: > > % time guile -l ack.scm -c '(ackermann 3 1)' > 0.026u 0.006s 0:00.03 66.6% 0+0k 0+0io 0pf+0w > % time guile -l ack.scm -c '(ackermann 3 6)' > 0.761u 0.141s 0:00.90 100.0% 0+0k 0+0io 0pf+0w > % time guile -l ack.scm -c '(ackermann 3 9)' > 47.242u 8.644s 0:55.89 99.9% 0+0k 0+0io 0pf+0w > % > > Since this behaved worse than 1.8.6, I skipped the 4,1 test. Yes, Guile 1.9's interpreter will be slower than 1.8's. > I tested > the compiled version: > > % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 1))' > 0.012u 0.004s 0:00.01 100.0% 0+0k 0+0io 0pf+0w > % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 6))' > 0.772u 0.333s 0:01.10 100.0% 0+0k 0+0io 0pf+0w > % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 9))' > 48.728u 22.016s 1:10.75 99.9% 0+0k 0+0io 0pf+0w > % > > Not much better than loading the .scm file, and only better for small > values of n... in fact, with m=3 and n>=6 the performance seems worse > in the precompiled case: > > % time guile -l ack.scm -c '(ackermann 3 10)' > 181.997u 34.928s 3:36.94 99.9% 0+0k 0+0io 0pf+0w > % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 10))' > 196.678u 85.618s 4:42.34 99.9% 0+0k 0+0io 0pf+0w > % > > Poking at it a couple of times with Apple's process sampling tool, it > does seem to be spending a bit of time in garbage collection -- but > scm_leave_guile (called from scm_async_click via > scm_pthread_mutex_lock) and getrusage (called from > scm_c_get_internal_run_time via mytime) also seem to show up on the > stack a significant fraction of the time, and each of those appears to > be a comparable if not bigger time sink. This could be simply that Mac OS has different performance characteristics than Linux. If locking uncontended mutexen or getting the resource usage of a process are expensive, that doesn't sound very good... > According to the execution timing with the compiled code, GC actually > doesn't seem to be a huge fraction of the run time, maybe 8-9%: > > scheme@(guile-user)> ,t (ackermann 3 8) > 2045 > clock utime stime cutime cstime gctime > 18.43 12.96 5.47 0.00 0.00 1.66 > scheme@(guile-user)> ,t (ackermann 3 9) > 4093 > clock utime stime cutime cstime gctime > 73.47 51.61 21.85 0.00 0.00 5.95 > scheme@(guile-user)> > > The latter is the case that took under three seconds in Emacs, > remember. Perhaps the deal is that 1+ is not being inlined. It seems to be calling out to the 1+ procedure, which is a primitive, actually making it a bit slower. I'll see what I can do. > In retrospect, this is much heavier on the interpretation or execution > of some of the simple operations than I intended, and Emacs benefits a > lot from opcodes like "add1", and I'm more interested specifically in > exercising allocation and GC at the moment, so it's not that useful a > test for me after all. But I thought I'd send this along anyways in > case someone was interested. In particular the fact that the > performance of the precompiled version gets worse as n grows might be > something to investigate, unless the cause is already known.... Thank you for the report, I'll investigate. Andy -- http://wingolog.org/