unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Take some lowhanging fruit to speed up R6RS fixnum operations
@ 2011-03-22 23:20 Andreas Rottmann
  2011-03-22 23:20 ` [PATCH] " Andreas Rottmann
  2011-03-29 11:05 ` Andy Wingo
  0 siblings, 2 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-22 23:20 UTC (permalink / raw)
  To: guile-devel

In porting dorodango[0], I have noticed that Guile's R6RS fixnum
operations are quite slow; here's a patch to remedy that a bit.  The
patch contains a benchmark for `fxxor', which experiences a performance
improvement of >2 with the patch.

[0] http://home.gna.org/dorodango/

I also used real code to validate the performance impact, namely Göran
Weinholt's ZIP implementation from his industria collection[1]. There
the patch makes for an ~1.7x improvement for uncompressing a 675KiB
(compressed) ZIP file containing text files that yield good compression.

[1] https://code.launchpad.net/~weinholt/scheme-libraries/industria

However, it must be noted that while this patch improves things,
uncompressing the aforementioned ZIP file takes still 42 seconds on my
machine (down from 74 before).  I have another patch in the queue to
improve this by making `fixnum?' a VM primitive, but even with that,
performance is not stellar.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-22 23:20 Take some lowhanging fruit to speed up R6RS fixnum operations Andreas Rottmann
@ 2011-03-22 23:20 ` Andreas Rottmann
  2011-03-24 21:51   ` Ludovic Courtès
  2011-03-29 11:05 ` Andy Wingo
  1 sibling, 1 reply; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-22 23:20 UTC (permalink / raw)
  To: guile-devel

* module/rnrs/arithmetic/fixnums.scm (assert-fixnum): Is now a
  macro.
  (assert-fixnums): New procedure checking a the elements of a list
  for fixnum-ness.  All callers applying `assert-fixnum' to a list
  now changed to use this procedure.

* module/rnrs/arithmetic/fixnums.scm (define-fxop*): New macro for
  defining n-ary procedures special-casing the binary case via
  case-lambda.  All applicable procedures redefined using this
  macro.

* benchmark-suite/benchmarks/r6rs-arithmetic.bm: New file containing
  some benchmarks for R6RS fixnum operations.
---
 benchmark-suite/benchmarks/r6rs-arithmetic.bm |   35 +++++++++++++
 module/rnrs/arithmetic/fixnums.scm            |   69 +++++++++++--------------
 2 files changed, 66 insertions(+), 38 deletions(-)
 create mode 100644 benchmark-suite/benchmarks/r6rs-arithmetic.bm

diff --git a/benchmark-suite/benchmarks/r6rs-arithmetic.bm b/benchmark-suite/benchmarks/r6rs-arithmetic.bm
new file mode 100644
index 0000000..4c9b8e6
--- /dev/null
+++ b/benchmark-suite/benchmarks/r6rs-arithmetic.bm
@@ -0,0 +1,35 @@
+;;; -*- mode: scheme; coding: utf-8; -*-
+;;; R6RS-specific arithmetic benchmarks
+;;;
+;;; Copyright (C) 2011 Free Software Foundation, Inc.
+;;;
+;;; This library is free software; you can redistribute it and/or
+;;; modify it under the terms of the GNU Lesser General Public
+;;; License as published by the Free Software Foundation; either
+;;; version 3 of the License, or (at your option) any later version.
+;;;
+;;; This library is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;;; Lesser General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU Lesser General Public
+;;; License along with this library. If not, see
+;;; <http://www.gnu.org/licenses/>.
+
+(define-module (benchmarks r6rs-arithmetic)
+  #:use-module (benchmark-suite lib)
+  #:use-module (rnrs arithmetic fixnums))
+
+\f
+(with-benchmark-prefix "fixnum"
+
+  (benchmark "fixnum? [yes]" 1e7
+    (fixnum? 10000))
+
+  (let ((n (+ most-positive-fixnum 100)))
+    (benchmark "fixnum? [no]" 1e7
+      (fixnum? n)))
+
+  (benchmark "fxxor [2]" 1e7
+    (fxxor 3 8)))
diff --git a/module/rnrs/arithmetic/fixnums.scm b/module/rnrs/arithmetic/fixnums.scm
index befbe9d..8c35dc6 100644
--- a/module/rnrs/arithmetic/fixnums.scm
+++ b/module/rnrs/arithmetic/fixnums.scm
@@ -87,6 +87,7 @@
 			most-negative-fixnum)
 	  (ice-9 optargs)
 	  (rnrs base (6))
+	  (rnrs control (6))
 	  (rnrs arithmetic bitwise (6))
 	  (rnrs conditions (6))
 	  (rnrs exceptions (6))
@@ -105,50 +106,42 @@
 	 (>= obj most-negative-fixnum) 
 	 (<= obj most-positive-fixnum)))
 
-  (define (assert-fixnum . args)
+  (define-syntax assert-fixnum
+    (syntax-rules ()
+      ((_ arg ...)
+       (or (and (fixnum? arg) ...)
+	   (raise (make-assertion-violation))))))
+
+  (define (assert-fixnums args)
     (or (for-all fixnum? args) (raise (make-assertion-violation))))
 
-  (define (fx=? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum args) 
-      (apply = args)))
-
-  (define (fx>? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst))) 
-      (apply assert-fixnum args) 
-      (apply > args)))
-
-  (define (fx<? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum rst)
-      (apply < args)))
-
-  (define (fx>=? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum rst)
-      (apply >= args)))
-
-  (define (fx<=? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum rst)
-      (apply <= args)))
-  
+  (define-syntax define-fxop*
+    (syntax-rules ()
+      ((_ name op)
+       (define name
+	 (case-lambda
+	   ((x y)
+	    (assert-fixnum x y)
+	    (op x y))
+	   (args
+	    (assert-fixnums args)
+	    (apply op args)))))))
+
+  (define-fxop* fx=? =)
+  (define-fxop* fx>? >)
+  (define-fxop* fx<? <)
+  (define-fxop* fx>=? >=)
+  (define-fxop* fx<=? <=)
+
   (define (fxzero? fx) (assert-fixnum fx) (zero? fx))
   (define (fxpositive? fx) (assert-fixnum fx) (positive? fx))
   (define (fxnegative? fx) (assert-fixnum fx) (negative? fx))
   (define (fxodd? fx) (assert-fixnum fx) (odd? fx))
   (define (fxeven? fx) (assert-fixnum fx) (even? fx))
 
-  (define (fxmax fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum args)
-      (apply max args)))
+  (define-fxop* fxmax max)
+  (define-fxop* fxmin min)
 
-  (define (fxmin fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum args)
-      (apply min args)))
- 
   (define (fx+ fx1 fx2)
     (assert-fixnum fx1 fx2) 
     (let ((r (+ fx1 fx2))) 
@@ -219,9 +212,9 @@
       (values s0 s1)))
 
   (define (fxnot fx) (assert-fixnum fx) (lognot fx))
-  (define (fxand . args) (apply assert-fixnum args) (apply logand args))
-  (define (fxior . args) (apply assert-fixnum args) (apply logior args))
-  (define (fxxor . args) (apply assert-fixnum args) (apply logxor args))
+  (define-fxop* fxand logand)
+  (define-fxop* fxior logior)
+  (define-fxop* fxxor logxor)
 
   (define (fxif fx1 fx2 fx3) 
     (assert-fixnum fx1 fx2 fx3) 
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-22 23:20 ` [PATCH] " Andreas Rottmann
@ 2011-03-24 21:51   ` Ludovic Courtès
  2011-03-24 23:42     ` Andreas Rottmann
  0 siblings, 1 reply; 28+ messages in thread
From: Ludovic Courtès @ 2011-03-24 21:51 UTC (permalink / raw)
  To: guile-devel

Hi Andreas,

I’m all for your suggestion.

Andreas Rottmann <a.rottmann@gmx.at> writes:

> +  (define-syntax define-fxop*
> +    (syntax-rules ()
> +      ((_ name op)
> +       (define name
> +	 (case-lambda
> +	   ((x y)
> +	    (assert-fixnum x y)
> +	    (op x y))
> +	   (args
> +	    (assert-fixnums args)
> +	    (apply op args)))))))
> +
> +  (define-fxop* fx=? =)

How about making something like this (untested):

  (define-syntax define-fxop*
    (syntax-rules ()
      ((_ name op)
       (define-syntax name
         (lambda (s)
           (syntax-case s ()
             ((_ x y)
              #'(begin
                  (assert-fixnum x y)
                  (op x y)))
             ((_ args ...)
              #'(apply op args))
             (_ #'op)))))))

This way, there’d be no procedure call involved and ‘=’, ‘+’, etc. would
use the corresponding instruction in the base case.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-24 21:51   ` Ludovic Courtès
@ 2011-03-24 23:42     ` Andreas Rottmann
  2011-03-25 12:16       ` Andreas Rottmann
  0 siblings, 1 reply; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-24 23:42 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hi Andreas,
>
> I’m all for your suggestion.
>
> Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> +  (define-syntax define-fxop*
>> +    (syntax-rules ()
>> +      ((_ name op)
>> +       (define name
>> +	 (case-lambda
>> +	   ((x y)
>> +	    (assert-fixnum x y)
>> +	    (op x y))
>> +	   (args
>> +	    (assert-fixnums args)
>> +	    (apply op args)))))))
>> +
>> +  (define-fxop* fx=? =)
>
> How about making something like this (untested):
>
>   (define-syntax define-fxop*
>     (syntax-rules ()
>       ((_ name op)
>        (define-syntax name
>          (lambda (s)
>            (syntax-case s ()
>              ((_ x y)
>               #'(begin
>                   (assert-fixnum x y)
>                   (op x y)))
>              ((_ args ...)
>               #'(apply op args))
>              (_ #'op)))))))
>
> This way, there’d be no procedure call involved and ‘=’, ‘+’, etc. would
> use the corresponding instruction in the base case.
>
That's an excellent idea, and would probably boost performance quite a
bit.  However, there's still the issue that the non-two-args cases don't
throw the exceptions required by R6RS.  Upon trying to interpret the
R6RS in a way that would lift that requirement, I still see no way to do
so in the general case, due to the following quite explicit language:

  Fixnum operations perform integer arithmetic on their fixnum
  arguments, but raise an exception with condition type
  &implementation-restriction if the result is not a fixnum.

So even if checking the arguments seems to be not explicitly required
(but might be implictly intended -- "fixnum arguments"), the result must
definitly be a fixnum, or the mentioned exception be thrown.

So for the sake of efficiency, while heeding what is explictly required,
I propose the following:

- Alias all fixnum operations that yield non-fixnums (e.g. fx<?, fx>?,
  ...) with the corresponding non-fixnum procedures.  IMHO, that seems
  to be within bounds wrt. R6RS; Section 5.4 "Argument checking"
  says:

  The implementation must check that the restrictions in the
  specification are indeed met, to the extent that it is reasonable,
  possible, and necessary to allow the specified operation to complete
  successfully.  

  One might argue that it is not reasonable (for efficiency reasons
  related to the way Guile is currently implemented) and certainly not
  necessary to check the arguments for fixnum-ness in these procedures.

- Otherwise, do the macro trick as above, but fall back on a procedure
  like in SRFI-9's `define-inlinable' for the non-binary case.

Thoughts?

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-24 23:42     ` Andreas Rottmann
@ 2011-03-25 12:16       ` Andreas Rottmann
  2011-03-27 15:19         ` Ludovic Courtès
  0 siblings, 1 reply; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-25 12:16 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Andreas Rottmann <a.rottmann@gmx.at> writes:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Hi Andreas,
>>
>> I’m all for your suggestion.
>>
>> Andreas Rottmann <a.rottmann@gmx.at> writes:
>>
>>> +  (define-syntax define-fxop*
>>> +    (syntax-rules ()
>>> +      ((_ name op)
>>> +       (define name
>>> +	 (case-lambda
>>> +	   ((x y)
>>> +	    (assert-fixnum x y)
>>> +	    (op x y))
>>> +	   (args
>>> +	    (assert-fixnums args)
>>> +	    (apply op args)))))))
>>> +
>>> +  (define-fxop* fx=? =)
>>
>> How about making something like this (untested):
>>
>>   (define-syntax define-fxop*
>>     (syntax-rules ()
>>       ((_ name op)
>>        (define-syntax name
>>          (lambda (s)
>>            (syntax-case s ()
>>              ((_ x y)
>>               #'(begin
>>                   (assert-fixnum x y)
>>                   (op x y)))
>>              ((_ args ...)
>>               #'(apply op args))
>>              (_ #'op)))))))
>>
>> This way, there’d be no procedure call involved and ‘=’, ‘+’, etc. would
>> use the corresponding instruction in the base case.
>>
> That's an excellent idea, and would probably boost performance quite a
> bit.
>
Sorry to reply to myself, but new evidence has been produced :-).

Using the following macro:

  (define-syntax define-fxop*
    (lambda (stx)
      (define prefix (string->symbol "% "))
      (define (make-procedure-name name)
	(datum->syntax name
		       (symbol-append prefix (syntax->datum name)
				      '-procedure)))
      (syntax-case stx ()
	((_ name op)
	 (with-syntax ((proc-name (make-procedure-name #'name)))
	   #'(begin
	       (define (proc-name . args)
		 (assert-fixnums args)
		 (apply op args))
	       (define-syntax name
		 (lambda (stx)
		   (syntax-case stx ()
		     ((_ x-expr y-expr)
		      #'(let ((x x-expr)
			      (y y-expr))
			  (assert-fixnum x y)
			  (op x y)))
		     ((_ . args)
		      #'(proc-name . args))
		     (_
		      (identifier? stx)
		      #'proc-name))))))))))

It turns out that this does not have a great effect on the efficiency;
here are the numbers:

    * stable-2.0 + rotty/wip-fixnum-speed (simple define-fxop*, clever fixnum?)

    #+begin_example
    % _build/meta/guile -e main -s benchmark-suite/guile-benchmark --benchmark-suite benchmarks r6rs-arithmetic.bm
    ;; running guile version 2.0.0.133-e47c9-dirty
    ;; calibrating the benchmarking framework...
    ;; framework time per iteration: 1.1444091796875e-7
    ("r6rs-arithmetic.bm: fixnum: fixnum? [yes]" 10000000 user 3.42 benchmark 2.2755908203125 bench/interp 2.2755908203125 gc 0.0)
    ("r6rs-arithmetic.bm: fixnum: fixnum? [no]" 10000000 user 3.42 benchmark 2.2755908203125 bench/interp 2.2755908203125 gc 0.0)
    ("r6rs-arithmetic.bm: fixnum: fxxor [2]" 10000000 user 6.96 benchmark 5.8155908203125 bench/interp 5.8155908203125 gc 0.0)

    scheme@(guile-user)> ,time (run-benchmark)
    clock utime stime cutime cstime gctime
    23.81 23.76  0.00   0.00   0.00   0.00
    #+end_example

    * stable-2.0 + rotty/wip-fixnum-speed (macro-based define-fxop*, clever fixnum?)

    #+begin_example
    % _build/meta/guile -e main -s benchmark-suite/guile-benchmark --benchmark-suite benchmarks r6rs-arithmetic.bm
    ;; running guile version 2.0.0.133-e47c9-dirty
    ;; calibrating the benchmarking framework...
    ;; framework time per iteration: 1.52587890625e-7
    ("r6rs-arithmetic.bm: fixnum: fixnum? [yes]" 10000000 user 3.4 benchmark 1.87412109375 bench/interp 1.87412109375 gc 0.0)
    ("r6rs-arithmetic.bm: fixnum: fixnum? [no]" 10000000 user 3.42 benchmark 1.89412109375 bench/interp 1.89412109375 gc 0.0)
    ("r6rs-arithmetic.bm: fixnum: fxxor [2]" 10000000 user 6.72 benchmark 5.19412109375 bench/interp 5.19412109375 gc 0.0)

    scheme@(guile-user)> ,time (run-benchmark)
    clock utime stime cutime cstime gctime
    22.76 22.73  0.00   0.00   0.00   0.00
    #+end_example

So, that's around 5% improvment (on the ZIP benchmark) for an IMHO
significantly more hackish implementation.  I'm not sure that's worth
it. WDYT?

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-25 12:16       ` Andreas Rottmann
@ 2011-03-27 15:19         ` Ludovic Courtès
  2011-03-27 22:20           ` Andreas Rottmann
  0 siblings, 1 reply; 28+ messages in thread
From: Ludovic Courtès @ 2011-03-27 15:19 UTC (permalink / raw)
  To: Andreas Rottmann; +Cc: guile-devel

Hello,

Andreas Rottmann <a.rottmann@gmx.at> writes:

> So, that's around 5% improvment (on the ZIP benchmark) for an IMHO
> significantly more hackish implementation.  I'm not sure that's worth
> it. WDYT?

Was it with ‘fixnum?’ inline, or with the ‘fixnum?’ instruction?

It’s ironic that while R6RS fixnums are a performance hack, they end up
being less efficient than unbounded integers in Guile.

Do you know how other implementations deal with that?

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-27 15:19         ` Ludovic Courtès
@ 2011-03-27 22:20           ` Andreas Rottmann
  0 siblings, 0 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-27 22:20 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello,
>
> Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> So, that's around 5% improvment (on the ZIP benchmark) for an IMHO
>> significantly more hackish implementation.  I'm not sure that's worth
>> it. WDYT?
>
> Was it with ‘fixnum?’ inline, or with the ‘fixnum?’ instruction?
>
The numbers from my previous mail were with the "clever" `fixnum?'
(i.e. the one that uses object-address and bit-twiddling), defined with
`define-inline'.  I have yet to produce numbers with `fixnum?' as VM
primitive and the more complicated, partially-inlinable fixnum
operators.  I'll keep you posted.

> It’s ironic that while R6RS fixnums are a performance hack, they end up
> being less efficient than unbounded integers in Guile.
>
Indeed.

> Do you know how other implementations deal with that?
>
I will do some research on that topic, investigating Racket, Ikarus and
Ypsilon -- these are the R6RS implementations, which I'm most familiar
with (besides Guile, obviously).

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-22 23:20 Take some lowhanging fruit to speed up R6RS fixnum operations Andreas Rottmann
  2011-03-22 23:20 ` [PATCH] " Andreas Rottmann
@ 2011-03-29 11:05 ` Andy Wingo
  2011-03-30  1:37   ` Andreas Rottmann
                     ` (2 more replies)
  1 sibling, 3 replies; 28+ messages in thread
From: Andy Wingo @ 2011-03-29 11:05 UTC (permalink / raw)
  To: Andreas Rottmann; +Cc: guile-devel

On Wed 23 Mar 2011 00:20, Andreas Rottmann <a.rottmann@gmx.at> writes:

> In porting dorodango[0], I have noticed that Guile's R6RS fixnum
> operations are quite slow; here's a patch to remedy that a bit.

What is the state of things here?  I'm a bit lost in the discussion :)

You said that decompressing the zip file takes 74 seconds with Guile as
it is now.  What if you change to use Guile's arithmetic instead of
fxops?  That would give is a good baseline so that we can know how much
overhead the fxops have.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-29 11:05 ` Andy Wingo
@ 2011-03-30  1:37   ` Andreas Rottmann
  2011-03-30 10:31   ` Andreas Rottmann
  2011-03-30 10:58   ` Andreas Rottmann
  2 siblings, 0 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-30  1:37 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> On Wed 23 Mar 2011 00:20, Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> In porting dorodango[0], I have noticed that Guile's R6RS fixnum
>> operations are quite slow; here's a patch to remedy that a bit.
>
> What is the state of things here?  I'm a bit lost in the discussion :)
>
I'll post a summary mail regarding performace soonish; it's quite
tedious to set up branches for all the possibly interesting combinations
of optimizations, and then actually running the benchmarks.

> You said that decompressing the zip file takes 74 seconds with Guile as
> it is now.  What if you change to use Guile's arithmetic instead of
> fxops?  That would give is a good baseline so that we can know how much
> overhead the fxops have.
>
I'll put numbers for that into the benchmark; i.e. with a `(rnrs
arithmetic fixnum)' module just containing aliases.

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-29 11:05 ` Andy Wingo
  2011-03-30  1:37   ` Andreas Rottmann
@ 2011-03-30 10:31   ` Andreas Rottmann
  2011-03-30 10:58   ` Andreas Rottmann
  2 siblings, 0 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-30 10:31 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> On Wed 23 Mar 2011 00:20, Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> In porting dorodango[0], I have noticed that Guile's R6RS fixnum
>> operations are quite slow; here's a patch to remedy that a bit.
>
> What is the state of things here?  I'm a bit lost in the discussion :)
>
> You said that decompressing the zip file takes 74 seconds with Guile as
> it is now.  What if you change to use Guile's arithmetic instead of
> fxops?  That would give is a good baseline so that we can know how much
> overhead the fxops have.
>
OK, here are a bunch of numbers that will hopefully help to determine
the desired course of action:

| Codebase         | fixnum? | fxxor | unzip |
|------------------+---------+-------+-------|
| vanilla          |    5.66 | 24.69 |  65.9 |
| generic          |    5.62 |  1.92 |   7.7 |
| smart            |    3.42 | 21.29 |  56.3 |
| smart + opt        |    3.42 |  6.99 |  21.9 |
| smart + opt-macros |     3.4 |  5.56 |  21.0 |
| vmop             |    1.78 | 18.83 |  47.9 |
| vmop+opt         |    1.77 |  3.94 |  14.1 |
| vmop+opt-macros  |    1.78 |   3.7 |  13.3 |

As you can see, in the "vanilla" case (i.e. current stable-2.0), it's
65.9 seconds -- I do not know where I got the 74 second number from,
perhaps it was from a profiling run.

The "generic" case is the one where all fixnum checks have been
eliminated, thus allowing the compiler to emit VM primitives for most of
the operations.

Now, with these baselines established, I implemented the following
optimizations, and benchmarked combinations of them as can be seen from
the above table.

-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-29 11:05 ` Andy Wingo
  2011-03-30  1:37   ` Andreas Rottmann
  2011-03-30 10:31   ` Andreas Rottmann
@ 2011-03-30 10:58   ` Andreas Rottmann
  2011-04-02 17:42     ` R6RS fixnum arithmetic optimizations Andreas Rottmann
  2011-04-04 21:28     ` Take some lowhanging fruit to speed up R6RS fixnum operations Andy Wingo
  2 siblings, 2 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-03-30 10:58 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel


[ Sorry for the incomplete mail I just sent; hit the wrong key combo ]

Andy Wingo <wingo@pobox.com> writes:

> On Wed 23 Mar 2011 00:20, Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> In porting dorodango[0], I have noticed that Guile's R6RS fixnum
>> operations are quite slow; here's a patch to remedy that a bit.
>
> What is the state of things here?  I'm a bit lost in the discussion :)
>
> You said that decompressing the zip file takes 74 seconds with Guile as
> it is now.  What if you change to use Guile's arithmetic instead of
> fxops?  That would give is a good baseline so that we can know how much
> overhead the fxops have.
>
As promised, here are a bunch of numbers that will hopefully help to
determine the desired course of action:

| Codebase           | fixnum? | fxxor | unzip |
|--------------------+---------+-------+-------|
| vanilla            |    5.66 | 24.69 |  65.9 |
| generic            |    5.62 |  1.92 |   7.7 |
| smart              |    3.42 | 21.29 |  56.3 |
| smart + opt        |    3.42 |  6.99 |  21.9 |
| smart + opt-macros |     3.4 |  5.56 |  21.0 |
| vmop               |    1.78 | 18.83 |  47.9 |
| vmop + opt         |    1.77 |  3.94 |  14.1 |
| vmop + opt-macros  |    1.78 |   3.7 |  13.3 |

As you can see, in the "vanilla" case (i.e. current stable-2.0), it's
65.9 seconds -- I do not know where I got the 74 second number from,
perhaps it was from a profiling run.

The "generic" case is the one where all fixnum checks have been
eliminated (i.e. using the generic procedures, such as logxor), thus
allowing the compiler to emit VM primitives for most of the operations.

Now, with these baselines established, I implemented the following
optimizations, and benchmarked combinations of them as can be seen from
the above table:

- "smart" refers to having `fixnum?' implemented via `define-inline' and
  using `object-address'.

- "opt" refers to having many fixnum procedures (e.g. fx+, fxxor, ...)
  going through a case-lambda, thus optimizing the common, two-argument
  case not to cons.

- "opt-macros" refers to making the same procedures as in "opt" actually
  macros, expanding to a `(let ((x x-expr) (y y-expr)) (OP x y))' in the
  binary case.

- "vmop" means that the newly-introduced VM primitive `fixnum?' is used.

Regarding the columns, "fixnum?" is the time needed for 10 million
`fixnum?' checks, "fxxor" for 10 million binary `fxxor' operations, and
"unzip" times the unzipping of a not-so-large (675KiB) ZIP file using
Göran Weinholts ZIP implementation.

Now there are several questions:

- Can we put the `fixnum?' VM primitive into stable-2.0?  AFAIK, yes we
  can, but we will break things for users mixing 2.0.0 and 2.0.1 on the
  same machine (or sharing $HOME).  So do we want to?

- Should we complicate the implementation and take a potential code size
  hit by using the "opt-macros" hack, for a 5-6% gain in speed? Of
  course, that's just for the ZIP benchmark, but I think it gives a good
  estimate in what kind of ballpark we are in here.

  My preference would be to go with the simpler implementation ("opt"
  instead of "opt-macros").

- Should further improvements be considered?  I guess if we put the
  fixnum operations into the VM, we could get as fast as the "generic"
  case, and perhaps even a bit faster (i.e. about a 2x speedup on the
  ZIP benchmark compared to vmop + opt).  I'm not sure if that's worth
  it.

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* R6RS fixnum arithmetic optimizations
  2011-03-30 10:58   ` Andreas Rottmann
@ 2011-04-02 17:42     ` Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 1/3] Add a few benchmarks for R6RS fixnum arithmetic Andreas Rottmann
                         ` (2 more replies)
  2011-04-04 21:28     ` Take some lowhanging fruit to speed up R6RS fixnum operations Andy Wingo
  1 sibling, 3 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-02 17:42 UTC (permalink / raw)
  To: guile-devel

After getting no feedback on my last mail so far, here is a series of
proposed patches for the stable-2.0 branch.  In total, they speed up the
unzip benchmark by a factor of about 4.7, but it's still about 1.8 times
slower than what we'd get for using generic (non-fixnum) operations.




^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] Add a few benchmarks for R6RS fixnum arithmetic
  2011-04-02 17:42     ` R6RS fixnum arithmetic optimizations Andreas Rottmann
@ 2011-04-02 17:42       ` Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 2/3] Several optimizations " Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 3/3] Add `fixnum?' VM primitive Andreas Rottmann
  2 siblings, 0 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-02 17:42 UTC (permalink / raw)
  To: guile-devel

* benchmark-suite/benchmarks/r6rs-arithmetic.bm: New file containing
  some benchmarks for R6RS fixnum operations.
* benchmark-suite/Makefile.am (SCM_BENCHMARKS): Add
  benchmarks/r6rs-arithmetic.
---
 benchmark-suite/Makefile.am                   |    1 +
 benchmark-suite/benchmarks/r6rs-arithmetic.bm |   35 +++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)
 create mode 100644 benchmark-suite/benchmarks/r6rs-arithmetic.bm

diff --git a/benchmark-suite/Makefile.am b/benchmark-suite/Makefile.am
index bac1df3..f29743f 100644
--- a/benchmark-suite/Makefile.am
+++ b/benchmark-suite/Makefile.am
@@ -6,6 +6,7 @@ SCM_BENCHMARKS = benchmarks/0-reference.bm		\
                  benchmarks/if.bm			\
                  benchmarks/logand.bm			\
 		 benchmarks/ports.bm			\
+		 benchmarks/r6rs-arithmetic.bm		\
 		 benchmarks/read.bm			\
 		 benchmarks/srfi-1.bm			\
 		 benchmarks/srfi-13.bm			\
diff --git a/benchmark-suite/benchmarks/r6rs-arithmetic.bm b/benchmark-suite/benchmarks/r6rs-arithmetic.bm
new file mode 100644
index 0000000..4c9b8e6
--- /dev/null
+++ b/benchmark-suite/benchmarks/r6rs-arithmetic.bm
@@ -0,0 +1,35 @@
+;;; -*- mode: scheme; coding: utf-8; -*-
+;;; R6RS-specific arithmetic benchmarks
+;;;
+;;; Copyright (C) 2011 Free Software Foundation, Inc.
+;;;
+;;; This library is free software; you can redistribute it and/or
+;;; modify it under the terms of the GNU Lesser General Public
+;;; License as published by the Free Software Foundation; either
+;;; version 3 of the License, or (at your option) any later version.
+;;;
+;;; This library is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;;; Lesser General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU Lesser General Public
+;;; License along with this library. If not, see
+;;; <http://www.gnu.org/licenses/>.
+
+(define-module (benchmarks r6rs-arithmetic)
+  #:use-module (benchmark-suite lib)
+  #:use-module (rnrs arithmetic fixnums))
+
+\f
+(with-benchmark-prefix "fixnum"
+
+  (benchmark "fixnum? [yes]" 1e7
+    (fixnum? 10000))
+
+  (let ((n (+ most-positive-fixnum 100)))
+    (benchmark "fixnum? [no]" 1e7
+      (fixnum? n)))
+
+  (benchmark "fxxor [2]" 1e7
+    (fxxor 3 8)))
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/3] Several optimizations for R6RS fixnum arithmetic
  2011-04-02 17:42     ` R6RS fixnum arithmetic optimizations Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 1/3] Add a few benchmarks for R6RS fixnum arithmetic Andreas Rottmann
@ 2011-04-02 17:42       ` Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 3/3] Add `fixnum?' VM primitive Andreas Rottmann
  2 siblings, 0 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-02 17:42 UTC (permalink / raw)
  To: guile-devel

* module/rnrs/arithmetic/fixnums.scm (assert-fixnum): Is now a
  macro.
  (assert-fixnums): New procedure checking a the elements of a list
  for fixnum-ness.  All callers applying `assert-fixnum' to a list
  now changed to use this procedure.

* module/rnrs/arithmetic/fixnums.scm (define-fxop*): New for defining
  n-ary inlinable special-casing the binary case using `case-lambda'.
  All applicable procedures redefined using this macro.

* module/rnrs/arithmetic/fixnums.scm: Alias all predicates to
  their non-fixnum counterparts.
---
 module/rnrs/arithmetic/fixnums.scm |   86 +++++++++++++++++-------------------
 1 files changed, 41 insertions(+), 45 deletions(-)

diff --git a/module/rnrs/arithmetic/fixnums.scm b/module/rnrs/arithmetic/fixnums.scm
index befbe9d..03511ed 100644
--- a/module/rnrs/arithmetic/fixnums.scm
+++ b/module/rnrs/arithmetic/fixnums.scm
@@ -87,6 +87,7 @@
 			most-negative-fixnum)
 	  (ice-9 optargs)
 	  (rnrs base (6))
+	  (rnrs control (6))
 	  (rnrs arithmetic bitwise (6))
 	  (rnrs conditions (6))
 	  (rnrs exceptions (6))
@@ -105,50 +106,45 @@
 	 (>= obj most-negative-fixnum) 
 	 (<= obj most-positive-fixnum)))
 
-  (define (assert-fixnum . args)
+  (define-syntax assert-fixnum
+    (syntax-rules ()
+      ((_ arg ...)
+       (or (and (fixnum? arg) ...)
+	   (raise (make-assertion-violation))))))
+
+  (define (assert-fixnums args)
     (or (for-all fixnum? args) (raise (make-assertion-violation))))
 
-  (define (fx=? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum args) 
-      (apply = args)))
-
-  (define (fx>? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst))) 
-      (apply assert-fixnum args) 
-      (apply > args)))
-
-  (define (fx<? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum rst)
-      (apply < args)))
-
-  (define (fx>=? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum rst)
-      (apply >= args)))
-
-  (define (fx<=? fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum rst)
-      (apply <= args)))
-  
-  (define (fxzero? fx) (assert-fixnum fx) (zero? fx))
-  (define (fxpositive? fx) (assert-fixnum fx) (positive? fx))
-  (define (fxnegative? fx) (assert-fixnum fx) (negative? fx))
-  (define (fxodd? fx) (assert-fixnum fx) (odd? fx))
-  (define (fxeven? fx) (assert-fixnum fx) (even? fx))
-
-  (define (fxmax fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum args)
-      (apply max args)))
-
-  (define (fxmin fx1 fx2 . rst)
-    (let ((args (cons* fx1 fx2 rst)))
-      (apply assert-fixnum args)
-      (apply min args)))
- 
+  (define-syntax define-fxop*
+    (syntax-rules ()
+      ((_ name op)
+       (define name
+	(case-lambda
+	  ((x y)
+	   (assert-fixnum x y)
+	   (op x y))
+	  (args
+	   (assert-fixnums args)
+           (apply op args)))))))
+
+  ;; All these predicates don't check their arguments for fixnum-ness,
+  ;; as this doesn't seem to be strictly required by R6RS.
+
+  (define fx=? =)
+  (define fx>? >)
+  (define fx<? <)
+  (define fx>=? >=)
+  (define fx<=? <=)
+
+  (define fxzero? zero?)
+  (define fxpositive? positive?)
+  (define fxnegative? negative?)
+  (define fxodd? odd?)
+  (define fxeven? even?)
+
+  (define-fxop* fxmax max)
+  (define-fxop* fxmin min)
+
   (define (fx+ fx1 fx2)
     (assert-fixnum fx1 fx2) 
     (let ((r (+ fx1 fx2))) 
@@ -219,9 +215,9 @@
       (values s0 s1)))
 
   (define (fxnot fx) (assert-fixnum fx) (lognot fx))
-  (define (fxand . args) (apply assert-fixnum args) (apply logand args))
-  (define (fxior . args) (apply assert-fixnum args) (apply logior args))
-  (define (fxxor . args) (apply assert-fixnum args) (apply logxor args))
+  (define-fxop* fxand logand)
+  (define-fxop* fxior logior)
+  (define-fxop* fxxor logxor)
 
   (define (fxif fx1 fx2 fx3) 
     (assert-fixnum fx1 fx2 fx3) 
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/3] Add `fixnum?' VM primitive
  2011-04-02 17:42     ` R6RS fixnum arithmetic optimizations Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 1/3] Add a few benchmarks for R6RS fixnum arithmetic Andreas Rottmann
  2011-04-02 17:42       ` [PATCH 2/3] Several optimizations " Andreas Rottmann
@ 2011-04-02 17:42       ` Andreas Rottmann
  2011-04-04 21:53         ` Andy Wingo
  2 siblings, 1 reply; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-02 17:42 UTC (permalink / raw)
  To: guile-devel

This primitive can be used to significantly speed up the operations in
`(rnrs arithmetic fixnums)'.

* libguile/r6rs-arithmetic.c: New file containing `fixnum?' procedure
  implementation as a new extension.
* libguile/r6rs-arithmetic.h: New file with prototypes for the above.
* libguile/Makefile.am: Add above files in relevant places.

* libguile/init.c (scm_i_init_guile): Register R6RS arithmetic
  extension.

* libguile/vm-i-scheme.c (fixnump): New VM primitive.
* module/language/tree-il/compile-glil.scm (*primcall-ops*): Add
  `fixnum?'.
* module/language/tree-il/primitives.scm
  (*interesting-primitive-names*, *effect-free-primitives*)
  (*effect+exception-free-primitives*): Add `fixnum?'.
---
 libguile/Makefile.am                     |    4 ++
 libguile/init.c                          |    2 +
 libguile/numbers.c                       |    1 -
 libguile/numbers.h                       |    1 +
 libguile/r6rs-arithmetic.c               |   48 ++++++++++++++++++++++++++++++
 libguile/r6rs-arithmetic.h               |   30 ++++++++++++++++++
 libguile/vm-i-scheme.c                   |    8 ++++-
 module/language/tree-il/compile-glil.scm |    1 +
 module/language/tree-il/primitives.scm   |    9 +++--
 module/rnrs/arithmetic/fixnums.scm       |   11 +++----
 10 files changed, 103 insertions(+), 12 deletions(-)
 create mode 100644 libguile/r6rs-arithmetic.c
 create mode 100644 libguile/r6rs-arithmetic.h

diff --git a/libguile/Makefile.am b/libguile/Makefile.am
index ac27eb8..01a384d 100644
--- a/libguile/Makefile.am
+++ b/libguile/Makefile.am
@@ -179,6 +179,7 @@ libguile_@GUILE_EFFECTIVE_VERSION@_la_SOURCES =				\
 	procs.c					\
 	programs.c				\
 	promises.c				\
+	r6rs-arithmetic.c			\
 	r6rs-ports.c				\
 	random.c				\
 	rdelim.c				\
@@ -275,6 +276,7 @@ DOT_X_FILES =					\
 	procprop.x				\
 	procs.x					\
 	promises.x				\
+	r6rs-arithmetic.x			\
 	r6rs-ports.x				\
 	random.x				\
 	rdelim.x				\
@@ -375,6 +377,7 @@ DOT_DOC_FILES = 				\
 	procprop.doc				\
 	procs.doc				\
 	promises.doc				\
+	r6rs-arithmetic.doc			\
 	r6rs-ports.doc				\
 	random.doc				\
 	rdelim.doc				\
@@ -569,6 +572,7 @@ modinclude_HEADERS =				\
 	programs.h				\
 	promises.h				\
 	pthread-threads.h			\
+	r6rs-arithmetic.h			\
 	r6rs-ports.h				\
 	random.h				\
 	rdelim.h				\
diff --git a/libguile/init.c b/libguile/init.c
index 8b3b8cd..2c23b1e 100644
--- a/libguile/init.c
+++ b/libguile/init.c
@@ -100,6 +100,7 @@
 #include "libguile/programs.h"
 #include "libguile/promises.h"
 #include "libguile/array-map.h"
+#include "libguile/r6rs-arithmetic.h"
 #include "libguile/random.h"
 #include "libguile/rdelim.h"
 #include "libguile/read.h"
@@ -403,6 +404,7 @@ scm_i_init_guile (void *base)
   scm_bootstrap_programs ();
   scm_bootstrap_vm ();
   scm_register_r6rs_ports ();
+  scm_register_r6rs_arithmetic ();
   scm_register_foreign ();
   scm_register_srfi_1 ();
   scm_register_srfi_60 ();
diff --git a/libguile/numbers.c b/libguile/numbers.c
index 427e772..0a10030 100644
--- a/libguile/numbers.c
+++ b/libguile/numbers.c
@@ -6122,7 +6122,6 @@ SCM_DEFINE (scm_integer_p, "integer?", 1, 0, 0,
 }
 #undef FUNC_NAME
 
-
 SCM scm_i_num_eq_p (SCM, SCM, SCM);
 SCM_PRIMITIVE_GENERIC (scm_i_num_eq_p, "=", 0, 2, 1,
                        (SCM x, SCM y, SCM rest),
diff --git a/libguile/numbers.h b/libguile/numbers.h
index ab96981..fb97785 100644
--- a/libguile/numbers.h
+++ b/libguile/numbers.h
@@ -240,6 +240,7 @@ SCM_API SCM scm_complex_p (SCM x);
 SCM_API SCM scm_real_p (SCM x);
 SCM_API SCM scm_rational_p (SCM z);
 SCM_API SCM scm_integer_p (SCM x);
+SCM_API SCM scm_fixnum_p (SCM x);
 SCM_API SCM scm_inexact_p (SCM x);
 SCM_API SCM scm_num_eq_p (SCM x, SCM y);
 SCM_API SCM scm_less_p (SCM x, SCM y);
diff --git a/libguile/r6rs-arithmetic.c b/libguile/r6rs-arithmetic.c
new file mode 100644
index 0000000..b00f1f4
--- /dev/null
+++ b/libguile/r6rs-arithmetic.c
@@ -0,0 +1,48 @@
+/* Copyright (C) 2011 Free Software Foundation, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 3 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef HAVE_CONFIG_H
+# include <config.h>
+#endif
+
+#include "libguile/_scm.h"
+#include "libguile/numbers.h"
+#include "libguile/r6rs-arithmetic.h"
+
+SCM_DEFINE (scm_fixnum_p, "fixnum?", 1, 0, 0,
+            (SCM x),
+            "Return @code{#t} if @var{x} is a fixnum, @code{#f} otherwise.")
+#define FUNC_NAME s_scm_fixnum_p
+{
+  return scm_from_bool (SCM_I_INUMP (x));
+}
+#undef FUNC_NAME
+
+void
+scm_register_r6rs_arithmetic (void)
+{
+  scm_c_register_extension ("libguile-" SCM_EFFECTIVE_VERSION,
+                            "scm_init_r6rs_arithmetic",
+			    (scm_t_extension_init_func) scm_init_r6rs_arithmetic,
+			    NULL);
+}
+
+void
+scm_init_r6rs_arithmetic (void)
+{
+#include "libguile/r6rs-arithmetic.x"
+}
diff --git a/libguile/r6rs-arithmetic.h b/libguile/r6rs-arithmetic.h
new file mode 100644
index 0000000..833426a
--- /dev/null
+++ b/libguile/r6rs-arithmetic.h
@@ -0,0 +1,30 @@
+#ifndef SCM_R6RS_ARITHMETIC_H
+#define SCM_R6RS_ARITHMETIC_H
+
+/* Copyright (C) 2011 Free Software Foundation, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 3 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+\f
+
+#include "libguile/__scm.h"
+
+/* R6RS Fixnum Arithmetic  */
+
+SCM_API void scm_init_r6rs_arithmetic (void);
+SCM_INTERNAL void scm_register_r6rs_arithmetic (void);
+
+#endif /* SCM_R6RS_ARITHMETIC_H */
diff --git a/libguile/vm-i-scheme.c b/libguile/vm-i-scheme.c
index 9e249bc..21255c6 100644
--- a/libguile/vm-i-scheme.c
+++ b/libguile/vm-i-scheme.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2001, 2009, 2010 Free Software Foundation, Inc.
+/* Copyright (C) 2001, 2009, 2010, 2011 Free Software Foundation, Inc.
  * 
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -111,6 +111,12 @@ VM_DEFINE_FUNCTION (139, vectorp, "vector?", 1)
   RETURN (scm_from_bool (SCM_I_IS_VECTOR (x)));
 }
 
+VM_DEFINE_FUNCTION (210, fixnump, "fixnum?", 1)
+{
+  ARGS1 (x);
+  RETURN (scm_from_bool (SCM_I_INUMP (x)));
+}
+
 \f
 /*
  * Basic data
diff --git a/module/language/tree-il/compile-glil.scm b/module/language/tree-il/compile-glil.scm
index f193e9d..b4d860f 100644
--- a/module/language/tree-il/compile-glil.scm
+++ b/module/language/tree-il/compile-glil.scm
@@ -108,6 +108,7 @@
    ((list? . 1) . list?)
    ((symbol? . 1) . symbol?)
    ((vector? . 1) . vector?)
+   ((fixnum? . 1) . fixnum?)
    (list . list)
    (vector . vector)
    ((class-of . 1) . class-of)
diff --git a/module/language/tree-il/primitives.scm b/module/language/tree-il/primitives.scm
index 316a462..24e6021 100644
--- a/module/language/tree-il/primitives.scm
+++ b/module/language/tree-il/primitives.scm
@@ -1,6 +1,6 @@
 ;;; open-coding primitive procedures
 
-;; Copyright (C) 2009, 2010 Free Software Foundation, Inc.
+;; Copyright (C) 2009, 2010, 2011 Free Software Foundation, Inc.
 
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -21,6 +21,7 @@
 (define-module (language tree-il primitives)
   #:use-module (system base pmatch)
   #:use-module (rnrs bytevectors)
+  #:use-module (rnrs arithmetic fixnums)
   #:use-module (system base syntax)
   #:use-module (language tree-il)
   #:use-module (srfi srfi-4)
@@ -43,7 +44,7 @@
     + * - / 1- 1+ quotient remainder modulo
     ash logand logior logxor
     not
-    pair? null? list? symbol? vector? acons cons cons*
+    fixnum? pair? null? list? symbol? vector? acons cons cons*
 
     list vector
 
@@ -112,7 +113,7 @@
     = < > <= >= zero?
     + * - / 1- 1+ quotient remainder modulo
     not
-    pair? null? list? symbol? vector? acons cons cons*
+    pair? null? list? symbol? vector? fixnum? acons cons cons*
     list vector
     car cdr
     caar cadr cdar cddr
@@ -137,7 +138,7 @@
   '(values
     eq? eqv? equal?
     not
-    pair? null? list? symbol? vector? acons cons cons*
+    pair? null? list? symbol? vector? fixnum? acons cons cons*
     list vector
     struct?))
 
diff --git a/module/rnrs/arithmetic/fixnums.scm b/module/rnrs/arithmetic/fixnums.scm
index 03511ed..b519920 100644
--- a/module/rnrs/arithmetic/fixnums.scm
+++ b/module/rnrs/arithmetic/fixnums.scm
@@ -76,7 +76,9 @@
 	  fxreverse-bit-field)
   (import (only (guile) ash
 		        cons*
+			effective-version
 			inexact->exact
+			load-extension
 			logand
 			logbit?
 			logcount
@@ -93,18 +95,15 @@
 	  (rnrs exceptions (6))
 	  (rnrs lists (6)))
 
+  (load-extension (string-append "libguile-" (effective-version))
+                  "scm_init_r6rs_arithmetic")
+
   (define fixnum-width 
     (let ((w (inexact->exact (round (/ (log (+ most-positive-fixnum 1)) (log 2))))))
       (lambda () w)))
 
   (define (greatest-fixnum) most-positive-fixnum)
   (define (least-fixnum) most-negative-fixnum)
-  
-  (define (fixnum? obj) 
-    (and (integer? obj) 
-	 (exact? obj) 
-	 (>= obj most-negative-fixnum) 
-	 (<= obj most-positive-fixnum)))
 
   (define-syntax assert-fixnum
     (syntax-rules ()
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-03-30 10:58   ` Andreas Rottmann
  2011-04-02 17:42     ` R6RS fixnum arithmetic optimizations Andreas Rottmann
@ 2011-04-04 21:28     ` Andy Wingo
  2011-04-04 22:00       ` Andreas Rottmann
  1 sibling, 1 reply; 28+ messages in thread
From: Andy Wingo @ 2011-04-04 21:28 UTC (permalink / raw)
  To: Andreas Rottmann; +Cc: guile-devel

On Wed 30 Mar 2011 12:58, Andreas Rottmann <a.rottmann@gmx.at> writes:

> "unzip" times the unzipping of a not-so-large (675KiB) ZIP file using
> Göran Weinholts ZIP implementation.

How long does the standard unix utility take, for the record?

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] Add `fixnum?' VM primitive
  2011-04-02 17:42       ` [PATCH 3/3] Add `fixnum?' VM primitive Andreas Rottmann
@ 2011-04-04 21:53         ` Andy Wingo
  2011-04-05  0:14           ` Andreas Rottmann
  0 siblings, 1 reply; 28+ messages in thread
From: Andy Wingo @ 2011-04-04 21:53 UTC (permalink / raw)
  To: Andreas Rottmann; +Cc: guile-devel

Hi Andreas,

I applied the first two, thanks.

I am hesitant about this one for three reasons:

 1) I don't want the compiler to import (rnrs arithmetic fixnums).
    Rather, if we were to do this, I would have that module register its
    primitives, as GOOPS does.

 2) Something about this sounds wrong to me.  If fixnum? is important,
    why not have it in Guile's default environment directly?  Or some
    other non-R6RS namespace.  What about, for example,
    `exact-integer?'?  And why for fixnums and not flonums?

 3) Are there no alternatives, like exposing Guile's tags to Scheme
    directly?  (ice-9 tags) for example.  Then we could inline tc3?,
    tc8?, and tc16? ops.  Some other predicates would become
    unnecessary; char? for example.  Perhaps this is a bad idea though.

Apologies for being crabby here after you've done all this work :)  It
could well be that something close to this is the right thing.

Cheers,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-04-04 21:28     ` Take some lowhanging fruit to speed up R6RS fixnum operations Andy Wingo
@ 2011-04-04 22:00       ` Andreas Rottmann
  2011-04-04 22:12         ` Andy Wingo
  0 siblings, 1 reply; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-04 22:00 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> On Wed 30 Mar 2011 12:58, Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> "unzip" times the unzipping of a not-so-large (675KiB) ZIP file using
>> Göran Weinholts ZIP implementation.
>
> How long does the standard unix utility take, for the record?
>
On a tmpfs, with the same ZIP file:

unzip ~/public_html/tmp/bundles/xitomatl-20110103.zip  0.07s user 0.03s system 77% cpu 0.135 total

So there's quite some room for improvement.  Other data points: Ikarus
runs the benchmark in 0.55 seconds, Racket v5.1.0.2 takes 2.39 seconds.

For Dorodango, I intend to refactor ZIP support to allow for different
backends, selectable at runtime, and implement a libzip binding.  This
will get reasonable performance on implementations that don't have a
native-code compiler as fancy as Ikarus's.

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Take some lowhanging fruit to speed up R6RS fixnum operations
  2011-04-04 22:00       ` Andreas Rottmann
@ 2011-04-04 22:12         ` Andy Wingo
  0 siblings, 0 replies; 28+ messages in thread
From: Andy Wingo @ 2011-04-04 22:12 UTC (permalink / raw)
  To: Andreas Rottmann; +Cc: guile-devel

On Tue 05 Apr 2011 00:00, Andreas Rottmann <a.rottmann@gmx.at> writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> How long does the standard unix utility take, for the record?
>>
> On a tmpfs, with the same ZIP file:
>
> unzip ~/public_html/tmp/bundles/xitomatl-20110103.zip  0.07s user 0.03s system 77% cpu 0.135 total

OK, so Guile (in its standard, "generic" configuration) is about 100
times slower (7.7s) than well-optimized native code.  I guess the
positive way to spin that is that it's great that we have so much room
for improvement ;-)

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] Add `fixnum?' VM primitive
  2011-04-04 21:53         ` Andy Wingo
@ 2011-04-05  0:14           ` Andreas Rottmann
  2011-04-06 12:42             ` define-inlinable Ludovic Courtès
  2011-04-07 15:57             ` [PATCH 3/3] Add `fixnum?' VM primitive Ludovic Courtès
  0 siblings, 2 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-05  0:14 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

Andy Wingo <wingo@pobox.com> writes:

> Hi Andreas,
>
> I applied the first two, thanks.
>
> I am hesitant about this one for three reasons:
>
>  1) I don't want the compiler to import (rnrs arithmetic fixnums).
>     Rather, if we were to do this, I would have that module register its
>     primitives, as GOOPS does.
>
>  2) Something about this sounds wrong to me.  If fixnum? is important,
>
I don't hold the opinion that `fixnum?' per se is important; it's just
that to achieve better performance for R6RS' fixnum operations (while
still remaining within R6RS spec territory), it needs to be fast, due to
the abundance of calls to it.  If, for example, we'd make fixnum
operations VM primitives, we wouldn't need to care about `fixnum?'s
speed so much.

>  why not have it in Guile's default environment directly?
>
Well, this was what my original patch did [0], but Ludovic objected [1].

[0] http://lists.gnu.org/archive/html/guile-devel/2011-03/msg00223.html
[1] http://lists.gnu.org/archive/html/guile-devel/2011-03/msg00234.html

If you guys can reach an agreement that it should/could live in Guile's
default namespace, either named `fixnum?', or by some other name, I'll
happily re-do the patch accordingly.  My personal feeling is that having
an extension for `fixnum?' alone is clumsy, and I did this just due to
Ludovic's response.

>   Or some other non-R6RS namespace.
>
Suggestions?

>  What about, for example, `exact-integer?'?  And why for fixnums and
>  not flonums?
>
Well, the intention behind naming the files r6rs-arithmetic.{c,h} is
that similiar performance hacks for flonums could go in there as well,
but ATM I don't really care about flonum operations, so I didn't include
that in the patch.

>  3) Are there no alternatives, like exposing Guile's tags to Scheme
>     directly?  (ice-9 tags) for example.  Then we could inline tc3?,
>     tc8?, and tc16? ops.  Some other predicates would become
>     unnecessary; char? for example.  Perhaps this is a bad idea though.
>
I don't like this idea.  It sounds like exposing what is a _really_
low-level implementation detail to Scheme for not a very good reason.

Granted, `fixnum?' exposes also an implementation detail, but one that
is probably common to all non-toy implementations of Scheme, and if
hardware architectures do not change fundamentally, will probably remain
to be so.  This is also my response to Ludovic's response that "this
fixnum thing is a breach in the numerical tower": Yes, it is, but one
that can potentially provide performance gains by having the user
explicitly stating (commonly found, in some areas) invariants about the
numerical range of certain expressions.  A compiler can then leverage
this additional information to give better performance on such code.
IMO, it's a performance hack, but a potentially useful one.

> Apologies for being crabby here after you've done all this work :)  It
> could well be that something close to this is the right thing.
>
Well, I've attached my Plan-B solution.  We can come back to the
`fixnum?' VM primitive when it's clear where its corresponding Scheme
binding should go.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: refactor-define-inlinable.diff --]
[-- Type: text/x-diff, Size: 6037 bytes --]

From: Andreas Rottmann <a.rottmann@gmx.at>
Subject: Move `define-inlinable' into the default namespace

* module/ice-9/boot-9.scm (define-inlineable): Moved here from SRFI-9.
* module/srfi/srfi-9 (define-inlinable): Removed here.

* doc/ref/api-procedures.texi (Inlinable Procedures): Add subsection
  about `define-inlinable'.

---
 doc/ref/api-procedures.texi |   27 ++++++++++++++++++++++++++-
 module/ice-9/boot-9.scm     |   36 ++++++++++++++++++++++++++++++++++++
 module/srfi/srfi-9.scm      |   32 --------------------------------
 3 files changed, 62 insertions(+), 33 deletions(-)

diff --git a/doc/ref/api-procedures.texi b/doc/ref/api-procedures.texi
index 02889c4..4b4870d 100644
--- a/doc/ref/api-procedures.texi
+++ b/doc/ref/api-procedures.texi
@@ -1,6 +1,6 @@
 @c -*-texinfo-*-
 @c This is part of the GNU Guile Reference Manual.
-@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2009, 2010
+@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2009, 2010, 2011
 @c   Free Software Foundation, Inc.
 @c See the file guile.texi for copying conditions.
 
@@ -16,6 +16,7 @@
 * Higher-Order Functions::      Function that take or return functions.
 * Procedure Properties::        Procedure properties and meta-information.
 * Procedures with Setters::     Procedures with setters.
+* Inlinable Procedures::        Procedures that can be inlined.
 @end menu
 
 
@@ -797,6 +798,30 @@ Return the setter of @var{proc}, which must be either a procedure with
 setter or an operator struct.
 @end deffn
 
+@node Inlinable Procedures
+@subsection Inlinable Procedures
+
+You can define an ``inlinable procedure'' by using
+@code{define-inlinable} instead of @code{define}. An inlinable procedure
+behaves the same as a regular procedure, but direct calls will result in
+the procedure body being inlined into the caller.
+
+Making a procedure inlinable eliminates the overhead of the call, but at
+the same time means that the caller will not transparently use the new
+definition if the inline procedure is redefined.  Inlinable procedures
+will also not deal nicely with debugging and tracing.  Therefore, you
+should not make a procedure inlinable unless it demonstrably improves
+performance in a crucial way.
+
+In general, only small procedures should be considered for inlining, as
+making large procedures inlinable will probably result in an increase in
+code size.  Additionally, the elimination of the call overhead rarely
+matters for for large procedures.
+
+@deffn {Scheme Syntax} define-inlinable (name parameter ...) . body
+Define @var{name} as a procedure with parameters @var{parameter}s and
+body @var{body}.
+@end deffn
 
 @c Local Variables:
 @c TeX-master: "guile.texi"
diff --git a/module/ice-9/boot-9.scm b/module/ice-9/boot-9.scm
index 33aa333..327e3fa 100644
--- a/module/ice-9/boot-9.scm
+++ b/module/ice-9/boot-9.scm
@@ -3497,6 +3497,42 @@ module '(ice-9 q) '(make-q q-length))}."
                          x)))))
 
 \f
+;;; Defining transparently inlinable procedures
+;;;
+
+(define-syntax define-inlinable
+  ;; Define a macro and a procedure such that direct calls are inlined, via
+  ;; the macro expansion, whereas references in non-call contexts refer to
+  ;; the procedure.  Inspired by the `define-integrable' macro by Dybvig et al.
+  (lambda (x)
+    ;; Use a space in the prefix to avoid potential -Wunused-toplevel
+    ;; warning
+    (define prefix (string->symbol "% "))
+    (define (make-procedure-name name)
+      (datum->syntax name
+                     (symbol-append prefix (syntax->datum name)
+                                    '-procedure)))
+
+    (syntax-case x ()
+      ((_ (name formals ...) body ...)
+       (identifier? #'name)
+       (with-syntax ((proc-name  (make-procedure-name #'name))
+                     ((args ...) (generate-temporaries #'(formals ...))))
+         #`(begin
+             (define (proc-name formals ...)
+               body ...)
+             (define-syntax name
+               (lambda (x)
+                 (syntax-case x ()
+                   ((_ args ...)
+                    #'((lambda (formals ...)
+                         body ...)
+                       args ...))
+                   (_
+                    (identifier? x)
+                    #'proc-name))))))))))
+
+\f
 
 (define using-readline?
   (let ((using-readline? (make-fluid)))
diff --git a/module/srfi/srfi-9.scm b/module/srfi/srfi-9.scm
index f9449a6..ad9e95d 100644
--- a/module/srfi/srfi-9.scm
+++ b/module/srfi/srfi-9.scm
@@ -64,38 +64,6 @@
 
 (cond-expand-provide (current-module) '(srfi-9))
 
-(define-syntax define-inlinable
-  ;; Define a macro and a procedure such that direct calls are inlined, via
-  ;; the macro expansion, whereas references in non-call contexts refer to
-  ;; the procedure.  Inspired by the `define-integrable' macro by Dybvig et al.
-  (lambda (x)
-    ;; Use a space in the prefix to avoid potential -Wunused-toplevel
-    ;; warning
-    (define prefix (string->symbol "% "))
-    (define (make-procedure-name name)
-      (datum->syntax name
-                     (symbol-append prefix (syntax->datum name)
-                                    '-procedure)))
-
-    (syntax-case x ()
-      ((_ (name formals ...) body ...)
-       (identifier? #'name)
-       (with-syntax ((proc-name  (make-procedure-name #'name))
-                     ((args ...) (generate-temporaries #'(formals ...))))
-         #`(begin
-             (define (proc-name formals ...)
-               body ...)
-             (define-syntax name
-               (lambda (x)
-                 (syntax-case x ()
-                   ((_ args ...)
-                    #'((lambda (formals ...)
-                         body ...)
-                       args ...))
-                   (_
-                    (identifier? x)
-                    #'proc-name))))))))))
-
 (define-syntax define-record-type
   (lambda (x)
     (define (field-identifiers field-specs)
-- 
tg: (78d1be4..) t/refactor-define-inlinable (depends on: stable-2.0)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: smart-fixnum-p.diff --]
[-- Type: text/x-diff, Size: 1384 bytes --]

From: Andreas Rottmann <a.rottmann@gmx.at>
Subject: Implement R6RS' `fixnum?' in a smarter way

* module/rnrs/arithmetic/fixnums.scm (fixnum?): Implemented using
  bit-twiddling, and using `define-inlinable'.

---
 module/rnrs/arithmetic/fixnums.scm |   13 ++++++-------
 1 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/module/rnrs/arithmetic/fixnums.scm b/module/rnrs/arithmetic/fixnums.scm
index 03511ed..0ce2458 100644
--- a/module/rnrs/arithmetic/fixnums.scm
+++ b/module/rnrs/arithmetic/fixnums.scm
@@ -76,6 +76,7 @@
 	  fxreverse-bit-field)
   (import (only (guile) ash
 		        cons*
+			define-inlinable
 			inexact->exact
 			logand
 			logbit?
@@ -84,7 +85,8 @@
 			lognot
 			logxor
 			most-positive-fixnum 
-			most-negative-fixnum)
+			most-negative-fixnum
+			object-address)
 	  (ice-9 optargs)
 	  (rnrs base (6))
 	  (rnrs control (6))
@@ -99,12 +101,9 @@
 
   (define (greatest-fixnum) most-positive-fixnum)
   (define (least-fixnum) most-negative-fixnum)
-  
-  (define (fixnum? obj) 
-    (and (integer? obj) 
-	 (exact? obj) 
-	 (>= obj most-negative-fixnum) 
-	 (<= obj most-positive-fixnum)))
+
+  (define-inlinable (fixnum? obj)
+    (not (= 0 (logand 2 (object-address obj)))))
 
   (define-syntax assert-fixnum
     (syntax-rules ()
-- 
tg: (dfabfe2..) t/smart-fixnum-p (depends on: t/rnrs-arithmetic-benchmark t/refactor-define-inlinable)

[-- Attachment #4: Type: text/plain, Size: 64 bytes --]



Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* define-inlinable
  2011-04-05  0:14           ` Andreas Rottmann
@ 2011-04-06 12:42             ` Ludovic Courtès
  2011-04-06 21:30               ` define-inlinable Andreas Rottmann
  2011-04-07 15:57             ` [PATCH 3/3] Add `fixnum?' VM primitive Ludovic Courtès
  1 sibling, 1 reply; 28+ messages in thread
From: Ludovic Courtès @ 2011-04-06 12:42 UTC (permalink / raw)
  To: guile-devel

Hello!

Andreas Rottmann <a.rottmann@gmx.at> writes:

> From: Andreas Rottmann <a.rottmann@gmx.at>
> Subject: Move `define-inlinable' into the default namespace
>
> * module/ice-9/boot-9.scm (define-inlineable): Moved here from SRFI-9.
> * module/srfi/srfi-9 (define-inlinable): Removed here.
>
> * doc/ref/api-procedures.texi (Inlinable Procedures): Add subsection
>   about `define-inlinable'.

Looks great to me.

Some nitpicking:

[...]

> +@node Inlinable Procedures
> +@subsection Inlinable Procedures
> +
> +You can define an ``inlinable procedure'' by using

Use @dfn{inlinable procedure} here.

> +@code{define-inlinable} instead of @code{define}. An inlinable procedure

I prefer two-spaces-after-period, but there’s no consensus.

> +behaves the same as a regular procedure, but direct calls will result in
> +the procedure body being inlined into the caller.
> +
> +Making a procedure inlinable eliminates the overhead of the call,

How about:

  Procedures defined with @code{define-inlinable} are @emph{always}
  inlined, at all call sites.  This eliminates function call overhead at
  the expense of an increase in code size.

> but at
> +the same time means that the caller will not transparently use the new
> +definition if the inline procedure is redefined.

  ... redefined using @code{set!}.

> Inlinable procedures
> +will also not deal nicely with debugging and tracing.

Instead of “not deal nicely”, what about something like:

  It is not possible to trace an inlined procedures or install a
  breakpoint in it (@pxref{Traps}).

> Therefore, you
> +should not make a procedure inlinable unless it demonstrably improves
> +performance in a crucial way.
> +
> +In general, only small procedures should be considered for inlining, as
> +making large procedures inlinable will probably result in an increase in
> +code size.  Additionally, the elimination of the call overhead rarely
> +matters for for large procedures.
> +
> +@deffn {Scheme Syntax} define-inlinable (name parameter ...) . body

I’d write “body ...” instead of “. body”.  Besides being aesthetically
nicer, the former matches a proper list whereas the latter matches a
pair.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: define-inlinable
  2011-04-06 12:42             ` define-inlinable Ludovic Courtès
@ 2011-04-06 21:30               ` Andreas Rottmann
  2011-04-06 22:24                 ` define-inlinable Ludovic Courtès
  0 siblings, 1 reply; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-06 21:30 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 2777 bytes --]

ludo@gnu.org (Ludovic Courtès) writes:

> Hello!
>
> Andreas Rottmann <a.rottmann@gmx.at> writes:
>
>> Subject: Move `define-inlinable' into the default namespace
>>
>> +@node Inlinable Procedures
>> +@subsection Inlinable Procedures
>> +
>> +You can define an ``inlinable procedure'' by using
>
> Use @dfn{inlinable procedure} here.
>
Done.

>> +@code{define-inlinable} instead of @code{define}. An inlinable procedure
>
> I prefer two-spaces-after-period, but there’s no consensus.
>
I try to use two spaces in English text as well, but being a German
native speaker (where this is against typographical rules, AFAIK), I
slip sometimes.  Fixed.

>> +behaves the same as a regular procedure, but direct calls will result in
>> +the procedure body being inlined into the caller.
>> +
>> +Making a procedure inlinable eliminates the overhead of the call,
>
> How about:
>
>   Procedures defined with @code{define-inlinable} are @emph{always}
>   inlined, at all call sites.  This eliminates function call overhead at
>   the expense of an increase in code size.
>
Folded in, with the addition of using ".., at _direct_ call sites.".
There's no inlining happening when you use `apply', or rebind the
procedure with `let'.  Should this be made more explicit?

>> but at
>> +the same time means that the caller will not transparently use the new
>> +definition if the inline procedure is redefined.
>
>   ... redefined using @code{set!}.
>
I don't agree with that one: there are multiple ways a procedure can get
"redefined", `set!' being just one of them.  I was actually thinking
more of re-evaluating the procedure definition or something like
`geiser-compile-file', hence I left the text like it was, being more
vague.

>> Inlinable procedures
>> +will also not deal nicely with debugging and tracing.
>
> Instead of “not deal nicely”, what about something like:
>
>   It is not possible to trace an inlined procedures or install a
>   breakpoint in it (@pxref{Traps}).
>
Done.

>> Therefore, you
>> +should not make a procedure inlinable unless it demonstrably improves
>> +performance in a crucial way.
>> +
>> +In general, only small procedures should be considered for inlining, as
>> +making large procedures inlinable will probably result in an increase in
>> +code size.  Additionally, the elimination of the call overhead rarely
>> +matters for for large procedures.
>> +
>> +@deffn {Scheme Syntax} define-inlinable (name parameter ...) . body
>
> I’d write “body ...” instead of “. body”.  Besides being aesthetically
> nicer, the former matches a proper list whereas the latter matches a
> pair.
>
Agreed, and done.

Updated patch attached, is it OK to push this way?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: refactor-define-inlinable.diff --]
[-- Type: text/x-diff, Size: 6178 bytes --]

From: Andreas Rottmann <a.rottmann@gmx.at>
Subject: Move `define-inlinable' into the default namespace

* module/ice-9/boot-9.scm (define-inlineable): Moved here from SRFI-9.
* module/srfi/srfi-9 (define-inlinable): Removed here.

* doc/ref/api-procedures.texi (Inlinable Procedures): Add subsection
  about `define-inlinable'.

---
 doc/ref/api-procedures.texi |   29 ++++++++++++++++++++++++++++-
 module/ice-9/boot-9.scm     |   36 ++++++++++++++++++++++++++++++++++++
 module/srfi/srfi-9.scm      |   32 --------------------------------
 3 files changed, 64 insertions(+), 33 deletions(-)

diff --git a/doc/ref/api-procedures.texi b/doc/ref/api-procedures.texi
index 02889c4..5c6d380 100644
--- a/doc/ref/api-procedures.texi
+++ b/doc/ref/api-procedures.texi
@@ -1,6 +1,6 @@
 @c -*-texinfo-*-
 @c This is part of the GNU Guile Reference Manual.
-@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2009, 2010
+@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2009, 2010, 2011
 @c   Free Software Foundation, Inc.
 @c See the file guile.texi for copying conditions.
 
@@ -16,6 +16,7 @@
 * Higher-Order Functions::      Function that take or return functions.
 * Procedure Properties::        Procedure properties and meta-information.
 * Procedures with Setters::     Procedures with setters.
+* Inlinable Procedures::        Procedures that can be inlined.
 @end menu
 
 
@@ -797,6 +798,32 @@ Return the setter of @var{proc}, which must be either a procedure with
 setter or an operator struct.
 @end deffn
 
+@node Inlinable Procedures
+@subsection Inlinable Procedures
+
+You can define an @dfn{inlinable procedure} by using
+@code{define-inlinable} instead of @code{define}.  An inlinable
+procedure behaves the same as a regular procedure, but direct calls will
+result in the procedure body being inlined into the caller.
+
+Procedures defined with @code{define-inlinable} are @emph{always}
+inlined, at all direct call sites.  This eliminates function call
+overhead at the expense of an increase in code size.  Additionally, the
+caller will not transparently use the new definition if the inline
+procedure is redefined.  It is not possible to trace an inlined
+procedures or install a breakpoint in it (@pxref{Traps}).  For these
+reasons, you should not make a procedure inlinable unless it
+demonstrably improves performance in a crucial way.
+
+In general, only small procedures should be considered for inlining, as
+making large procedures inlinable will probably result in an increase in
+code size.  Additionally, the elimination of the call overhead rarely
+matters for for large procedures.
+
+@deffn {Scheme Syntax} define-inlinable (name parameter ...) body ...
+Define @var{name} as a procedure with parameters @var{parameter}s and
+body @var{body}.
+@end deffn
 
 @c Local Variables:
 @c TeX-master: "guile.texi"
diff --git a/module/ice-9/boot-9.scm b/module/ice-9/boot-9.scm
index 33aa333..327e3fa 100644
--- a/module/ice-9/boot-9.scm
+++ b/module/ice-9/boot-9.scm
@@ -3497,6 +3497,42 @@ module '(ice-9 q) '(make-q q-length))}."
                          x)))))
 
 \f
+;;; Defining transparently inlinable procedures
+;;;
+
+(define-syntax define-inlinable
+  ;; Define a macro and a procedure such that direct calls are inlined, via
+  ;; the macro expansion, whereas references in non-call contexts refer to
+  ;; the procedure.  Inspired by the `define-integrable' macro by Dybvig et al.
+  (lambda (x)
+    ;; Use a space in the prefix to avoid potential -Wunused-toplevel
+    ;; warning
+    (define prefix (string->symbol "% "))
+    (define (make-procedure-name name)
+      (datum->syntax name
+                     (symbol-append prefix (syntax->datum name)
+                                    '-procedure)))
+
+    (syntax-case x ()
+      ((_ (name formals ...) body ...)
+       (identifier? #'name)
+       (with-syntax ((proc-name  (make-procedure-name #'name))
+                     ((args ...) (generate-temporaries #'(formals ...))))
+         #`(begin
+             (define (proc-name formals ...)
+               body ...)
+             (define-syntax name
+               (lambda (x)
+                 (syntax-case x ()
+                   ((_ args ...)
+                    #'((lambda (formals ...)
+                         body ...)
+                       args ...))
+                   (_
+                    (identifier? x)
+                    #'proc-name))))))))))
+
+\f
 
 (define using-readline?
   (let ((using-readline? (make-fluid)))
diff --git a/module/srfi/srfi-9.scm b/module/srfi/srfi-9.scm
index f9449a6..ad9e95d 100644
--- a/module/srfi/srfi-9.scm
+++ b/module/srfi/srfi-9.scm
@@ -64,38 +64,6 @@
 
 (cond-expand-provide (current-module) '(srfi-9))
 
-(define-syntax define-inlinable
-  ;; Define a macro and a procedure such that direct calls are inlined, via
-  ;; the macro expansion, whereas references in non-call contexts refer to
-  ;; the procedure.  Inspired by the `define-integrable' macro by Dybvig et al.
-  (lambda (x)
-    ;; Use a space in the prefix to avoid potential -Wunused-toplevel
-    ;; warning
-    (define prefix (string->symbol "% "))
-    (define (make-procedure-name name)
-      (datum->syntax name
-                     (symbol-append prefix (syntax->datum name)
-                                    '-procedure)))
-
-    (syntax-case x ()
-      ((_ (name formals ...) body ...)
-       (identifier? #'name)
-       (with-syntax ((proc-name  (make-procedure-name #'name))
-                     ((args ...) (generate-temporaries #'(formals ...))))
-         #`(begin
-             (define (proc-name formals ...)
-               body ...)
-             (define-syntax name
-               (lambda (x)
-                 (syntax-case x ()
-                   ((_ args ...)
-                    #'((lambda (formals ...)
-                         body ...)
-                       args ...))
-                   (_
-                    (identifier? x)
-                    #'proc-name))))))))))
-
 (define-syntax define-record-type
   (lambda (x)
     (define (field-identifiers field-specs)
-- 
tg: (ce60660..) t/refactor-define-inlinable (depends on: stable-2.0)

[-- Attachment #3: Type: text/plain, Size: 77 bytes --]


Thanks for the review, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: define-inlinable
  2011-04-06 21:30               ` define-inlinable Andreas Rottmann
@ 2011-04-06 22:24                 ` Ludovic Courtès
  2011-04-11 16:56                   ` define-inlinable Andy Wingo
  0 siblings, 1 reply; 28+ messages in thread
From: Ludovic Courtès @ 2011-04-06 22:24 UTC (permalink / raw)
  To: Andreas Rottmann; +Cc: guile-devel

Hi!

Andreas Rottmann <a.rottmann@gmx.at> writes:

[...]

>>> +behaves the same as a regular procedure, but direct calls will result in
>>> +the procedure body being inlined into the caller.
>>> +
>>> +Making a procedure inlinable eliminates the overhead of the call,
>>
>> How about:
>>
>>   Procedures defined with @code{define-inlinable} are @emph{always}
>>   inlined, at all call sites.  This eliminates function call overhead at
>>   the expense of an increase in code size.
>>
> Folded in, with the addition of using ".., at _direct_ call sites.".
> There's no inlining happening when you use `apply', or rebind the
> procedure with `let'.  Should this be made more explicit?

That’s fine IMO.

>>> but at
>>> +the same time means that the caller will not transparently use the new
>>> +definition if the inline procedure is redefined.
>>
>>   ... redefined using @code{set!}.
>>
> I don't agree with that one: there are multiple ways a procedure can get
> "redefined", `set!' being just one of them.  I was actually thinking
> more of re-evaluating the procedure definition or something like
> `geiser-compile-file', hence I left the text like it was, being more
> vague.

Right, good point.

> Updated patch attached, is it OK to push this way?

Yes, please go ahead!

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] Add `fixnum?' VM primitive
  2011-04-05  0:14           ` Andreas Rottmann
  2011-04-06 12:42             ` define-inlinable Ludovic Courtès
@ 2011-04-07 15:57             ` Ludovic Courtès
  1 sibling, 0 replies; 28+ messages in thread
From: Ludovic Courtès @ 2011-04-07 15:57 UTC (permalink / raw)
  To: guile-devel

Hi!

Andreas Rottmann <a.rottmann@gmx.at> writes:

> From: Andreas Rottmann <a.rottmann@gmx.at>
> Subject: Implement R6RS' `fixnum?' in a smarter way
>
> * module/rnrs/arithmetic/fixnums.scm (fixnum?): Implemented using
>   bit-twiddling, and using `define-inlinable'.

Fine with me, feel free to commit.

Ludo’.




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: define-inlinable
  2011-04-06 22:24                 ` define-inlinable Ludovic Courtès
@ 2011-04-11 16:56                   ` Andy Wingo
  2011-04-11 20:01                     ` define-inlinable Ludovic Courtès
  0 siblings, 1 reply; 28+ messages in thread
From: Andy Wingo @ 2011-04-11 16:56 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Thu 07 Apr 2011 00:24, ludo@gnu.org (Ludovic Courtès) writes:

>> Updated patch attached, is it OK to push this way?
>
> Yes, please go ahead!

FWIW:

Running r6rs-arithmetic-flonums.test
ERROR: r6rs-arithmetic-flonums.test: fixnum->flonum: simple - arguments: ((wrong-type-arg #f "Wrong type to apply: ~S" (#<syntax-transformer fixnum?>) (#<syntax-transformer fixnum?>)))

In practice this means that anyone that compiled something that uses
fixnum? against Guile 2.0.0 now has broken code.  I don't know if this
is important, as R6RS users probably have lots of other carnage to deal
with, but it is strictly an ABI break.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: define-inlinable
  2011-04-11 16:56                   ` define-inlinable Andy Wingo
@ 2011-04-11 20:01                     ` Ludovic Courtès
  2011-04-11 21:05                       ` define-inlinable Andy Wingo
  0 siblings, 1 reply; 28+ messages in thread
From: Ludovic Courtès @ 2011-04-11 20:01 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Hello!

Andy Wingo <wingo@pobox.com> writes:

> Running r6rs-arithmetic-flonums.test
> ERROR: r6rs-arithmetic-flonums.test: fixnum->flonum: simple - arguments: ((wrong-type-arg #f "Wrong type to apply: ~S" (#<syntax-transformer fixnum?>) (#<syntax-transformer fixnum?>)))
>
> In practice this means that anyone that compiled something that uses
> fixnum? against Guile 2.0.0 now has broken code.

Ouch, right.

> I don't know if this is important, as R6RS users probably have lots of
> other carnage to deal with, but it is strictly an ABI break.

Well well, they’ll need to recompile.  My feeling is that it’s
acceptable, but I don’t have a strong opinion.

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: define-inlinable
  2011-04-11 20:01                     ` define-inlinable Ludovic Courtès
@ 2011-04-11 21:05                       ` Andy Wingo
  2011-04-11 22:11                         ` define-inlinable Andreas Rottmann
  0 siblings, 1 reply; 28+ messages in thread
From: Andy Wingo @ 2011-04-11 21:05 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Heya :)

On Mon 11 Apr 2011 22:01, ludo@gnu.org (Ludovic Courtès) writes:

>> I don't know if this is important, as R6RS users probably have lots of
>> other carnage to deal with, but it is strictly an ABI break.
>
> Well well, they’ll need to recompile.  My feeling is that it’s
> acceptable, but I don’t have a strong opinion.

In this case it probably is, but we don't really know what users there
are.  But let's please keep this general issue in mind in the future.

Cheers,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: define-inlinable
  2011-04-11 21:05                       ` define-inlinable Andy Wingo
@ 2011-04-11 22:11                         ` Andreas Rottmann
  0 siblings, 0 replies; 28+ messages in thread
From: Andreas Rottmann @ 2011-04-11 22:11 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Ludovic Courtès, guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Heya :)
>
> On Mon 11 Apr 2011 22:01, ludo@gnu.org (Ludovic Courtès) writes:
>
>>> I don't know if this is important, as R6RS users probably have lots of
>>> other carnage to deal with, but it is strictly an ABI break.
>>
>> Well well, they’ll need to recompile.  My feeling is that it’s
>> acceptable, but I don’t have a strong opinion.
>
> In this case it probably is, but we don't really know what users there
> are.  But let's please keep this general issue in mind in the future.
>
Agreed.  I'm reluctant to using macros as a performance hack (e.g.,
define-inlinable) -- it just feels wrong, but sometimes performance
numbers are seductive...

Regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.yi.org/>



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-04-11 22:11 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-22 23:20 Take some lowhanging fruit to speed up R6RS fixnum operations Andreas Rottmann
2011-03-22 23:20 ` [PATCH] " Andreas Rottmann
2011-03-24 21:51   ` Ludovic Courtès
2011-03-24 23:42     ` Andreas Rottmann
2011-03-25 12:16       ` Andreas Rottmann
2011-03-27 15:19         ` Ludovic Courtès
2011-03-27 22:20           ` Andreas Rottmann
2011-03-29 11:05 ` Andy Wingo
2011-03-30  1:37   ` Andreas Rottmann
2011-03-30 10:31   ` Andreas Rottmann
2011-03-30 10:58   ` Andreas Rottmann
2011-04-02 17:42     ` R6RS fixnum arithmetic optimizations Andreas Rottmann
2011-04-02 17:42       ` [PATCH 1/3] Add a few benchmarks for R6RS fixnum arithmetic Andreas Rottmann
2011-04-02 17:42       ` [PATCH 2/3] Several optimizations " Andreas Rottmann
2011-04-02 17:42       ` [PATCH 3/3] Add `fixnum?' VM primitive Andreas Rottmann
2011-04-04 21:53         ` Andy Wingo
2011-04-05  0:14           ` Andreas Rottmann
2011-04-06 12:42             ` define-inlinable Ludovic Courtès
2011-04-06 21:30               ` define-inlinable Andreas Rottmann
2011-04-06 22:24                 ` define-inlinable Ludovic Courtès
2011-04-11 16:56                   ` define-inlinable Andy Wingo
2011-04-11 20:01                     ` define-inlinable Ludovic Courtès
2011-04-11 21:05                       ` define-inlinable Andy Wingo
2011-04-11 22:11                         ` define-inlinable Andreas Rottmann
2011-04-07 15:57             ` [PATCH 3/3] Add `fixnum?' VM primitive Ludovic Courtès
2011-04-04 21:28     ` Take some lowhanging fruit to speed up R6RS fixnum operations Andy Wingo
2011-04-04 22:00       ` Andreas Rottmann
2011-04-04 22:12         ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).