unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* High run time variance
@ 2010-03-29 16:09 Luca Saiu
  2010-03-30  9:16 ` Ludovic Courtès
  2010-08-04 19:51 ` Andy Wingo
  0 siblings, 2 replies; 10+ messages in thread
From: Luca Saiu @ 2010-03-29 16:09 UTC (permalink / raw)
  To: bug-guile

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello guilers.

I've come across a curious issue with Guile; I had already casually
spotted it before, but today I've run a somewhat more comprehensive
test, and I think the results are interesting. As usual, please forgive
me for the noise if you're already aware of what I say.

Let's say we have to perform a CPU-intensive computation several times;
in this distilled test case I will use the traditional recursive,
exponential-complexity definition of the Fibonacci sequence. Let's apply
the function to the same argument three time in a row in the same script
run, measuring the completion time of each computation.

Here comes the content of my script q.scm:
================================================================
(define-macro (benchmark . body)
  ;; Yes, I know that this is ugly and non-hygienic.  That's
  ;; not the problem.
  `(let* ((t0 (array-ref (times) 0))
          (result (begin ,@body))
          (t1 (array-ref (times) 0)))
     (format #t
             "~a returned ~a in ~a seconds.\n"
             '(begin ,@body)
             result
             (/ (- t1 t0) 100.0))
     result))

(define-macro (thrice . body)
  `(begin
     ,@body
     ,@body
     ,@body))

(define (fibo n)
  (if (< n 2)
      n
      (+ (fibo (- n 1))
         (fibo (- n 2)))))

(define n 35)
(thrice (benchmark (fibo n)))
================================================================

Here's what I get when running the script ten times on 1.9.9 with my
office machine (a fast x86_64 machine with eight 3GHz cores, very
lightly loaded):

[luca@optimum ~/tmp]$ rm -rf ~/.cache/guile/; for i in `seq 10`; do
guile --autocompile q.scm; echo; done
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q.scm
;;; compiled
/home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q.scm.go
(begin (fibo n)) returned 9227465 in 2.75 seconds.
(begin (fibo n)) returned 9227465 in 2.74 seconds.
(begin (fibo n)) returned 9227465 in 2.75 seconds.

(begin (fibo n)) returned 9227465 in 2.63 seconds.
(begin (fibo n)) returned 9227465 in 2.62 seconds.
(begin (fibo n)) returned 9227465 in 2.63 seconds.

(begin (fibo n)) returned 9227465 in 3.07 seconds.
(begin (fibo n)) returned 9227465 in 3.07 seconds.
(begin (fibo n)) returned 9227465 in 3.08 seconds.

(begin (fibo n)) returned 9227465 in 2.87 seconds.
(begin (fibo n)) returned 9227465 in 2.87 seconds.
(begin (fibo n)) returned 9227465 in 2.87 seconds.

(begin (fibo n)) returned 9227465 in 2.93 seconds.
(begin (fibo n)) returned 9227465 in 2.93 seconds.
(begin (fibo n)) returned 9227465 in 2.93 seconds.

(begin (fibo n)) returned 9227465 in 2.6 seconds.
(begin (fibo n)) returned 9227465 in 2.6 seconds.
(begin (fibo n)) returned 9227465 in 2.59 seconds.

(begin (fibo n)) returned 9227465 in 2.88 seconds.
(begin (fibo n)) returned 9227465 in 2.88 seconds.
(begin (fibo n)) returned 9227465 in 2.88 seconds.

(begin (fibo n)) returned 9227465 in 2.67 seconds.
(begin (fibo n)) returned 9227465 in 2.67 seconds.
(begin (fibo n)) returned 9227465 in 2.67 seconds.

(begin (fibo n)) returned 9227465 in 2.76 seconds.
(begin (fibo n)) returned 9227465 in 2.75 seconds.
(begin (fibo n)) returned 9227465 in 2.75 seconds.

(begin (fibo n)) returned 9227465 in 3.23 seconds.
(begin (fibo n)) returned 9227465 in 3.23 seconds.
(begin (fibo n)) returned 9227465 in 3.23 seconds.

Notice the wide variance among runs (from 2.6 to 3.23 seconds); however
within each single run the three computations always take roughly the
same time -- This will remain consistent in all the following
experiments, and it is the "anomaly" I am concerned about in this
message. If I had to bet about the cause, I'd say it's an alignment
issue (determined at startup time); if the alignment were always optimal
we would get the best time in all runs.
  Notice that there is no consing (at least with the VM), so false
pointers are not an issue.

Is the effect also observable with the interpreter? Just to be sure:

[luca@optimum ~/tmp]$ rm -rf ~/.cache/guile/; for i in `seq 10`; do
guile --no-autocompile q.scm; echo; done
(begin (fibo n)) returned 9227465 in 169.99 seconds.
(begin (fibo n)) returned 9227465 in 170.1 seconds.
(begin (fibo n)) returned 9227465 in 169.09 seconds.

(begin (fibo n)) returned 9227465 in 168.65 seconds.
(begin (fibo n)) returned 9227465 in 168.53 seconds.
(begin (fibo n)) returned 9227465 in 168.78 seconds.

(begin (fibo n)) returned 9227465 in 172.49 seconds.
(begin (fibo n)) returned 9227465 in 172.63 seconds.
(begin (fibo n)) returned 9227465 in 172.71 seconds.

(begin (fibo n)) returned 9227465 in 179.91 seconds.
(begin (fibo n)) returned 9227465 in 179.89 seconds.
(begin (fibo n)) returned 9227465 in 181.18 seconds.

(begin (fibo n)) returned 9227465 in 166.11 seconds.
(begin (fibo n)) returned 9227465 in 166.06 seconds.
(begin (fibo n)) returned 9227465 in 166.16 seconds.

(begin (fibo n)) returned 9227465 in 166.09 seconds.
(begin (fibo n)) returned 9227465 in 166.12 seconds.
(begin (fibo n)) returned 9227465 in 166.27 seconds.

(begin (fibo n)) returned 9227465 in 165.64 seconds.
(begin (fibo n)) returned 9227465 in 165.48 seconds.
(begin (fibo n)) returned 9227465 in 166.27 seconds.

(begin (fibo n)) returned 9227465 in 172.14 seconds.
(begin (fibo n)) returned 9227465 in 172.94 seconds.
(begin (fibo n)) returned 9227465 in 173.33 seconds.

(begin (fibo n)) returned 9227465 in 164.26 seconds.
(begin (fibo n)) returned 9227465 in 164.28 seconds.
(begin (fibo n)) returned 9227465 in 162.94 seconds.

(begin (fibo n)) returned 9227465 in 175.57 seconds.
(begin (fibo n)) returned 9227465 in 175.45 seconds.
(begin (fibo n)) returned 9227465 in 175.22 seconds.

- From 164 to 180 seconds. Less pronounced than before, but the effect is
still visible.

Now let's define n as 42:

[luca@optimum ~/tmp]$ rm -rf ~/.cache/guile/; for i in `seq 10`; do
guile --autocompile q2.scm; echo; done
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q2.scm
;;; compiled
/home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q2.scm.go
(begin (fibo n)) returned 267914296 in 84.44 seconds.
(begin (fibo n)) returned 267914296 in 84.45 seconds.
(begin (fibo n)) returned 267914296 in 84.44 seconds.

(begin (fibo n)) returned 267914296 in 78.83 seconds.
(begin (fibo n)) returned 267914296 in 78.71 seconds.
(begin (fibo n)) returned 267914296 in 78.73 seconds.

(begin (fibo n)) returned 267914296 in 93.8 seconds.
(begin (fibo n)) returned 267914296 in 93.81 seconds.
(begin (fibo n)) returned 267914296 in 93.8 seconds.

(begin (fibo n)) returned 267914296 in 79.27 seconds.
(begin (fibo n)) returned 267914296 in 79.27 seconds.
(begin (fibo n)) returned 267914296 in 79.27 seconds.

(begin (fibo n)) returned 267914296 in 91.31 seconds.
(begin (fibo n)) returned 267914296 in 91.31 seconds.
(begin (fibo n)) returned 267914296 in 91.3 seconds.

(begin (fibo n)) returned 267914296 in 87.23 seconds.
(begin (fibo n)) returned 267914296 in 87.13 seconds.
(begin (fibo n)) returned 267914296 in 87.12 seconds.

(begin (fibo n)) returned 267914296 in 79.6 seconds.
(begin (fibo n)) returned 267914296 in 79.6 seconds.
(begin (fibo n)) returned 267914296 in 79.59 seconds.

(begin (fibo n)) returned 267914296 in 84.9 seconds.
(begin (fibo n)) returned 267914296 in 84.88 seconds.
(begin (fibo n)) returned 267914296 in 84.88 seconds.

(begin (fibo n)) returned 267914296 in 98.91 seconds.
(begin (fibo n)) returned 267914296 in 98.9 seconds.
(begin (fibo n)) returned 267914296 in 98.9 seconds.

(begin (fibo n)) returned 267914296 in 73.83 seconds.
(begin (fibo n)) returned 267914296 in 73.82 seconds.
(begin (fibo n)) returned 267914296 in 73.83 seconds.

The variance is quite high: from 73.8 to 98.9.

Let's define n as 43:

[luca@optimum ~/tmp]$ rm -rf ~/.cache/guile/; for i in `seq 10`; do
guile --autocompile q3.scm; echo; done
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q3.scm
;;; compiled
/home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q3.scm.go
(begin (fibo n)) returned 433494437 in 136.97 seconds.
(begin (fibo n)) returned 433494437 in 136.97 seconds.
(begin (fibo n)) returned 433494437 in 136.96 seconds.

(begin (fibo n)) returned 433494437 in 119.29 seconds.
(begin (fibo n)) returned 433494437 in 119.32 seconds.
(begin (fibo n)) returned 433494437 in 119.38 seconds.

(begin (fibo n)) returned 433494437 in 135.42 seconds.
(begin (fibo n)) returned 433494437 in 135.47 seconds.
(begin (fibo n)) returned 433494437 in 135.45 seconds.

(begin (fibo n)) returned 433494437 in 150.63 seconds.
(begin (fibo n)) returned 433494437 in 149.82 seconds.
(begin (fibo n)) returned 433494437 in 149.68 seconds.

(begin (fibo n)) returned 433494437 in 121.13 seconds.
(begin (fibo n)) returned 433494437 in 121.19 seconds.
(begin (fibo n)) returned 433494437 in 121.15 seconds.

(begin (fibo n)) returned 433494437 in 119.39 seconds.
(begin (fibo n)) returned 433494437 in 119.36 seconds.
(begin (fibo n)) returned 433494437 in 119.39 seconds.

(begin (fibo n)) returned 433494437 in 137.46 seconds.
(begin (fibo n)) returned 433494437 in 137.44 seconds.
(begin (fibo n)) returned 433494437 in 137.41 seconds.

(begin (fibo n)) returned 433494437 in 123.83 seconds.
(begin (fibo n)) returned 433494437 in 123.85 seconds.
(begin (fibo n)) returned 433494437 in 123.78 seconds.

(begin (fibo n)) returned 433494437 in 136.3 seconds.
(begin (fibo n)) returned 433494437 in 136.26 seconds.
(begin (fibo n)) returned 433494437 in 136.26 seconds.

(begin (fibo n)) returned 433494437 in 118.75 seconds.
(begin (fibo n)) returned 433494437 in 118.73 seconds.
(begin (fibo n)) returned 433494437 in 118.75 seconds.

- From 118.7 to 150 seconds.

Now, did this happen also in 1.8? I had 1.8.7 already installed on
another machine (it's nearly identical to mine, it just has 2GHz instead
of 3GHz cores; this should not be relevant), so I used that:

With n = 35:
[luca@factotum /tmp]$ for i in `seq 10`; do guile q.scm; echo; done
(begin (fibo n)) returned 9227465 in 9.69 seconds.
(begin (fibo n)) returned 9227465 in 9.69 seconds.
(begin (fibo n)) returned 9227465 in 9.7 seconds.

(begin (fibo n)) returned 9227465 in 9.81 seconds.
(begin (fibo n)) returned 9227465 in 9.8 seconds.
(begin (fibo n)) returned 9227465 in 9.88 seconds.

(begin (fibo n)) returned 9227465 in 9.56 seconds.
(begin (fibo n)) returned 9227465 in 9.55 seconds.
(begin (fibo n)) returned 9227465 in 9.55 seconds.

(begin (fibo n)) returned 9227465 in 9.61 seconds.
(begin (fibo n)) returned 9227465 in 9.59 seconds.
(begin (fibo n)) returned 9227465 in 9.58 seconds.

(begin (fibo n)) returned 9227465 in 9.69 seconds.
(begin (fibo n)) returned 9227465 in 9.76 seconds.
(begin (fibo n)) returned 9227465 in 9.69 seconds.

(begin (fibo n)) returned 9227465 in 9.78 seconds.
(begin (fibo n)) returned 9227465 in 9.77 seconds.
(begin (fibo n)) returned 9227465 in 9.82 seconds.

(begin (fibo n)) returned 9227465 in 9.7 seconds.
(begin (fibo n)) returned 9227465 in 9.67 seconds.
(begin (fibo n)) returned 9227465 in 9.68 seconds.

(begin (fibo n)) returned 9227465 in 9.69 seconds.
(begin (fibo n)) returned 9227465 in 9.69 seconds.
(begin (fibo n)) returned 9227465 in 9.71 seconds.

(begin (fibo n)) returned 9227465 in 9.69 seconds.
(begin (fibo n)) returned 9227465 in 9.68 seconds.
(begin (fibo n)) returned 9227465 in 9.66 seconds.

(begin (fibo n)) returned 9227465 in 9.76 seconds.
(begin (fibo n)) returned 9227465 in 9.74 seconds.
(begin (fibo n)) returned 9227465 in 9.71 seconds.

I'd say that the "anomaly" was there also in 1.8, even if it was much
less pronounced.

Let's also try with higher values of n:

n=38:
[luca@factotum /tmp]$ for i in `seq 10`; do guile q2.scm; echo; done
(begin (fibo n)) returned 39088169 in 40.76 seconds.
(begin (fibo n)) returned 39088169 in 40.85 seconds.
(begin (fibo n)) returned 39088169 in 40.76 seconds.

(begin (fibo n)) returned 39088169 in 40.78 seconds.
(begin (fibo n)) returned 39088169 in 40.91 seconds.
(begin (fibo n)) returned 39088169 in 40.89 seconds.

(begin (fibo n)) returned 39088169 in 41.08 seconds.
(begin (fibo n)) returned 39088169 in 41.49 seconds.
(begin (fibo n)) returned 39088169 in 41.33 seconds.

(begin (fibo n)) returned 39088169 in 41.52 seconds.
(begin (fibo n)) returned 39088169 in 41.59 seconds.
(begin (fibo n)) returned 39088169 in 41.41 seconds.

(begin (fibo n)) returned 39088169 in 41.17 seconds.
(begin (fibo n)) returned 39088169 in 41.3 seconds.
(begin (fibo n)) returned 39088169 in 41.33 seconds.

(begin (fibo n)) returned 39088169 in 40.98 seconds.
(begin (fibo n)) returned 39088169 in 41.06 seconds.
(begin (fibo n)) returned 39088169 in 41.09 seconds.

(begin (fibo n)) returned 39088169 in 41.2 seconds.
(begin (fibo n)) returned 39088169 in 41.22 seconds.
(begin (fibo n)) returned 39088169 in 41.32 seconds.

(begin (fibo n)) returned 39088169 in 40.88 seconds.
(begin (fibo n)) returned 39088169 in 41.04 seconds.
(begin (fibo n)) returned 39088169 in 41.02 seconds.

(begin (fibo n)) returned 39088169 in 40.7 seconds.
(begin (fibo n)) returned 39088169 in 40.83 seconds.
(begin (fibo n)) returned 39088169 in 40.8 seconds.

(begin (fibo n)) returned 39088169 in 41.07 seconds.
(begin (fibo n)) returned 39088169 in 41.15 seconds.
(begin (fibo n)) returned 39088169 in 40.86 seconds.

And finally n = 40:

[luca@factotum /tmp]$ for i in `seq 10`; do guile q3.scm; echo; done
(begin (fibo n)) returned 102334155 in 107.13 seconds.
(begin (fibo n)) returned 102334155 in 107.09 seconds.
(begin (fibo n)) returned 102334155 in 107.19 seconds.

(begin (fibo n)) returned 102334155 in 107.25 seconds.
(begin (fibo n)) returned 102334155 in 107.24 seconds.
(begin (fibo n)) returned 102334155 in 107.69 seconds.

(begin (fibo n)) returned 102334155 in 107.49 seconds.
(begin (fibo n)) returned 102334155 in 107.49 seconds.
(begin (fibo n)) returned 102334155 in 107.7 seconds.

(begin (fibo n)) returned 102334155 in 106.75 seconds.
(begin (fibo n)) returned 102334155 in 106.51 seconds.
(begin (fibo n)) returned 102334155 in 106.53 seconds.

(begin (fibo n)) returned 102334155 in 108.13 seconds.
(begin (fibo n)) returned 102334155 in 108.07 seconds.
(begin (fibo n)) returned 102334155 in 107.66 seconds.

(begin (fibo n)) returned 102334155 in 106.74 seconds.
(begin (fibo n)) returned 102334155 in 107.08 seconds.
(begin (fibo n)) returned 102334155 in 106.78 seconds.

(begin (fibo n)) returned 102334155 in 107.91 seconds.
(begin (fibo n)) returned 102334155 in 107.93 seconds.
(begin (fibo n)) returned 102334155 in 108.25 seconds.

(begin (fibo n)) returned 102334155 in 106.3 seconds.
(begin (fibo n)) returned 102334155 in 106.43 seconds.
(begin (fibo n)) returned 102334155 in 106.57 seconds.

(begin (fibo n)) returned 102334155 in 106.51 seconds.
(begin (fibo n)) returned 102334155 in 106.31 seconds.
(begin (fibo n)) returned 102334155 in 106.64 seconds.

(begin (fibo n)) returned 102334155 in 106.49 seconds.
(begin (fibo n)) returned 102334155 in 106.38 seconds.
(begin (fibo n)) returned 102334155 in 106.41 seconds.

To sum up, within each run the computation time of (fibo n) is the same,
but the time varies widely from one run to another. This anomaly seems
to have become much more accentuated in 1.9.

Is this really an alignment issue as I suspect? I don't know because I
have not tracked the problem any deeper than running these tests, but
maybe it's easy for you to see. If so, maybe it would be easy to get
some more performance by forcing the alignment to the optimum value.

Best regards,

- --
Luca Saiu
http://www-lipn.univ-paris13.fr/~saiu
GNU epsilon: http://www.gnu.org/software/epsilon
Marionnet:   http://www.marionnet.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuw0KcACgkQvzOavibF0oaEWwCgksg6bNkI2BOxWnd9bEgjk99O
9EMAoJi+yadDJHEuGGs8woAmwzB6Sj9A
=018I
-----END PGP SIGNATURE-----




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-03-29 16:09 High run time variance Luca Saiu
@ 2010-03-30  9:16 ` Ludovic Courtès
  2010-03-30 12:35   ` Luca Saiu
  2010-08-04 19:51 ` Andy Wingo
  1 sibling, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2010-03-30  9:16 UTC (permalink / raw)
  To: Luca Saiu; +Cc: bug-guile

Hi Luca,

Luca Saiu <positron@gnu.org> writes:

> (define (fibo n)
>   (if (< n 2)
>       n
>       (+ (fibo (- n 1))
>          (fibo (- n 2)))))

This function is not tail-recursive, so it consumes stack space, which
increases the amount of memory the GC has to scan.  My guess is that
this has to do with the time spent in GC.

Could you try with a tail-recursive version?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-03-30  9:16 ` Ludovic Courtès
@ 2010-03-30 12:35   ` Luca Saiu
  2010-03-30 13:01     ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Luca Saiu @ 2010-03-30 12:35 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: bug-guile

[-- Attachment #1: Type: text/plain, Size: 9708 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Ludovic.

Ludovic Courtès wrote:
> Hi Luca,
> 
> Luca Saiu <positron@gnu.org> writes:
> 
>> (define (fibo n)
>>   (if (< n 2)
>>       n
>>       (+ (fibo (- n 1))
>>          (fibo (- n 2)))))
> 
> This function is not tail-recursive, so it consumes stack space, which
> increases the amount of memory the GC has to scan. My guess is that
> this has to do with the time spent in GC.

Well, no.  That's easy to prove with Boehm's GC, by setting
GC_PRINT_STATS=1. The log is quite big, so I'm attaching it separately
as fibo-with-GC_PRINT_STATS.  Anyway the result is that there is no
collection while computing fibo (I'm speaking about 1.9 using the VM),
which is to be expected: Bohem's GC can only trigger a collection at
allocation time.
  q-fibo.scm is the file which I had called q.scm in my first report,
with n = 35.

We can also completely disable the GC by setting GC_DONT_GC, and the
effect is still there:
[luca@optimum ~/tmp]$ export GC_DONT_GC=1; unset GC_PRINT_STATS; rm -rf
~/.cache/guile/; for i in `seq 10`; do time guile --autocompile
q-fibo.scm; echo; done
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q-fibo.scm
;;; compiled
/home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-fibo.scm.go
(begin (fibo n)) returned 9227465 in 2.79 seconds.
(begin (fibo n)) returned 9227465 in 2.79 seconds.
(begin (fibo n)) returned 9227465 in 2.78 seconds.

real	0m8.429s
user	0m8.420s
sys	0m0.020s

(begin (fibo n)) returned 9227465 in 2.59 seconds.
(begin (fibo n)) returned 9227465 in 2.59 seconds.
(begin (fibo n)) returned 9227465 in 2.6 seconds.

real	0m7.792s
user	0m7.800s
sys	0m0.010s

(begin (fibo n)) returned 9227465 in 2.95 seconds.
(begin (fibo n)) returned 9227465 in 2.93 seconds.
(begin (fibo n)) returned 9227465 in 2.94 seconds.

real	0m8.829s
user	0m8.820s
sys	0m0.010s

(begin (fibo n)) returned 9227465 in 2.64 seconds.
(begin (fibo n)) returned 9227465 in 2.64 seconds.
(begin (fibo n)) returned 9227465 in 2.65 seconds.

real	0m7.949s
user	0m7.940s
sys	0m0.000s

(begin (fibo n)) returned 9227465 in 2.73 seconds.
(begin (fibo n)) returned 9227465 in 2.74 seconds.
(begin (fibo n)) returned 9227465 in 2.73 seconds.

real	0m8.216s
user	0m8.190s
sys	0m0.000s

(begin (fibo n)) returned 9227465 in 2.6 seconds.
(begin (fibo n)) returned 9227465 in 2.61 seconds.
(begin (fibo n)) returned 9227465 in 2.61 seconds.

real	0m7.835s
user	0m7.820s
sys	0m0.020s

(begin (fibo n)) returned 9227465 in 2.97 seconds.
(begin (fibo n)) returned 9227465 in 2.97 seconds.
(begin (fibo n)) returned 9227465 in 2.98 seconds.

real	0m8.933s
user	0m8.910s
sys	0m0.000s

(begin (fibo n)) returned 9227465 in 2.61 seconds.
(begin (fibo n)) returned 9227465 in 2.6 seconds.
(begin (fibo n)) returned 9227465 in 2.61 seconds.

real	0m7.826s
user	0m7.810s
sys	0m0.010s

(begin (fibo n)) returned 9227465 in 2.92 seconds.
(begin (fibo n)) returned 9227465 in 2.93 seconds.
(begin (fibo n)) returned 9227465 in 2.92 seconds.

real	0m8.789s
user	0m8.780s
sys	0m0.000s

(begin (fibo n)) returned 9227465 in 2.55 seconds.
(begin (fibo n)) returned 9227465 in 2.55 seconds.
(begin (fibo n)) returned 9227465 in 2.55 seconds.

real	0m7.671s
user	0m7.670s
sys	0m0.000s

> Could you try with a tail-recursive version?

Well, of course I've used a different function :-). Here's the source
q-f.scm:
==================================================================
(define-macro (benchmark . body)
  ;; Yes, I know that this is ugly and non-hygienic.  That's
  ;; not the problem.
  `(let* ((t0 (array-ref (times) 0))
          (result (begin ,@body))
          (t1 (array-ref (times) 0)))
     (format #t
             "~a returned ~a in ~a seconds.\n"
             '(begin ,@body)
             result
             (/ (- t1 t0) 100.0))
     result))

(define-macro (thrice . body)
  `(begin
     ,@body
     ,@body
     ,@body))

(define (f n)
  (if (zero? n)
      0
      (f (- n 1))))

(define n 100000000)
(thrice (benchmark (f n)))
==================================================================

So, the result:
[luca@optimum ~/tmp]$ unset GC_DONT_GC; unset GC_PRINT_STATS; rm -rf
~/.cache/guile/; for i in `seq 10`; do time guile --autocompile q-f.scm;
echo; done
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q-f.scm
;;; compiled
/home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-f.scm.go
(begin (f n)) returned 0 in 5.19 seconds.
(begin (f n)) returned 0 in 5.19 seconds.
(begin (f n)) returned 0 in 5.18 seconds.

real	0m15.641s
user	0m15.640s
sys	0m0.010s

(begin (f n)) returned 0 in 5.21 seconds.
(begin (f n)) returned 0 in 5.2 seconds.
(begin (f n)) returned 0 in 5.2 seconds.

real	0m15.631s
user	0m15.620s
sys	0m0.000s

(begin (f n)) returned 0 in 5.18 seconds.
(begin (f n)) returned 0 in 5.18 seconds.
(begin (f n)) returned 0 in 5.18 seconds.

real	0m15.554s
user	0m15.540s
sys	0m0.020s

(begin (f n)) returned 0 in 5.2 seconds.
(begin (f n)) returned 0 in 5.21 seconds.
(begin (f n)) returned 0 in 5.2 seconds.

real	0m15.621s
user	0m15.630s
sys	0m0.000s

(begin (f n)) returned 0 in 5.05 seconds.
(begin (f n)) returned 0 in 5.05 seconds.
(begin (f n)) returned 0 in 5.05 seconds.

real	0m15.173s
user	0m15.180s
sys	0m0.010s

(begin (f n)) returned 0 in 5.18 seconds.
(begin (f n)) returned 0 in 5.17 seconds.
(begin (f n)) returned 0 in 5.18 seconds.

real	0m15.550s
user	0m15.550s
sys	0m0.010s

(begin (f n)) returned 0 in 5.24 seconds.
(begin (f n)) returned 0 in 5.24 seconds.
(begin (f n)) returned 0 in 5.24 seconds.

real	0m15.744s
user	0m15.750s
sys	0m0.000s

(begin (f n)) returned 0 in 5.25 seconds.
(begin (f n)) returned 0 in 5.25 seconds.
(begin (f n)) returned 0 in 5.24 seconds.

real	0m15.773s
user	0m15.770s
sys	0m0.010s

(begin (f n)) returned 0 in 5.2 seconds.
(begin (f n)) returned 0 in 5.2 seconds.
(begin (f n)) returned 0 in 5.2 seconds.

real	0m15.626s
user	0m15.630s
sys	0m0.010s

(begin (f n)) returned 0 in 5.26 seconds.
(begin (f n)) returned 0 in 5.26 seconds.
(begin (f n)) returned 0 in 5.26 seconds.

real	0m15.793s
user	0m15.800s
sys	0m0.010s

And with a bigger n, n = 1000000000:

[luca@optimum ~/tmp]$ unset GC_DONT_GC; unset GC_PRINT_STATS; rm -rf
~/.cache/guile/; for i in `seq 10`; do time guile --autocompile
q-f-big-n.scm; echo; done
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q-f-big-n.scm
;;; compiled
/home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-f-big-n.scm.go
(begin (f n)) returned 0 in 51.79 seconds.
(begin (f n)) returned 0 in 51.77 seconds.
(begin (f n)) returned 0 in 51.77 seconds.

real	2m35.411s
user	2m35.380s
sys	0m0.050s

(begin (f n)) returned 0 in 52.34 seconds.
(begin (f n)) returned 0 in 52.34 seconds.
(begin (f n)) returned 0 in 52.35 seconds.

real	2m37.051s
user	2m37.020s
sys	0m0.030s

(begin (f n)) returned 0 in 51.97 seconds.
(begin (f n)) returned 0 in 51.97 seconds.
(begin (f n)) returned 0 in 51.98 seconds.

real	2m35.949s
user	2m35.930s
sys	0m0.010s

(begin (f n)) returned 0 in 52.55 seconds.
(begin (f n)) returned 0 in 52.54 seconds.
(begin (f n)) returned 0 in 52.54 seconds.

real	2m37.651s
user	2m37.610s
sys	0m0.040s

(begin (f n)) returned 0 in 52.02 seconds.
(begin (f n)) returned 0 in 52.03 seconds.
(begin (f n)) returned 0 in 52.01 seconds.

real	2m36.084s
user	2m36.060s
sys	0m0.040s

(begin (f n)) returned 0 in 51.78 seconds.
(begin (f n)) returned 0 in 51.79 seconds.
(begin (f n)) returned 0 in 51.79 seconds.

real	2m35.382s
user	2m35.310s
sys	0m0.060s

(begin (f n)) returned 0 in 51.39 seconds.
(begin (f n)) returned 0 in 51.39 seconds.
(begin (f n)) returned 0 in 51.39 seconds.

real	2m34.191s
user	2m34.150s
sys	0m0.040s

(begin (f n)) returned 0 in 52.25 seconds.
(begin (f n)) returned 0 in 52.25 seconds.
(begin (f n)) returned 0 in 52.25 seconds.

real	2m36.763s
user	2m36.730s
sys	0m0.040s

(begin (f n)) returned 0 in 52.0 seconds.
(begin (f n)) returned 0 in 51.99 seconds.
(begin (f n)) returned 0 in 52.01 seconds.

real	2m36.014s
user	2m35.960s
sys	0m0.050s

(begin (f n)) returned 0 in 49.9 seconds.
(begin (f n)) returned 0 in 49.9 seconds.
(begin (f n)) returned 0 in 49.89 seconds.

real	2m29.714s
user	2m29.660s
sys	0m0.030s

The effect is much weaker in either case, when using tail-recursive
functions. This is interesting.
  Can the problem be related to the alignment of the Guile stack? It's
not related to the collector, but it may well be about stack access.

Has Guile been tested recently on architectures such as Sparc, requiring
aligned access? If it works reliably on Sparc, could the issue still be
about alignment, but with respect to cache lines?
  Notice that this last program should be very cache-friendly.

> Thanks,

Thank you,

- --
Luca Saiu
http://www-lipn.univ-paris13.fr/~saiu
GNU epsilon: http://www.gnu.org/software/epsilon
Marionnet:   http://www.marionnet.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkux7/wACgkQvzOavibF0obEAgCePe61GGtoB24RNFkdM0bankXp
o+4AniFih5wvlnXSP5erZXX3csWCs5u/
=6CMC
-----END PGP SIGNATURE-----

[-- Attachment #2: fibo-with-GC_PRINT_STATS --]
[-- Type: text/plain, Size: 45706 bytes --]

[luca@optimum ~/tmp]$ unset GC_DONT_GC; export GC_PRINT_STATS=1; rm -rf ~/.cache/guile/; for i in `seq 10`; do time guile --autocompile q-fibo.scm; echo; done
Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84368 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 183248 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 968688 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 10 msecs (3 in average)
Heap contains 893184 pointer-containing + 63328 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 389120 after 53168 allocated bytes
Increasing heap size by 520192 after 510544 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1033968 allocated bytes
Collection 3 reclaimed 77936 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (2 in average)
Heap contains 1258864 pointer-containing + 152928 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 119232 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-autocompile argument to disable.
;;; compiling q-fibo.scm
Initiating full world-stop collection!

--> Marking for collection 5 after 1219392 allocated bytes
Collection 4 reclaimed 567168 bytes ---> heapsize = 2789376 bytes
World-stopped marking took 0 msecs (2 in average)
Heap contains 1862960 pointer-containing + 217904 pointer-free reachable bytes
5 finalization table entries; 55 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 933888 after 284352 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 6 after 1490640 allocated bytes
Collection 5 reclaimed 560224 bytes ---> heapsize = 3723264 bytes
World-stopped marking took 0 msecs (1 in average)
Heap contains 2507120 pointer-containing + 319120 pointer-free reachable bytes
5 finalization table entries; 65 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 1249280 after 3824 allocated bytes
Grew dl table to 128 entries
Grew fo table to 8 entries
Adding block map for size of 23 granules (368 bytes)
Adding block map for size of 50 granules (800 bytes)
Initiating full world-stop collection!

--> Marking for collection 7 after 1963888 allocated bytes
Collection 6 reclaimed 773792 bytes ---> heapsize = 4972544 bytes
World-stopped marking took 10 msecs (2 in average)
Heap contains 3368416 pointer-containing + 412384 pointer-free reachable bytes
7 finalization table entries; 129 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Adding block map for size of 19 granules (304 bytes)
Grew dl table to 256 entries
Increasing heap size by 1658880 after 372640 allocated bytes
Adding block map for size of 14 granules (224 bytes)
Initiating full world-stop collection!

--> Marking for collection 8 after 2613952 allocated bytes
Collection 7 reclaimed 997088 bytes ---> heapsize = 6631424 bytes
World-stopped marking took 10 msecs (3 in average)
Heap contains 4372400 pointer-containing + 560176 pointer-free reachable bytes
7 finalization table entries; 145 disappearing links alive
0 objects are eligible for immediate finalization; 43 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 2211840 after 238064 allocated bytes
;;; compiled /home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-fibo.scm.go
(begin (fibo n)) returned 9227465 in 2.56 seconds.
(begin (fibo n)) returned 9227465 in 2.56 seconds.
(begin (fibo n)) returned 9227465 in 2.56 seconds.

real	0m7.768s
user	0m7.760s
sys	0m0.000s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 83664 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 10 msecs (3 in average)
Heap contains 895712 pointer-containing + 60880 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 389120 after 67712 allocated bytes
Increasing heap size by 520192 after 515360 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1038672 allocated bytes
Collection 3 reclaimed 78560 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (2 in average)
Heap contains 1259264 pointer-containing + 151552 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 101792 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.83 seconds.
(begin (fibo n)) returned 9227465 in 2.83 seconds.
(begin (fibo n)) returned 9227465 in 2.83 seconds.

real	0m8.515s
user	0m8.510s
sys	0m0.000s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 10 msecs (10 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (5 in average)
Heap contains 84352 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (3 in average)
Heap contains 894272 pointer-containing + 60464 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 72240 allocated bytes
Increasing heap size by 520192 after 515792 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1039104 allocated bytes
Collection 3 reclaimed 78992 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (2 in average)
Heap contains 1259184 pointer-containing + 150992 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 104928 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.93 seconds.
(begin (fibo n)) returned 9227465 in 2.94 seconds.
(begin (fibo n)) returned 9227465 in 2.93 seconds.

real	0m8.822s
user	0m8.820s
sys	0m0.030s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 95440 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84352 pointer-containing + 18416 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Increasing heap size by 69632 after 30000 allocated bytes
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 895840 pointer-containing + 65024 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 72240 allocated bytes
Increasing heap size by 520192 after 515792 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1039104 allocated bytes
Collection 3 reclaimed 74896 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 10 msecs (2 in average)
Heap contains 1261952 pointer-containing + 155664 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 716800 after 106656 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.81 seconds.
(begin (fibo n)) returned 9227465 in 2.8 seconds.
(begin (fibo n)) returned 9227465 in 2.8 seconds.

real	0m8.426s
user	0m8.420s
sys	0m0.000s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84368 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30528 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 105840 allocated bytes
Increasing heap size by 122880 after 191376 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204368 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 980912 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 893760 pointer-containing + 64640 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 64608 allocated bytes
Increasing heap size by 520192 after 511872 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1035184 allocated bytes
Collection 3 reclaimed 75008 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 1259024 pointer-containing + 151056 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 102496 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.79 seconds.
(begin (fibo n)) returned 9227465 in 2.8 seconds.
(begin (fibo n)) returned 9227465 in 2.79 seconds.

real	0m8.400s
user	0m8.380s
sys	0m0.010s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 10 msecs (10 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (5 in average)
Heap contains 84368 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (3 in average)
Heap contains 894208 pointer-containing + 60720 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 67712 allocated bytes
Increasing heap size by 520192 after 515360 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1034576 allocated bytes
Collection 3 reclaimed 78560 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 10 msecs (5 in average)
Heap contains 1262528 pointer-containing + 149568 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 716800 after 108688 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.95 seconds.
(begin (fibo n)) returned 9227465 in 2.95 seconds.
(begin (fibo n)) returned 9227465 in 2.95 seconds.

real	0m8.872s
user	0m8.870s
sys	0m0.010s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84224 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 97664 allocated bytes
Increasing heap size by 122880 after 183248 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 972784 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 891536 pointer-containing + 62512 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 72656 allocated bytes
Increasing heap size by 520192 after 511760 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1035184 allocated bytes
Collection 3 reclaimed 79152 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 10 msecs (2 in average)
Heap contains 1256544 pointer-containing + 152320 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 716800 after 108448 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 3.27 seconds.
(begin (fibo n)) returned 9227465 in 3.26 seconds.
(begin (fibo n)) returned 9227465 in 3.27 seconds.

real	0m9.811s
user	0m9.800s
sys	0m0.010s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84368 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 894448 pointer-containing + 60496 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 72240 allocated bytes
Increasing heap size by 520192 after 515792 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1039104 allocated bytes
Collection 3 reclaimed 78992 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 10 msecs (2 in average)
Heap contains 1260272 pointer-containing + 151632 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Increasing heap size by 716800 after 107392 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 3.0 seconds.
(begin (fibo n)) returned 9227465 in 3.01 seconds.
(begin (fibo n)) returned 9227465 in 3.0 seconds.

real	0m9.026s
user	0m9.020s
sys	0m0.010s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84240 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 892864 pointer-containing + 60400 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 72240 allocated bytes
Increasing heap size by 520192 after 511600 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1035008 allocated bytes
Collection 3 reclaimed 78992 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 1262336 pointer-containing + 149040 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 112480 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.6 seconds.
(begin (fibo n)) returned 9227465 in 2.61 seconds.
(begin (fibo n)) returned 9227465 in 2.6 seconds.

real	0m7.829s
user	0m7.830s
sys	0m0.000s

Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 99536 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103648 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84288 pointer-containing + 14320 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 2 entries
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30592 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 101760 allocated bytes
Increasing heap size by 122880 after 187344 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 204432 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Adding block map for size of 10 granules (160 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 976880 allocated bytes
Collection 2 reclaimed 4896 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 896752 pointer-containing + 60816 pointer-free reachable bytes
4 finalization table entries; 25 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 389120 after 71536 allocated bytes
Increasing heap size by 520192 after 515792 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1039104 allocated bytes
Collection 3 reclaimed 78992 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 1259024 pointer-containing + 151616 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 103440 allocated bytes
Grew dl table to 64 entries
Adding block map for size of 128 granules (2048 bytes)
(begin (fibo n)) returned 9227465 in 2.78 seconds.
(begin (fibo n)) returned 9227465 in 2.78 seconds.
(begin (fibo n)) returned 9227465 in 2.78 seconds.

real	0m9.213s
user	0m8.350s
sys	0m0.000s

[-- Attachment #3: fibo-with-GC_PRINT_STATS.sig --]
[-- Type: application/octet-stream, Size: 72 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-03-30 12:35   ` Luca Saiu
@ 2010-03-30 13:01     ` Ludovic Courtès
  2010-03-30 15:05       ` Luca Saiu
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2010-03-30 13:01 UTC (permalink / raw)
  To: Luca Saiu; +Cc: bug-guile

Hello,

Luca Saiu <positron@gnu.org> writes:

> The effect is much weaker in either case, when using tail-recursive
> functions. This is interesting.

Another guess: it could be that the non-tail-recursive version fills the
L2 cache (/sys/devices/system/cpu/cpu0/cache/index2/size says it’s 2 MiB
on my laptop), or something like that.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-03-30 13:01     ` Ludovic Courtès
@ 2010-03-30 15:05       ` Luca Saiu
  0 siblings, 0 replies; 10+ messages in thread
From: Luca Saiu @ 2010-03-30 15:05 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: bug-guile

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Ludovic.

Ludovic Courtès wrote:
> Hello,
> 
> Luca Saiu <positron@gnu.org> writes:
> 
>> The effect is much weaker in either case, when using tail-recursive
>> functions. This is interesting.
> 
> Another guess: it could be that the non-tail-recursive version fills the
> L2 cache

In the case of fibo I'd say no, as the maximum stack depth is O(n) and
Guile reuses the same stack space over and over again.
Anyway, even if L2 were really filled, why the timing anomaly? Some
event that we don't control takes place *at process start up time*, and
deterministically determines the speed of all the repeated evaluations
taking place later, until the process dies. This thing is intriguing.

> (/sys/devices/system/cpu/cpu0/cache/index2/size says it’s 2 MiB
> on my laptop)

On my machine L2s are 6Mb, 24-way set-associative, one per pair of cores
(there are four L2 caches for eight cores (four per package: I'm
honestly not sure about chips and dies)). Anyway during the experiments
essentially only one core of eight was active.

By the way, am I the only one seeing this effect?  I'd like to try using
OProfile to exploit some hardware event counter for cache faults and the
like. I'm not expert, but I'll look into it.

Thanks,

- --
Luca Saiu
http://www-lipn.univ-paris13.fr/~saiu
GNU epsilon: http://www.gnu.org/software/epsilon
Marionnet:   http://www.marionnet.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuyEyUACgkQvzOavibF0oaUQgCcC4V1HQoLXYlqrzqeOuqTn2ra
DJEAnRCjcLDYjqlBIlBDpIOIcPKnEyCg
=sUjD
-----END PGP SIGNATURE-----




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-03-29 16:09 High run time variance Luca Saiu
  2010-03-30  9:16 ` Ludovic Courtès
@ 2010-08-04 19:51 ` Andy Wingo
  2010-08-05 13:42   ` Luca Saiu
  1 sibling, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2010-08-04 19:51 UTC (permalink / raw)
  To: Luca Saiu; +Cc: bug-guile

On Mon 29 Mar 2010 18:09, Luca Saiu <positron@gnu.org> writes:

> To sum up, within each run the computation time of (fibo n) is the same,
> but the time varies widely from one run to another. This anomaly seems
> to have become much more accentuated in 1.9.

I suspect this is related to the GC issues brought up recently on
guile-user. Something is being misidentified as a Scheme object.

We'll poke this problem once we have a heap profiler.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-08-04 19:51 ` Andy Wingo
@ 2010-08-05 13:42   ` Luca Saiu
  2010-08-05 14:45     ` Andy Wingo
  0 siblings, 1 reply; 10+ messages in thread
From: Luca Saiu @ 2010-08-05 13:42 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bug-guile

[-- Attachment #1: Type: text/plain, Size: 1600 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello.

Andy Wingo wrote:
> On Mon 29 Mar 2010 18:09, Luca Saiu <positron@gnu.org> writes:
> 
>> To sum up, within each run the computation time of (fibo n) is the same,
>> but the time varies widely from one run to another. This anomaly seems
>> to have become much more accentuated in 1.9.
> 
> I suspect this is related to the GC issues brought up recently on
> guile-user. Something is being misidentified as a Scheme object.

Well, not in the fibo case unless I grossly misunderstood something.
Try running again the same test program q.scm, with GC_PRINT_STATS set
to 1: you will notice that the GC is not triggered at all when computing
fibo, which is correct and good.  I've just checked that this is still
true with 1.9.11; see the attached log.

The run time variance problem is still there in 1.9.11, but it seems
less serious.  I still suspect it's an alignment issue, unrelated to the GC.

By the way, I've been surprised in finding other trivial testcases which
*do* allocate and collect without any obvious reason; see a.scm.

> We'll poke this problem once we have a heap profiler.

That will be fun, indeed.

Thanks and best regards,

- --
Luca Saiu
http://www-lipn.univ-paris13.fr/~saiu
GNU epsilon: http://www.gnu.org/software/epsilon
Marionnet:   http://www.marionnet.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkxav6sACgkQvzOavibF0oYDcQCdGVWvzeKNFITXG6RFo70oqBDW
s6EAn2BOYi6QmfyQvK3ARzGo0sNyWC3X
=2RcI
-----END PGP SIGNATURE-----

[-- Attachment #2: log-with-1.9.11 --]
[-- Type: text/plain, Size: 4938 bytes --]

[luca@optimum ~/scratch]$ GC_PRINT_STATS=1 \time -p guile q.scm
Increasing heap size by 65536 after 0 allocated bytes
Number of processors = 8, number of marker threads = 8
Started 7 mark helper threads
Initiating full world-stop collection!

--> Marking for collection 1 after 0 allocated bytes
Collection 0 reclaimed 0 bytes ---> heapsize = 65536 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 0 pointer-containing + 0 pointer-free reachable bytes
0 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 65536 after 0 allocated bytes
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 84 granules (1344 bytes)
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 3 granules (48 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 17 granules (272 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 4 granules (64 bytes)
Grew fo table to 1 entries
Increasing heap size by 65536 after 103584 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 2 after 103600 allocated bytes
Collection 1 reclaimed 0 bytes ---> heapsize = 196608 bytes
World-stopped marking took 0 msecs (0 in average)
Heap contains 84304 pointer-containing + 14272 pointer-free reachable bytes
1 finalization table entries; 0 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Grew dl table to 1 entries
Adding block map for size of 7 granules (112 bytes)
Grew dl table to 2 entries
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 20 granules (320 bytes)
Grew dl table to 4 entries
Grew dl table to 8 entries
Adding block map for size of 12 granules (192 bytes)
Adding block map for size of 8 granules (128 bytes)
Grew dl table to 16 entries
Increasing heap size by 69632 after 30656 allocated bytes
Adding block map for size of 42 granules (672 bytes)
Increasing heap size by 90112 after 100448 allocated bytes
Increasing heap size by 122880 after 185792 allocated bytes
Adding block map for size of 9 granules (144 bytes)
Increasing heap size by 684032 after 202832 allocated bytes
Adding block map for size of 15 granules (240 bytes)
Adding block map for size of 16 granules (256 bytes)
Grew dl table to 32 entries
Grew fo table to 2 entries
Grew fo table to 4 entries
Adding block map for size of 10 granules (160 bytes)
Adding block map for size of 14 granules (224 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 11 granules (176 bytes)
Initiating full world-stop collection!

--> Marking for collection 3 after 973552 allocated bytes
Collection 2 reclaimed 4864 bytes ---> heapsize = 1163264 bytes
World-stopped marking took 10 msecs (3 in average)
Heap contains 896640 pointer-containing + 61136 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 1 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Adding block map for size of 18 granules (288 bytes)
Increasing heap size by 389120 after 91168 allocated bytes
Increasing heap size by 520192 after 505632 allocated bytes
Initiating full world-stop collection!

--> Marking for collection 4 after 1030672 allocated bytes
Collection 3 reclaimed 73152 bytes ---> heapsize = 2072576 bytes
World-stopped marking took 0 msecs (2 in average)
Heap contains 1279280 pointer-containing + 149120 pointer-free reachable bytes
4 finalization table entries; 26 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Increasing heap size by 716800 after 220144 allocated bytes
Grew dl table to 64 entries
Initiating full world-stop collection!

--> Marking for collection 5 after 1254672 allocated bytes
Collection 4 reclaimed 581984 bytes ---> heapsize = 2789376 bytes
World-stopped marking took 0 msecs (2 in average)
Heap contains 1837088 pointer-containing + 234592 pointer-free reachable bytes
5 finalization table entries; 41 disappearing links alive
0 objects are eligible for immediate finalization; 0 links cleared
Finalize + initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs
Adding block map for size of 128 granules (2048 bytes)
Increasing heap size by 937984 after 44352 allocated bytes
(begin (fibo n)) returned 9227465 in 2.37 seconds.
(begin (fibo n)) returned 9227465 in 2.37 seconds.
(begin (fibo n)) returned 9227465 in 2.37 seconds.
real 7.14
user 7.13
sys 0.01

[-- Attachment #3: a.scm --]
[-- Type: text/x-scheme, Size: 341 bytes --]

;;; Why does this GCs after creating b?  Try running this with GC_PRINT_STATS=1,
;;; or with GC_DONT_GC=1 if you want to see the memory use rise.

(define size 100000)
(define times 100)

(define b (make-bitvector size))

(do ((t 0 (1+ t)))
    ((= t times) 'done)
  (do ((i 0 (1+ i)))
      ((= i size) 'done)
    (bitvector-set! b i #t)))

[-- Attachment #4: log-with-1.9.11.sig --]
[-- Type: application/octet-stream, Size: 72 bytes --]

[-- Attachment #5: a.scm.sig --]
[-- Type: application/octet-stream, Size: 72 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-08-05 13:42   ` Luca Saiu
@ 2010-08-05 14:45     ` Andy Wingo
  2010-08-05 16:21       ` Luca Saiu
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2010-08-05 14:45 UTC (permalink / raw)
  To: Luca Saiu; +Cc: bug-guile

Hi,

On Thu 05 Aug 2010 15:42, Luca Saiu <positron@gnu.org> writes:

> ;;; Why does this GCs after creating b?  Try running this with GC_PRINT_STATS=1,
> ;;; or with GC_DONT_GC=1 if you want to see the memory use rise.
>
> (define size 100000)
> (define times 100)
>
> (define b (make-bitvector size))
>
> (do ((t 0 (1+ t)))
>     ((= t times) 'done)
>   (do ((i 0 (1+ i)))
>       ((= i size) 'done)
>     (bitvector-set! b i #t)))

The toplevel references to bitvector-set! don't get cached, so the
bitvector-set! gets looked up every time, and there is some unfortunate
allocation in scm_from_locale_symboln. If you put it in a function,
the variable gets cached, and there's no allocation.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-08-05 14:45     ` Andy Wingo
@ 2010-08-05 16:21       ` Luca Saiu
  2010-08-05 17:43         ` Andy Wingo
  0 siblings, 1 reply; 10+ messages in thread
From: Luca Saiu @ 2010-08-05 16:21 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bug-guile

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andy Wingo wrote:
> The toplevel references to bitvector-set! don't get cached, so the
> bitvector-set! gets looked up every time, and there is some unfortunate
> allocation in scm_from_locale_symboln. If you put it in a function,
> the variable gets cached, and there's no allocation.

So that was a corner case.  Sorry for the noise.

- --
Luca Saiu
http://www-lipn.univ-paris13.fr/~saiu
GNU epsilon: http://www.gnu.org/software/epsilon
Marionnet:   http://www.marionnet.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkxa5Q0ACgkQvzOavibF0oYOyQCcCYvBoJXGeEZgXcYt9+jvBwnI
ay8An1lV/YnublXhNxruclRv9H4CvC8T
=R9Q3
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: High run time variance
  2010-08-05 16:21       ` Luca Saiu
@ 2010-08-05 17:43         ` Andy Wingo
  0 siblings, 0 replies; 10+ messages in thread
From: Andy Wingo @ 2010-08-05 17:43 UTC (permalink / raw)
  To: Luca Saiu; +Cc: bug-guile

On Thu 05 Aug 2010 18:21, Luca Saiu <positron@gnu.org> writes:

> Andy Wingo wrote:
>> The toplevel references to bitvector-set! don't get cached, so the
>> bitvector-set! gets looked up every time, and there is some unfortunate
>> allocation in scm_from_locale_symboln. If you put it in a function,
>> the variable gets cached, and there's no allocation.
>
> So that was a corner case.  Sorry for the noise.

Well, arguably it's still a bit stupid to be slow there, so it could be
our bug...

A
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-08-05 17:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-29 16:09 High run time variance Luca Saiu
2010-03-30  9:16 ` Ludovic Courtès
2010-03-30 12:35   ` Luca Saiu
2010-03-30 13:01     ` Ludovic Courtès
2010-03-30 15:05       ` Luca Saiu
2010-08-04 19:51 ` Andy Wingo
2010-08-05 13:42   ` Luca Saiu
2010-08-05 14:45     ` Andy Wingo
2010-08-05 16:21       ` Luca Saiu
2010-08-05 17:43         ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).