-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Ludovic. Ludovic Courtès wrote: > Hi Luca, > > Luca Saiu writes: > >> (define (fibo n) >> (if (< n 2) >> n >> (+ (fibo (- n 1)) >> (fibo (- n 2))))) > > This function is not tail-recursive, so it consumes stack space, which > increases the amount of memory the GC has to scan. My guess is that > this has to do with the time spent in GC. Well, no. That's easy to prove with Boehm's GC, by setting GC_PRINT_STATS=1. The log is quite big, so I'm attaching it separately as fibo-with-GC_PRINT_STATS. Anyway the result is that there is no collection while computing fibo (I'm speaking about 1.9 using the VM), which is to be expected: Bohem's GC can only trigger a collection at allocation time. q-fibo.scm is the file which I had called q.scm in my first report, with n = 35. We can also completely disable the GC by setting GC_DONT_GC, and the effect is still there: [luca@optimum ~/tmp]$ export GC_DONT_GC=1; unset GC_PRINT_STATS; rm -rf ~/.cache/guile/; for i in `seq 10`; do time guile --autocompile q-fibo.scm; echo; done ;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-autocompile argument to disable. ;;; compiling q-fibo.scm ;;; compiled /home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-fibo.scm.go (begin (fibo n)) returned 9227465 in 2.79 seconds. (begin (fibo n)) returned 9227465 in 2.79 seconds. (begin (fibo n)) returned 9227465 in 2.78 seconds. real 0m8.429s user 0m8.420s sys 0m0.020s (begin (fibo n)) returned 9227465 in 2.59 seconds. (begin (fibo n)) returned 9227465 in 2.59 seconds. (begin (fibo n)) returned 9227465 in 2.6 seconds. real 0m7.792s user 0m7.800s sys 0m0.010s (begin (fibo n)) returned 9227465 in 2.95 seconds. (begin (fibo n)) returned 9227465 in 2.93 seconds. (begin (fibo n)) returned 9227465 in 2.94 seconds. real 0m8.829s user 0m8.820s sys 0m0.010s (begin (fibo n)) returned 9227465 in 2.64 seconds. (begin (fibo n)) returned 9227465 in 2.64 seconds. (begin (fibo n)) returned 9227465 in 2.65 seconds. real 0m7.949s user 0m7.940s sys 0m0.000s (begin (fibo n)) returned 9227465 in 2.73 seconds. (begin (fibo n)) returned 9227465 in 2.74 seconds. (begin (fibo n)) returned 9227465 in 2.73 seconds. real 0m8.216s user 0m8.190s sys 0m0.000s (begin (fibo n)) returned 9227465 in 2.6 seconds. (begin (fibo n)) returned 9227465 in 2.61 seconds. (begin (fibo n)) returned 9227465 in 2.61 seconds. real 0m7.835s user 0m7.820s sys 0m0.020s (begin (fibo n)) returned 9227465 in 2.97 seconds. (begin (fibo n)) returned 9227465 in 2.97 seconds. (begin (fibo n)) returned 9227465 in 2.98 seconds. real 0m8.933s user 0m8.910s sys 0m0.000s (begin (fibo n)) returned 9227465 in 2.61 seconds. (begin (fibo n)) returned 9227465 in 2.6 seconds. (begin (fibo n)) returned 9227465 in 2.61 seconds. real 0m7.826s user 0m7.810s sys 0m0.010s (begin (fibo n)) returned 9227465 in 2.92 seconds. (begin (fibo n)) returned 9227465 in 2.93 seconds. (begin (fibo n)) returned 9227465 in 2.92 seconds. real 0m8.789s user 0m8.780s sys 0m0.000s (begin (fibo n)) returned 9227465 in 2.55 seconds. (begin (fibo n)) returned 9227465 in 2.55 seconds. (begin (fibo n)) returned 9227465 in 2.55 seconds. real 0m7.671s user 0m7.670s sys 0m0.000s > Could you try with a tail-recursive version? Well, of course I've used a different function :-). Here's the source q-f.scm: ================================================================== (define-macro (benchmark . body) ;; Yes, I know that this is ugly and non-hygienic. That's ;; not the problem. `(let* ((t0 (array-ref (times) 0)) (result (begin ,@body)) (t1 (array-ref (times) 0))) (format #t "~a returned ~a in ~a seconds.\n" '(begin ,@body) result (/ (- t1 t0) 100.0)) result)) (define-macro (thrice . body) `(begin ,@body ,@body ,@body)) (define (f n) (if (zero? n) 0 (f (- n 1)))) (define n 100000000) (thrice (benchmark (f n))) ================================================================== So, the result: [luca@optimum ~/tmp]$ unset GC_DONT_GC; unset GC_PRINT_STATS; rm -rf ~/.cache/guile/; for i in `seq 10`; do time guile --autocompile q-f.scm; echo; done ;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-autocompile argument to disable. ;;; compiling q-f.scm ;;; compiled /home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-f.scm.go (begin (f n)) returned 0 in 5.19 seconds. (begin (f n)) returned 0 in 5.19 seconds. (begin (f n)) returned 0 in 5.18 seconds. real 0m15.641s user 0m15.640s sys 0m0.010s (begin (f n)) returned 0 in 5.21 seconds. (begin (f n)) returned 0 in 5.2 seconds. (begin (f n)) returned 0 in 5.2 seconds. real 0m15.631s user 0m15.620s sys 0m0.000s (begin (f n)) returned 0 in 5.18 seconds. (begin (f n)) returned 0 in 5.18 seconds. (begin (f n)) returned 0 in 5.18 seconds. real 0m15.554s user 0m15.540s sys 0m0.020s (begin (f n)) returned 0 in 5.2 seconds. (begin (f n)) returned 0 in 5.21 seconds. (begin (f n)) returned 0 in 5.2 seconds. real 0m15.621s user 0m15.630s sys 0m0.000s (begin (f n)) returned 0 in 5.05 seconds. (begin (f n)) returned 0 in 5.05 seconds. (begin (f n)) returned 0 in 5.05 seconds. real 0m15.173s user 0m15.180s sys 0m0.010s (begin (f n)) returned 0 in 5.18 seconds. (begin (f n)) returned 0 in 5.17 seconds. (begin (f n)) returned 0 in 5.18 seconds. real 0m15.550s user 0m15.550s sys 0m0.010s (begin (f n)) returned 0 in 5.24 seconds. (begin (f n)) returned 0 in 5.24 seconds. (begin (f n)) returned 0 in 5.24 seconds. real 0m15.744s user 0m15.750s sys 0m0.000s (begin (f n)) returned 0 in 5.25 seconds. (begin (f n)) returned 0 in 5.25 seconds. (begin (f n)) returned 0 in 5.24 seconds. real 0m15.773s user 0m15.770s sys 0m0.010s (begin (f n)) returned 0 in 5.2 seconds. (begin (f n)) returned 0 in 5.2 seconds. (begin (f n)) returned 0 in 5.2 seconds. real 0m15.626s user 0m15.630s sys 0m0.010s (begin (f n)) returned 0 in 5.26 seconds. (begin (f n)) returned 0 in 5.26 seconds. (begin (f n)) returned 0 in 5.26 seconds. real 0m15.793s user 0m15.800s sys 0m0.010s And with a bigger n, n = 1000000000: [luca@optimum ~/tmp]$ unset GC_DONT_GC; unset GC_PRINT_STATS; rm -rf ~/.cache/guile/; for i in `seq 10`; do time guile --autocompile q-f-big-n.scm; echo; done ;;; note: autocompilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-autocompile argument to disable. ;;; compiling q-f-big-n.scm ;;; compiled /home/luca/.cache/guile/ccache/2.0-0.P-LE-8/home/luca/tmp/q-f-big-n.scm.go (begin (f n)) returned 0 in 51.79 seconds. (begin (f n)) returned 0 in 51.77 seconds. (begin (f n)) returned 0 in 51.77 seconds. real 2m35.411s user 2m35.380s sys 0m0.050s (begin (f n)) returned 0 in 52.34 seconds. (begin (f n)) returned 0 in 52.34 seconds. (begin (f n)) returned 0 in 52.35 seconds. real 2m37.051s user 2m37.020s sys 0m0.030s (begin (f n)) returned 0 in 51.97 seconds. (begin (f n)) returned 0 in 51.97 seconds. (begin (f n)) returned 0 in 51.98 seconds. real 2m35.949s user 2m35.930s sys 0m0.010s (begin (f n)) returned 0 in 52.55 seconds. (begin (f n)) returned 0 in 52.54 seconds. (begin (f n)) returned 0 in 52.54 seconds. real 2m37.651s user 2m37.610s sys 0m0.040s (begin (f n)) returned 0 in 52.02 seconds. (begin (f n)) returned 0 in 52.03 seconds. (begin (f n)) returned 0 in 52.01 seconds. real 2m36.084s user 2m36.060s sys 0m0.040s (begin (f n)) returned 0 in 51.78 seconds. (begin (f n)) returned 0 in 51.79 seconds. (begin (f n)) returned 0 in 51.79 seconds. real 2m35.382s user 2m35.310s sys 0m0.060s (begin (f n)) returned 0 in 51.39 seconds. (begin (f n)) returned 0 in 51.39 seconds. (begin (f n)) returned 0 in 51.39 seconds. real 2m34.191s user 2m34.150s sys 0m0.040s (begin (f n)) returned 0 in 52.25 seconds. (begin (f n)) returned 0 in 52.25 seconds. (begin (f n)) returned 0 in 52.25 seconds. real 2m36.763s user 2m36.730s sys 0m0.040s (begin (f n)) returned 0 in 52.0 seconds. (begin (f n)) returned 0 in 51.99 seconds. (begin (f n)) returned 0 in 52.01 seconds. real 2m36.014s user 2m35.960s sys 0m0.050s (begin (f n)) returned 0 in 49.9 seconds. (begin (f n)) returned 0 in 49.9 seconds. (begin (f n)) returned 0 in 49.89 seconds. real 2m29.714s user 2m29.660s sys 0m0.030s The effect is much weaker in either case, when using tail-recursive functions. This is interesting. Can the problem be related to the alignment of the Guile stack? It's not related to the collector, but it may well be about stack access. Has Guile been tested recently on architectures such as Sparc, requiring aligned access? If it works reliably on Sparc, could the issue still be about alignment, but with respect to cache lines? Notice that this last program should be very cache-friendly. > Thanks, Thank you, - -- Luca Saiu http://www-lipn.univ-paris13.fr/~saiu GNU epsilon: http://www.gnu.org/software/epsilon Marionnet: http://www.marionnet.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkux7/wACgkQvzOavibF0obEAgCePe61GGtoB24RNFkdM0bankXp o+4AniFih5wvlnXSP5erZXX3csWCs5u/ =6CMC -----END PGP SIGNATURE-----