unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: guile-devel@gnu.org
Cc: Andy Wingo <wingo@igalia.com>
Subject: CPU and GC cost of bignums
Date: Tue, 04 Feb 2020 17:56:51 +0100	[thread overview]
Message-ID: <87imkmwass.fsf@gnu.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 6755 bytes --]

Hello!

(If you’re in a hurry, there are good news at the bottom.)

I noticed that 3.0 (and also 2.2 actually) takes a long time to compile
Guix’ gnu/services/mail.scm, which is macro-heavy, producing lots of
top-level defines.

At -O2 (the default), we have:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,pr (compile-file "gnu/services/mail.scm")
%     cumulative   self             
time   seconds     seconds  procedure
 13.79     19.16     14.46  language/cps/slot-allocation.scm:846:19
 11.05     11.58     11.58  language/cps/intmap.scm:396:0:intmap-ref
  7.56     12.63      7.92  anon #x10768e0
  6.61      7.70      6.92  ice-9/popen.scm:145:0:reap-pipes
  5.50    182.23      5.76  language/cps/intset.scm:470:5:visit-branch
  4.65      4.87      4.87  system/vm/linker.scm:179:0:string-table-intern!
  4.07      5.04      4.26  ice-9/vlist.scm:534:0:vhash-assoc
  3.54      3.93      3.71  language/cps/intmap.scm:184:0:intmap-add!
  3.28      6.65      3.43  language/cps/intset.scm:270:2:adjoin
  2.70      2.82      2.82  language/cps/intset.scm:349:0:intset-ref
  1.80     34.84      1.88  language/cps/intmap.scm:247:2:adjoin
  1.80      5.93      1.88  language/cps/intset.scm:269:0:intset-add
  1.74     18.22      1.83  language/cps/intmap.scm:246:0:intmap-add
  1.22      3.27      1.27  language/cps/intset.scm:382:2:visit-node
  1.16      2.94      1.22  language/cps/intset.scm:551:2:union
  1.11      1.38      1.16  language/cps/intset.scm:204:0:intset-add!
  0.74   1281.59      0.78  language/cps/intset.scm:472:5:visit-branch

[...]

Sample count: 1892
Total time: 104.795540582 seconds (85.091574653 seconds in GC)
--8<---------------cut here---------------end--------------->8---

At -O1:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,use(system base optimize)
scheme@(guile-user)> ,pr (compile-file "gnu/services/mail.scm" #:opts (optimizations-for-level 1))
%     cumulative   self             
time   seconds     seconds  procedure
 11.76    129.78      7.60  language/cps/intset.scm:470:5:visit-branch
 10.77      6.96      6.96  language/cps/intmap.scm:396:0:intmap-ref
 10.43     11.69      6.74  language/cps/slot-allocation.scm:846:19
  8.99      7.39      5.81  ice-9/vlist.scm:534:0:vhash-assoc
  7.55      4.88      4.88  system/vm/linker.scm:179:0:string-table-intern!
  6.44      4.16      4.16  ice-9/popen.scm:145:0:reap-pipes
  4.22      2.80      2.72  language/cps/intmap.scm:184:0:intmap-add!
  1.89      1.86      1.22  language/cps/slot-allocation.scm:681:17
  1.89      1.43      1.22  ice-9/vlist.scm:539:0:vhash-assq
  1.78      1.51      1.15  language/cps/slot-allocation.scm:505:17
  1.22      1.36      0.79  language/cps/slot-allocation.scm:846:19
  1.22      1.08      0.79  language/cps/slot-allocation.scm:505:17

[...]

Sample count: 901
Total time: 64.602907835 seconds (55.87541493 seconds in GC)
--8<---------------cut here---------------end--------------->8---

language/cps/slot-allocation.scm:846:19 corresponds to:

    (define (compute-live-slots* slots label live-vars)
      (intset-fold (lambda (var live)
                     (match (get-slot slots var)
                       (#f live)
                       (slot (add-live-slot slot live))))    ;L846
                   (intmap-ref live-vars label)
                   0))

The GC times remain extremely high though, and it’s also coming from
‘compute-live-slots*’:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (gcprof (lambda () (compile-file "gnu/services/mail.scm" #:opts (optimizations-for-level 1))))
%     cumulative   self             
time   seconds     seconds  procedure
 58.14     34.56     34.56  language/cps/slot-allocation.scm:846:19
  8.01      4.76      4.76  language/cps/slot-allocation.scm:681:17
  8.01      4.76      4.76  language/cps/slot-allocation.scm:505:17
  6.98      4.15      4.15  language/cps/slot-allocation.scm:505:17
  6.46      3.84      3.84  language/cps/slot-allocation.scm:846:19
  1.29      0.77      0.77  anon #x23e88e0

[...]

Sample count: 387
Total time: 59.442422179 seconds (50.331193744 seconds in GC)
--8<---------------cut here---------------end--------------->8---

(I believe Guile commit 5675e46410c9a24b05ddf58cbe3b998a4c9cad7c and its
parent were made to optimize the -O1 case back in 2017¹.)

‘compute-live-slots*’ returns an integer and the allocation comes from
line 846, where we allocate a bignum, in this case a verybignum even.
And for each bignum, we register a finalizer, which itself takes space.

(Time passes…)

The patch below (also for 2.2) gives us better timing:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (gcprof (lambda () (compile-file "gnu/services/mail.scm" #:opts (optimizations-for-level 1))))
%     cumulative   self             
time   seconds     seconds  procedure
 18.75      2.49      2.49  anon #x6f58e0

[...]

Sample count: 32
Total time: 13.290191232 seconds (4.584969888 seconds in GC)
--8<---------------cut here---------------end--------------->8---

… but has the disadvantage that it doesn’t work: ‘numbers.test’ fails
badly on bignums.

However, it turns out that removing the ‘mp_set_memory_functions’ call
works, and the result is:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (gcprof (lambda () (compile-file "gnu/services/mail.scm" #:opts (optimizations-for-level 1))))
%     cumulative   self             
time   seconds     seconds  procedure
 20.00      2.60      2.60  anon #x12578e0
 10.00      3.47      1.30  language/cps/intset.scm:270:2:adjoin
  6.67      0.87      0.87  ice-9/boot-9.scm:2201:0:%load-announce
  6.67      0.87      0.87  anon #x1253160
  3.33    146.48      0.43  ice-9/threads.scm:388:4
  3.33      1.30      0.43  language/cps/intset.scm:759:8:lp
  3.33      0.87      0.43  system/vm/assembler.scm:2854:4:write-die
  3.33      0.43      0.43  language/cps/slot-allocation.scm:843:19
  3.33      0.43      0.43  language/cps/intmap.scm:167:0:persistent-intmap

[...]

Sample count: 30
Total time: 13.001181844 seconds (4.278418897 seconds in GC)
--8<---------------cut here---------------end--------------->8---

It’s 4.5 times faster than what we have now.

Andy, anything against removing that ‘mp_set_memory_functions’ call
altogether, or having ‘scm_install_gmp_memory_functions’ default to 0?

Thanks,
Ludo’.

¹ https://lists.gnu.org/archive/html/guile-devel/2017-10/msg00048.html


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1559 bytes --]

diff --git a/libguile/numbers.c b/libguile/numbers.c
index d1b463358..cf21a86ca 100644
--- a/libguile/numbers.c
+++ b/libguile/numbers.c
@@ -1,4 +1,4 @@
-/* Copyright 1995-2016,2018-2019
+/* Copyright 1995-2016,2018-2020
      Free Software Foundation, Inc.
 
    Portions Copyright 1990-1993 by AT&T Bell Laboratories and Bellcore.
@@ -218,16 +218,6 @@ static mpz_t z_negative_one;
 
 \f
 
-/* Clear the `mpz_t' embedded in bignum PTR.  */
-static void
-finalize_bignum (void *ptr, void *data)
-{
-  SCM bignum;
-
-  bignum = SCM_PACK_POINTER (ptr);
-  mpz_clear (SCM_I_BIG_MPZ (bignum));
-}
-
 /* The next three functions (custom_libgmp_*) are passed to
    mp_set_memory_functions (in GMP) so that memory used by the digits
    themselves is known to the garbage collector.  This is needed so
@@ -237,19 +227,20 @@ finalize_bignum (void *ptr, void *data)
 static void *
 custom_gmp_malloc (size_t alloc_size)
 {
-  return scm_malloc (alloc_size);
+  return scm_gc_malloc_pointerless (alloc_size, "GMP");
 }
 
 static void *
 custom_gmp_realloc (void *old_ptr, size_t old_size, size_t new_size)
 {
-  return scm_realloc (old_ptr, new_size);
+  return scm_gc_realloc (old_ptr, old_size, new_size, "GMP");
 }
 
 static void
 custom_gmp_free (void *ptr, size_t size)
 {
-  free (ptr);
+  /* Do nothing: all memory allocated by GMP is under GC control and
+     will be freed when needed.  */
 }
 
 
@@ -264,8 +255,6 @@ make_bignum (void)
 				 "bignum");
   p[0] = scm_tc16_big;
 
-  scm_i_set_finalizer (p, finalize_bignum, NULL);
-
   return SCM_PACK (p);
 }

             reply	other threads:[~2020-02-04 16:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-04 16:56 Ludovic Courtès [this message]
2020-02-05 16:29 ` CPU and GC cost of bignums Ludovic Courtès
2020-02-05 21:28   ` Hans Åberg
2020-02-06  9:37   ` Andy Wingo
2020-02-06 13:37     ` Ludovic Courtès
2020-02-08 14:05       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87imkmwass.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guile-devel@gnu.org \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).