From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Newsgroups: gmane.lisp.guile.devel Subject: CPU and GC cost of bignums Date: Tue, 04 Feb 2020 17:56:51 +0100 Message-ID: <87imkmwass.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="97778"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Cc: Andy Wingo To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Tue Feb 04 17:57:16 2020 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1iz1Vk-000PH2-6c for guile-devel@m.gmane-mx.org; Tue, 04 Feb 2020 17:57:16 +0100 Original-Received: from localhost ([::1]:33824 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iz1Vj-0008GX-8B for guile-devel@m.gmane-mx.org; Tue, 04 Feb 2020 11:57:15 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:46459) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iz1VQ-00086G-Mq for guile-devel@gnu.org; Tue, 04 Feb 2020 11:56:58 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:33651) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1iz1VQ-0007sg-2M; Tue, 04 Feb 2020 11:56:56 -0500 Original-Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=52606 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1iz1VP-0003nD-FK; Tue, 04 Feb 2020 11:56:55 -0500 X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 16 =?utf-8?Q?Pluvi=C3=B4se?= an 228 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20387 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello! (If you=E2=80=99re in a hurry, there are good news at the bottom.) I noticed that 3.0 (and also 2.2 actually) takes a long time to compile Guix=E2=80=99 gnu/services/mail.scm, which is macro-heavy, producing lots of top-level defines. At -O2 (the default), we have: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> ,pr (compile-file "gnu/services/mail.scm") % cumulative self=20=20=20=20=20=20=20=20=20=20=20=20=20 time seconds seconds procedure 13.79 19.16 14.46 language/cps/slot-allocation.scm:846:19 11.05 11.58 11.58 language/cps/intmap.scm:396:0:intmap-ref 7.56 12.63 7.92 anon #x10768e0 6.61 7.70 6.92 ice-9/popen.scm:145:0:reap-pipes 5.50 182.23 5.76 language/cps/intset.scm:470:5:visit-branch 4.65 4.87 4.87 system/vm/linker.scm:179:0:string-table-intern! 4.07 5.04 4.26 ice-9/vlist.scm:534:0:vhash-assoc 3.54 3.93 3.71 language/cps/intmap.scm:184:0:intmap-add! 3.28 6.65 3.43 language/cps/intset.scm:270:2:adjoin 2.70 2.82 2.82 language/cps/intset.scm:349:0:intset-ref 1.80 34.84 1.88 language/cps/intmap.scm:247:2:adjoin 1.80 5.93 1.88 language/cps/intset.scm:269:0:intset-add 1.74 18.22 1.83 language/cps/intmap.scm:246:0:intmap-add 1.22 3.27 1.27 language/cps/intset.scm:382:2:visit-node 1.16 2.94 1.22 language/cps/intset.scm:551:2:union 1.11 1.38 1.16 language/cps/intset.scm:204:0:intset-add! 0.74 1281.59 0.78 language/cps/intset.scm:472:5:visit-branch [...] Sample count: 1892 Total time: 104.795540582 seconds (85.091574653 seconds in GC) --8<---------------cut here---------------end--------------->8--- At -O1: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> ,use(system base optimize) scheme@(guile-user)> ,pr (compile-file "gnu/services/mail.scm" #:opts (opti= mizations-for-level 1)) % cumulative self=20=20=20=20=20=20=20=20=20=20=20=20=20 time seconds seconds procedure 11.76 129.78 7.60 language/cps/intset.scm:470:5:visit-branch 10.77 6.96 6.96 language/cps/intmap.scm:396:0:intmap-ref 10.43 11.69 6.74 language/cps/slot-allocation.scm:846:19 8.99 7.39 5.81 ice-9/vlist.scm:534:0:vhash-assoc 7.55 4.88 4.88 system/vm/linker.scm:179:0:string-table-intern! 6.44 4.16 4.16 ice-9/popen.scm:145:0:reap-pipes 4.22 2.80 2.72 language/cps/intmap.scm:184:0:intmap-add! 1.89 1.86 1.22 language/cps/slot-allocation.scm:681:17 1.89 1.43 1.22 ice-9/vlist.scm:539:0:vhash-assq 1.78 1.51 1.15 language/cps/slot-allocation.scm:505:17 1.22 1.36 0.79 language/cps/slot-allocation.scm:846:19 1.22 1.08 0.79 language/cps/slot-allocation.scm:505:17 [...] Sample count: 901 Total time: 64.602907835 seconds (55.87541493 seconds in GC) --8<---------------cut here---------------end--------------->8--- language/cps/slot-allocation.scm:846:19 corresponds to: (define (compute-live-slots* slots label live-vars) (intset-fold (lambda (var live) (match (get-slot slots var) (#f live) (slot (add-live-slot slot live)))) ;L846 (intmap-ref live-vars label) 0)) The GC times remain extremely high though, and it=E2=80=99s also coming from =E2=80=98compute-live-slots*=E2=80=99: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (gcprof (lambda () (compile-file "gnu/services/mail.sc= m" #:opts (optimizations-for-level 1)))) % cumulative self=20=20=20=20=20=20=20=20=20=20=20=20=20 time seconds seconds procedure 58.14 34.56 34.56 language/cps/slot-allocation.scm:846:19 8.01 4.76 4.76 language/cps/slot-allocation.scm:681:17 8.01 4.76 4.76 language/cps/slot-allocation.scm:505:17 6.98 4.15 4.15 language/cps/slot-allocation.scm:505:17 6.46 3.84 3.84 language/cps/slot-allocation.scm:846:19 1.29 0.77 0.77 anon #x23e88e0 [...] Sample count: 387 Total time: 59.442422179 seconds (50.331193744 seconds in GC) --8<---------------cut here---------------end--------------->8--- (I believe Guile commit 5675e46410c9a24b05ddf58cbe3b998a4c9cad7c and its parent were made to optimize the -O1 case back in 2017=C2=B9.) =E2=80=98compute-live-slots*=E2=80=99 returns an integer and the allocation= comes from line 846, where we allocate a bignum, in this case a verybignum even. And for each bignum, we register a finalizer, which itself takes space. (Time passes=E2=80=A6) The patch below (also for 2.2) gives us better timing: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (gcprof (lambda () (compile-file "gnu/services/mail.sc= m" #:opts (optimizations-for-level 1)))) % cumulative self=20=20=20=20=20=20=20=20=20=20=20=20=20 time seconds seconds procedure 18.75 2.49 2.49 anon #x6f58e0 [...] Sample count: 32 Total time: 13.290191232 seconds (4.584969888 seconds in GC) --8<---------------cut here---------------end--------------->8--- =E2=80=A6 but has the disadvantage that it doesn=E2=80=99t work: =E2=80=98n= umbers.test=E2=80=99 fails badly on bignums. However, it turns out that removing the =E2=80=98mp_set_memory_functions=E2= =80=99 call works, and the result is: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (gcprof (lambda () (compile-file "gnu/services/mail.sc= m" #:opts (optimizations-for-level 1)))) % cumulative self=20=20=20=20=20=20=20=20=20=20=20=20=20 time seconds seconds procedure 20.00 2.60 2.60 anon #x12578e0 10.00 3.47 1.30 language/cps/intset.scm:270:2:adjoin 6.67 0.87 0.87 ice-9/boot-9.scm:2201:0:%load-announce 6.67 0.87 0.87 anon #x1253160 3.33 146.48 0.43 ice-9/threads.scm:388:4 3.33 1.30 0.43 language/cps/intset.scm:759:8:lp 3.33 0.87 0.43 system/vm/assembler.scm:2854:4:write-die 3.33 0.43 0.43 language/cps/slot-allocation.scm:843:19 3.33 0.43 0.43 language/cps/intmap.scm:167:0:persistent-intmap [...] Sample count: 30 Total time: 13.001181844 seconds (4.278418897 seconds in GC) --8<---------------cut here---------------end--------------->8--- It=E2=80=99s 4.5 times faster than what we have now. Andy, anything against removing that =E2=80=98mp_set_memory_functions=E2=80= =99 call altogether, or having =E2=80=98scm_install_gmp_memory_functions=E2=80=99 de= fault to 0? Thanks, Ludo=E2=80=99. =C2=B9 https://lists.gnu.org/archive/html/guile-devel/2017-10/msg00048.html --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/libguile/numbers.c b/libguile/numbers.c index d1b463358..cf21a86ca 100644 --- a/libguile/numbers.c +++ b/libguile/numbers.c @@ -1,4 +1,4 @@ -/* Copyright 1995-2016,2018-2019 +/* Copyright 1995-2016,2018-2020 Free Software Foundation, Inc. Portions Copyright 1990-1993 by AT&T Bell Laboratories and Bellcore. @@ -218,16 +218,6 @@ static mpz_t z_negative_one; -/* Clear the `mpz_t' embedded in bignum PTR. */ -static void -finalize_bignum (void *ptr, void *data) -{ - SCM bignum; - - bignum = SCM_PACK_POINTER (ptr); - mpz_clear (SCM_I_BIG_MPZ (bignum)); -} - /* The next three functions (custom_libgmp_*) are passed to mp_set_memory_functions (in GMP) so that memory used by the digits themselves is known to the garbage collector. This is needed so @@ -237,19 +227,20 @@ finalize_bignum (void *ptr, void *data) static void * custom_gmp_malloc (size_t alloc_size) { - return scm_malloc (alloc_size); + return scm_gc_malloc_pointerless (alloc_size, "GMP"); } static void * custom_gmp_realloc (void *old_ptr, size_t old_size, size_t new_size) { - return scm_realloc (old_ptr, new_size); + return scm_gc_realloc (old_ptr, old_size, new_size, "GMP"); } static void custom_gmp_free (void *ptr, size_t size) { - free (ptr); + /* Do nothing: all memory allocated by GMP is under GC control and + will be freed when needed. */ } @@ -264,8 +255,6 @@ make_bignum (void) "bignum"); p[0] = scm_tc16_big; - scm_i_set_finalizer (p, finalize_bignum, NULL); - return SCM_PACK (p); } --=-=-=--