From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludovic.courtes@laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Guile + Boehm GC: First Remarks Date: Wed, 31 May 2006 10:49:45 +0200 Organization: LAAS-CNRS Message-ID: <877j42r32u.fsf@laas.fr> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1149065460 12996 80.91.229.2 (31 May 2006 08:51:00 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 31 May 2006 08:51:00 +0000 (UTC) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed May 31 10:50:53 2006 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FlMPa-0000Tb-1o for guile-devel@m.gmane.org; Wed, 31 May 2006 10:50:42 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FlMPZ-0005FL-Gs for guile-devel@m.gmane.org; Wed, 31 May 2006 04:50:41 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FlMPS-0005EN-Dg for guile-devel@gnu.org; Wed, 31 May 2006 04:50:34 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FlMPR-0005E2-JC for guile-devel@gnu.org; Wed, 31 May 2006 04:50:34 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FlMPR-0005Dz-Cb for guile-devel@gnu.org; Wed, 31 May 2006 04:50:33 -0400 Original-Received: from [140.93.0.15] (helo=laas.laas.fr) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1FlMVP-0006lz-U5 for guile-devel@gnu.org; Wed, 31 May 2006 04:56:44 -0400 Original-Received: by laas.laas.fr (8.13.6/8.13.4) with SMTP id k4V8ntON009582; Wed, 31 May 2006 10:49:58 +0200 (CEST) Original-To: Guile-Devel X-URL: http://www.laas.fr/~lcourtes/ X-Revolutionary-Date: 12 Prairial an 214 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEB1F5364 X-PGP-Key: http://www.laas.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: powerpc-unknown-linux-gnu Mail-Followup-To: Guile-Devel User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux) X-Spam-Score: 0 () X-Scanned-By: MIMEDefang at CNRS-LAAS X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:5946 Archived-At: Hello, Some time ago, I started porting Guile to Boehm GC, mostly in order to see whether this was feasible without loss of functionality and while remaining as compatible as possible with the current Guile. The good news is that I got to a "working", almost as feature-full, and mostly-compatible, version of Guile that uses BGC. Below are details about the "almost" and "mostly" qualifiers, as well as pointers to the code. ;-) Feedback is welcome! * Memory Allocation In the current prototype, I changed the semantics of `scm_gc_malloc ()' so that it basically does a `GC_MALLOC ()'. This means that the data returned by `scm_gc_malloc ()' will: 1. be scanned for pointers (Scheme references); 2. automatically be garbage-collected. I added a variant, `scm_gc_malloc_pointerless ()' which corresponds to `GC_MALLOC_ATOMIC ()', i.e., the returned memory area will automatically be garbage-collected but it will not be scanned for pointers. This is useful, e.g., for stringbufs. This is different from the current semantics of `scm_gc_malloc ()' where it just "[i]nforms the GC that the memory [...] can potentially be freed during a GC". However, it is (almost) a backward-compatible change: code that currently allocates memory using `scm_gc_malloc ()' is expected to free it explicitly with `scm_gc_free ()' and this will just work fine. Unfortunately, the combination `scm_gc_unregister_collectable_memory' + `free', which is supposed to work now, will no longer work. Hopefully, this is a rare enough idiom and we can forget about it (?). These two `scm_gc_malloc' functions are very convenient since Guile-aware object implementations (e.g., SMOBs) can use them to allocate memory associated to the Scheme object and they no longer need to explicitly mark/free it (see below). * Marking For SMOBs, use of a custom mark procedure is still supported, but should be useless in most cases: if `scm_gc_malloc ()' is used to allocate the memory associated to a SMOB, then this memory will automatically get marked and will be freed when needed. * Finalization Guile basically offers three finalization mechanisms: guardians, the free procedure for SMOB types, and the free procedure for structs. All of them are implemented using libgc's finalizers, and they can coexist peacefully (e.g., a SMOB whose type defines a free procedure can also be guarded by one or more guardians). However, the implementation assumes that nobody besides guardians, SMOBs, and structs directly use libgc's finalizers (which is a reasonable assumption IMO). * Weak Stuff Guile has two "weak" data types: weak hash tables and weak vectors. (Actually, it also has a third one, weak alist vectors, which is very similar to weak hash tables and thus should vanish --- that's the topic of another discussion) Both of them are implemented using libgc's "disappearing links". More precisely, weak vectors are implemented as "regular vectors" with a specific memory allocation method. Weak hash tables are implemented as non-weak vectors of weak alists. A weak alist is defined as a list of weak pairs (weak-car pairs, weak-cdr pairs, or doubly-weak pairs). libgc's disappearing links work by setting weak pointers to NULL when the objects they point to becomes unreachable. Because of this, an extra level of indirection had to be added for weak vectors and weak pairs: we need to make sure that user code never sees `PTR2SCM (NULL)'. As a consequence, weak vectors can only be accessed via `scm_c_vector_ref ()' (I did not add the extra test to `SCM_SIMPLE_VECTOR_REF ()' so that regular vectors don't suffer from this), and weak hash tables can only be accessed via the authorized functions. Unfortunately, there are pieces of code in Guile that break this encapsulation. In particular, `lookup_interned_symbol ()' does this, as well as `environments.c' (but we don't care about this one since it's unused). I believe the best fix here would be to actually allow for read access to hash tables with no function-call overhead, e.g., by exporting part of the API as inlines, and also by exporting a lower-level API (allowing for access to the raw buckets and bucket vectors). Again, that's the topic for another discussion. ;-) * GC Hooks Only `before-gc-hook' and `after-gc-hook' (which are run before/after any `scm_i_gc ()' call) are honored. The others just cannot be honored because they are too low-level. * GC Stats I haven't thought much about this, but it looks is quite problematic. We probably can't provide the level of details of `gc-stats'. And we cannot either provide per-object-type information as currently provided by `gc-live-object-stats' (this would require registering a finalizer for each and every object in order to update the object count). OTOH, per-object-type information, when needed, can be obtained by writing code that makes use of one of the finalization mechanism. For instance, one can wrap a type constructor, or a class `initialize' method, so that every new instance is added to a guardian; then, said guardian can be periodically queried to update the instance counter. * Static Initialization Using BGC allows for static initialization of Scheme objects (which the current GC does not permit), provided we can use GCC's `alignment' attribute or some equivalent pragma. It's unclear whether statically-initializing a few strings and symbols here and there is likely to have a positive impact on startup time, but perhaps it's worth trying. Conclusion: there are a few areas of incompatibility or feature loss that can hardly be worked around (at first sight) and we'd have to question whether this is a big issue. I haven't done any serious performance measurement, but just to give an idea, the test suite roughly takes as long to run with the current Guile and with GBGC. However, GBGC seems to be slightly slower when running programs that allocate and discard lots of cells: $ time ../pre-inst-guile -c "(use-modules (srfi srfi-1)) (fold (lambda (x y) (list x)) '() (iota 300000)) (exit 0)" real 0m3.550s user 0m3.383s sys 0m0.162s $ time guile-1.8.0 -c "(use-modules (srfi srfi-1)) (fold (lambda (x y) (list x)) '() (iota 300000)) (exit 0)" real 0m2.738s user 0m2.650s sys 0m0.059s One reason for this might be that GBGC adds one function call on the allocation path while Guile's `scm_cell' currently rarely needs to call a function. This will need further benchmarking and investigation. Although "usable", the code still needs more work. I tested it only with libgc 6.6 (from Debian sid) on GNU/Linux (sparc64, powerpc, i386 -- on sparc64, use GCC 4.1 instead of 4.0 since the latter seems to generate invalid code at times). It passes the test suite except for `gc.test' which relies on `gc-live-object-stats' and `environments.test' which I just renamed to `environments.nottest'. You can get it from my Arch archive: $ tla register-archive http://www.laas.fr/~lcourtes/software/arch-2005 $ tla get -A lcourtes@laas.fr--2005-libre guile-core--boehm-gc--1.9 \ guile-core--bgc A tarball and a diff against current CVS HEAD are also available: http://www.laas.fr/~lcourtes/software/arch-2005/guile-core/guile-core--boehm-gc/guile-core--boehm-gc--1.9/patch-35/guile-core--boehm-gc--1.9--patch-35.tar.gz http://www.laas.fr/~lcourtes/software/guile/guile-core--bgc--patch-35.diff.gz SHA1 sums: 6b6867dc637a24d7f9f25f99fb7d81c8a54b9190 guile-core--bgc--patch-35.diff.gz 4a54c2489dbca7e5415be788c3d1fdc9884584f9 guile-core--boehm-gc--1.9--patch-35.tar.gz That's it. Sorry for the long mail and if you have any comment, don't hesitate! Thanks, Ludovic. _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel