From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludovic.courtes@laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] Marking weak alist vectors, #2 Date: Wed, 23 Nov 2005 11:19:06 +0100 Organization: LAAS-CNRS Message-ID: <87sltny8dh.fsf_-_@laas.fr> References: <87y83z3vh5.fsf@laas.fr> <4371CF46.4010708@xs4all.nl> <87y83xkcq6.fsf@laas.fr> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1132748378 18736 80.91.229.2 (23 Nov 2005 12:19:38 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 23 Nov 2005 12:19:38 +0000 (UTC) Cc: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Nov 23 13:19:29 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EetZm-0002rq-P1 for guile-devel@m.gmane.org; Wed, 23 Nov 2005 13:18:15 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EetZl-0004gL-GJ for guile-devel@m.gmane.org; Wed, 23 Nov 2005 07:18:13 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EesKB-0008TE-HJ for guile-devel@gnu.org; Wed, 23 Nov 2005 05:58:04 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Ees5V-0006lS-Bz for guile-devel@gnu.org; Wed, 23 Nov 2005 05:42:56 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Eerj1-0005mj-B8 for guile-devel@gnu.org; Wed, 23 Nov 2005 05:19:42 -0500 Original-Received: from [140.93.0.15] (helo=laas.laas.fr) by monty-python.gnu.org with esmtp (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24) (Exim 4.34) id 1Eerj0-0000ir-Md for guile-devel@gnu.org; Wed, 23 Nov 2005 05:19:39 -0500 Original-Received: by laas.laas.fr (8.13.1/8.13.4) with SMTP id jANAJQFW007546; Wed, 23 Nov 2005 11:19:29 +0100 (CET) Original-To: Han-Wen Nienhuys X-URL: http://www.laas.fr/~lcourtes/ X-Revolutionary-Date: 3 Frimaire an 214 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEB1F5364 X-PGP-Key: http://www.laas.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: powerpc-unknown-linux-gnu Mail-Followup-To: Han-Wen Nienhuys , guile-devel@gnu.org In-Reply-To: <87y83xkcq6.fsf@laas.fr> (Ludovic =?iso-8859-1?Q?Court=E8s's?= message of "Wed, 09 Nov 2005 17:28:01 +0100") User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux) X-Spam-Score: 0 () X-Scanned-By: MIMEDefang at CNRS-LAAS X-MIME-Autoconverted: from 8bit to quoted-printable by laas.laas.fr id jANAJQFW007546 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:5418 Archived-At: Hi, Below is an improved version of the patch I originally sent. The goal of the original patch (and associated test case) was to ensure that an object associated to a weak key (resp. a weak value) is GC'd _after_ that key (resp. value). However, as Han-Wen pointed out earlier in this thread, with the original patch, cyclical structures within a weak-key (or weak-value) alist vectors would never become unmarked. By "cyclical structure", I mean something like: key A is associated to B key B is associated to C Here (assuming a weakly-keyed alist vector), object B is used both as a key and a value, resulting in a "cyclical structure". The patch below fixes this. It also fixes `weaks.test' and adds a test case for cyclical structures (called "cascading weak keys die"). Looking at `weaks.test' gives an idea of how hard it is to work around reference "leaks". Certainly, not all of them are actual leaks, but some of them may be so. The funniest example is the `(+ 1 2 3)' call before calls to `(gc)' to ensure that the C stack no longer holds references that were used by the *previous* call of `ceval ()'. Another neat trick is used in "cascading weak keys die": if K2 and K3 are initialized within the init forms of `let', then things won't work (i.e. references to K2 and K3 are held somewhere). I suspect this is a leak in `let' but I was unable to find out what happens exactly. So there is room for further debugging. ;-) I'd be glad to have feedback about all this. Thanks, Ludovic. libguile/ChangeLog 2005-11-23 Ludovic Court=E8s * gc-mark.c (scm_mark_all): Removed C++/C99-style comment. * properties.c (scm_init_properties): Make SCM_PROPERTIES_WHASH a permanent object. =20 * weaks.c (weak_vectors): Initialize it to `SCM_EOL'. (scm_i_mark_weak_vector_non_weaks): Cosmetic changes. (scm_i_remove_weaks): When an element of a pair is weak, add its non-weak element to a "mark queue". Before returning, mark all the elements in this queue so that they will only become unmarked during the next mark phase. test-suite/ChangeLog 2005-11-23 Ludovic Court=E8s * standalone/Makefile.am (TESTS): Added `test-weaks'. (check_PROGRAMS): Likewise. =20 * standalone/test-weaks.c: New file. * tests/weaks.test: Overhauled the test. Added the "cascading weak key dies"test case. =0C --- orig/libguile/gc-mark.c +++ mod/libguile/gc-mark.c @@ -74,7 +74,7 @@ =20 scm_i_init_weak_vectors_for_gc (); scm_i_init_guardians_for_gc (); - =20 + scm_i_clear_mark_space (); =20 /* Mark every thread's stack and registers */ @@ -138,12 +138,9 @@ break; } =20 - //fprintf (stderr, "%d loops\n", loops); - - /* Remove all unmarked entries from the weak vectors. - */ + /* Remove all unmarked entries from the weak vectors. */ scm_i_remove_weaks_from_weak_vectors (); - =20 + /* Bring hashtables upto date. */ scm_i_scan_weak_hashtables (); --- orig/libguile/weaks.c +++ mod/libguile/weaks.c @@ -205,7 +205,11 @@ =20 #define UNMARKED_CELL_P(x) (SCM_NIMP(x) && !SCM_GC_MARK_P (x)) =20 -static SCM weak_vectors; +/* A list of live weak vectors, updated each time a weak vector is marke= d (in + `scm_i_mark_weak_vector ()') and cleared at the beginning of each mar= k + phase (in `scm_mark_all ()' which calls + `scm_i_init_weak_vectors_for_gc ()'). */ +static SCM weak_vectors =3D SCM_EOL; =20 void scm_i_init_weak_vectors_for_gc () @@ -264,13 +268,12 @@ { SCM key =3D SCM_CAR (elt); SCM value =3D SCM_CDR (elt); - =20 + if (!((weak_keys && UNMARKED_CELL_P (key)) || (weak_values && UNMARKED_CELL_P (value)))) { - /* The item should be kept. We need to mark it - recursively. - */=20 + /* The whole pair should be kept, as well as its + CAR and CDR, recursively. */ scm_gc_mark (elt); again =3D 1; } @@ -323,6 +326,8 @@ { SCM *ptr =3D SCM_I_WVECT_GC_WVELTS (w); size_t n =3D SCM_I_WVECT_LENGTH (w); + int weak_keys =3D SCM_IS_WHVEC (w) || SCM_IS_WHVEC_B (w); + int weak_values =3D SCM_IS_WHVEC_V (w) || SCM_IS_WHVEC_B (w); size_t i; =20 if (!SCM_IS_WHVEC_ANY (w)) @@ -334,8 +339,15 @@ else { size_t delta =3D 0; + SCM *mark_queue; + size_t mark_queue_len =3D 0; =20 - for (i =3D 0; i < n; ++i) + if (weak_keys && weak_values) + mark_queue =3D alloca (2 * n * sizeof (*mark_queue)); + else + mark_queue =3D alloca (n * sizeof (*mark_queue)); + + for (i =3D 0; i < n; i++) { SCM alist, *fixup; =20 @@ -343,8 +355,20 @@ alist =3D *fixup; while (scm_is_pair (alist) && !SCM_GC_MARK_P (alist)) { - if (UNMARKED_CELL_P (SCM_CAR (alist))) + SCM elt =3D SCM_CAR (alist); + SCM key =3D SCM_CAR (elt), value =3D SCM_CDR (elt); + + if ((weak_keys && UNMARKED_CELL_P (key)) + || (weak_values && UNMARKED_CELL_P (value))) { + /* Remove this pair from ALIST. However, mark its elements + so that they will only become unreachable during the + next mark phase. */ + if (!weak_values) + mark_queue[mark_queue_len++] =3D value; + if (!weak_keys) + mark_queue[mark_queue_len++] =3D key; + *fixup =3D SCM_CDR (alist); delta++; } @@ -354,11 +378,21 @@ fixup =3D SCM_CDRLOC (alist); } alist =3D *fixup; + + elt =3D key =3D value =3D SCM_BOOL_F; } } + + for (i =3D 0; i < mark_queue_len; i++) + { + scm_gc_mark (mark_queue[i]); + mark_queue[i] =3D SCM_UNSPECIFIED; + } + #if 0 if (delta) - fprintf (stderr, "vector %p, delta %d\n", w, delta); + fprintf (stderr, "vector %p, delta %d, post-marked %d\n", + w, delta, mark_queue_len); #endif SCM_I_SET_WVECT_DELTA (w, delta); } --- orig/test-suite/tests/weaks.test +++ mod/test-suite/tests/weaks.test @@ -36,7 +36,8 @@ (use-modules (test-suite lib) (ice-9 weak-vector)) =20 -;;; Creation functions=20 + +;;; Creation functions =20 =20 (with-test-prefix @@ -105,7 +106,7 @@ ;;; Normal weak vectors (let ((x (make-weak-vector 10 #f)) (bar "bar")) - (with-test-prefix=20 + (with-test-prefix "weak-vector" (pass-if "lives" (begin @@ -123,10 +124,10 @@ (throw 'unresolved)))))) =20 (let ((x (make-weak-key-alist-vector 17)) - (y (make-weak-value-alist-vector 17)) - (z (make-doubly-weak-alist-vector 17)) - (test-key "foo") - (test-value "bar")) + (y (make-weak-value-alist-vector 17)) + (z (make-doubly-weak-alist-vector 17)) + (test-key "foo") + (test-value "bar")) (with-test-prefix "weak-hash" (pass-if "lives" @@ -140,50 +141,71 @@ (hashq-ref y test-key) (hashq-ref z test-key) #t))) + (pass-if "weak-key dies" (begin - (hashq-set! x "this" "is") - (hashq-set! x "a" "test") - (hashq-set! x "of" "the") - (hashq-set! x "emergency" "weak") - (hashq-set! x "key" "hash system") - (gc) - (and=20 - (or (not (hashq-ref x "this")) - (not (hashq-ref x "a")) - (not (hashq-ref x "of")) - (not (hashq-ref x "emergency")) - (not (hashq-ref x "key"))) - (hashq-ref x test-key) - #t))) + ;; We use `string-copy' because the `SCM_IM_BEGIN' structure + ;; holds a reference to the strings that were read so these + ;; strings would not be GC'd until the `begin' expression is. + (hashq-set! x (string-copy "this") "is") + (hashq-set! x (string-copy "a") "test") + (hashq-set! x (string-copy "of") "the") + (hashq-set! x (string-copy "emergency") "weak") + (hashq-set! x (string-copy "key") "hash system") + + ;; This has the effect of cleaning up the C stack because the + ;; stack space between `ceval ()' and `scm_gc ()' may contain + ;; references to the arguments of the last `hashq-set!' call + ;; above. + (+ 1 2 3) + (gc) (gc) (gc) + + ;; X must now contain nothing more than the + ;; `(test-key . test-value)' pair. + (equal? (hash-map->list cons x) + (list (cons test-key test-value))))) + + (pass-if "cascading weak keys die" + (let ((h (make-weak-key-alist-vector 17))) + (let ((k2 #f) + (k3 #f)) + ;; Here, K2 and K3 are used both as keys and values. Garbage + ;; collection of these two objects is expected to occur in a + ;; ``cascading'' fashion: K2 becomes unmarked, so K3 becomes + ;; unmarked. + (set! k2 (string-copy "a shared key/value")) + (set! k3 (string-copy "another shared key/value")) + (hashq-set! h (string-copy "a key") k2) + (hashq-set! h k2 k3) + (hashq-set! h k3 (string-copy "some value")) + (set! k2 #f) + (set! k3 #f) + (+ 1 2 3)) + + (gc) (gc) (gc) + (null? (hash-map->list cons h)))) =20 (pass-if "weak-value dies" (begin - (hashq-set! y "this" "is") - (hashq-set! y "a" "test") - (hashq-set! y "of" "the") - (hashq-set! y "emergency" "weak") - (hashq-set! y "value" "hash system") - (gc) - (and (or (not (hashq-ref y "this")) - (not (hashq-ref y "a")) - (not (hashq-ref y "of")) - (not (hashq-ref y "emergency")) - (not (hashq-ref y "value"))) - (hashq-ref y test-key) - #t))) + (hashq-set! y "this" (string-copy "is")) + (hashq-set! y "a" (string-copy "test")) + (hashq-set! y "of" (string-copy "the")) + (hashq-set! y "emergency" (string-copy "weak")) + (hashq-set! y "value" (string-copy "hash system")) + (+ 1 2 3) + (gc) (gc) (gc) + (equal? (hash-map->list cons y) + (list (cons test-key test-value))))) + (pass-if "doubly-weak dies" (begin - (hashq-set! z "this" "is") - (hashq-set! z "a" "test") - (hashq-set! z "of" "the") - (hashq-set! z "emergency" "weak") - (hashq-set! z "all" "hash system") - (gc) - (and (or (not (hashq-ref z "this")) - (not (hashq-ref z "a")) - (not (hashq-ref z "of")) - (not (hashq-ref z "emergency")) - (not (hashq-ref z "all"))) - (hashq-ref z test-key) - #t))))) + (hashq-set! z (string-copy "this") (string-copy "is")) + (hashq-set! z (string-copy "a") (string-copy "test")) + (hashq-set! z (string-copy "of") (string-copy "the")) + (hashq-set! z (string-copy "emergency") (string-copy "weak")) + (hashq-set! z (string-copy "all") (string-copy "hash system")) + (+ 1 2 3) + (gc) (gc) (gc) + (equal? (hash-map->list cons z) + (list (cons test-key test-value))))))) + --- orig/test-suite/standalone/Makefile.am +++ mod/test-suite/standalone/Makefile.am @@ -74,6 +74,13 @@ check_PROGRAMS +=3D test-conversion TESTS +=3D test-conversion =20 +# test-weaks +test_weaks_SOURCES =3D test-weaks.c +test_weaks_CFLAGS =3D ${test_cflags} +test_weaks_LDADD =3D ${top_builddir}/libguile/libguile.la +check_PROGRAMS +=3D test-weaks +TESTS +=3D test-weaks + all-local: cd ${srcdir} && chmod u+x ${check_SCRIPTS} =20 =0C New file `test-suite/standalone/test-weaks.c': /* Copyright (C) 2005 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either * version 2.1 of the License, or (at your option) any later version. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this library; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-13= 01 USA */ /* This test case targets garbage collection of weak hash tables. It doe= s so by using object properties (which are currently implemented using weak hash tables) and verifying that properties attached to an object are always GC'd _after_ the object itself has been freed. In order to do so, `test_weak_gc ()' creates a number of SMOBs. The C structure underlying those SMOBs explicitly maintains a reference coun= ter for each instance. This reference counter is: 1. incremented each time a SMOB is attached to another SMOB (by means = of object properties); 2. decremented each time a SMOB that was attached to another SMOB is freed. For instance if A is attached to B, when B is GC'd, A's reference counter is decremented. The invariant that we check is: any SMOB that is GC'd must have its reference count equal to zero. */ #include "libguile.h" #include #include /* Number of time a `my-object' SMOB was freed. */ static unsigned free_count =3D 0; static scm_t_bits my_object_type =3D 0; static SCM some_property =3D SCM_BOOL_F; static SCM my_object_new_smob_proc =3D SCM_BOOL_F, attach_object_proc =3D SCM_BOOL_F; typedef struct my_object { struct my_object *attached_to; unsigned ref_count; int freed; } my_object_t; static void my_object_init (my_object_t *obj) { obj->attached_to =3D NULL; obj->ref_count =3D 0; obj->freed =3D 0; } static size_t my_object_free (SCM obj) { my_object_t *my_object =3D (my_object_t *)SCM_SMOB_DATA (obj); if (my_object->attached_to) { /* Decrease the reference count of the object MY_OBJECT is attached to. */ assert (my_object->attached_to->ref_count > 0); my_object->attached_to->ref_count--; } /* MY_OBJECT must not have been already freed and there must be no pend= ing references to it. */ assert (!my_object->freed); assert (my_object->ref_count =3D=3D 0); my_object->freed =3D 1; free_count++; return 0; } =0C /* Return a new `my-object' SMOB. */ static SCM my_object_new_smob (void) { my_object_t *obj; obj =3D scm_malloc (sizeof (*obj)); my_object_init (obj); SCM_RETURN_NEWSMOB (my_object_type, obj); } /* Attach TO_BE_ATTACHED to OBJ. */ static SCM attach_object (SCM obj, SCM to_be_attached) { my_object_t *c_obj, *c_to_be_attached; assert (SCM_SMOB_PREDICATE (my_object_type, obj)); assert (SCM_SMOB_PREDICATE (my_object_type, to_be_attached)); /* TO_BE_ATTACHED is attached as a property of OBJ. As such, OBJ will = get GC'd _before_ TO_BE_ATTACHED. */ scm_primitive_property_set_x (some_property, obj, to_be_attached); c_obj =3D (my_object_t *)SCM_SMOB_DATA (obj); c_to_be_attached =3D (my_object_t *)SCM_SMOB_DATA (to_be_attached); /* Because TO_BE_ATTACHED is to be freed _after_ OBJ, we can increase i= ts reference count and it should be zero by the time it is freed. */ c_to_be_attached->ref_count++; c_obj->attached_to =3D c_to_be_attached; return obj; } =0C /* Instantiate a number of `my-object' SMOBs, attached some of them toget= her, invoke the GC, and wait until all of these SMOBs have been freed. */ static void test_weak_gc (void) { #define PAIRS 700 unsigned pair_count, total; for (pair_count =3D 0, total =3D 0; pair_count < PAIRS; pair_count++) { size_t noise, i; SCM obj, attached; obj =3D my_object_new_smob (); attached =3D my_object_new_smob (); #if 0 printf ("%p attached to %p\n", SCM_SMOB_DATA (obj), SCM_SMOB_DATA (attached)); #endif attach_object (obj, attached); total +=3D 2; obj =3D attached =3D SCM_BOOL_F; for (i =3D 0, noise =3D random () % 10; i < noise; i++) { my_object_new_smob (); total++; } } while (free_count < total) { unsigned i; scm_gc (); for (i =3D 0; i < 1000; i++) scm_cons (SCM_I_MAKINUM (0), SCM_I_MAKINUM (0)); } } =0C int main (int argc, char *argv[]) { scm_init_guile (); my_object_type =3D scm_make_smob_type ("test-object", 0); scm_set_smob_free (my_object_type, my_object_free); some_property =3D scm_primitive_make_property (SCM_BOOL_F); my_object_new_smob_proc =3D scm_c_make_gsubr ("make-my-object", 0, 0, 0, my_object_new_smob); attach_object_proc =3D scm_c_make_gsubr ("attach-object!", 0, 0, 0, attach_object); test_weak_gc (); return 0; } _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel