unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: ludovic.courtes@laas.fr (Ludovic Courtès)
Cc: guile-devel@gnu.org
Subject: Re: [PATCH] Marking weak alist vectors, #2
Date: Wed, 23 Nov 2005 11:19:06 +0100	[thread overview]
Message-ID: <87sltny8dh.fsf_-_@laas.fr> (raw)
In-Reply-To: <87y83xkcq6.fsf@laas.fr> (Ludovic Courtès's message of "Wed, 09 Nov 2005 17:28:01 +0100")

Hi,

Below is an improved version of the patch I originally sent.  The goal
of the original patch (and associated test case) was to ensure that an
object associated to a weak key (resp. a weak value) is GC'd _after_
that key (resp. value).

However, as Han-Wen pointed out earlier in this thread, with the
original patch, cyclical structures within a weak-key (or weak-value)
alist vectors would never become unmarked.  By "cyclical structure", I
mean something like:

  key A is associated to B
  key B is associated to C

Here (assuming a weakly-keyed alist vector), object B is used both as a
key and a value, resulting in a "cyclical structure".

The patch below fixes this.  It also fixes `weaks.test' and adds a test
case for cyclical structures (called "cascading weak keys die").

Looking at `weaks.test' gives an idea of how hard it is to work around
reference "leaks".  Certainly, not all of them are actual leaks, but
some of them may be so.  The funniest example is the `(+ 1 2 3)' call
before calls to `(gc)' to ensure that the C stack no longer holds
references that were used by the *previous* call of `ceval ()'.

Another neat trick is used in "cascading weak keys die": if K2 and K3
are initialized within the init forms of `let', then things won't work
(i.e. references to K2 and K3 are held somewhere).  I suspect this is a
leak in `let' but I was unable to find out what happens exactly.  So
there is room for further debugging.  ;-)

I'd be glad to have feedback about all this.

Thanks,
Ludovic.


libguile/ChangeLog

2005-11-23  Ludovic Courtès  <ludovic.courtes@laas.fr>

	* gc-mark.c (scm_mark_all): Removed C++/C99-style comment.

	* properties.c (scm_init_properties): Make SCM_PROPERTIES_WHASH
	a permanent object.
    
	* weaks.c (weak_vectors): Initialize it to `SCM_EOL'.
	(scm_i_mark_weak_vector_non_weaks): Cosmetic changes.
	(scm_i_remove_weaks): When an element of a pair is weak, add its
	non-weak element to a "mark queue".  Before returning, mark all
	the elements in this queue so that they will only become
	unmarked during the next mark phase.


test-suite/ChangeLog

2005-11-23  Ludovic Courtès  <ludovic.courtes@laas.fr>

	* standalone/Makefile.am (TESTS): Added `test-weaks'.
	(check_PROGRAMS): Likewise.
    
	* standalone/test-weaks.c: New file.

	* tests/weaks.test: Overhauled the test.  Added the "cascading
	weak key dies"test case.


\f
--- orig/libguile/gc-mark.c
+++ mod/libguile/gc-mark.c
@@ -74,7 +74,7 @@
 
   scm_i_init_weak_vectors_for_gc ();
   scm_i_init_guardians_for_gc ();
-  
+
   scm_i_clear_mark_space ();
   
   /* Mark every thread's stack and registers */
@@ -138,12 +138,9 @@
       break;
     }
 
-  //fprintf (stderr, "%d loops\n", loops);
-
-  /* Remove all unmarked entries from the weak vectors.
-   */
+  /* Remove all unmarked entries from the weak vectors.  */
   scm_i_remove_weaks_from_weak_vectors ();
-  
+
   /* Bring hashtables upto date.
    */
   scm_i_scan_weak_hashtables ();


--- orig/libguile/weaks.c
+++ mod/libguile/weaks.c
@@ -205,7 +205,11 @@
 
 #define UNMARKED_CELL_P(x) (SCM_NIMP(x) && !SCM_GC_MARK_P (x))
 
-static SCM weak_vectors;
+/* A list of live weak vectors, updated each time a weak vector is marked (in
+   `scm_i_mark_weak_vector ()') and cleared at the beginning of each mark
+   phase (in `scm_mark_all ()' which calls
+   `scm_i_init_weak_vectors_for_gc ()').  */
+static SCM weak_vectors = SCM_EOL;
 
 void
 scm_i_init_weak_vectors_for_gc ()
@@ -264,13 +268,12 @@
 		    {
 		      SCM key = SCM_CAR (elt);
 		      SCM value = SCM_CDR (elt);
-		  
+
 		      if (!((weak_keys && UNMARKED_CELL_P (key))
 			    || (weak_values && UNMARKED_CELL_P (value))))
 			{
-			  /* The item should be kept.  We need to mark it
-			     recursively.
-			  */ 
+			  /* The whole pair should be kept, as well as its
+			     CAR and CDR, recursively.  */
 			  scm_gc_mark (elt);
 			  again = 1;
 			}
@@ -323,6 +326,8 @@
 {
   SCM *ptr = SCM_I_WVECT_GC_WVELTS (w);
   size_t n = SCM_I_WVECT_LENGTH (w);
+  int weak_keys = SCM_IS_WHVEC (w) || SCM_IS_WHVEC_B (w);
+  int weak_values = SCM_IS_WHVEC_V (w) || SCM_IS_WHVEC_B (w);
   size_t i;
 
   if (!SCM_IS_WHVEC_ANY (w))
@@ -334,8 +339,15 @@
   else
     {
       size_t delta = 0;
+      SCM *mark_queue;
+      size_t mark_queue_len = 0;
 
-      for (i = 0; i < n; ++i)
+      if (weak_keys && weak_values)
+	mark_queue = alloca (2 * n * sizeof (*mark_queue));
+      else
+	mark_queue = alloca (n * sizeof (*mark_queue));
+
+      for (i = 0; i < n; i++)
 	{
 	  SCM alist, *fixup;
 
@@ -343,8 +355,20 @@
 	  alist = *fixup;
 	  while (scm_is_pair (alist) && !SCM_GC_MARK_P (alist))
 	    {
-	      if (UNMARKED_CELL_P (SCM_CAR (alist)))
+	      SCM elt = SCM_CAR (alist);
+	      SCM key = SCM_CAR (elt), value = SCM_CDR (elt);
+
+	      if ((weak_keys && UNMARKED_CELL_P (key))
+		  || (weak_values && UNMARKED_CELL_P (value)))
 		{
+		  /* Remove this pair from ALIST.  However, mark its elements
+		     so that they will only become unreachable during the
+		     next mark phase.  */
+		  if (!weak_values)
+		    mark_queue[mark_queue_len++] = value;
+		  if (!weak_keys)
+		    mark_queue[mark_queue_len++] = key;
+
 		  *fixup = SCM_CDR (alist);
 		  delta++;
 		}
@@ -354,11 +378,21 @@
 		  fixup = SCM_CDRLOC (alist);
 		}
 	      alist = *fixup;
+
+	      elt = key = value = SCM_BOOL_F;
 	    }
 	}
+
+      for (i = 0; i < mark_queue_len; i++)
+	{
+	  scm_gc_mark (mark_queue[i]);
+	  mark_queue[i] = SCM_UNSPECIFIED;
+	}
+
 #if 0
       if (delta)
-	fprintf (stderr, "vector %p, delta %d\n", w, delta);
+	fprintf (stderr, "vector %p, delta %d, post-marked %d\n",
+		 w, delta, mark_queue_len);
 #endif
       SCM_I_SET_WVECT_DELTA (w, delta);
     }


--- orig/test-suite/tests/weaks.test
+++ mod/test-suite/tests/weaks.test
@@ -36,7 +36,8 @@
 (use-modules (test-suite lib)
 	     (ice-9 weak-vector))
 
-;;; Creation functions 
+
+;;; Creation functions
 
 
 (with-test-prefix
@@ -105,7 +106,7 @@
 ;;; Normal weak vectors
 (let ((x (make-weak-vector 10 #f))
       (bar "bar"))
-  (with-test-prefix 
+  (with-test-prefix
    "weak-vector"
    (pass-if "lives"
 	    (begin
@@ -123,10 +124,10 @@
 		  (throw 'unresolved))))))
 
  (let ((x (make-weak-key-alist-vector 17))
-      (y (make-weak-value-alist-vector 17))
-      (z (make-doubly-weak-alist-vector 17))
-      (test-key "foo")
-      (test-value "bar"))
+       (y (make-weak-value-alist-vector 17))
+       (z (make-doubly-weak-alist-vector 17))
+       (test-key "foo")
+       (test-value "bar"))
   (with-test-prefix
    "weak-hash"
    (pass-if "lives"
@@ -140,50 +141,71 @@
 		   (hashq-ref y test-key)
 		   (hashq-ref z test-key)
 		   #t)))
+
    (pass-if "weak-key dies"
 	    (begin
-	      (hashq-set! x "this" "is")
-	      (hashq-set! x "a" "test")
-	      (hashq-set! x "of" "the")
-	      (hashq-set! x "emergency" "weak")
-	      (hashq-set! x "key" "hash system")
-	      (gc)
-	      (and 
-	       (or (not (hashq-ref x "this"))
-		   (not (hashq-ref x "a"))
-		   (not (hashq-ref x "of"))
-		   (not (hashq-ref x "emergency"))
-		   (not (hashq-ref x "key")))
-	       (hashq-ref x test-key)
-	       #t)))
+	      ;; We use `string-copy' because the `SCM_IM_BEGIN' structure
+	      ;; holds a reference to the strings that were read so these
+	      ;; strings would not be GC'd until the `begin' expression is.
+	      (hashq-set! x (string-copy "this") "is")
+	      (hashq-set! x (string-copy "a") "test")
+	      (hashq-set! x (string-copy "of") "the")
+	      (hashq-set! x (string-copy "emergency") "weak")
+	      (hashq-set! x (string-copy "key") "hash system")
+
+	      ;; This has the effect of cleaning up the C stack because the
+	      ;; stack space between `ceval ()' and `scm_gc ()' may contain
+	      ;; references to the arguments of the last `hashq-set!' call
+	      ;; above.
+	      (+ 1 2 3)
+	      (gc) (gc) (gc)
+
+	      ;; X must now contain nothing more than the
+	      ;; `(test-key . test-value)' pair.
+	      (equal? (hash-map->list cons x)
+		      (list (cons test-key test-value)))))
+
+   (pass-if "cascading weak keys die"
+	    (let ((h (make-weak-key-alist-vector 17)))
+	      (let ((k2 #f)
+		    (k3 #f))
+		;; Here, K2 and K3 are used both as keys and values.  Garbage
+		;; collection of these two objects is expected to occur in a
+		;; ``cascading'' fashion: K2 becomes unmarked, so K3 becomes
+		;; unmarked.
+		(set! k2 (string-copy "a shared key/value"))
+		(set! k3 (string-copy "another shared key/value"))
+		(hashq-set! h (string-copy "a key") k2)
+		(hashq-set! h k2 k3)
+		(hashq-set! h k3 (string-copy "some value"))
+		(set! k2 #f)
+		(set! k3 #f)
+		(+ 1 2 3))
+
+	      (gc) (gc) (gc)
+	      (null? (hash-map->list cons h))))
 
    (pass-if "weak-value dies"
 	    (begin
-	      (hashq-set! y "this" "is")
-	      (hashq-set! y "a" "test")
-	      (hashq-set! y "of" "the")
-	      (hashq-set! y "emergency" "weak")
-	      (hashq-set! y "value" "hash system")
-	      (gc)
-	      (and (or (not (hashq-ref y "this"))
-		       (not (hashq-ref y "a"))
-		       (not (hashq-ref y "of"))
-		       (not (hashq-ref y "emergency"))
-		       (not (hashq-ref y "value")))
-		   (hashq-ref y test-key)
-		   #t)))
+	      (hashq-set! y "this" (string-copy "is"))
+	      (hashq-set! y "a" (string-copy "test"))
+	      (hashq-set! y "of" (string-copy "the"))
+	      (hashq-set! y "emergency" (string-copy "weak"))
+	      (hashq-set! y "value" (string-copy "hash system"))
+	      (+ 1 2 3)
+	      (gc) (gc) (gc)
+	      (equal? (hash-map->list cons y)
+			   (list (cons test-key test-value)))))
+
    (pass-if "doubly-weak dies"
 	    (begin
-	      (hashq-set! z "this" "is")
-	      (hashq-set! z "a" "test")
-	      (hashq-set! z "of" "the")
-	      (hashq-set! z "emergency" "weak")
-	      (hashq-set! z "all" "hash system")
-	      (gc)
-	      (and (or (not (hashq-ref z "this"))
-		       (not (hashq-ref z "a"))
-		       (not (hashq-ref z "of"))
-		       (not (hashq-ref z "emergency"))
-		       (not (hashq-ref z "all")))
-		   (hashq-ref z test-key)
-		   #t)))))
+	      (hashq-set! z (string-copy "this") (string-copy "is"))
+	      (hashq-set! z (string-copy "a") (string-copy "test"))
+	      (hashq-set! z (string-copy "of") (string-copy "the"))
+	      (hashq-set! z (string-copy "emergency") (string-copy "weak"))
+	      (hashq-set! z (string-copy "all") (string-copy "hash system"))
+	      (+ 1 2 3)
+	      (gc) (gc) (gc)
+	      (equal? (hash-map->list cons z)
+		      (list (cons test-key test-value)))))))
+


--- orig/test-suite/standalone/Makefile.am
+++ mod/test-suite/standalone/Makefile.am
@@ -74,6 +74,13 @@
 check_PROGRAMS += test-conversion
 TESTS += test-conversion
 
+# test-weaks
+test_weaks_SOURCES = test-weaks.c
+test_weaks_CFLAGS = ${test_cflags}
+test_weaks_LDADD = ${top_builddir}/libguile/libguile.la
+check_PROGRAMS += test-weaks
+TESTS += test-weaks
+
 all-local:
 	cd ${srcdir} && chmod u+x ${check_SCRIPTS}
 



\f
New file `test-suite/standalone/test-weaks.c':

/* Copyright (C) 2005 Free Software Foundation, Inc.
 *
 * This library is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with this library; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */


/* This test case targets garbage collection of weak hash tables.  It does so
   by using object properties (which are currently implemented using weak
   hash tables) and verifying that properties attached to an object are
   always GC'd _after_ the object itself has been freed.

   In order to do so, `test_weak_gc ()' creates a number of SMOBs.  The C
   structure underlying those SMOBs explicitly maintains a reference counter
   for each instance.  This reference counter is:

   1. incremented each time a SMOB is attached to another SMOB (by means of
      object properties);

   2. decremented each time a SMOB that was attached to another SMOB is
      freed.  For instance if A is attached to B, when B is GC'd, A's
      reference counter is decremented.

   The invariant that we check is: any SMOB that is GC'd must have its
   reference count equal to zero.  */


#include "libguile.h"

#include <stdio.h>
#include <assert.h>

/* Number of time a `my-object' SMOB was freed.  */
static unsigned free_count = 0;

static scm_t_bits my_object_type = 0;
static SCM        some_property = SCM_BOOL_F;
static SCM my_object_new_smob_proc = SCM_BOOL_F,
  attach_object_proc = SCM_BOOL_F;

typedef struct my_object
{
  struct my_object *attached_to;

  unsigned ref_count;
  int      freed;
} my_object_t;


static void
my_object_init (my_object_t *obj)
{
  obj->attached_to = NULL;
  obj->ref_count = 0;
  obj->freed = 0;
}

static size_t
my_object_free (SCM obj)
{
  my_object_t *my_object = (my_object_t *)SCM_SMOB_DATA (obj);

  if (my_object->attached_to)
    {
      /* Decrease the reference count of the object MY_OBJECT is attached
	 to.  */
      assert (my_object->attached_to->ref_count > 0);
      my_object->attached_to->ref_count--;
    }

  /* MY_OBJECT must not have been already freed and there must be no pending
     references to it.  */
  assert (!my_object->freed);
  assert (my_object->ref_count == 0);

  my_object->freed = 1;

  free_count++;

  return 0;
}

\f
/* Return a new `my-object' SMOB.  */
static SCM
my_object_new_smob (void)
{
  my_object_t *obj;

  obj = scm_malloc (sizeof (*obj));
  my_object_init (obj);

  SCM_RETURN_NEWSMOB (my_object_type, obj);
}

/* Attach TO_BE_ATTACHED to OBJ.  */
static SCM
attach_object (SCM obj, SCM to_be_attached)
{
  my_object_t *c_obj, *c_to_be_attached;

  assert (SCM_SMOB_PREDICATE (my_object_type, obj));
  assert (SCM_SMOB_PREDICATE (my_object_type, to_be_attached));

  /* TO_BE_ATTACHED is attached as a property of OBJ.  As such, OBJ will get
     GC'd _before_ TO_BE_ATTACHED.  */
  scm_primitive_property_set_x (some_property, obj, to_be_attached);
  c_obj = (my_object_t *)SCM_SMOB_DATA (obj);
  c_to_be_attached = (my_object_t *)SCM_SMOB_DATA (to_be_attached);

  /* Because TO_BE_ATTACHED is to be freed _after_ OBJ, we can increase its
     reference count and it should be zero by the time it is freed.  */
  c_to_be_attached->ref_count++;
  c_obj->attached_to = c_to_be_attached;

  return obj;
}

\f
/* Instantiate a number of `my-object' SMOBs, attached some of them together,
   invoke the GC, and wait until all of these SMOBs have been freed.  */
static void
test_weak_gc (void)
{
#define PAIRS 700
  unsigned pair_count, total;

  for (pair_count = 0, total = 0;
       pair_count < PAIRS;
       pair_count++)
    {
      size_t noise, i;
      SCM obj, attached;

      obj = my_object_new_smob ();
      attached = my_object_new_smob ();

#if 0
      printf ("%p attached to %p\n",
	      SCM_SMOB_DATA (obj), SCM_SMOB_DATA (attached));
#endif

      attach_object (obj, attached);
      total += 2;

      obj = attached = SCM_BOOL_F;

      for (i = 0, noise = random () % 10;
	   i < noise;
	   i++)
	{
	  my_object_new_smob ();
	  total++;
	}
    }

  while (free_count < total)
    {
      unsigned i;
      scm_gc ();

      for (i = 0; i < 1000; i++)
	scm_cons (SCM_I_MAKINUM (0), SCM_I_MAKINUM (0));
    }
}

\f
int
main (int argc, char *argv[])
{
  scm_init_guile ();

  my_object_type = scm_make_smob_type ("test-object", 0);
  scm_set_smob_free (my_object_type, my_object_free);

  some_property = scm_primitive_make_property (SCM_BOOL_F);

  my_object_new_smob_proc =
    scm_c_make_gsubr ("make-my-object", 0, 0, 0, my_object_new_smob);
  attach_object_proc =
    scm_c_make_gsubr ("attach-object!", 0, 0, 0, attach_object);

  test_weak_gc ();

  return 0;
}


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel


  parent reply	other threads:[~2005-11-23 10:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-08 17:22 [PATCH] Marking weak alist vectors Ludovic Courtès
2005-11-08 23:51 ` Marius Vollmer
2005-11-09  9:03   ` Ludovic Courtès
2005-12-06 23:55     ` Marius Vollmer
2005-12-07 10:33       ` Ludovic Courtès
2005-12-13 23:45         ` Marius Vollmer
2005-12-14  9:30           ` Ludovic Courtès
2005-11-09 10:28 ` Han-Wen Nienhuys
2005-11-09 16:28   ` Ludovic Courtès
2005-11-09 18:36     ` Han-Wen Nienhuys
2005-11-09 21:11     ` Kevin Ryde
2005-11-09 22:45       ` Marius Vollmer
2005-11-10 12:11         ` Han-Wen Nienhuys
2005-11-10  9:47       ` [PATCH] Reference leak in `iprin1 ()' Ludovic Courtès
2005-11-12  9:23         ` Neil Jerram
2005-11-14  9:58           ` Ludovic Courtès
2005-11-16 21:18             ` Neil Jerram
2005-11-17  9:54               ` Ludovic Courtès
2005-11-17 18:52                 ` Neil Jerram
2005-11-23 10:19     ` Ludovic Courtès [this message]
2005-11-24  0:59       ` [PATCH] Marking weak alist vectors, #2 Han-Wen Nienhuys
2005-11-24  9:01         ` Ludovic Courtès
2005-11-26  0:49       ` Kevin Ryde
2006-01-09 14:51       ` [PATCH] Marking weak alist vectors, epilogue Ludovic Courtès
2006-01-09 19:29         ` Neil Jerram
2006-01-10  8:21           ` Ludovic Courtès
2006-01-10  9:33             ` Neil Jerram
2006-01-10 15:43               ` Ludovic Courtès
2005-11-17 13:21 ` [PATCH] Fixing `gc-live-object-stats' Ludovic Courtès
2005-11-17 14:12   ` Han-Wen Nienhuys
2005-11-30  8:54   ` Ludovic Courtès
2005-11-30 23:45     ` Han-Wen Nienhuys
2005-12-03 19:31       ` Neil Jerram
2005-12-05  8:50         ` Ludovic Courtès
2005-12-06 19:14           ` Neil Jerram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sltny8dh.fsf_-_@laas.fr \
    --to=ludovic.courtes@laas.fr \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).