From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Newsgroups: gmane.lisp.guile.bugs Subject: bug#19180: Weak tables harmful to GC? Date: Thu, 26 Oct 2017 22:28:37 -0700 Message-ID: <87tvyl9n22.fsf__31254.576953552$1509082184$gmane$org@gnu.org> References: <87wp9gwz8m.fsf@gnu.org> <87fug4o8z2.fsf@pobox.com> <87wp54z3p5.fsf@gnu.org> <87zi9th1i6.fsf_-_@gnu.org> <87y3o454pr.fsf@gnu.org> <87r2tvncs5.fsf@dustycloud.org> <87wp3mwwef.fsf@gnu.org> <87mv4gd0ik.fsf@elephly.net> <87a80eie63.fsf@gnu.org> <87k1zimhmt.fsf@elephly.net> <87bmktn96e.fsf@elephly.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: blaine.gmane.org 1509082184 7951 195.159.176.226 (27 Oct 2017 05:29:44 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 27 Oct 2017 05:29:44 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Cc: Andy Wingo , 19180@debbugs.gnu.org, guile-devel@gnu.org To: Ricardo Wurmus Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Fri Oct 27 07:29:36 2017 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e7xCu-0008PF-Cr for guile-bugs@m.gmane.org; Fri, 27 Oct 2017 07:29:25 +0200 Original-Received: from localhost ([::1]:55794 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e7xD1-0002bI-Os for guile-bugs@m.gmane.org; Fri, 27 Oct 2017 01:29:31 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41772) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e7xCi-0002WG-0M for bug-guile@gnu.org; Fri, 27 Oct 2017 01:29:23 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e7xCY-0006Qt-7l for bug-guile@gnu.org; Fri, 27 Oct 2017 01:29:12 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:55428) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e7xCX-0006Qf-SO for bug-guile@gnu.org; Fri, 27 Oct 2017 01:29:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1e7xCX-0006RH-KU for bug-guile@gnu.org; Fri, 27 Oct 2017 01:29:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Fri, 27 Oct 2017 05:29:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19180 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 19180-submit@debbugs.gnu.org id=B19180.150908213724741 (code B ref 19180); Fri, 27 Oct 2017 05:29:01 +0000 Original-Received: (at 19180) by debbugs.gnu.org; 27 Oct 2017 05:28:57 +0000 Original-Received: from localhost ([127.0.0.1]:35877 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e7xCT-0006Qz-2B for submit@debbugs.gnu.org; Fri, 27 Oct 2017 01:28:57 -0400 Original-Received: from hera.aquilenet.fr ([141.255.128.1]:53952) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e7xCO-0006Qo-WA for 19180@debbugs.gnu.org; Fri, 27 Oct 2017 01:28:53 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id E3200DE6A; Fri, 27 Oct 2017 07:28:51 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at aquilenet.fr Original-Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OAisKHwi1H6e; Fri, 27 Oct 2017 07:28:44 +0200 (CEST) Original-Received: from ribbon (node-1w7jr9qot7winqrne9k8b99d1.ipv6.telus.net [IPv6:2001:569:7803:2c00:728:f466:18ad:22a5]) by hera.aquilenet.fr (Postfix) with ESMTPSA id CAE10DE19; Fri, 27 Oct 2017 07:28:42 +0200 (CEST) X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 5 Brumaire an 226 de la =?UTF-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu In-Reply-To: <87bmktn96e.fsf@elephly.net> (Ricardo Wurmus's message of "Thu, 26 Oct 2017 18:52:25 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.org gmane.lisp.guile.bugs:8866 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, Ricardo Wurmus skribis: > BOOTSTRAP GUILEC language/tree-il/primitives.go > /gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash:= line 6: 30173 Killed GUILE_AUTO_COMPILE=3D0 ../meta/build= -env guild compile --target=3D"x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-= build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-gui= le-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "ice-9/vlist Funny: with the =E2=80=9Ccleanup=E2=80=9D that led to the patches you tried= , those weak hash tables were not weak at all, because SCM_WEAK_TABLE_KIND_KEY was now zero, and thus SCM_HASHTABLE_WEAK_P would always return false. The fix: --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/libguile/hashtab.h b/libguile/hashtab.h index 8f422b0b5..1705cf744 100644 --- a/libguile/hashtab.h +++ b/libguile/hashtab.h @@ -33,6 +33,7 @@ /* Types of weak hash tables. */ typedef enum { + SCM_WEAK_TABLE_KIND_NONE = 0, SCM_WEAK_TABLE_KIND_KEY, SCM_WEAK_TABLE_KIND_VALUE, SCM_WEAK_TABLE_KIND_BOTH @@ -51,7 +52,9 @@ typedef enum { #define SCM_HASHTABLE_DOUBLY_WEAK_P(x) \ (SCM_HASHTABLE_FLAGS (x) == SCM_WEAK_TABLE_KIND_BOTH) -#define SCM_HASHTABLE_WEAK_P(x) SCM_HASHTABLE_FLAGS (x) +#define SCM_HASHTABLE_WEAK_P(x) \ + (SCM_HASHTABLE_FLAGS (x) != SCM_WEAK_TABLE_KIND_NONE) + #define SCM_HASHTABLE_N_ITEMS(x) (SCM_HASHTABLE (x)->n_items) #define SCM_SET_HASHTABLE_N_ITEMS(x, n) (SCM_HASHTABLE (x)->n_items = n) #define SCM_HASHTABLE_INCREMENT(x) (SCM_HASHTABLE_N_ITEMS(x)++) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable (Updated patches below.) With your package definition, I see the first few Guile processes peak at ~100=C2=A0MiB resident (would be interesting to compare with stock 2.2.2= ). Let me know if it=E2=80=99s better this time! Thanks again for testing, Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch >From eba61a14bd4d39fdfb84e70186b71004044583e3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= Date: Sat, 21 Oct 2017 16:18:39 -0600 Subject: [PATCH 1/2] Remove weak tables and revert to weak hash tables. This removes weak-tables.[ch] and reintroduces weak hash tables as implemented in Guile 2.0 into hashtab.[ch]. This reduces wall-clock time by more than 15% on some GC-intensive benchmarks (compiling code) where big weak hash tables are in use, such as source properties. For more details on the rationale, see . * libguile/weak-table.c, libguile/weak-table.h: Remove. * libguile.h: Don't include "weak-table.h". * libguile/Makefile.am (libguile_@GUILE_EFFECTIVE_VERSION@_la_SOURCES) (DOT_X_FILES, DOT_DOC_FILES, modinclude_HEADERS): Remove weak-table.* files. * libguile/evalext.c (scm_self_evaluating_p): Remove reference to scm_tc7_weak_table. * libguile/hashtab.c (SCM_HASHTABLEF_WEAK_CAR) (SCM_HASHTABLEF_WEAK_CDR): New macros. * libguile/hashtab.c (scm_fixup_weak_alist, vacuum_weak_hash_table) (do_weak_bucket_fixup, weak_bucket_assoc) (weak_bucket_assoc_by_hash): New function. (make_hash_table, scm_make_hash_table): Add support for weak hash tables. (weak_gc_callback, weak_gc_hook, weak_gc_finalizer) (scm_c_register_weak_gc_callback, scm_make_weak_key_hash_table) (scm_make_weak_value_hash_table, scm_make_doubly_weak_hash_table): New functions. (SCM_WEAK_TABLE_P): Remove. (scm_weak_key_hash_table_p, scm_weak_value_hash_table_p) (scm_doubly_weak_hash_table_p, scm_hash_fn_get_handle_by_hash): New functions. (scm_hash_fn_create_handle_x): Add support for weak hash tables. (get_weak_cdr, weak_pair_cdr): New functions. (scm_hash_fn_set_x): Add support for weak hash tables. (scm_hash_fn_remove_x): Likewise. (scm_hashq_get_handle, scm_hashq_create_handle_x): Likewise. (scm_hashv_get_handle, scm_hashv_create_handle_x): Likewise. (scm_hashq_ref, scm_hashq_set_x, scm_hashq_remove_x): Remove special cases for 'SCM_WEAK_TABLE_P'. (scm_hashv_ref, scm_hashv_set_x, scm_hashv_remove_x): Likewise. (scm_hash_ref, scm_hash_set_x, scm_hash_remove_x): Likewise. (scm_hashx_ref, scm_hashx_set_x, scm_hashx_remove_x): Likewise. (assv_predicate, assoc_predicate, assx_predicate): Remove. (scm_hash_map_to_list, scm_internal_hash_fold): Likewise, and check for deleted entries. (scm_internal_hash_for_each_handle): Likewise. (scm_t_ihashx_closure): Remove 'key' field. (wcar_pair_descr, wcdr_pair_descr): New variables. (scm_weak_car_pair, scm_weak_cdr_pair, scm_doubly_weak_pair): New functions. (scm_weak_table_refq, scm_weak_table_putq_x, scm_c_make_weak_table) (scm_c_weak_table_fold): Rewrite in terms of the hash-table API. (scm_init_hashtab): Initialize 'wcar_pair_descr' and 'wcdr_pair_descr'. * libguile/hashtab.h (scm_t_weak_table_kind): New type. (SCM_HASHTABLE, SCM_HASHTABLE_FLAGS, SCM_HASHTABLE_WEAK_KEY_P) (SCM_HASHTABLE_WEAK_VALUE_P, SCM_HASHTABLE_DOUBLY_WEAK_P): New macros. (scm_t_hash_predicate_fn): New type. (scm_t_hashtable)[flags]: New field. (scm_make_weak_value_hash_table, scm_make_doubly_weak_hash_table) (scm_make_weak_key_hash_table, scm_c_make_weak_table) (scm_c_weak_table_fold, scm_weak_table_refq) (scm_weak_table_putq_x): New declarations. * libguile/init.c (scm_i_init_guile): Remove calls to 'scm_weak_table_prehistory' and 'scm_init_weak_table'. (iprin1): Remove reference to scm_tc7_weak_table. * libguile/procprop.c: Include "hashtab.h". * libguile/tags.h (scm_tc7_weak_table): Remove. * libguile/weak-list.h (scm_weak_car_pair, scm_weak_cdr_pair) (scm_doubly_weak_pair): New declarations. (SCM_WEAK_PAIR_DELETED_P, SCM_WEAK_PAIR_WORD_DELETED_P) (SCM_WEAK_PAIR_CAR_DELETED_P, SCM_WEAK_PAIR_CDR_DELETED_P) (SCM_WEAK_PAIR_WORD, SCM_WEAK_PAIR_CAR, SCM_WEAK_PAIR_CDR): New macros. * module/system/base/types.scm (%tc7-weak-table): Mark as obsolete. * test-suite/tests/types.test ("opaque objects"): Replace references to 'weak-table' with 'hash-table'. Add 'make-hash-table' test. --- libguile.h | 3 +- libguile/Makefile.am | 6 +- libguile/evalext.c | 3 +- libguile/hashtab.c | 880 +++++++++++++++++++++++++------ libguile/hashtab.h | 50 +- libguile/init.c | 4 +- libguile/print.c | 3 - libguile/procprop.c | 4 +- libguile/tags.h | 3 +- libguile/weak-list.h | 32 +- libguile/weak-table.c | 1180 ------------------------------------------ libguile/weak-table.h | 94 ---- module/system/base/types.scm | 2 +- test-suite/tests/types.test | 9 +- 14 files changed, 812 insertions(+), 1461 deletions(-) delete mode 100644 libguile/weak-table.c delete mode 100644 libguile/weak-table.h diff --git a/libguile.h b/libguile.h index 3f7f0b791..90326844b 100644 --- a/libguile.h +++ b/libguile.h @@ -1,7 +1,7 @@ #ifndef SCM_LIBGUILE_H #define SCM_LIBGUILE_H -/* Copyright (C) 1995,1996,1997,1998,2000,2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014 Free Software Foundation, Inc. +/* Copyright (C) 1995,1996,1997,1998,2000,2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -117,7 +117,6 @@ extern "C" { #include "libguile/version.h" #include "libguile/vports.h" #include "libguile/weak-set.h" -#include "libguile/weak-table.h" #include "libguile/weak-vector.h" #include "libguile/backtrace.h" #include "libguile/debug.h" diff --git a/libguile/Makefile.am b/libguile/Makefile.am index 2214a4aa3..6420d0f48 100644 --- a/libguile/Makefile.am +++ b/libguile/Makefile.am @@ -1,7 +1,7 @@ ## Process this file with Automake to create Makefile.in ## ## Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2007, -## 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016 Free Software Foundation, Inc. +## 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Free Software Foundation, Inc. ## ## This file is part of GUILE. ## @@ -225,7 +225,6 @@ libguile_@GUILE_EFFECTIVE_VERSION@_la_SOURCES = \ vm.c \ vports.c \ weak-set.c \ - weak-table.c \ weak-vector.c DOT_X_FILES = \ @@ -330,7 +329,6 @@ DOT_X_FILES = \ vm.x \ vports.x \ weak-set.x \ - weak-table.x \ weak-vector.x EXTRA_DOT_X_FILES = @EXTRA_DOT_X_FILES@ @@ -432,7 +430,6 @@ DOT_DOC_FILES = \ version.doc \ vports.doc \ weak-set.doc \ - weak-table.doc \ weak-vector.doc EXTRA_DOT_DOC_FILES = @EXTRA_DOT_DOC_FILES@ @@ -685,7 +682,6 @@ modinclude_HEADERS = \ vm.h \ vports.h \ weak-set.h \ - weak-table.h \ weak-vector.h nodist_modinclude_HEADERS = version.h scmconfig.h diff --git a/libguile/evalext.c b/libguile/evalext.c index 33205a2ca..e381daa65 100644 --- a/libguile/evalext.c +++ b/libguile/evalext.c @@ -1,4 +1,4 @@ -/* Copyright (C) 1998,1999,2000,2001,2002,2003, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2015 Free Software Foundation, Inc. +/* Copyright (C) 1998,1999,2000,2001,2002,2003, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2015, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -77,7 +77,6 @@ SCM_DEFINE (scm_self_evaluating_p, "self-evaluating?", 1, 0, 0, case scm_tc7_pointer: case scm_tc7_hashtable: case scm_tc7_weak_set: - case scm_tc7_weak_table: case scm_tc7_fluid: case scm_tc7_dynamic_state: case scm_tc7_frame: diff --git a/libguile/hashtab.c b/libguile/hashtab.c index 8920e08a6..0b3b0c0d4 100644 --- a/libguile/hashtab.c +++ b/libguile/hashtab.c @@ -1,5 +1,5 @@ /* Copyright (C) 1995, 1996, 1998, 1999, 2000, 2001, 2003, 2004, 2006, - * 2008, 2009, 2010, 2011, 2012, 2013 Free Software Foundation, Inc. + * 2008, 2009, 2010, 2011, 2012, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -38,9 +38,18 @@ #include "libguile/validate.h" #include "libguile/hashtab.h" +#include +#include + +/* Map the 2.0 names (on the left) to the new enum values. */ +#define SCM_HASHTABLEF_WEAK_CAR SCM_WEAK_TABLE_KIND_KEY +#define SCM_HASHTABLEF_WEAK_CDR SCM_WEAK_TABLE_KIND_VALUE + + + /* A hash table is a cell containing a vector of association lists. * * Growing or shrinking, with following rehashing, is triggered when @@ -53,6 +62,9 @@ * The implementation stores the upper and lower number of items which * trigger a resize in the hashtable object. * + * Weak hash tables use weak pairs in the bucket lists rather than + * normal pairs. + * * Possible hash table sizes (primes) are stored in the array * hashtable_size. */ @@ -72,8 +84,213 @@ static unsigned long hashtable_size[] = { static char *s_hashtable = "hashtable"; + + +/* Helper functions and macros to deal with weak pairs. + + Weak pairs need to be accessed very carefully since their components can + be nullified by the GC when the object they refer to becomes unreachable. + Hence the macros and functions below that detect such weak pairs within + buckets and remove them. */ + + +/* Remove nullified weak pairs from ALIST such that the result contains only + valid pairs. Set REMOVED_ITEMS to the number of pairs that have been + deleted. */ static SCM -make_hash_table (unsigned long k, const char *func_name) +scm_fixup_weak_alist (SCM alist, size_t *removed_items) +{ + SCM result; + SCM prev = SCM_EOL; + + *removed_items = 0; + for (result = alist; + scm_is_pair (alist); + alist = SCM_CDR (alist)) + { + SCM pair = SCM_CAR (alist); + + if (SCM_WEAK_PAIR_DELETED_P (pair)) + { + /* Remove from ALIST weak pair PAIR whose car/cdr has been + nullified by the GC. */ + if (scm_is_null (prev)) + result = SCM_CDR (alist); + else + SCM_SETCDR (prev, SCM_CDR (alist)); + + (*removed_items)++; + + /* Leave PREV unchanged. */ + } + else + prev = alist; + } + + return result; +} + +static void +vacuum_weak_hash_table (SCM table) +{ + SCM buckets = SCM_HASHTABLE_VECTOR (table); + unsigned long k = SCM_SIMPLE_VECTOR_LENGTH (buckets); + size_t len = SCM_HASHTABLE_N_ITEMS (table); + + while (k--) + { + size_t removed; + SCM alist = SCM_SIMPLE_VECTOR_REF (buckets, k); + alist = scm_fixup_weak_alist (alist, &removed); + if (removed <= len) + len -= removed; + else + { + /* The move to BDW-GC with Guile 2.0 introduced some bugs + related to weak hash tables, threads, memory usage, and the + alloc lock. We were unable to fix these issues + satisfactorily in 2.0 but have addressed them via a rewrite + in 2.2. If you see this message often, you probably want + to upgrade to 2.2. */ + fprintf (stderr, "guile: warning: weak hash table corruption " + "(https://bugs.gnu.org/19180)\n"); + len = 0; + } + SCM_SIMPLE_VECTOR_SET (buckets, k, alist); + } + + SCM_SET_HASHTABLE_N_ITEMS (table, len); +} + + +/* Packed arguments for `do_weak_bucket_fixup'. */ +struct t_fixup_args +{ + SCM bucket; + SCM *bucket_copy; + size_t removed_items; +}; + +static void * +do_weak_bucket_fixup (void *data) +{ + struct t_fixup_args *args; + SCM pair, *copy; + + args = (struct t_fixup_args *) data; + + args->bucket = scm_fixup_weak_alist (args->bucket, &args->removed_items); + + for (pair = args->bucket, copy = args->bucket_copy; + scm_is_pair (pair); + pair = SCM_CDR (pair), copy += 2) + { + /* At this point, all weak pairs have been removed. */ + assert (!SCM_WEAK_PAIR_DELETED_P (SCM_CAR (pair))); + + /* Copy the key and value. */ + copy[0] = SCM_CAAR (pair); + copy[1] = SCM_CDAR (pair); + } + + return args; +} + +/* Lookup OBJECT in weak hash table TABLE using ASSOC. OBJECT is searched + for in the alist that is the BUCKET_INDEXth element of BUCKETS. + Optionally update TABLE and rehash it. */ +static SCM +weak_bucket_assoc (SCM table, SCM buckets, size_t bucket_index, + scm_t_hash_fn hash_fn, + scm_t_assoc_fn assoc, SCM object, void *closure) +{ + SCM result; + SCM bucket, *strong_refs; + struct t_fixup_args args; + + bucket = SCM_SIMPLE_VECTOR_REF (buckets, bucket_index); + + /* Prepare STRONG_REFS as an array large enough to hold all the keys + and values in BUCKET. */ + strong_refs = alloca (scm_ilength (bucket) * 2 * sizeof (SCM)); + + args.bucket = bucket; + args.bucket_copy = strong_refs; + + /* Fixup BUCKET. Do that with the allocation lock held to avoid + seeing disappearing links pointing to objects that have already + been reclaimed (this happens when the disappearing links that point + to it haven't yet been cleared.) + + The `do_weak_bucket_fixup' call populates STRONG_REFS with a copy + of BUCKET's entries after it's been fixed up. Thus, all the + entries kept in BUCKET are still reachable when ASSOC sees + them. */ + GC_call_with_alloc_lock (do_weak_bucket_fixup, &args); + + bucket = args.bucket; + SCM_SIMPLE_VECTOR_SET (buckets, bucket_index, bucket); + + result = assoc (object, bucket, closure); + + /* If we got a result, it should not have NULL fields. */ + if (scm_is_pair (result) && SCM_WEAK_PAIR_DELETED_P (result)) + abort (); + + scm_remember_upto_here_1 (strong_refs); + + if (args.removed_items > 0) + { + /* Update TABLE's item count and optionally trigger a rehash. */ + size_t remaining; + + assert (SCM_HASHTABLE_N_ITEMS (table) >= args.removed_items); + + remaining = SCM_HASHTABLE_N_ITEMS (table) - args.removed_items; + SCM_SET_HASHTABLE_N_ITEMS (table, remaining); + + if (remaining < SCM_HASHTABLE_LOWER (table)) + scm_i_rehash (table, hash_fn, closure, "weak_bucket_assoc"); + } + + return result; +} + + +/* Packed arguments for `weak_bucket_assoc_by_hash'. */ +struct assoc_by_hash_data +{ + SCM alist; + SCM ret; + scm_t_hash_predicate_fn predicate; + void *closure; +}; + +/* See scm_hash_fn_get_handle_by_hash below. */ +static void* +weak_bucket_assoc_by_hash (void *args) +{ + struct assoc_by_hash_data *data = args; + SCM alist = data->alist; + + for (; scm_is_pair (alist); alist = SCM_CDR (alist)) + { + SCM pair = SCM_CAR (alist); + + if (!SCM_WEAK_PAIR_DELETED_P (pair) + && data->predicate (SCM_CAR (pair), data->closure)) + { + data->ret = pair; + break; + } + } + return args; +} + + + +static SCM +make_hash_table (int flags, unsigned long k, const char *func_name) { SCM vector; scm_t_hashtable *t; @@ -82,6 +299,9 @@ make_hash_table (unsigned long k, const char *func_name) ++i; n = hashtable_size[i]; + /* In both cases, i.e., regardless of whether we are creating a weak hash + table, we return a non-weak vector. This is because the vector itself + is not weak in the case of a weak hash table: the alist pairs are. */ vector = scm_c_make_vector (n, SCM_EOL); t = scm_gc_malloc_pointerless (sizeof (*t), s_hashtable); @@ -89,6 +309,8 @@ make_hash_table (unsigned long k, const char *func_name) t->n_items = 0; t->lower = 0; t->upper = 9 * n / 10; + t->flags = flags; + t->hash_fn = NULL; /* FIXME: we just need two words of storage, not three */ return scm_double_cell (scm_tc7_hashtable, SCM_UNPACK (vector), @@ -121,6 +343,13 @@ scm_i_rehash (SCM table, if (i >= HASHTABLE_SIZE_N) /* don't rehash */ return; + + /* Remember HASH_FN for rehash_after_gc, but only when CLOSURE + is not needed since CLOSURE can not be guaranteed to be valid + after this function returns. + */ + if (closure == NULL) + SCM_HASHTABLE (table)->hash_fn = hash_fn; } SCM_HASHTABLE (table)->size_index = i; @@ -134,6 +363,13 @@ scm_i_rehash (SCM table, new_buckets = scm_c_make_vector (new_size, SCM_EOL); + /* When this is a weak hashtable, running the GC might change it. + We need to cope with this while rehashing its elements. We do + this by first installing the new, empty bucket vector. Then we + remove the elements from the old bucket vector and insert them + into the new one. + */ + SCM_SET_HASHTABLE_VECTOR (table, new_buckets); SCM_SET_HASHTABLE_N_ITEMS (table, 0); @@ -153,6 +389,10 @@ scm_i_rehash (SCM table, handle = SCM_CAR (cell); ls = SCM_CDR (ls); + if (SCM_WEAK_PAIR_DELETED_P (handle)) + /* HANDLE is a nullified weak pair: skip it. */ + continue; + h = hash_fn (SCM_CAR (handle), new_size, closure); if (h >= new_size) scm_out_of_range (func_name, scm_from_ulong (h)); @@ -167,7 +407,14 @@ scm_i_rehash (SCM table, void scm_i_hashtable_print (SCM exp, SCM port, scm_print_state *pstate) { - scm_puts ("#= SCM_SIMPLE_VECTOR_LENGTH (buckets)) scm_out_of_range (FUNC_NAME, scm_from_ulong (k)); - h = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure); + if (SCM_HASHTABLE_WEAK_P (table)) + h = weak_bucket_assoc (table, buckets, k, hash_fn, + assoc_fn, obj, closure); + else + h = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure); + + return h; +} +#undef FUNC_NAME + + +/* This procedure implements three optimizations, with respect to the + raw get_handle(): + + 1. For weak tables, it's assumed that calling the predicate in the + allocation lock is safe. In practice this means that the predicate + cannot call arbitrary scheme functions. + + 2. We don't check for overflow / underflow and rehash. + + 3. We don't actually have to allocate a key -- instead we get the + hash value directly. This is useful for, for example, looking up + strings in the symbol table. + */ +SCM +scm_hash_fn_get_handle_by_hash (SCM table, unsigned long raw_hash, + scm_t_hash_predicate_fn predicate_fn, + void *closure) +#define FUNC_NAME "scm_hash_fn_ref_by_hash" +{ + unsigned long k; + SCM buckets, alist, h = SCM_BOOL_F; + + SCM_VALIDATE_HASHTABLE (SCM_ARG1, table); + buckets = SCM_HASHTABLE_VECTOR (table); + + if (SCM_SIMPLE_VECTOR_LENGTH (buckets) == 0) + return SCM_BOOL_F; + + k = raw_hash % SCM_SIMPLE_VECTOR_LENGTH (buckets); + alist = SCM_SIMPLE_VECTOR_REF (buckets, k); + + if (SCM_HASHTABLE_WEAK_P (table)) + { + struct assoc_by_hash_data args; + + args.alist = alist; + args.ret = SCM_BOOL_F; + args.predicate = predicate_fn; + args.closure = closure; + GC_call_with_alloc_lock (weak_bucket_assoc_by_hash, &args); + h = args.ret; + } + else + for (; scm_is_pair (alist); alist = SCM_CDR (alist)) + { + SCM pair = SCM_CAR (alist); + if (predicate_fn (SCM_CAR (pair), closure)) + { + h = pair; + break; + } + } return h; } @@ -252,7 +714,11 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, if (k >= SCM_SIMPLE_VECTOR_LENGTH (buckets)) scm_out_of_range ("hash_fn_create_handle_x", scm_from_ulong (k)); - it = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure); + if (SCM_HASHTABLE_WEAK_P (table)) + it = weak_bucket_assoc (table, buckets, k, hash_fn, + assoc_fn, obj, closure); + else + it = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure); if (scm_is_pair (it)) return it; @@ -260,9 +726,29 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, scm_wrong_type_arg_msg (NULL, 0, it, "a pair"); else { + /* When this is a weak hashtable, running the GC can change it. + Thus, we must allocate the new cells first and can only then + access BUCKETS. Also, we need to fetch the bucket vector + again since the hashtable might have been rehashed. This + necessitates a new hash value as well. + */ SCM handle, new_bucket; - handle = scm_cons (obj, init); + if (SCM_HASHTABLE_WEAK_P (table)) + { + /* FIXME: We don't support weak alist vectors. */ + /* Use a weak cell. */ + if (SCM_HASHTABLE_DOUBLY_WEAK_P (table)) + handle = scm_doubly_weak_pair (obj, init); + else if (SCM_HASHTABLE_WEAK_KEY_P (table)) + handle = scm_weak_car_pair (obj, init); + else + handle = scm_weak_cdr_pair (obj, init); + } + else + /* Use a regular, non-weak cell. */ + handle = scm_cons (obj, init); + new_bucket = scm_cons (handle, SCM_EOL); if (!scm_is_eq (SCM_HASHTABLE_VECTOR (table), buckets)) @@ -298,6 +784,36 @@ scm_hash_fn_ref (SCM table, SCM obj, SCM dflt, return dflt; } +struct weak_cdr_data +{ + SCM pair; + SCM cdr; +}; + +static void* +get_weak_cdr (void *data) +{ + struct weak_cdr_data *d = data; + + if (SCM_WEAK_PAIR_CDR_DELETED_P (d->pair)) + d->cdr = SCM_BOOL_F; + else + d->cdr = SCM_CDR (d->pair); + + return NULL; +} + +static SCM +weak_pair_cdr (SCM x) +{ + struct weak_cdr_data data; + + data.pair = x; + GC_call_with_alloc_lock (get_weak_cdr, &data); + + return data.cdr; +} + SCM scm_hash_fn_set_x (SCM table, SCM obj, SCM val, scm_t_hash_fn hash_fn, scm_t_assoc_fn assoc_fn, @@ -309,7 +825,24 @@ scm_hash_fn_set_x (SCM table, SCM obj, SCM val, hash_fn, assoc_fn, closure); if (!scm_is_eq (SCM_CDR (pair), val)) - SCM_SETCDR (pair, val); + { + if (SCM_UNLIKELY (SCM_HASHTABLE_WEAK_VALUE_P (table))) + { + /* If the former value was on the heap, we need to unregister + the weak link. */ + SCM prev = weak_pair_cdr (pair); + + SCM_SETCDR (pair, val); + + if (SCM_NIMP (prev) && !SCM_NIMP (val)) + GC_unregister_disappearing_link ((void **) SCM_CDRLOC (pair)); + else + SCM_I_REGISTER_DISAPPEARING_LINK ((void **) SCM_CDRLOC (pair), + SCM2PTR (val)); + } + else + SCM_SETCDR (pair, val); + } return val; } @@ -336,7 +869,11 @@ scm_hash_fn_remove_x (SCM table, SCM obj, if (k >= SCM_SIMPLE_VECTOR_LENGTH (buckets)) scm_out_of_range (FUNC_NAME, scm_from_ulong (k)); - h = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure); + if (SCM_HASHTABLE_WEAK_P (table)) + h = weak_bucket_assoc (table, buckets, k, hash_fn, + assoc_fn, obj, closure); + else + h = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure); if (scm_is_true (h)) { @@ -355,12 +892,6 @@ SCM_DEFINE (scm_hash_clear_x, "hash-clear!", 1, 0, 0, "Remove all items from @var{table} (without triggering a resize).") #define FUNC_NAME s_scm_hash_clear_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_weak_table_clear_x (table); - return SCM_UNSPECIFIED; - } - SCM_VALIDATE_HASHTABLE (SCM_ARG1, table); scm_vector_fill_x (SCM_HASHTABLE_VECTOR (table), SCM_EOL); @@ -380,6 +911,9 @@ SCM_DEFINE (scm_hashq_get_handle, "hashq-get-handle", 2, 0, 0, "Uses @code{eq?} for equality testing.") #define FUNC_NAME s_scm_hashq_get_handle { + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + return scm_hash_fn_get_handle (table, key, (scm_t_hash_fn) scm_ihashq, (scm_t_assoc_fn) scm_sloppy_assq, @@ -395,6 +929,9 @@ SCM_DEFINE (scm_hashq_create_handle_x, "hashq-create-handle!", 3, 0, 0, "associates @var{key} with @var{init}.") #define FUNC_NAME s_scm_hashq_create_handle_x { + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + return scm_hash_fn_create_handle_x (table, key, init, (scm_t_hash_fn) scm_ihashq, (scm_t_assoc_fn) scm_sloppy_assq, @@ -413,10 +950,6 @@ SCM_DEFINE (scm_hashq_ref, "hashq-ref", 2, 1, 0, { if (SCM_UNBNDP (dflt)) dflt = SCM_BOOL_F; - - if (SCM_WEAK_TABLE_P (table)) - return scm_weak_table_refq (table, key, dflt); - return scm_hash_fn_ref (table, key, dflt, (scm_t_hash_fn) scm_ihashq, (scm_t_assoc_fn) scm_sloppy_assq, @@ -432,12 +965,6 @@ SCM_DEFINE (scm_hashq_set_x, "hashq-set!", 3, 0, 0, "store @var{val} there. Uses @code{eq?} for equality testing.") #define FUNC_NAME s_scm_hashq_set_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_weak_table_putq_x (table, key, val); - return val; - } - return scm_hash_fn_set_x (table, key, val, (scm_t_hash_fn) scm_ihashq, (scm_t_assoc_fn) scm_sloppy_assq, @@ -453,16 +980,6 @@ SCM_DEFINE (scm_hashq_remove_x, "hashq-remove!", 2, 0, 0, "@var{table}. Uses @code{eq?} for equality tests.") #define FUNC_NAME s_scm_hashq_remove_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_weak_table_remq_x (table, key); - /* This return value is for historical compatibility with - hash-remove!, which returns either the "handle" corresponding - to the entry, or #f. Since weak tables don't have handles, we - have to return #f. */ - return SCM_BOOL_F; - } - return scm_hash_fn_remove_x (table, key, (scm_t_hash_fn) scm_ihashq, (scm_t_assoc_fn) scm_sloppy_assq, @@ -481,6 +998,9 @@ SCM_DEFINE (scm_hashv_get_handle, "hashv-get-handle", 2, 0, 0, "Uses @code{eqv?} for equality testing.") #define FUNC_NAME s_scm_hashv_get_handle { + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + return scm_hash_fn_get_handle (table, key, (scm_t_hash_fn) scm_ihashv, (scm_t_assoc_fn) scm_sloppy_assv, @@ -496,6 +1016,9 @@ SCM_DEFINE (scm_hashv_create_handle_x, "hashv-create-handle!", 3, 0, 0, "associates @var{key} with @var{init}.") #define FUNC_NAME s_scm_hashv_create_handle_x { + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + return scm_hash_fn_create_handle_x (table, key, init, (scm_t_hash_fn) scm_ihashv, (scm_t_assoc_fn) scm_sloppy_assv, @@ -504,12 +1027,6 @@ SCM_DEFINE (scm_hashv_create_handle_x, "hashv-create-handle!", 3, 0, 0, #undef FUNC_NAME -static int -assv_predicate (SCM k, SCM v, void *closure) -{ - return scm_is_true (scm_eqv_p (k, SCM_PACK_POINTER (closure))); -} - SCM_DEFINE (scm_hashv_ref, "hashv-ref", 2, 1, 0, (SCM table, SCM key, SCM dflt), "Look up @var{key} in the hash table @var{table}, and return the\n" @@ -520,12 +1037,6 @@ SCM_DEFINE (scm_hashv_ref, "hashv-ref", 2, 1, 0, { if (SCM_UNBNDP (dflt)) dflt = SCM_BOOL_F; - - if (SCM_WEAK_TABLE_P (table)) - return scm_c_weak_table_ref (table, scm_ihashv (key, -1), - assv_predicate, - (void *) SCM_UNPACK (key), dflt); - return scm_hash_fn_ref (table, key, dflt, (scm_t_hash_fn) scm_ihashv, (scm_t_assoc_fn) scm_sloppy_assv, @@ -541,14 +1052,6 @@ SCM_DEFINE (scm_hashv_set_x, "hashv-set!", 3, 0, 0, "store @var{value} there. Uses @code{eqv?} for equality testing.") #define FUNC_NAME s_scm_hashv_set_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_c_weak_table_put_x (table, scm_ihashv (key, -1), - assv_predicate, (void *) SCM_UNPACK (key), - key, val); - return val; - } - return scm_hash_fn_set_x (table, key, val, (scm_t_hash_fn) scm_ihashv, (scm_t_assoc_fn) scm_sloppy_assv, @@ -563,14 +1066,6 @@ SCM_DEFINE (scm_hashv_remove_x, "hashv-remove!", 2, 0, 0, "@var{table}. Uses @code{eqv?} for equality tests.") #define FUNC_NAME s_scm_hashv_remove_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_c_weak_table_remove_x (table, scm_ihashv (key, -1), - assv_predicate, (void *) SCM_UNPACK (key)); - /* See note in hashq-remove!. */ - return SCM_BOOL_F; - } - return scm_hash_fn_remove_x (table, key, (scm_t_hash_fn) scm_ihashv, (scm_t_assoc_fn) scm_sloppy_assv, @@ -588,6 +1083,9 @@ SCM_DEFINE (scm_hash_get_handle, "hash-get-handle", 2, 0, 0, "Uses @code{equal?} for equality testing.") #define FUNC_NAME s_scm_hash_get_handle { + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + return scm_hash_fn_get_handle (table, key, (scm_t_hash_fn) scm_ihash, (scm_t_assoc_fn) scm_sloppy_assoc, @@ -603,6 +1101,9 @@ SCM_DEFINE (scm_hash_create_handle_x, "hash-create-handle!", 3, 0, 0, "associates @var{key} with @var{init}.") #define FUNC_NAME s_scm_hash_create_handle_x { + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + return scm_hash_fn_create_handle_x (table, key, init, (scm_t_hash_fn) scm_ihash, (scm_t_assoc_fn) scm_sloppy_assoc, @@ -611,12 +1112,6 @@ SCM_DEFINE (scm_hash_create_handle_x, "hash-create-handle!", 3, 0, 0, #undef FUNC_NAME -static int -assoc_predicate (SCM k, SCM v, void *closure) -{ - return scm_is_true (scm_equal_p (k, SCM_PACK_POINTER (closure))); -} - SCM_DEFINE (scm_hash_ref, "hash-ref", 2, 1, 0, (SCM table, SCM key, SCM dflt), "Look up @var{key} in the hash table @var{table}, and return the\n" @@ -627,12 +1122,6 @@ SCM_DEFINE (scm_hash_ref, "hash-ref", 2, 1, 0, { if (SCM_UNBNDP (dflt)) dflt = SCM_BOOL_F; - - if (SCM_WEAK_TABLE_P (table)) - return scm_c_weak_table_ref (table, scm_ihash (key, -1), - assoc_predicate, - (void *) SCM_UNPACK (key), dflt); - return scm_hash_fn_ref (table, key, dflt, (scm_t_hash_fn) scm_ihash, (scm_t_assoc_fn) scm_sloppy_assoc, @@ -649,14 +1138,6 @@ SCM_DEFINE (scm_hash_set_x, "hash-set!", 3, 0, 0, "testing.") #define FUNC_NAME s_scm_hash_set_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_c_weak_table_put_x (table, scm_ihash (key, -1), - assoc_predicate, (void *) SCM_UNPACK (key), - key, val); - return val; - } - return scm_hash_fn_set_x (table, key, val, (scm_t_hash_fn) scm_ihash, (scm_t_assoc_fn) scm_sloppy_assoc, @@ -672,14 +1153,6 @@ SCM_DEFINE (scm_hash_remove_x, "hash-remove!", 2, 0, 0, "@var{table}. Uses @code{equal?} for equality tests.") #define FUNC_NAME s_scm_hash_remove_x { - if (SCM_WEAK_TABLE_P (table)) - { - scm_c_weak_table_remove_x (table, scm_ihash (key, -1), - assoc_predicate, (void *) SCM_UNPACK (key)); - /* See note in hashq-remove!. */ - return SCM_BOOL_F; - } - return scm_hash_fn_remove_x (table, key, (scm_t_hash_fn) scm_ihash, (scm_t_assoc_fn) scm_sloppy_assoc, @@ -694,9 +1167,10 @@ typedef struct scm_t_ihashx_closure { SCM hash; SCM assoc; - SCM key; } scm_t_ihashx_closure; + + static unsigned long scm_ihashx (SCM obj, unsigned long n, void *arg) { @@ -706,6 +1180,8 @@ scm_ihashx (SCM obj, unsigned long n, void *arg) return scm_to_ulong (answer); } + + static SCM scm_sloppy_assx (SCM obj, SCM alist, void *arg) { @@ -713,20 +1189,6 @@ scm_sloppy_assx (SCM obj, SCM alist, void *arg) return scm_call_2 (closure->assoc, obj, alist); } -static int -assx_predicate (SCM k, SCM v, void *closure) -{ - scm_t_ihashx_closure *c = (scm_t_ihashx_closure *) closure; - - /* FIXME: The hashx interface is crazy. Hash tables have nothing to - do with alists in principle. Instead of getting an assoc proc, - hashx functions should use an equality predicate. Perhaps we can - change this before 2.2, but until then, add a terrible, terrible - hack. */ - - return scm_is_true (scm_call_2 (c->assoc, c->key, scm_acons (k, v, SCM_EOL))); -} - SCM_DEFINE (scm_hashx_get_handle, "hashx-get-handle", 4, 0, 0, (SCM hash, SCM assoc, SCM table, SCM key), @@ -741,7 +1203,9 @@ SCM_DEFINE (scm_hashx_get_handle, "hashx-get-handle", 4, 0, 0, scm_t_ihashx_closure closure; closure.hash = hash; closure.assoc = assoc; - closure.key = key; + + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); return scm_hash_fn_get_handle (table, key, scm_ihashx, scm_sloppy_assx, (void *) &closure); @@ -762,7 +1226,9 @@ SCM_DEFINE (scm_hashx_create_handle_x, "hashx-create-handle!", 5, 0, 0, scm_t_ihashx_closure closure; closure.hash = hash; closure.assoc = assoc; - closure.key = key; + + if (SCM_UNLIKELY (SCM_HASHTABLE_P (table) && SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); return scm_hash_fn_create_handle_x (table, key, init, scm_ihashx, scm_sloppy_assx, (void *)&closure); @@ -789,15 +1255,6 @@ SCM_DEFINE (scm_hashx_ref, "hashx-ref", 4, 1, 0, dflt = SCM_BOOL_F; closure.hash = hash; closure.assoc = assoc; - closure.key = key; - - if (SCM_WEAK_TABLE_P (table)) - { - unsigned long h = scm_to_ulong (scm_call_2 (hash, key, - scm_from_ulong (-1))); - return scm_c_weak_table_ref (table, h, assx_predicate, &closure, dflt); - } - return scm_hash_fn_ref (table, key, dflt, scm_ihashx, scm_sloppy_assx, (void *)&closure); } @@ -822,16 +1279,6 @@ SCM_DEFINE (scm_hashx_set_x, "hashx-set!", 5, 0, 0, scm_t_ihashx_closure closure; closure.hash = hash; closure.assoc = assoc; - closure.key = key; - - if (SCM_WEAK_TABLE_P (table)) - { - unsigned long h = scm_to_ulong (scm_call_2 (hash, key, - scm_from_ulong (-1))); - scm_c_weak_table_put_x (table, h, assx_predicate, &closure, key, val); - return val; - } - return scm_hash_fn_set_x (table, key, val, scm_ihashx, scm_sloppy_assx, (void *)&closure); } @@ -853,17 +1300,6 @@ SCM_DEFINE (scm_hashx_remove_x, "hashx-remove!", 4, 0, 0, scm_t_ihashx_closure closure; closure.hash = hash; closure.assoc = assoc; - closure.key = obj; - - if (SCM_WEAK_TABLE_P (table)) - { - unsigned long h = scm_to_ulong (scm_call_2 (hash, obj, - scm_from_ulong (-1))); - scm_c_weak_table_remove_x (table, h, assx_predicate, &closure); - /* See note in hashq-remove!. */ - return SCM_BOOL_F; - } - return scm_hash_fn_remove_x (table, obj, scm_ihashx, scm_sloppy_assx, (void *) &closure); } @@ -884,10 +1320,6 @@ SCM_DEFINE (scm_hash_fold, "hash-fold", 3, 0, 0, #define FUNC_NAME s_scm_hash_fold { SCM_VALIDATE_PROC (1, proc); - - if (SCM_WEAK_TABLE_P (table)) - return scm_weak_table_fold (proc, init, table); - SCM_VALIDATE_HASHTABLE (3, table); return scm_internal_hash_fold ((scm_t_hash_fold_fn) scm_call_3, (void *) SCM_UNPACK (proc), init, table); @@ -909,13 +1341,6 @@ SCM_DEFINE (scm_hash_for_each, "hash-for-each", 2, 0, 0, #define FUNC_NAME s_scm_hash_for_each { SCM_VALIDATE_PROC (1, proc); - - if (SCM_WEAK_TABLE_P (table)) - { - scm_weak_table_for_each (proc, table); - return SCM_UNSPECIFIED; - } - SCM_VALIDATE_HASHTABLE (2, table); scm_internal_hash_for_each_handle (for_each_proc, @@ -934,6 +1359,9 @@ SCM_DEFINE (scm_hash_for_each_handle, "hash-for-each-handle", 2, 0, 0, SCM_ASSERT (scm_is_true (scm_procedure_p (proc)), proc, 1, FUNC_NAME); SCM_VALIDATE_HASHTABLE (2, table); + if (SCM_UNLIKELY (SCM_HASHTABLE_WEAK_P (table))) + SCM_MISC_ERROR ("Handle access not permitted on weak table", SCM_EOL); + scm_internal_hash_for_each_handle ((scm_t_hash_handle_fn) scm_call_1, (void *) SCM_UNPACK (proc), table); @@ -956,10 +1384,6 @@ SCM_DEFINE (scm_hash_map_to_list, "hash-map->list", 2, 0, 0, #define FUNC_NAME s_scm_hash_map_to_list { SCM_VALIDATE_PROC (1, proc); - - if (SCM_WEAK_TABLE_P (table)) - return scm_weak_table_map_to_list (proc, table); - SCM_VALIDATE_HASHTABLE (2, table); return scm_internal_hash_fold (map_proc, (void *) SCM_UNPACK (proc), @@ -1005,9 +1429,6 @@ scm_internal_hash_fold (scm_t_hash_fold_fn fn, void *closure, long i, n; SCM buckets, result = init; - if (SCM_WEAK_TABLE_P (table)) - return scm_c_weak_table_fold (fn, closure, init, table); - SCM_VALIDATE_HASHTABLE (0, table); buckets = SCM_HASHTABLE_VECTOR (table); @@ -1020,7 +1441,14 @@ scm_internal_hash_fold (scm_t_hash_fold_fn fn, void *closure, ls = SCM_CDR (ls)) { handle = SCM_CAR (ls); - result = fn (closure, SCM_CAR (handle), SCM_CDR (handle), result); + + if (SCM_HASHTABLE_WEAK_P (table) && SCM_WEAK_PAIR_DELETED_P (handle)) + /* Don't try to unlink this weak pair, as we're not within + the allocation lock. Instead rely on + vacuum_weak_hash_table to do its job. */ + continue; + else + result = fn (closure, SCM_CAR (handle), SCM_CDR (handle), result); } } @@ -1056,7 +1484,9 @@ scm_internal_hash_for_each_handle (scm_t_hash_handle_fn fn, void *closure, handle = SCM_CAR (ls); if (!scm_is_pair (handle)) SCM_WRONG_TYPE_ARG (SCM_ARG3, buckets); - fn (closure, handle); + if (!SCM_HASHTABLE_WEAK_P (table) + || !SCM_WEAK_PAIR_DELETED_P (handle)) + fn (closure, handle); ls = SCM_CDR (ls); } } @@ -1064,11 +1494,139 @@ scm_internal_hash_for_each_handle (scm_t_hash_handle_fn fn, void *closure, #undef FUNC_NAME +/* Weak pairs for use in weak alist vectors and weak hash tables. + + We have weal-car pairs, weak-cdr pairs, and doubly weak pairs. In weak + pairs, the weak component(s) are not scanned for pointers and are + registered as disapperaring links; therefore, the weak component may be + set to NULL by the garbage collector when no other reference to that word + exist. Thus, users should only access weak pairs via the + `SCM_WEAK_PAIR_C[AD]R ()' macros. See also `scm_fixup_weak_alist ()' in + `hashtab.c'. */ + +/* Type descriptors for weak-c[ad]r pairs. */ +static GC_descr wcar_pair_descr, wcdr_pair_descr; + + +SCM +scm_weak_car_pair (SCM car, SCM cdr) +{ + scm_t_cell *cell; + + cell = (scm_t_cell *) GC_malloc_explicitly_typed (sizeof (*cell), + wcar_pair_descr); + + cell->word_0 = car; + cell->word_1 = cdr; + + if (SCM_NIMP (car)) + /* Weak car cells make sense iff the car is non-immediate. */ + SCM_I_REGISTER_DISAPPEARING_LINK ((void **) &cell->word_0, SCM2PTR (car)); + + return SCM_PACK (cell); +} + +SCM +scm_weak_cdr_pair (SCM car, SCM cdr) +{ + scm_t_cell *cell; + + cell = (scm_t_cell *) GC_malloc_explicitly_typed (sizeof (*cell), + wcdr_pair_descr); + + cell->word_0 = car; + cell->word_1 = cdr; + + if (SCM_NIMP (cdr)) + /* Weak cdr cells make sense iff the cdr is non-immediate. */ + SCM_I_REGISTER_DISAPPEARING_LINK ((void **) &cell->word_1, SCM2PTR (cdr)); + + return SCM_PACK (cell); +} + +SCM +scm_doubly_weak_pair (SCM car, SCM cdr) +{ + /* Doubly weak cells shall not be scanned at all for pointers. */ + scm_t_cell *cell = (scm_t_cell *) scm_gc_malloc_pointerless (sizeof (*cell), + "weak cell"); + + cell->word_0 = car; + cell->word_1 = cdr; + + if (SCM_NIMP (car)) + SCM_I_REGISTER_DISAPPEARING_LINK ((void **) &cell->word_0, SCM2PTR (car)); + if (SCM_NIMP (cdr)) + SCM_I_REGISTER_DISAPPEARING_LINK ((void **) &cell->word_1, SCM2PTR (cdr)); + + return SCM_PACK (cell); +} + + +/* Backward-compatibility with the former internal weak-table API. */ + +SCM +scm_weak_table_refq (SCM table, SCM key, SCM dflt) +{ + return scm_hash_fn_ref (table, key, dflt, + (scm_t_hash_fn) scm_ihashq, + (scm_t_assoc_fn) scm_sloppy_assq, + 0); +} + +void +scm_weak_table_putq_x (SCM table, SCM key, SCM value) +{ + (void) scm_hashq_set_x (table, key, value); +} + +SCM +scm_c_make_weak_table (unsigned long size, scm_t_weak_table_kind kind) +{ + switch (kind) + { + case SCM_WEAK_TABLE_KIND_KEY: + return scm_make_weak_key_hash_table (scm_from_ulong (size)); + case SCM_WEAK_TABLE_KIND_VALUE: + return scm_make_weak_value_hash_table (scm_from_ulong (size)); + case SCM_WEAK_TABLE_KIND_BOTH: + return scm_make_doubly_weak_hash_table (scm_from_ulong (size)); + case SCM_WEAK_TABLE_KIND_NONE: + return scm_make_hash_table (scm_from_ulong (size)); + default: + abort (); + } +} + +SCM +scm_c_weak_table_fold (scm_t_hash_fold_fn fn, void *closure, + SCM init, SCM table) +{ + return scm_internal_hash_fold (fn, closure, init, table); +} + + void scm_init_hashtab () { + /* Initialize weak pairs. */ + GC_word wcar_pair_bitmap[GC_BITMAP_SIZE (scm_t_cell)] = { 0 }; + GC_word wcdr_pair_bitmap[GC_BITMAP_SIZE (scm_t_cell)] = { 0 }; + + /* In a weak-car pair, only the second word must be scanned for + pointers. */ + GC_set_bit (wcar_pair_bitmap, GC_WORD_OFFSET (scm_t_cell, word_1)); + wcar_pair_descr = GC_make_descriptor (wcar_pair_bitmap, + GC_WORD_LEN (scm_t_cell)); + + /* Conversely, in a weak-cdr pair, only the first word must be scanned for + pointers. */ + GC_set_bit (wcdr_pair_bitmap, GC_WORD_OFFSET (scm_t_cell, word_0)); + wcdr_pair_descr = GC_make_descriptor (wcdr_pair_bitmap, + GC_WORD_LEN (scm_t_cell)); + #include "libguile/hashtab.x" } diff --git a/libguile/hashtab.h b/libguile/hashtab.h index 82ed22e66..1705cf744 100644 --- a/libguile/hashtab.h +++ b/libguile/hashtab.h @@ -3,7 +3,7 @@ #ifndef SCM_HASHTAB_H #define SCM_HASHTAB_H -/* Copyright (C) 1995,1996,1999,2000,2001, 2003, 2004, 2006, 2008, 2009, 2011 Free Software Foundation, Inc. +/* Copyright (C) 1995,1996,1999,2000,2001, 2003, 2004, 2006, 2008, 2009, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -25,14 +25,36 @@ #include "libguile/__scm.h" +#include "libguile/weak-list.h" + #define SCM_HASHTABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_hashtable)) + +/* Types of weak hash tables. */ +typedef enum { + SCM_WEAK_TABLE_KIND_NONE = 0, + SCM_WEAK_TABLE_KIND_KEY, + SCM_WEAK_TABLE_KIND_VALUE, + SCM_WEAK_TABLE_KIND_BOTH +} scm_t_weak_table_kind; + #define SCM_VALIDATE_HASHTABLE(pos, arg) \ SCM_MAKE_VALIDATE_MSG (pos, arg, HASHTABLE_P, "hash-table") #define SCM_HASHTABLE_VECTOR(h) SCM_CELL_OBJECT_1 (h) #define SCM_SET_HASHTABLE_VECTOR(x, v) SCM_SET_CELL_OBJECT_1 ((x), (v)) #define SCM_HASHTABLE(x) ((scm_t_hashtable *) SCM_CELL_WORD_2 (x)) +#define SCM_HASHTABLE_FLAGS(x) (SCM_HASHTABLE (x)->flags) +#define SCM_HASHTABLE_WEAK_KEY_P(x) \ + (SCM_HASHTABLE_FLAGS (x) == SCM_WEAK_TABLE_KIND_KEY) +#define SCM_HASHTABLE_WEAK_VALUE_P(x) \ + (SCM_HASHTABLE_FLAGS (x) == SCM_WEAK_TABLE_KIND_VALUE) +#define SCM_HASHTABLE_DOUBLY_WEAK_P(x) \ + (SCM_HASHTABLE_FLAGS (x) == SCM_WEAK_TABLE_KIND_BOTH) + +#define SCM_HASHTABLE_WEAK_P(x) \ + (SCM_HASHTABLE_FLAGS (x) != SCM_WEAK_TABLE_KIND_NONE) + #define SCM_HASHTABLE_N_ITEMS(x) (SCM_HASHTABLE (x)->n_items) #define SCM_SET_HASHTABLE_N_ITEMS(x, n) (SCM_HASHTABLE (x)->n_items = n) #define SCM_HASHTABLE_INCREMENT(x) (SCM_HASHTABLE_N_ITEMS(x)++) @@ -55,6 +77,10 @@ typedef unsigned long (*scm_t_hash_fn) (SCM obj, unsigned long max, some equality predicate. */ typedef SCM (*scm_t_assoc_fn) (SCM obj, SCM alist, void *closure); +/* Function that returns true if the given object is the one we are + looking for, for scm_hash_fn_ref_by_hash. */ +typedef int (*scm_t_hash_predicate_fn) (SCM obj, void *closure); + /* Function to fold over the entries of a hash table. */ typedef SCM (*scm_t_hash_fold_fn) (void *closure, SCM key, SCM value, SCM result); @@ -64,6 +90,7 @@ typedef SCM (*scm_t_hash_fold_fn) (void *closure, SCM key, SCM value, typedef SCM (*scm_t_hash_handle_fn) (void *closure, SCM handle); typedef struct scm_t_hashtable { + scm_t_weak_table_kind flags; /* properties of table */ unsigned long n_items; /* number of items in table */ unsigned long lower; /* when to shrink */ unsigned long upper; /* when to grow */ @@ -77,8 +104,14 @@ typedef struct scm_t_hashtable { SCM_API SCM scm_vector_to_hash_table (SCM vector); SCM_API SCM scm_c_make_hash_table (unsigned long k); SCM_API SCM scm_make_hash_table (SCM n); +SCM_API SCM scm_make_weak_key_hash_table (SCM k); +SCM_API SCM scm_make_weak_value_hash_table (SCM k); +SCM_API SCM scm_make_doubly_weak_hash_table (SCM k); SCM_API SCM scm_hash_table_p (SCM h); +SCM_API SCM scm_weak_key_hash_table_p (SCM h); +SCM_API SCM scm_weak_value_hash_table_p (SCM h); +SCM_API SCM scm_doubly_weak_hash_table_p (SCM h); SCM_INTERNAL void scm_i_rehash (SCM table, scm_t_hash_fn hash_fn, void *closure, const char *func_name); @@ -88,6 +121,10 @@ SCM_API SCM scm_hash_fn_get_handle (SCM table, SCM obj, scm_t_hash_fn hash_fn, scm_t_assoc_fn assoc_fn, void *closure); +SCM_INTERNAL +SCM scm_hash_fn_get_handle_by_hash (SCM table, unsigned long raw_hash, + scm_t_hash_predicate_fn predicate_fn, + void *closure); SCM_API SCM scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, scm_t_hash_fn hash_fn, scm_t_assoc_fn assoc_fn, @@ -138,6 +175,17 @@ SCM_API SCM scm_hash_count (SCM hash, SCM pred); SCM_INTERNAL void scm_i_hashtable_print (SCM exp, SCM port, scm_print_state *pstate); SCM_INTERNAL void scm_init_hashtab (void); + +/* Guile 2.2.x (x <= 2) weak-table API. */ + +SCM_INTERNAL SCM scm_c_make_weak_table (unsigned long k, + scm_t_weak_table_kind kind); +SCM_INTERNAL SCM scm_c_weak_table_fold (scm_t_hash_fold_fn proc, void *closure, + SCM init, SCM table); +SCM_INTERNAL SCM scm_weak_table_refq (SCM table, SCM key, SCM dflt); +SCM_INTERNAL void scm_weak_table_putq_x (SCM table, SCM key, SCM value); + + #endif /* SCM_HASHTAB_H */ /* diff --git a/libguile/init.c b/libguile/init.c index b046685d4..64d3f8d63 100644 --- a/libguile/init.c +++ b/libguile/init.c @@ -1,4 +1,4 @@ -/* Copyright (C) 1995-2004, 2006, 2009-2014 Free Software Foundation, Inc. +/* Copyright (C) 1995-2004, 2006, 2009-2014, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -387,7 +387,6 @@ scm_i_init_guile (void *base) scm_storage_prehistory (); scm_threads_prehistory (base); /* requires storage_prehistory */ - scm_weak_table_prehistory (); /* requires storage_prehistory */ #ifdef GUILE_DEBUG_MALLOC scm_debug_malloc_prehistory (); #endif @@ -495,7 +494,6 @@ scm_i_init_guile (void *base) scm_init_trees (); scm_init_version (); scm_init_weak_set (); - scm_init_weak_table (); scm_init_weak_vectors (); scm_init_guardians (); /* requires smob_prehistory */ scm_init_vports (); diff --git a/libguile/print.c b/libguile/print.c index 24c532f29..75a44d99c 100644 --- a/libguile/print.c +++ b/libguile/print.c @@ -701,9 +701,6 @@ iprin1 (SCM exp, SCM port, scm_print_state *pstate) case scm_tc7_weak_set: scm_i_weak_set_print (exp, port, pstate); break; - case scm_tc7_weak_table: - scm_i_weak_table_print (exp, port, pstate); - break; case scm_tc7_fluid: scm_i_fluid_print (exp, port, pstate); break; diff --git a/libguile/procprop.c b/libguile/procprop.c index ad56bd5ba..c906c93f8 100644 --- a/libguile/procprop.c +++ b/libguile/procprop.c @@ -1,4 +1,4 @@ -/* Copyright (C) 1995,1996,1998,2000,2001,2003,2004, 2006, 2008, 2009, 2010, 2011, 2012, 2013 Free Software Foundation, Inc. +/* Copyright (C) 1995,1996,1998,2000,2001,2003,2004, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -30,7 +30,7 @@ #include "libguile/gsubr.h" #include "libguile/smob.h" #include "libguile/vectors.h" -#include "libguile/weak-table.h" +#include "libguile/hashtab.h" #include "libguile/programs.h" #include "libguile/vm-builtins.h" diff --git a/libguile/tags.h b/libguile/tags.h index 3a01a1587..9aa4d00d0 100644 --- a/libguile/tags.h +++ b/libguile/tags.h @@ -3,7 +3,7 @@ #ifndef SCM_TAGS_H #define SCM_TAGS_H -/* Copyright (C) 1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2008,2009,2010,2011,2012,2013,2014,2015 +/* Copyright (C) 1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2008,2009,2010,2011,2012,2013,2014,2015,2017 * Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or @@ -423,7 +423,6 @@ typedef union SCM { struct { scm_t_bits n; } n; } SCM; #define scm_tc7_bytevector 0x4d #define scm_tc7_unused_4f 0x4f #define scm_tc7_weak_set 0x55 -#define scm_tc7_weak_table 0x57 #define scm_tc7_array 0x5d #define scm_tc7_bitvector 0x5f #define scm_tc7_unused_65 0x65 diff --git a/libguile/weak-list.h b/libguile/weak-list.h index 989cb7f0a..e8e5a3555 100644 --- a/libguile/weak-list.h +++ b/libguile/weak-list.h @@ -3,7 +3,7 @@ #ifndef SCM_WEAK_LIST_H #define SCM_WEAK_LIST_H -/* Copyright (C) 2016 Free Software Foundation, Inc. +/* Copyright (C) 2016, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -24,6 +24,7 @@ #include "libguile/__scm.h" +#include "libguile/pairs.h" #include "libguile/weak-vector.h" @@ -64,6 +65,35 @@ scm_i_visit_weak_list (SCM *list_loc, void (*visit) (SCM)) } + +/* Weak pairs. */ + +SCM_INTERNAL SCM scm_weak_car_pair (SCM car, SCM cdr); +SCM_INTERNAL SCM scm_weak_cdr_pair (SCM car, SCM cdr); +SCM_INTERNAL SCM scm_doubly_weak_pair (SCM car, SCM cdr); + +/* Testing the weak component(s) of a cell for reachability. */ +#define SCM_WEAK_PAIR_WORD_DELETED_P(_cell, _word) \ + (SCM_UNPACK (SCM_CELL_OBJECT ((_cell), (_word))) == 0) +#define SCM_WEAK_PAIR_CAR_DELETED_P(_cell) \ + (SCM_WEAK_PAIR_WORD_DELETED_P ((_cell), 0)) +#define SCM_WEAK_PAIR_CDR_DELETED_P(_cell) \ + (SCM_WEAK_PAIR_WORD_DELETED_P ((_cell), 1)) + +#define SCM_WEAK_PAIR_DELETED_P(_cell) \ + ((SCM_WEAK_PAIR_CAR_DELETED_P (_cell)) \ + || (SCM_WEAK_PAIR_CDR_DELETED_P (_cell))) + +/* Accessing the components of a weak cell. These return `SCM_UNDEFINED' if + the car/cdr has been collected. */ +#define SCM_WEAK_PAIR_WORD(_cell, _word) \ + (SCM_WEAK_PAIR_WORD_DELETED_P ((_cell), (_word)) \ + ? SCM_UNDEFINED \ + : SCM_CELL_OBJECT ((_cell), (_word))) +#define SCM_WEAK_PAIR_CAR(_cell) (SCM_WEAK_PAIR_WORD ((_cell), 0)) +#define SCM_WEAK_PAIR_CDR(_cell) (SCM_WEAK_PAIR_WORD ((_cell), 1)) + + #endif /* SCM_WEAK_LIST_H */ /* diff --git a/libguile/weak-table.c b/libguile/weak-table.c deleted file mode 100644 index 599c4cf0e..000000000 --- a/libguile/weak-table.c +++ /dev/null @@ -1,1180 +0,0 @@ -/* Copyright (C) 2011, 2012, 2013, 2014 Free Software Foundation, Inc. - * - * This library is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public License - * as published by the Free Software Foundation; either version 3 of - * the License, or (at your option) any later version. - * - * This library is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA - * 02110-1301 USA - */ - - - -#ifdef HAVE_CONFIG_H -# include -#endif - -#include - -#include "libguile/bdw-gc.h" -#include - -#include "libguile/_scm.h" -#include "libguile/hash.h" -#include "libguile/eval.h" -#include "libguile/ports.h" - -#include "libguile/validate.h" -#include "libguile/weak-list.h" -#include "libguile/weak-table.h" - - -/* Weak Tables - - This file implements weak hash tables. Weak hash tables are - generally used when you want to augment some object with additional - data, but when you don't have space to store the data in the object. - For example, procedure properties are implemented with weak tables. - - Weak tables are implemented using an open-addressed hash table. - Basically this means that there is an array of entries, and the item - is expected to be found the slot corresponding to its hash code, - modulo the length of the array. - - Collisions are handled using linear probing with the Robin Hood - technique. See Pedro Celis' paper, "Robin Hood Hashing": - - http://www.cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf - - The vector of entries is allocated in such a way that the GC doesn't - trace the weak values. For doubly-weak tables, this means that the - entries are allocated as an "atomic" piece of memory. Key-weak and - value-weak tables use a special GC kind with a custom mark procedure. - When items are added weakly into table, a disappearing link is - registered to their locations. If the referent is collected, then - that link will be zeroed out. - - An entry in the table consists of the key and the value, together - with the hash code of the key. We munge hash codes so that they are - never 0. In this way we can detect removed entries (key of zero but - nonzero hash code), and can then reshuffle elements as needed to - maintain the robin hood ordering. - - Compared to buckets-and-chains hash tables, open addressing has the - advantage that it is very cache-friendly. It also uses less memory. - - Implementation-wise, there are two things to note. - - 1. We assume that hash codes are evenly distributed across the - range of unsigned longs. The actual hash code stored in the - entry is left-shifted by 1 bit (losing 1 bit of hash precision), - and then or'd with 1. In this way we ensure that the hash field - of an occupied entry is nonzero. To map to an index, we - right-shift the hash by one, divide by the size, and take the - remainder. - - 2. Since the weak references are stored in an atomic region with - disappearing links, they need to be accessed with the GC alloc - lock. `copy_weak_entry' will do that for you. The hash code - itself can be read outside the lock, though. - */ - - -typedef struct { - unsigned long hash; - scm_t_bits key; - scm_t_bits value; -} scm_t_weak_entry; - - -struct weak_entry_data { - scm_t_weak_entry *in; - scm_t_weak_entry *out; -}; - -static void* -do_copy_weak_entry (void *data) -{ - struct weak_entry_data *e = data; - - e->out->hash = e->in->hash; - e->out->key = e->in->key; - e->out->value = e->in->value; - - return NULL; -} - -static void -copy_weak_entry (scm_t_weak_entry *src, scm_t_weak_entry *dst) -{ - struct weak_entry_data data; - - data.in = src; - data.out = dst; - - GC_call_with_alloc_lock (do_copy_weak_entry, &data); -} - -static void -register_disappearing_links (scm_t_weak_entry *entry, - SCM k, SCM v, - scm_t_weak_table_kind kind) -{ - if (SCM_UNPACK (k) && SCM_HEAP_OBJECT_P (k) - && (kind == SCM_WEAK_TABLE_KIND_KEY - || kind == SCM_WEAK_TABLE_KIND_BOTH)) - SCM_I_REGISTER_DISAPPEARING_LINK ((void **) &entry->key, - SCM2PTR (k)); - - if (SCM_UNPACK (v) && SCM_HEAP_OBJECT_P (v) - && (kind == SCM_WEAK_TABLE_KIND_VALUE - || kind == SCM_WEAK_TABLE_KIND_BOTH)) - SCM_I_REGISTER_DISAPPEARING_LINK ((void **) &entry->value, - SCM2PTR (v)); -} - -static void -unregister_disappearing_links (scm_t_weak_entry *entry, - scm_t_weak_table_kind kind) -{ - if (kind == SCM_WEAK_TABLE_KIND_KEY || kind == SCM_WEAK_TABLE_KIND_BOTH) - GC_unregister_disappearing_link ((void **) &entry->key); - - if (kind == SCM_WEAK_TABLE_KIND_VALUE || kind == SCM_WEAK_TABLE_KIND_BOTH) - GC_unregister_disappearing_link ((void **) &entry->value); -} - -#ifndef HAVE_GC_MOVE_DISAPPEARING_LINK -static void -GC_move_disappearing_link (void **from, void **to) -{ - GC_unregister_disappearing_link (from); - SCM_I_REGISTER_DISAPPEARING_LINK (to, *to); -} -#endif - -static void -move_disappearing_links (scm_t_weak_entry *from, scm_t_weak_entry *to, - SCM key, SCM value, scm_t_weak_table_kind kind) -{ - if ((kind == SCM_WEAK_TABLE_KIND_KEY || kind == SCM_WEAK_TABLE_KIND_BOTH) - && SCM_HEAP_OBJECT_P (key)) - GC_move_disappearing_link ((void **) &from->key, (void **) &to->key); - - if ((kind == SCM_WEAK_TABLE_KIND_VALUE || kind == SCM_WEAK_TABLE_KIND_BOTH) - && SCM_HEAP_OBJECT_P (value)) - GC_move_disappearing_link ((void **) &from->value, (void **) &to->value); -} - -static void -move_weak_entry (scm_t_weak_entry *from, scm_t_weak_entry *to, - scm_t_weak_table_kind kind) -{ - if (from->hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (from, ©); - to->hash = copy.hash; - to->key = copy.key; - to->value = copy.value; - - move_disappearing_links (from, to, - SCM_PACK (copy.key), SCM_PACK (copy.value), - kind); - } - else - { - to->hash = 0; - to->key = 0; - to->value = 0; - } -} - - -typedef struct { - scm_t_weak_entry *entries; /* the data */ - scm_i_pthread_mutex_t lock; /* the lock */ - scm_t_weak_table_kind kind; /* what kind of table it is */ - unsigned long size; /* total number of slots. */ - unsigned long n_items; /* number of items in table */ - unsigned long lower; /* when to shrink */ - unsigned long upper; /* when to grow */ - int size_index; /* index into hashtable_size */ - int min_size_index; /* minimum size_index */ -} scm_t_weak_table; - - -#define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table)) -#define SCM_VALIDATE_WEAK_TABLE(pos, arg) \ - SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table") -#define SCM_WEAK_TABLE(x) ((scm_t_weak_table *) SCM_CELL_WORD_1 (x)) - - -static unsigned long -hash_to_index (unsigned long hash, unsigned long size) -{ - return (hash >> 1) % size; -} - -static unsigned long -entry_distance (unsigned long hash, unsigned long k, unsigned long size) -{ - unsigned long origin = hash_to_index (hash, size); - - if (k >= origin) - return k - origin; - else - /* The other key was displaced and wrapped around. */ - return size - origin + k; -} - -static void -rob_from_rich (scm_t_weak_table *table, unsigned long k) -{ - unsigned long empty, size; - - size = table->size; - - /* If we are to free up slot K in the table, we need room to do so. */ - assert (table->n_items < size); - - empty = k; - do - empty = (empty + 1) % size; - while (table->entries[empty].hash); - - do - { - unsigned long last = empty ? (empty - 1) : (size - 1); - move_weak_entry (&table->entries[last], &table->entries[empty], - table->kind); - empty = last; - } - while (empty != k); - - table->entries[empty].hash = 0; - table->entries[empty].key = 0; - table->entries[empty].value = 0; -} - -static void -give_to_poor (scm_t_weak_table *table, unsigned long k) -{ - /* Slot K was just freed up; possibly shuffle others down. */ - unsigned long size = table->size; - - while (1) - { - unsigned long next = (k + 1) % size; - unsigned long hash; - scm_t_weak_entry copy; - - hash = table->entries[next].hash; - - if (!hash || hash_to_index (hash, size) == next) - break; - - copy_weak_entry (&table->entries[next], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference. */ - { - give_to_poor (table, next); - table->n_items--; - continue; - } - - move_weak_entry (&table->entries[next], &table->entries[k], - table->kind); - - k = next; - } - - /* We have shuffled down any entries that should be shuffled down; now - free the end. */ - table->entries[k].hash = 0; - table->entries[k].key = 0; - table->entries[k].value = 0; -} - - - - -/* The GC "kinds" for singly-weak tables. */ -static int weak_key_gc_kind; -static int weak_value_gc_kind; - -static struct GC_ms_entry * -mark_weak_key_table (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, - struct GC_ms_entry *mark_stack_limit, GC_word env) -{ - scm_t_weak_entry *entries = (scm_t_weak_entry*) addr; - unsigned long k, size = GC_size (addr) / sizeof (scm_t_weak_entry); - - for (k = 0; k < size; k++) - if (entries[k].hash && entries[k].key) - { - SCM value = SCM_PACK (entries[k].value); - mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (value), - mark_stack_ptr, mark_stack_limit, - NULL); - } - - return mark_stack_ptr; -} - -static struct GC_ms_entry * -mark_weak_value_table (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, - struct GC_ms_entry *mark_stack_limit, GC_word env) -{ - scm_t_weak_entry *entries = (scm_t_weak_entry*) addr; - unsigned long k, size = GC_size (addr) / sizeof (scm_t_weak_entry); - - for (k = 0; k < size; k++) - if (entries[k].hash && entries[k].value) - { - SCM key = SCM_PACK (entries[k].key); - mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (key), - mark_stack_ptr, mark_stack_limit, - NULL); - } - - return mark_stack_ptr; -} - -static scm_t_weak_entry * -allocate_entries (unsigned long size, scm_t_weak_table_kind kind) -{ - scm_t_weak_entry *ret; - size_t bytes = size * sizeof (*ret); - - switch (kind) - { - case SCM_WEAK_TABLE_KIND_KEY: - ret = GC_generic_malloc (bytes, weak_key_gc_kind); - break; - case SCM_WEAK_TABLE_KIND_VALUE: - ret = GC_generic_malloc (bytes, weak_value_gc_kind); - break; - case SCM_WEAK_TABLE_KIND_BOTH: - ret = scm_gc_malloc_pointerless (bytes, "weak-table"); - break; - default: - abort (); - } - - memset (ret, 0, bytes); - - return ret; -} - - - -/* Growing or shrinking is triggered when the load factor - * - * L = N / S (N: number of items in table, S: bucket vector length) - * - * passes an upper limit of 0.9 or a lower limit of 0.2. - * - * The implementation stores the upper and lower number of items which - * trigger a resize in the hashtable object. - * - * Possible hash table sizes (primes) are stored in the array - * hashtable_size. - */ - -static unsigned long hashtable_size[] = { - 31, 61, 113, 223, 443, 883, 1759, 3517, 7027, 14051, 28099, 56197, 112363, - 224717, 449419, 898823, 1797641, 3595271, 7190537, 14381041, 28762081, - 57524111, 115048217, 230096423 -}; - -#define HASHTABLE_SIZE_N (sizeof(hashtable_size)/sizeof(unsigned long)) - -static int -compute_size_index (scm_t_weak_table *table) -{ - int i = table->size_index; - - if (table->n_items < table->lower) - { - /* rehashing is not triggered when i <= min_size */ - do - --i; - while (i > table->min_size_index - && table->n_items < hashtable_size[i] / 5); - } - else if (table->n_items > table->upper) - { - ++i; - if (i >= HASHTABLE_SIZE_N) - /* The biggest size currently is 230096423, which for a 32-bit - machine will occupy 2.3GB of memory at a load of 80%. There - is probably something better to do here, but if you have a - weak map of that size, you are hosed in any case. */ - abort (); - } - - return i; -} - -static int -is_acceptable_size_index (scm_t_weak_table *table, int size_index) -{ - int computed = compute_size_index (table); - - if (size_index == computed) - /* We were going to grow or shrink, and allocating the new vector - didn't change the target size. */ - return 1; - - if (size_index == computed + 1) - { - /* We were going to enlarge the table, but allocating the new - vector finalized some objects, making an enlargement - unnecessary. It might still be a good idea to use the larger - table, though. (This branch also gets hit if, while allocating - the vector, some other thread was actively removing items from - the table. That is less likely, though.) */ - unsigned long new_lower = hashtable_size[size_index] / 5; - - return table->size > new_lower; - } - - if (size_index == computed - 1) - { - /* We were going to shrink the table, but when we dropped the lock - to allocate the new vector, some other thread added elements to - the table. */ - return 0; - } - - /* The computed size differs from our newly allocated size by more - than one size index -- recalculate. */ - return 0; -} - -static void -resize_table (scm_t_weak_table *table) -{ - scm_t_weak_entry *old_entries, *new_entries; - int new_size_index; - unsigned long old_size, new_size, old_k; - - do - { - new_size_index = compute_size_index (table); - if (new_size_index == table->size_index) - return; - new_size = hashtable_size[new_size_index]; - new_entries = allocate_entries (new_size, table->kind); - } - while (!is_acceptable_size_index (table, new_size_index)); - - old_entries = table->entries; - old_size = table->size; - - table->size_index = new_size_index; - table->size = new_size; - if (new_size_index <= table->min_size_index) - table->lower = 0; - else - table->lower = new_size / 5; - table->upper = 9 * new_size / 10; - table->n_items = 0; - table->entries = new_entries; - - for (old_k = 0; old_k < old_size; old_k++) - { - scm_t_weak_entry copy; - unsigned long new_k, distance; - - if (!old_entries[old_k].hash) - continue; - - copy_weak_entry (&old_entries[old_k], ©); - - if (!copy.key || !copy.value) - continue; - - new_k = hash_to_index (copy.hash, new_size); - - for (distance = 0; ; distance++, new_k = (new_k + 1) % new_size) - { - unsigned long other_hash = new_entries[new_k].hash; - - if (!other_hash) - /* Found an empty entry. */ - break; - - /* Displace the entry if our distance is less, otherwise keep - looking. */ - if (entry_distance (other_hash, new_k, new_size) < distance) - { - rob_from_rich (table, new_k); - break; - } - } - - table->n_items++; - new_entries[new_k].hash = copy.hash; - new_entries[new_k].key = copy.key; - new_entries[new_k].value = copy.value; - - register_disappearing_links (&new_entries[new_k], - SCM_PACK (copy.key), SCM_PACK (copy.value), - table->kind); - } -} - -/* Run after GC via do_vacuum_weak_table, this function runs over the - whole table, removing lost weak references, reshuffling the table as it - goes. It might resize the table if it reaps enough entries. */ -static void -vacuum_weak_table (scm_t_weak_table *table) -{ - scm_t_weak_entry *entries = table->entries; - unsigned long size = table->size; - unsigned long k; - - for (k = 0; k < size; k++) - { - unsigned long hash = entries[k].hash; - - if (hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ - { - give_to_poor (table, k); - table->n_items--; - } - } - } - - if (table->n_items < table->lower) - resize_table (table); -} - - - - -static SCM -weak_table_ref (scm_t_weak_table *table, unsigned long hash, - scm_t_table_predicate_fn pred, void *closure, - SCM dflt) -{ - unsigned long k, distance, size; - scm_t_weak_entry *entries; - - size = table->size; - entries = table->entries; - - hash = (hash << 1) | 0x1; - k = hash_to_index (hash, size); - - for (distance = 0; distance < size; distance++, k = (k + 1) % size) - { - unsigned long other_hash; - - retry: - other_hash = entries[k].hash; - - if (!other_hash) - /* Not found. */ - return dflt; - - if (hash == other_hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ - { - give_to_poor (table, k); - table->n_items--; - goto retry; - } - - if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure)) - /* Found. */ - return SCM_PACK (copy.value); - } - - /* If the entry's distance is less, our key is not in the table. */ - if (entry_distance (other_hash, k, size) < distance) - return dflt; - } - - /* If we got here, then we were unfortunate enough to loop through the - whole table. Shouldn't happen, but hey. */ - return dflt; -} - - -static void -weak_table_put_x (scm_t_weak_table *table, unsigned long hash, - scm_t_table_predicate_fn pred, void *closure, - SCM key, SCM value) -{ - unsigned long k, distance, size; - scm_t_weak_entry *entries; - - size = table->size; - entries = table->entries; - - hash = (hash << 1) | 0x1; - k = hash_to_index (hash, size); - - for (distance = 0; ; distance++, k = (k + 1) % size) - { - unsigned long other_hash; - - retry: - other_hash = entries[k].hash; - - if (!other_hash) - /* Found an empty entry. */ - break; - - if (other_hash == hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ - { - give_to_poor (table, k); - table->n_items--; - goto retry; - } - - if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure)) - /* Found an entry with this key. */ - break; - } - - if (table->n_items > table->upper) - /* Full table, time to resize. */ - { - resize_table (table); - return weak_table_put_x (table, hash >> 1, pred, closure, key, value); - } - - /* Displace the entry if our distance is less, otherwise keep - looking. */ - if (entry_distance (other_hash, k, size) < distance) - { - rob_from_rich (table, k); - break; - } - } - - /* Fast path for updated values for existing entries of weak-key - tables. */ - if (table->kind == SCM_WEAK_TABLE_KIND_KEY && - entries[k].hash == hash && - entries[k].key == SCM_UNPACK (key)) - { - entries[k].value = SCM_UNPACK (value); - return; - } - - if (entries[k].hash) - unregister_disappearing_links (&entries[k], table->kind); - else - table->n_items++; - - entries[k].hash = hash; - entries[k].key = SCM_UNPACK (key); - entries[k].value = SCM_UNPACK (value); - - register_disappearing_links (&entries[k], key, value, table->kind); -} - - -static void -weak_table_remove_x (scm_t_weak_table *table, unsigned long hash, - scm_t_table_predicate_fn pred, void *closure) -{ - unsigned long k, distance, size; - scm_t_weak_entry *entries; - - size = table->size; - entries = table->entries; - - hash = (hash << 1) | 0x1; - k = hash_to_index (hash, size); - - for (distance = 0; distance < size; distance++, k = (k + 1) % size) - { - unsigned long other_hash; - - retry: - other_hash = entries[k].hash; - - if (!other_hash) - /* Not found. */ - return; - - if (other_hash == hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ - { - give_to_poor (table, k); - table->n_items--; - goto retry; - } - - if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure)) - /* Found an entry with this key. */ - { - entries[k].hash = 0; - entries[k].key = 0; - entries[k].value = 0; - - unregister_disappearing_links (&entries[k], table->kind); - - if (--table->n_items < table->lower) - resize_table (table); - else - give_to_poor (table, k); - - return; - } - } - - /* If the entry's distance is less, our key is not in the table. */ - if (entry_distance (other_hash, k, size) < distance) - return; - } -} - - - -static SCM -make_weak_table (unsigned long k, scm_t_weak_table_kind kind) -{ - scm_t_weak_table *table; - - int i = 0, n = k ? k : 31; - while (i + 1 < HASHTABLE_SIZE_N && n > hashtable_size[i]) - ++i; - n = hashtable_size[i]; - - table = scm_gc_malloc (sizeof (*table), "weak-table"); - table->entries = allocate_entries (n, kind); - table->kind = kind; - table->n_items = 0; - table->size = n; - table->lower = 0; - table->upper = 9 * n / 10; - table->size_index = i; - table->min_size_index = i; - scm_i_pthread_mutex_init (&table->lock, NULL); - - return scm_cell (scm_tc7_weak_table, (scm_t_bits)table); -} - -void -scm_i_weak_table_print (SCM exp, SCM port, scm_print_state *pstate) -{ - scm_puts ("#<", port); - scm_puts ("weak-table ", port); - scm_uintprint (SCM_WEAK_TABLE (exp)->n_items, 10, port); - scm_putc ('/', port); - scm_uintprint (SCM_WEAK_TABLE (exp)->size, 10, port); - scm_puts (">", port); -} - -static void -do_vacuum_weak_table (SCM table) -{ - scm_t_weak_table *t; - - t = SCM_WEAK_TABLE (table); - - /* Unlike weak sets, the weak table interface allows custom predicates - to call out to arbitrary Scheme. There are two ways that this code - can be re-entrant, then: calling weak hash procedures while in a - custom predicate, or via finalizers run explicitly by (gc) or in an - async (for non-threaded Guile). We add a restriction that - prohibits the first case, by convention. But since we can't - prohibit the second case, here we trylock instead of lock. Not so - nice. */ - if (scm_i_pthread_mutex_trylock (&t->lock) == 0) - { - vacuum_weak_table (t); - scm_i_pthread_mutex_unlock (&t->lock); - } - - return; -} - -static scm_i_pthread_mutex_t all_weak_tables_lock = SCM_I_PTHREAD_MUTEX_INITIALIZER; -static SCM all_weak_tables = SCM_EOL; - -static void -vacuum_all_weak_tables (void) -{ - scm_i_pthread_mutex_lock (&all_weak_tables_lock); - scm_i_visit_weak_list (&all_weak_tables, do_vacuum_weak_table); - scm_i_pthread_mutex_unlock (&all_weak_tables_lock); -} - -SCM -scm_c_make_weak_table (unsigned long k, scm_t_weak_table_kind kind) -{ - SCM ret; - - ret = make_weak_table (k, kind); - - scm_i_pthread_mutex_lock (&all_weak_tables_lock); - all_weak_tables = scm_i_weak_cons (ret, all_weak_tables); - scm_i_pthread_mutex_unlock (&all_weak_tables_lock); - - return ret; -} - -SCM -scm_weak_table_p (SCM obj) -{ - return scm_from_bool (SCM_WEAK_TABLE_P (obj)); -} - -SCM -scm_c_weak_table_ref (SCM table, unsigned long raw_hash, - scm_t_table_predicate_fn pred, - void *closure, SCM dflt) -#define FUNC_NAME "weak-table-ref" -{ - SCM ret; - scm_t_weak_table *t; - - SCM_VALIDATE_WEAK_TABLE (1, table); - - t = SCM_WEAK_TABLE (table); - - scm_i_pthread_mutex_lock (&t->lock); - - ret = weak_table_ref (t, raw_hash, pred, closure, dflt); - - scm_i_pthread_mutex_unlock (&t->lock); - - return ret; -} -#undef FUNC_NAME - -void -scm_c_weak_table_put_x (SCM table, unsigned long raw_hash, - scm_t_table_predicate_fn pred, - void *closure, SCM key, SCM value) -#define FUNC_NAME "weak-table-put!" -{ - scm_t_weak_table *t; - - SCM_VALIDATE_WEAK_TABLE (1, table); - - t = SCM_WEAK_TABLE (table); - - scm_i_pthread_mutex_lock (&t->lock); - - weak_table_put_x (t, raw_hash, pred, closure, key, value); - - scm_i_pthread_mutex_unlock (&t->lock); -} -#undef FUNC_NAME - -void -scm_c_weak_table_remove_x (SCM table, unsigned long raw_hash, - scm_t_table_predicate_fn pred, - void *closure) -#define FUNC_NAME "weak-table-remove!" -{ - scm_t_weak_table *t; - - SCM_VALIDATE_WEAK_TABLE (1, table); - - t = SCM_WEAK_TABLE (table); - - scm_i_pthread_mutex_lock (&t->lock); - - weak_table_remove_x (t, raw_hash, pred, closure); - - scm_i_pthread_mutex_unlock (&t->lock); -} -#undef FUNC_NAME - -static int -assq_predicate (SCM x, SCM y, void *closure) -{ - return scm_is_eq (x, SCM_PACK_POINTER (closure)); -} - -SCM -scm_weak_table_refq (SCM table, SCM key, SCM dflt) -{ - return scm_c_weak_table_ref (table, scm_ihashq (key, -1), - assq_predicate, SCM_UNPACK_POINTER (key), - dflt); -} - -void -scm_weak_table_putq_x (SCM table, SCM key, SCM value) -{ - scm_c_weak_table_put_x (table, scm_ihashq (key, -1), - assq_predicate, SCM_UNPACK_POINTER (key), - key, value); -} - -void -scm_weak_table_remq_x (SCM table, SCM key) -{ - scm_c_weak_table_remove_x (table, scm_ihashq (key, -1), - assq_predicate, SCM_UNPACK_POINTER (key)); -} - -void -scm_weak_table_clear_x (SCM table) -#define FUNC_NAME "weak-table-clear!" -{ - scm_t_weak_table *t; - - SCM_VALIDATE_WEAK_TABLE (1, table); - - t = SCM_WEAK_TABLE (table); - - scm_i_pthread_mutex_lock (&t->lock); - - memset (t->entries, 0, sizeof (scm_t_weak_entry) * t->size); - t->n_items = 0; - - scm_i_pthread_mutex_unlock (&t->lock); -} -#undef FUNC_NAME - -SCM -scm_c_weak_table_fold (scm_t_table_fold_fn proc, void *closure, - SCM init, SCM table) -{ - scm_t_weak_table *t; - scm_t_weak_entry *entries; - unsigned long k, size; - - t = SCM_WEAK_TABLE (table); - - scm_i_pthread_mutex_lock (&t->lock); - - size = t->size; - entries = t->entries; - - for (k = 0; k < size; k++) - { - if (entries[k].hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); - - if (copy.key && copy.value) - { - /* Release table lock while we call the function. */ - scm_i_pthread_mutex_unlock (&t->lock); - init = proc (closure, - SCM_PACK (copy.key), SCM_PACK (copy.value), - init); - scm_i_pthread_mutex_lock (&t->lock); - } - } - } - - scm_i_pthread_mutex_unlock (&t->lock); - - return init; -} - -static SCM -fold_trampoline (void *closure, SCM k, SCM v, SCM init) -{ - return scm_call_3 (SCM_PACK_POINTER (closure), k, v, init); -} - -SCM -scm_weak_table_fold (SCM proc, SCM init, SCM table) -#define FUNC_NAME "weak-table-fold" -{ - SCM_VALIDATE_WEAK_TABLE (3, table); - SCM_VALIDATE_PROC (1, proc); - - return scm_c_weak_table_fold (fold_trampoline, SCM_UNPACK_POINTER (proc), init, table); -} -#undef FUNC_NAME - -static SCM -for_each_trampoline (void *closure, SCM k, SCM v, SCM seed) -{ - scm_call_2 (SCM_PACK_POINTER (closure), k, v); - return seed; -} - -void -scm_weak_table_for_each (SCM proc, SCM table) -#define FUNC_NAME "weak-table-for-each" -{ - SCM_VALIDATE_WEAK_TABLE (2, table); - SCM_VALIDATE_PROC (1, proc); - - scm_c_weak_table_fold (for_each_trampoline, SCM_UNPACK_POINTER (proc), SCM_BOOL_F, table); -} -#undef FUNC_NAME - -static SCM -map_trampoline (void *closure, SCM k, SCM v, SCM seed) -{ - return scm_cons (scm_call_2 (SCM_PACK_POINTER (closure), k, v), seed); -} - -SCM -scm_weak_table_map_to_list (SCM proc, SCM table) -#define FUNC_NAME "weak-table-map->list" -{ - SCM_VALIDATE_WEAK_TABLE (2, table); - SCM_VALIDATE_PROC (1, proc); - - return scm_c_weak_table_fold (map_trampoline, SCM_UNPACK_POINTER (proc), SCM_EOL, table); -} -#undef FUNC_NAME - - - - -/* Legacy interface. */ - -SCM_DEFINE (scm_make_weak_key_hash_table, "make-weak-key-hash-table", 0, 1, 0, - (SCM n), - "@deffnx {Scheme Procedure} make-weak-value-hash-table size\n" - "@deffnx {Scheme Procedure} make-doubly-weak-hash-table size\n" - "Return a weak hash table with @var{size} buckets.\n" - "\n" - "You can modify weak hash tables in exactly the same way you\n" - "would modify regular hash tables. (@pxref{Hash Tables})") -#define FUNC_NAME s_scm_make_weak_key_hash_table -{ - return scm_c_make_weak_table (SCM_UNBNDP (n) ? 0 : scm_to_ulong (n), - SCM_WEAK_TABLE_KIND_KEY); -} -#undef FUNC_NAME - - -SCM_DEFINE (scm_make_weak_value_hash_table, "make-weak-value-hash-table", 0, 1, 0, - (SCM n), - "Return a hash table with weak values with @var{size} buckets.\n" - "(@pxref{Hash Tables})") -#define FUNC_NAME s_scm_make_weak_value_hash_table -{ - return scm_c_make_weak_table (SCM_UNBNDP (n) ? 0 : scm_to_ulong (n), - SCM_WEAK_TABLE_KIND_VALUE); -} -#undef FUNC_NAME - - -SCM_DEFINE (scm_make_doubly_weak_hash_table, "make-doubly-weak-hash-table", 0, 1, 0, - (SCM n), - "Return a hash table with weak keys and values with @var{size}\n" - "buckets. (@pxref{Hash Tables})") -#define FUNC_NAME s_scm_make_doubly_weak_hash_table -{ - return scm_c_make_weak_table (SCM_UNBNDP (n) ? 0 : scm_to_ulong (n), - SCM_WEAK_TABLE_KIND_BOTH); -} -#undef FUNC_NAME - - -SCM_DEFINE (scm_weak_key_hash_table_p, "weak-key-hash-table?", 1, 0, 0, - (SCM obj), - "@deffnx {Scheme Procedure} weak-value-hash-table? obj\n" - "@deffnx {Scheme Procedure} doubly-weak-hash-table? obj\n" - "Return @code{#t} if @var{obj} is the specified weak hash\n" - "table. Note that a doubly weak hash table is neither a weak key\n" - "nor a weak value hash table.") -#define FUNC_NAME s_scm_weak_key_hash_table_p -{ - return scm_from_bool (SCM_WEAK_TABLE_P (obj) && - SCM_WEAK_TABLE (obj)->kind == SCM_WEAK_TABLE_KIND_KEY); -} -#undef FUNC_NAME - - -SCM_DEFINE (scm_weak_value_hash_table_p, "weak-value-hash-table?", 1, 0, 0, - (SCM obj), - "Return @code{#t} if @var{obj} is a weak value hash table.") -#define FUNC_NAME s_scm_weak_value_hash_table_p -{ - return scm_from_bool (SCM_WEAK_TABLE_P (obj) && - SCM_WEAK_TABLE (obj)->kind == SCM_WEAK_TABLE_KIND_VALUE); -} -#undef FUNC_NAME - - -SCM_DEFINE (scm_doubly_weak_hash_table_p, "doubly-weak-hash-table?", 1, 0, 0, - (SCM obj), - "Return @code{#t} if @var{obj} is a doubly weak hash table.") -#define FUNC_NAME s_scm_doubly_weak_hash_table_p -{ - return scm_from_bool (SCM_WEAK_TABLE_P (obj) && - SCM_WEAK_TABLE (obj)->kind == SCM_WEAK_TABLE_KIND_BOTH); -} -#undef FUNC_NAME - - - - - -void -scm_weak_table_prehistory (void) -{ - weak_key_gc_kind = - GC_new_kind (GC_new_free_list (), - GC_MAKE_PROC (GC_new_proc (mark_weak_key_table), 0), - 0, 0); - weak_value_gc_kind = - GC_new_kind (GC_new_free_list (), - GC_MAKE_PROC (GC_new_proc (mark_weak_value_table), 0), - 0, 0); -} - -void -scm_init_weak_table () -{ -#include "libguile/weak-table.x" - - scm_i_register_async_gc_callback (vacuum_all_weak_tables); -} - -/* - Local Variables: - c-file-style: "gnu" - End: -*/ diff --git a/libguile/weak-table.h b/libguile/weak-table.h deleted file mode 100644 index f516c2601..000000000 --- a/libguile/weak-table.h +++ /dev/null @@ -1,94 +0,0 @@ -/* classes: h_files */ - -#ifndef SCM_WEAK_TABLE_H -#define SCM_WEAK_TABLE_H - -/* Copyright (C) 2011, 2012 Free Software Foundation, Inc. - * - * This library is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public License - * as published by the Free Software Foundation; either version 3 of - * the License, or (at your option) any later version. - * - * This library is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA - * 02110-1301 USA - */ - - - -#include "libguile/__scm.h" - - - -/* The weak table API is currently only used internally. We could make it - public later, after some API review. */ - -typedef enum { - SCM_WEAK_TABLE_KIND_KEY, - SCM_WEAK_TABLE_KIND_VALUE, - SCM_WEAK_TABLE_KIND_BOTH, -} scm_t_weak_table_kind; - -/* Function that returns nonzero if the given mapping is the one we are - looking for. */ -typedef int (*scm_t_table_predicate_fn) (SCM k, SCM v, void *closure); - -/* Function to fold over the elements of a set. */ -typedef SCM (*scm_t_table_fold_fn) (void *closure, SCM k, SCM v, SCM result); - -SCM_INTERNAL SCM scm_c_make_weak_table (unsigned long k, - scm_t_weak_table_kind kind); -SCM_INTERNAL SCM scm_weak_table_p (SCM h); - -SCM_INTERNAL SCM scm_c_weak_table_ref (SCM table, unsigned long raw_hash, - scm_t_table_predicate_fn pred, - void *closure, SCM dflt); -SCM_INTERNAL void scm_c_weak_table_put_x (SCM table, unsigned long raw_hash, - scm_t_table_predicate_fn pred, - void *closure, SCM key, SCM value); -SCM_INTERNAL void scm_c_weak_table_remove_x (SCM table, unsigned long raw_hash, - scm_t_table_predicate_fn pred, - void *closure); - -SCM_INTERNAL SCM scm_weak_table_refq (SCM table, SCM key, SCM dflt); -SCM_INTERNAL void scm_weak_table_putq_x (SCM table, SCM key, SCM value); -SCM_INTERNAL void scm_weak_table_remq_x (SCM table, SCM key); - -SCM_INTERNAL void scm_weak_table_clear_x (SCM table); - -SCM_INTERNAL SCM scm_c_weak_table_fold (scm_t_table_fold_fn proc, void *closure, - SCM init, SCM table); -SCM_INTERNAL SCM scm_weak_table_fold (SCM proc, SCM init, SCM table); -SCM_INTERNAL void scm_weak_table_for_each (SCM proc, SCM table); -SCM_INTERNAL SCM scm_weak_table_map_to_list (SCM proc, SCM table); - - - -/* Legacy interface. */ -SCM_API SCM scm_make_weak_key_hash_table (SCM k); -SCM_API SCM scm_make_weak_value_hash_table (SCM k); -SCM_API SCM scm_make_doubly_weak_hash_table (SCM k); -SCM_API SCM scm_weak_key_hash_table_p (SCM h); -SCM_API SCM scm_weak_value_hash_table_p (SCM h); -SCM_API SCM scm_doubly_weak_hash_table_p (SCM h); - - - -SCM_INTERNAL void scm_i_weak_table_print (SCM exp, SCM port, scm_print_state *pstate); -SCM_INTERNAL void scm_weak_table_prehistory (void); -SCM_INTERNAL void scm_init_weak_table (void); - -#endif /* SCM_WEAK_TABLE_H */ - -/* - Local Variables: - c-file-style: "gnu" - End: -*/ diff --git a/module/system/base/types.scm b/module/system/base/types.scm index 49aea27ba..14bf5a9b2 100644 --- a/module/system/base/types.scm +++ b/module/system/base/types.scm @@ -260,7 +260,7 @@ the matching bits, possibly with bitwise operations to extract it from BITS." (define %tc7-vm-continuation #x47) (define %tc7-bytevector #x4d) (define %tc7-weak-set #x55) -(define %tc7-weak-table #x57) +(define %tc7-weak-table #x57) ;no longer used (define %tc7-array #x5d) (define %tc7-bitvector #x5f) (define %tc7-port #x7d) diff --git a/test-suite/tests/types.test b/test-suite/tests/types.test index 446aff541..336350f9a 100644 --- a/test-suite/tests/types.test +++ b/test-suite/tests/types.test @@ -1,6 +1,6 @@ ;;;; types.test --- Type tag decoding. -*- mode: scheme; coding: utf-8; -*- ;;;; -;;;; Copyright (C) 2014, 2015 Free Software Foundation, Inc. +;;;; Copyright (C) 2014, 2015, 2017 Free Software Foundation, Inc. ;;;; ;;;; This file is part of GNU Guile. ;;;; @@ -103,9 +103,10 @@ ((lambda () #t) program _) ((make-variable 'foo) variable _) ((make-weak-vector 3 #t) weak-vector _) - ((make-weak-key-hash-table) weak-table _) - ((make-weak-value-hash-table) weak-table _) - ((make-doubly-weak-hash-table) weak-table _) + ((make-hash-table) hash-table _) + ((make-weak-key-hash-table) hash-table _) + ((make-weak-value-hash-table) hash-table _) + ((make-doubly-weak-hash-table) hash-table _) (#2((1 2 3) (4 5 6)) array _) (#*00000110 bitvector _) ((expt 2 70) bignum _) -- 2.14.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0002-Keep-weak-hash-table-item-count-consistent.patch >From 937c127403e1c3c6fe4221e94df29bb3cf29af26 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= Date: Sun, 22 Oct 2017 16:56:51 -0700 Subject: [PATCH 2/2] Keep weak hash table item count consistent. Fixes a TOCTTOU kind of bug whereby we'd first count the number of items deleted from the table, and later, *without* having the alloc lock, we'd update the table's item count. The problem is that the item count could have been updated in the meantime, hence the bug. Fixes . * libguile/hashtab.c (vacuum_weak_hash_table): Rename to... (do_vacuum_weak_hash_table): ... this. Unmarshall the void* argument. Replace 'fprintf' warning with an assertion. (vacuum_weak_hash_table): New function. Call the above with 'GC_call_with_alloc_lock'. (t_fixup_args): Add 'table' field; remove 'removed_items'. (do_weak_bucket_fixup): Update TABLE's 'n_items' field. (weak_bucket_assoc): Check 'SCM_HASHTABLE_N_ITEMS' instead of 'args.removed_items'. --- libguile/hashtab.c | 68 +++++++++++++++++++++++++++--------------------------- 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/libguile/hashtab.c b/libguile/hashtab.c index 0b3b0c0d4..b80d8124a 100644 --- a/libguile/hashtab.c +++ b/libguile/hashtab.c @@ -96,7 +96,7 @@ static char *s_hashtable = "hashtable"; /* Remove nullified weak pairs from ALIST such that the result contains only valid pairs. Set REMOVED_ITEMS to the number of pairs that have been - deleted. */ + deleted. Assumes the allocation lock is already taken. */ static SCM scm_fixup_weak_alist (SCM alist, size_t *removed_items) { @@ -130,9 +130,10 @@ scm_fixup_weak_alist (SCM alist, size_t *removed_items) return result; } -static void -vacuum_weak_hash_table (SCM table) +static void * +do_vacuum_weak_hash_table (void *arg) { + SCM table = SCM_PACK_POINTER (arg); SCM buckets = SCM_HASHTABLE_VECTOR (table); unsigned long k = SCM_SIMPLE_VECTOR_LENGTH (buckets); size_t len = SCM_HASHTABLE_N_ITEMS (table); @@ -142,44 +143,52 @@ vacuum_weak_hash_table (SCM table) size_t removed; SCM alist = SCM_SIMPLE_VECTOR_REF (buckets, k); alist = scm_fixup_weak_alist (alist, &removed); - if (removed <= len) - len -= removed; - else - { - /* The move to BDW-GC with Guile 2.0 introduced some bugs - related to weak hash tables, threads, memory usage, and the - alloc lock. We were unable to fix these issues - satisfactorily in 2.0 but have addressed them via a rewrite - in 2.2. If you see this message often, you probably want - to upgrade to 2.2. */ - fprintf (stderr, "guile: warning: weak hash table corruption " - "(https://bugs.gnu.org/19180)\n"); - len = 0; - } + + /* The alloc lock is taken, so we cannot get REMOVED > LEN. If we + do, that means we messed up while counting items. */ + assert (removed <= len); + SCM_SIMPLE_VECTOR_SET (buckets, k, alist); } SCM_SET_HASHTABLE_N_ITEMS (table, len); + + return table; +} + +/* Remove deleted weak pairs from the buckets of TABLE, and update + TABLE's item count accordingly. */ +static void +vacuum_weak_hash_table (SCM table) +{ + /* Take the alloc lock so we have a consistent view of the live + elements in TABLE. Failing to do that, we could be miscounting the + number of elements. */ + GC_call_with_alloc_lock (do_vacuum_weak_hash_table, + SCM_PACK (table)); } + /* Packed arguments for `do_weak_bucket_fixup'. */ struct t_fixup_args { + SCM table; SCM bucket; SCM *bucket_copy; - size_t removed_items; }; static void * do_weak_bucket_fixup (void *data) { - struct t_fixup_args *args; SCM pair, *copy; + size_t len, removed_items; + struct t_fixup_args *args = (struct t_fixup_args *) data; - args = (struct t_fixup_args *) data; + args->bucket = scm_fixup_weak_alist (args->bucket, &removed_items); - args->bucket = scm_fixup_weak_alist (args->bucket, &args->removed_items); + len = SCM_HASHTABLE_N_ITEMS (args->table); + SCM_SET_HASHTABLE_N_ITEMS (args->table, len - removed_items); for (pair = args->bucket, copy = args->bucket_copy; scm_is_pair (pair); @@ -214,6 +223,7 @@ weak_bucket_assoc (SCM table, SCM buckets, size_t bucket_index, and values in BUCKET. */ strong_refs = alloca (scm_ilength (bucket) * 2 * sizeof (SCM)); + args.table = table; args.bucket = bucket; args.bucket_copy = strong_refs; @@ -239,19 +249,9 @@ weak_bucket_assoc (SCM table, SCM buckets, size_t bucket_index, scm_remember_upto_here_1 (strong_refs); - if (args.removed_items > 0) - { - /* Update TABLE's item count and optionally trigger a rehash. */ - size_t remaining; - - assert (SCM_HASHTABLE_N_ITEMS (table) >= args.removed_items); - - remaining = SCM_HASHTABLE_N_ITEMS (table) - args.removed_items; - SCM_SET_HASHTABLE_N_ITEMS (table, remaining); - - if (remaining < SCM_HASHTABLE_LOWER (table)) - scm_i_rehash (table, hash_fn, closure, "weak_bucket_assoc"); - } + if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)) + /* Trigger a rehash. */ + scm_i_rehash (table, hash_fn, closure, "weak_bucket_assoc"); return result; } -- 2.14.2 --=-=-=--