From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.bugs,gmane.lisp.guile.devel Subject: bug#19180: Weak tables harmful to GC? Date: Mon, 30 Oct 2017 18:29:45 +0100 Message-ID: <87she0lf1y.fsf@igalia.com> References: <87wp9gwz8m.fsf@gnu.org> <87fug4o8z2.fsf@pobox.com> <87wp54z3p5.fsf@gnu.org> <87zi9th1i6.fsf_-_@gnu.org> <87y3o454pr.fsf@gnu.org> <87r2tvncs5.fsf@dustycloud.org> <87wp3mwwef.fsf@gnu.org> <87mv4gd0ik.fsf@elephly.net> <87a80eie63.fsf@gnu.org> <87k1zimhmt.fsf@elephly.net> <87bmktn96e.fsf@elephly.net> <87tvyl9n22.fsf@gnu.org> <87r2tnlhno.fsf@elephly.net> <87a8087qz7.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: blaine.gmane.org 1509384628 5468 195.159.176.226 (30 Oct 2017 17:30:28 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 30 Oct 2017 17:30:28 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) Cc: 19180@debbugs.gnu.org, Ricardo Wurmus , guile-devel@gnu.org To: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Mon Oct 30 18:30:14 2017 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9Dt5-0008Uh-Fy for guile-bugs@m.gmane.org; Mon, 30 Oct 2017 18:30:12 +0100 Original-Received: from localhost ([::1]:41844 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e9DtC-00066T-Pb for guile-bugs@m.gmane.org; Mon, 30 Oct 2017 13:30:18 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58697) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e9Dt1-000613-07 for bug-guile@gnu.org; Mon, 30 Oct 2017 13:30:10 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e9Dsx-0005iY-A5 for bug-guile@gnu.org; Mon, 30 Oct 2017 13:30:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:33832) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e9Dsx-0005iG-2m for bug-guile@gnu.org; Mon, 30 Oct 2017 13:30:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1e9Dsw-0000uP-Lk for bug-guile@gnu.org; Mon, 30 Oct 2017 13:30:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Andy Wingo Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 30 Oct 2017 17:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19180 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 19180-submit@debbugs.gnu.org id=B19180.15093845993464 (code B ref 19180); Mon, 30 Oct 2017 17:30:02 +0000 Original-Received: (at 19180) by debbugs.gnu.org; 30 Oct 2017 17:29:59 +0000 Original-Received: from localhost ([127.0.0.1]:42513 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9Dss-0000tn-8s for submit@debbugs.gnu.org; Mon, 30 Oct 2017 13:29:59 -0400 Original-Received: from pb-sasl1.pobox.com ([64.147.108.66]:55773 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9Dsp-0000te-Gc for 19180@debbugs.gnu.org; Mon, 30 Oct 2017 13:29:57 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id C9AC39C9DA; Mon, 30 Oct 2017 13:29:53 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=wlfxUwjScFFvrObTMEvxB+55zZc=; b=nPPW1r 1+6qKVdq04BSgC2dlbpQ6jBHiqdh6819kP/EsFuJpQ+TuPCg8OSa/J3hZryqBPIW Td4t6C++utE+avFFGQBBc358mRtAF69OfWtzHlpbtpvTw3hzQ1TA8ivlZGEijzPE jnbx0s9ZdlkcYVTrBa65nzsloLTWyLssaZpdY= Original-Received: from pb-sasl1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id BF1469C9D9; Mon, 30 Oct 2017 13:29:53 -0400 (EDT) Original-Received: from sparrow (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl1.pobox.com (Postfix) with ESMTPSA id 61BAE9C9D8; Mon, 30 Oct 2017 13:29:52 -0400 (EDT) In-Reply-To: <87a8087qz7.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Mon, 30 Oct 2017 13:35:56 +0100") X-Pobox-Relay-ID: F02D800A-BD97-11E7-80FE-ABEFD5707B88-02397024!pb-sasl1.pobox.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.org gmane.lisp.guile.bugs:8874 gmane.lisp.guile.devel:19368 Archived-At: --=-=-= Content-Type: text/plain Hi! As discussed on IRC, what do you think of this patch? It preserves the thread-safety properties of weak tables and just adapts them to be bucket-and-chain tables. Let me know how it works for you. If it works, we'll need to adapt weak sets as well. Andy --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Weak-tables-are-now-bucket-and-chain-tables.patch >From 6ec4642516eaabf7a63644463a7836eb3efbcd60 Mon Sep 17 00:00:00 2001 From: Andy Wingo Date: Mon, 30 Oct 2017 18:19:37 +0100 Subject: [PATCH] Weak tables are now bucket-and-chain tables This change should make weak tables work better with libgc, as the weak components that need mark functions are smaller, so they don't overflow the mark queue. Also this prevents the need to move disappearing links. * libguile/weak-table.c (scm_t_weak_entry): Change to be a hash table chain entry. (struct weak_entry_data, do_read_weak_entry, read_weak_entry): Read out the key and value directly. (GC_move_disappearing_link, move_disappearing_links, move_weak_entry): Remove. (scm_t_weak_table): Rename "entries" member to "buckets", and "size" to "n_buckets". (hash_to_index, entry_distance, rob_from_rich, give_to_poor): Remove. (mark_weak_key_entry, mark_weak_value_entry): Mark a single link, and the next link. (mark_doubly_weak_entry): New kind. (allocate_entry): Allocate a single entry. (add_entry): New helper. (resize_table): Reimplement more like normal hash tables. (vacuum_weak_table): Adapt to new implementation. (weak_table_ref, weak_table_put_x, weak_table_remove_x): Adapt. (make_weak_table): Adapt. (scm_weak_table_clear_x): Actually unregister the links to prevent a memory leak. (scm_c_weak_table_fold): Collect items in an alist, then fold outside the lock. (scm_weak_table_prehistory): Initialize doubly_weak_gc_kind. --- libguile/weak-table.c | 723 +++++++++++++++----------------------------------- 1 file changed, 212 insertions(+), 511 deletions(-) diff --git a/libguile/weak-table.c b/libguile/weak-table.c index 599c4cf0e..ff8a01fb0 100644 --- a/libguile/weak-table.c +++ b/libguile/weak-table.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2011, 2012, 2013, 2014 Free Software Foundation, Inc. +/* Copyright (C) 2011, 2012, 2013, 2014, 2017 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -44,83 +44,62 @@ data, but when you don't have space to store the data in the object. For example, procedure properties are implemented with weak tables. - Weak tables are implemented using an open-addressed hash table. - Basically this means that there is an array of entries, and the item - is expected to be found the slot corresponding to its hash code, - modulo the length of the array. - - Collisions are handled using linear probing with the Robin Hood - technique. See Pedro Celis' paper, "Robin Hood Hashing": - - http://www.cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf - - The vector of entries is allocated in such a way that the GC doesn't - trace the weak values. For doubly-weak tables, this means that the - entries are allocated as an "atomic" piece of memory. Key-weak and - value-weak tables use a special GC kind with a custom mark procedure. - When items are added weakly into table, a disappearing link is - registered to their locations. If the referent is collected, then - that link will be zeroed out. + This is a normal bucket-and-chain hash table, except that the chain + entries are allocated in such a way that the GC doesn't trace the + weak values. For doubly-weak tables, this means that the entries are + allocated as an "atomic" piece of memory. Key-weak and value-weak + tables use a special GC kind with a custom mark procedure. When + items are added weakly into table, a disappearing link is registered + to their locations. If the referent is collected, then that link + will be zeroed out. An entry in the table consists of the key and the value, together - with the hash code of the key. We munge hash codes so that they are - never 0. In this way we can detect removed entries (key of zero but - nonzero hash code), and can then reshuffle elements as needed to - maintain the robin hood ordering. - - Compared to buckets-and-chains hash tables, open addressing has the - advantage that it is very cache-friendly. It also uses less memory. - - Implementation-wise, there are two things to note. - - 1. We assume that hash codes are evenly distributed across the - range of unsigned longs. The actual hash code stored in the - entry is left-shifted by 1 bit (losing 1 bit of hash precision), - and then or'd with 1. In this way we ensure that the hash field - of an occupied entry is nonzero. To map to an index, we - right-shift the hash by one, divide by the size, and take the - remainder. - - 2. Since the weak references are stored in an atomic region with - disappearing links, they need to be accessed with the GC alloc - lock. `copy_weak_entry' will do that for you. The hash code - itself can be read outside the lock, though. + with the hash code of the key. + + Note that since the weak references are stored in an atomic region + with disappearing links, they need to be accessed with the GC alloc + lock. `read_weak_entry' will do that for you. The hash code itself + can be read outside the lock, though. */ -typedef struct { +typedef struct scm_weak_entry scm_t_weak_entry; + +struct scm_weak_entry { unsigned long hash; + scm_t_weak_entry *next; scm_t_bits key; scm_t_bits value; -} scm_t_weak_entry; +}; struct weak_entry_data { - scm_t_weak_entry *in; - scm_t_weak_entry *out; + scm_t_weak_entry *entry; + scm_t_bits key; + scm_t_bits value; }; static void* -do_copy_weak_entry (void *data) +do_read_weak_entry (void *data) { struct weak_entry_data *e = data; - e->out->hash = e->in->hash; - e->out->key = e->in->key; - e->out->value = e->in->value; + e->key = e->entry->key; + e->value = e->entry->value; return NULL; } static void -copy_weak_entry (scm_t_weak_entry *src, scm_t_weak_entry *dst) +read_weak_entry (scm_t_weak_entry *entry, scm_t_bits *key, scm_t_bits *value) { struct weak_entry_data data; - data.in = src; - data.out = dst; - - GC_call_with_alloc_lock (do_copy_weak_entry, &data); + data.entry = entry; + GC_call_with_alloc_lock (do_read_weak_entry, &data); + + *key = data.key; + *value = data.value; } static void @@ -152,59 +131,11 @@ unregister_disappearing_links (scm_t_weak_entry *entry, GC_unregister_disappearing_link ((void **) &entry->value); } -#ifndef HAVE_GC_MOVE_DISAPPEARING_LINK -static void -GC_move_disappearing_link (void **from, void **to) -{ - GC_unregister_disappearing_link (from); - SCM_I_REGISTER_DISAPPEARING_LINK (to, *to); -} -#endif - -static void -move_disappearing_links (scm_t_weak_entry *from, scm_t_weak_entry *to, - SCM key, SCM value, scm_t_weak_table_kind kind) -{ - if ((kind == SCM_WEAK_TABLE_KIND_KEY || kind == SCM_WEAK_TABLE_KIND_BOTH) - && SCM_HEAP_OBJECT_P (key)) - GC_move_disappearing_link ((void **) &from->key, (void **) &to->key); - - if ((kind == SCM_WEAK_TABLE_KIND_VALUE || kind == SCM_WEAK_TABLE_KIND_BOTH) - && SCM_HEAP_OBJECT_P (value)) - GC_move_disappearing_link ((void **) &from->value, (void **) &to->value); -} - -static void -move_weak_entry (scm_t_weak_entry *from, scm_t_weak_entry *to, - scm_t_weak_table_kind kind) -{ - if (from->hash) - { - scm_t_weak_entry copy; - - copy_weak_entry (from, ©); - to->hash = copy.hash; - to->key = copy.key; - to->value = copy.value; - - move_disappearing_links (from, to, - SCM_PACK (copy.key), SCM_PACK (copy.value), - kind); - } - else - { - to->hash = 0; - to->key = 0; - to->value = 0; - } -} - - typedef struct { - scm_t_weak_entry *entries; /* the data */ + scm_t_weak_entry **buckets; /* the data */ scm_i_pthread_mutex_t lock; /* the lock */ scm_t_weak_table_kind kind; /* what kind of table it is */ - unsigned long size; /* total number of slots. */ + unsigned long n_buckets; /* total number of buckets. */ unsigned long n_items; /* number of items in table */ unsigned long lower; /* when to shrink */ unsigned long upper; /* when to grow */ @@ -219,171 +150,114 @@ typedef struct { #define SCM_WEAK_TABLE(x) ((scm_t_weak_table *) SCM_CELL_WORD_1 (x)) -static unsigned long -hash_to_index (unsigned long hash, unsigned long size) -{ - return (hash >> 1) % size; -} - -static unsigned long -entry_distance (unsigned long hash, unsigned long k, unsigned long size) -{ - unsigned long origin = hash_to_index (hash, size); - - if (k >= origin) - return k - origin; - else - /* The other key was displaced and wrapped around. */ - return size - origin + k; -} - -static void -rob_from_rich (scm_t_weak_table *table, unsigned long k) -{ - unsigned long empty, size; - - size = table->size; - - /* If we are to free up slot K in the table, we need room to do so. */ - assert (table->n_items < size); - - empty = k; - do - empty = (empty + 1) % size; - while (table->entries[empty].hash); - - do - { - unsigned long last = empty ? (empty - 1) : (size - 1); - move_weak_entry (&table->entries[last], &table->entries[empty], - table->kind); - empty = last; - } - while (empty != k); - - table->entries[empty].hash = 0; - table->entries[empty].key = 0; - table->entries[empty].value = 0; -} - -static void -give_to_poor (scm_t_weak_table *table, unsigned long k) -{ - /* Slot K was just freed up; possibly shuffle others down. */ - unsigned long size = table->size; - - while (1) - { - unsigned long next = (k + 1) % size; - unsigned long hash; - scm_t_weak_entry copy; - - hash = table->entries[next].hash; - - if (!hash || hash_to_index (hash, size) == next) - break; - - copy_weak_entry (&table->entries[next], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference. */ - { - give_to_poor (table, next); - table->n_items--; - continue; - } - - move_weak_entry (&table->entries[next], &table->entries[k], - table->kind); - - k = next; - } - - /* We have shuffled down any entries that should be shuffled down; now - free the end. */ - table->entries[k].hash = 0; - table->entries[k].key = 0; - table->entries[k].value = 0; -} - - /* The GC "kinds" for singly-weak tables. */ static int weak_key_gc_kind; static int weak_value_gc_kind; +static int doubly_weak_gc_kind; static struct GC_ms_entry * -mark_weak_key_table (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, +mark_weak_key_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, struct GC_ms_entry *mark_stack_limit, GC_word env) { - scm_t_weak_entry *entries = (scm_t_weak_entry*) addr; - unsigned long k, size = GC_size (addr) / sizeof (scm_t_weak_entry); + scm_t_weak_entry *entry = (scm_t_weak_entry*) addr; - for (k = 0; k < size; k++) - if (entries[k].hash && entries[k].key) - { - SCM value = SCM_PACK (entries[k].value); + if (entry->next) + mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next, + mark_stack_ptr, mark_stack_limit, + NULL); + + if (entry->hash && entry->key) + { + SCM value = SCM_PACK (entry->value); + if (SCM_HEAP_OBJECT_P (value)) mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (value), mark_stack_ptr, mark_stack_limit, NULL); - } + } return mark_stack_ptr; } static struct GC_ms_entry * -mark_weak_value_table (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, +mark_weak_value_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, struct GC_ms_entry *mark_stack_limit, GC_word env) { - scm_t_weak_entry *entries = (scm_t_weak_entry*) addr; - unsigned long k, size = GC_size (addr) / sizeof (scm_t_weak_entry); + scm_t_weak_entry *entry = (scm_t_weak_entry*) addr; - for (k = 0; k < size; k++) - if (entries[k].hash && entries[k].value) - { - SCM key = SCM_PACK (entries[k].key); + if (entry->next) + mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next, + mark_stack_ptr, mark_stack_limit, + NULL); + + if (entry->hash && entry->value) + { + SCM key = SCM_PACK (entry->key); + if (SCM_HEAP_OBJECT_P (key)) mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (key), mark_stack_ptr, mark_stack_limit, NULL); - } + } + + return mark_stack_ptr; +} + +static struct GC_ms_entry * +mark_doubly_weak_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr, + struct GC_ms_entry *mark_stack_limit, GC_word env) +{ + scm_t_weak_entry *entry = (scm_t_weak_entry*) addr; + + if (entry->next) + mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next, + mark_stack_ptr, mark_stack_limit, + NULL); return mark_stack_ptr; } static scm_t_weak_entry * -allocate_entries (unsigned long size, scm_t_weak_table_kind kind) +allocate_entry (scm_t_weak_table_kind kind) { scm_t_weak_entry *ret; - size_t bytes = size * sizeof (*ret); switch (kind) { case SCM_WEAK_TABLE_KIND_KEY: - ret = GC_generic_malloc (bytes, weak_key_gc_kind); + ret = GC_generic_malloc (sizeof (*ret), weak_key_gc_kind); break; case SCM_WEAK_TABLE_KIND_VALUE: - ret = GC_generic_malloc (bytes, weak_value_gc_kind); + ret = GC_generic_malloc (sizeof (*ret), weak_value_gc_kind); break; case SCM_WEAK_TABLE_KIND_BOTH: - ret = scm_gc_malloc_pointerless (bytes, "weak-table"); + ret = GC_generic_malloc (sizeof (*ret), doubly_weak_gc_kind); break; default: abort (); } - memset (ret, 0, bytes); + memset (ret, 0, sizeof (*ret)); return ret; } +static void +add_entry (scm_t_weak_table *table, scm_t_weak_entry *entry) +{ + unsigned long bucket = entry->hash % table->n_buckets; + entry->next = table->buckets[bucket]; + table->buckets[bucket] = entry; + table->n_items++; +} + /* Growing or shrinking is triggered when the load factor * * L = N / S (N: number of items in table, S: bucket vector length) * - * passes an upper limit of 0.9 or a lower limit of 0.2. + * passes an upper limit of 0.9 or a lower limit of 0.25. * * The implementation stores the upper and lower number of items which * trigger a resize in the hashtable object. @@ -400,168 +274,91 @@ static unsigned long hashtable_size[] = { #define HASHTABLE_SIZE_N (sizeof(hashtable_size)/sizeof(unsigned long)) -static int -compute_size_index (scm_t_weak_table *table) +static void +resize_table (scm_t_weak_table *table) { - int i = table->size_index; + scm_t_weak_entry **old_buckets, **new_buckets; + int new_size_index; + unsigned long old_n_buckets, new_n_buckets, old_k; + new_size_index = table->size_index; if (table->n_items < table->lower) { - /* rehashing is not triggered when i <= min_size */ + /* Rehashing is not triggered when i <= min_size. */ do - --i; - while (i > table->min_size_index - && table->n_items < hashtable_size[i] / 5); + new_size_index -= 1; + while (new_size_index > table->min_size_index + && table->n_items < hashtable_size[new_size_index] / 4); } else if (table->n_items > table->upper) { - ++i; - if (i >= HASHTABLE_SIZE_N) - /* The biggest size currently is 230096423, which for a 32-bit - machine will occupy 2.3GB of memory at a load of 80%. There - is probably something better to do here, but if you have a - weak map of that size, you are hosed in any case. */ - abort (); - } - - return i; -} - -static int -is_acceptable_size_index (scm_t_weak_table *table, int size_index) -{ - int computed = compute_size_index (table); - - if (size_index == computed) - /* We were going to grow or shrink, and allocating the new vector - didn't change the target size. */ - return 1; - - if (size_index == computed + 1) - { - /* We were going to enlarge the table, but allocating the new - vector finalized some objects, making an enlargement - unnecessary. It might still be a good idea to use the larger - table, though. (This branch also gets hit if, while allocating - the vector, some other thread was actively removing items from - the table. That is less likely, though.) */ - unsigned long new_lower = hashtable_size[size_index] / 5; - - return table->size > new_lower; - } - - if (size_index == computed - 1) - { - /* We were going to shrink the table, but when we dropped the lock - to allocate the new vector, some other thread added elements to - the table. */ - return 0; - } - - /* The computed size differs from our newly allocated size by more - than one size index -- recalculate. */ - return 0; -} - -static void -resize_table (scm_t_weak_table *table) -{ - scm_t_weak_entry *old_entries, *new_entries; - int new_size_index; - unsigned long old_size, new_size, old_k; - - do - { - new_size_index = compute_size_index (table); - if (new_size_index == table->size_index) + new_size_index += 1; + if (new_size_index >= HASHTABLE_SIZE_N) + /* Limit max bucket count. */ return; - new_size = hashtable_size[new_size_index]; - new_entries = allocate_entries (new_size, table->kind); } - while (!is_acceptable_size_index (table, new_size_index)); + else + /* Nothing to do. */ + return; + + new_n_buckets = hashtable_size[new_size_index]; + new_buckets = scm_gc_malloc (sizeof (*new_buckets) * new_n_buckets, + "weak table buckets"); - old_entries = table->entries; - old_size = table->size; + old_buckets = table->buckets; + old_n_buckets = table->n_buckets; table->size_index = new_size_index; - table->size = new_size; + table->n_buckets = new_n_buckets; if (new_size_index <= table->min_size_index) table->lower = 0; else - table->lower = new_size / 5; - table->upper = 9 * new_size / 10; + table->lower = new_n_buckets / 4; + table->upper = 9 * new_n_buckets / 10; table->n_items = 0; - table->entries = new_entries; + table->buckets = new_buckets; - for (old_k = 0; old_k < old_size; old_k++) + for (old_k = 0; old_k < old_n_buckets; old_k++) { - scm_t_weak_entry copy; - unsigned long new_k, distance; - - if (!old_entries[old_k].hash) - continue; - - copy_weak_entry (&old_entries[old_k], ©); - - if (!copy.key || !copy.value) - continue; - - new_k = hash_to_index (copy.hash, new_size); - - for (distance = 0; ; distance++, new_k = (new_k + 1) % new_size) + scm_t_weak_entry *entry = old_buckets[old_k]; + while (entry) { - unsigned long other_hash = new_entries[new_k].hash; - - if (!other_hash) - /* Found an empty entry. */ - break; - - /* Displace the entry if our distance is less, otherwise keep - looking. */ - if (entry_distance (other_hash, new_k, new_size) < distance) - { - rob_from_rich (table, new_k); - break; - } + scm_t_weak_entry *next = entry->next; + entry->next = NULL; + add_entry (table, entry); + entry = next; } - - table->n_items++; - new_entries[new_k].hash = copy.hash; - new_entries[new_k].key = copy.key; - new_entries[new_k].value = copy.value; - - register_disappearing_links (&new_entries[new_k], - SCM_PACK (copy.key), SCM_PACK (copy.value), - table->kind); } } /* Run after GC via do_vacuum_weak_table, this function runs over the whole table, removing lost weak references, reshuffling the table as it - goes. It might resize the table if it reaps enough entries. */ + goes. It might resize the table if it reaps enough buckets. */ static void vacuum_weak_table (scm_t_weak_table *table) { - scm_t_weak_entry *entries = table->entries; - unsigned long size = table->size; unsigned long k; - for (k = 0; k < size; k++) + for (k = 0; k < table->n_buckets; k++) { - unsigned long hash = entries[k].hash; - - if (hash) - { - scm_t_weak_entry copy; + scm_t_weak_entry **loc = table->buckets + k; + scm_t_weak_entry *entry; - copy_weak_entry (&entries[k], ©); + for (entry = *loc; entry; entry = *loc) + { + scm_t_bits key, value; - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ + read_weak_entry (entry, &key, &value); + if (!key || !value) + /* Lost weak reference; prune entry. */ { - give_to_poor (table, k); + *loc = entry->next; table->n_items--; + entry->next = NULL; + unregister_disappearing_links (entry, table->kind); } + else + loc = &entry->next; } } @@ -577,52 +374,22 @@ weak_table_ref (scm_t_weak_table *table, unsigned long hash, scm_t_table_predicate_fn pred, void *closure, SCM dflt) { - unsigned long k, distance, size; - scm_t_weak_entry *entries; - - size = table->size; - entries = table->entries; + unsigned long bucket = hash % table->n_buckets; + scm_t_weak_entry *entry; - hash = (hash << 1) | 0x1; - k = hash_to_index (hash, size); - - for (distance = 0; distance < size; distance++, k = (k + 1) % size) + for (entry = table->buckets[bucket]; entry; entry = entry->next) { - unsigned long other_hash; - - retry: - other_hash = entries[k].hash; - - if (!other_hash) - /* Not found. */ - return dflt; - - if (hash == other_hash) + if (entry->hash == hash) { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); + scm_t_bits key, value; - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ - { - give_to_poor (table, k); - table->n_items--; - goto retry; - } - - if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure)) + read_weak_entry (entry, &key, &value); + if (key && value && pred (SCM_PACK (key), SCM_PACK (value), closure)) /* Found. */ - return SCM_PACK (copy.value); + return SCM_PACK (value); } - - /* If the entry's distance is less, our key is not in the table. */ - if (entry_distance (other_hash, k, size) < distance) - return dflt; } - /* If we got here, then we were unfortunate enough to loop through the - whole table. Shouldn't happen, but hey. */ return dflt; } @@ -632,81 +399,37 @@ weak_table_put_x (scm_t_weak_table *table, unsigned long hash, scm_t_table_predicate_fn pred, void *closure, SCM key, SCM value) { - unsigned long k, distance, size; - scm_t_weak_entry *entries; - - size = table->size; - entries = table->entries; - - hash = (hash << 1) | 0x1; - k = hash_to_index (hash, size); + unsigned long bucket = hash % table->n_buckets; + scm_t_weak_entry *entry; - for (distance = 0; ; distance++, k = (k + 1) % size) + for (entry = table->buckets[bucket]; entry; entry = entry->next) { - unsigned long other_hash; - - retry: - other_hash = entries[k].hash; - - if (!other_hash) - /* Found an empty entry. */ - break; - - if (other_hash == hash) + if (entry->hash == hash) { - scm_t_weak_entry copy; + scm_t_bits k, v; - copy_weak_entry (&entries[k], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ + read_weak_entry (entry, &k, &v); + if (k && v && pred (SCM_PACK (k), SCM_PACK (v), closure)) { - give_to_poor (table, k); - table->n_items--; - goto retry; + unregister_disappearing_links (entry, table->kind); + key = SCM_PACK (k); + entry->value = SCM_UNPACK (value); + register_disappearing_links (entry, key, value, table->kind); + return; } - - if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure)) - /* Found an entry with this key. */ - break; - } - - if (table->n_items > table->upper) - /* Full table, time to resize. */ - { - resize_table (table); - return weak_table_put_x (table, hash >> 1, pred, closure, key, value); } - - /* Displace the entry if our distance is less, otherwise keep - looking. */ - if (entry_distance (other_hash, k, size) < distance) - { - rob_from_rich (table, k); - break; - } - } - - /* Fast path for updated values for existing entries of weak-key - tables. */ - if (table->kind == SCM_WEAK_TABLE_KIND_KEY && - entries[k].hash == hash && - entries[k].key == SCM_UNPACK (key)) - { - entries[k].value = SCM_UNPACK (value); - return; } - if (entries[k].hash) - unregister_disappearing_links (&entries[k], table->kind); - else - table->n_items++; - - entries[k].hash = hash; - entries[k].key = SCM_UNPACK (key); - entries[k].value = SCM_UNPACK (value); + if (table->n_items > table->upper) + /* Full table, time to resize. */ + resize_table (table); - register_disappearing_links (&entries[k], key, value, table->kind); + entry = allocate_entry (table->kind); + entry->hash = hash; + entry->key = SCM_UNPACK (key); + entry->value = SCM_UNPACK (value); + register_disappearing_links (entry, key, value, table->kind); + add_entry (table, entry); } @@ -714,62 +437,34 @@ static void weak_table_remove_x (scm_t_weak_table *table, unsigned long hash, scm_t_table_predicate_fn pred, void *closure) { - unsigned long k, distance, size; - scm_t_weak_entry *entries; - - size = table->size; - entries = table->entries; - - hash = (hash << 1) | 0x1; - k = hash_to_index (hash, size); + unsigned long bucket = hash % table->n_buckets; + scm_t_weak_entry **loc = table->buckets + bucket; + scm_t_weak_entry *entry; - for (distance = 0; distance < size; distance++, k = (k + 1) % size) + for (entry = *loc; entry; entry = *loc) { - unsigned long other_hash; - - retry: - other_hash = entries[k].hash; - - if (!other_hash) - /* Not found. */ - return; - - if (other_hash == hash) + if (entry->hash == hash) { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); - - if (!copy.key || !copy.value) - /* Lost weak reference; reshuffle. */ - { - give_to_poor (table, k); - table->n_items--; - goto retry; - } + scm_t_bits k, v; - if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure)) - /* Found an entry with this key. */ + read_weak_entry (entry, &k, &v); + if (k && v && pred (SCM_PACK (k), SCM_PACK (v), closure)) { - entries[k].hash = 0; - entries[k].key = 0; - entries[k].value = 0; - - unregister_disappearing_links (&entries[k], table->kind); + *loc = entry->next; + table->n_items--; + entry->next = NULL; + unregister_disappearing_links (entry, table->kind); - if (--table->n_items < table->lower) + if (table->n_items < table->lower) resize_table (table); - else - give_to_poor (table, k); return; } } - - /* If the entry's distance is less, our key is not in the table. */ - if (entry_distance (other_hash, k, size) < distance) - return; + loc = &entry->next; } + + return; } @@ -785,10 +480,11 @@ make_weak_table (unsigned long k, scm_t_weak_table_kind kind) n = hashtable_size[i]; table = scm_gc_malloc (sizeof (*table), "weak-table"); - table->entries = allocate_entries (n, kind); + table->buckets = scm_gc_malloc (sizeof (*table->buckets) * n, + "weak table buckets"); table->kind = kind; table->n_items = 0; - table->size = n; + table->n_buckets = n; table->lower = 0; table->upper = 9 * n / 10; table->size_index = i; @@ -805,7 +501,7 @@ scm_i_weak_table_print (SCM exp, SCM port, scm_print_state *pstate) scm_puts ("weak-table ", port); scm_uintprint (SCM_WEAK_TABLE (exp)->n_items, 10, port); scm_putc ('/', port); - scm_uintprint (SCM_WEAK_TABLE (exp)->size, 10, port); + scm_uintprint (SCM_WEAK_TABLE (exp)->n_buckets, 10, port); scm_puts (">", port); } @@ -961,6 +657,8 @@ scm_weak_table_clear_x (SCM table) #define FUNC_NAME "weak-table-clear!" { scm_t_weak_table *t; + unsigned long k; + scm_t_weak_entry *entry; SCM_VALIDATE_WEAK_TABLE (1, table); @@ -968,7 +666,12 @@ scm_weak_table_clear_x (SCM table) scm_i_pthread_mutex_lock (&t->lock); - memset (t->entries, 0, sizeof (scm_t_weak_entry) * t->size); + for (k = 0; k < t->n_buckets; k++) + { + for (entry = t->buckets[k]; entry; entry = entry->next) + unregister_disappearing_links (entry, t->kind); + t->buckets[k] = NULL; + } t->n_items = 0; scm_i_pthread_mutex_unlock (&t->lock); @@ -980,38 +683,32 @@ scm_c_weak_table_fold (scm_t_table_fold_fn proc, void *closure, SCM init, SCM table) { scm_t_weak_table *t; - scm_t_weak_entry *entries; - unsigned long k, size; + unsigned long k; + SCM alist = SCM_EOL; t = SCM_WEAK_TABLE (table); scm_i_pthread_mutex_lock (&t->lock); - size = t->size; - entries = t->entries; - - for (k = 0; k < size; k++) + for (k = 0; k < t->n_buckets; k++) { - if (entries[k].hash) + scm_t_weak_entry *entry; + for (entry = t->buckets[k]; entry; entry = entry->next) { - scm_t_weak_entry copy; - - copy_weak_entry (&entries[k], ©); + scm_t_bits key, value; + read_weak_entry (entry, &key, &value); - if (copy.key && copy.value) - { - /* Release table lock while we call the function. */ - scm_i_pthread_mutex_unlock (&t->lock); - init = proc (closure, - SCM_PACK (copy.key), SCM_PACK (copy.value), - init); - scm_i_pthread_mutex_lock (&t->lock); - } + if (key && value) + alist = scm_acons (SCM_PACK (key), SCM_PACK (value), alist); } } scm_i_pthread_mutex_unlock (&t->lock); + /* Call the proc outside the lock. */ + for (; !scm_is_null (alist); alist = scm_cdr (alist)) + init = proc (closure, scm_caar (alist), scm_cdar (alist), init); + return init; } @@ -1157,11 +854,15 @@ scm_weak_table_prehistory (void) { weak_key_gc_kind = GC_new_kind (GC_new_free_list (), - GC_MAKE_PROC (GC_new_proc (mark_weak_key_table), 0), + GC_MAKE_PROC (GC_new_proc (mark_weak_key_entry), 0), 0, 0); weak_value_gc_kind = GC_new_kind (GC_new_free_list (), - GC_MAKE_PROC (GC_new_proc (mark_weak_value_table), 0), + GC_MAKE_PROC (GC_new_proc (mark_weak_value_entry), 0), + 0, 0); + doubly_weak_gc_kind = + GC_new_kind (GC_new_free_list (), + GC_MAKE_PROC (GC_new_proc (mark_doubly_weak_entry), 0), 0, 0); } -- 2.14.1 --=-=-=--