From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Noah Lavine Newsgroups: gmane.lisp.guile.devel Subject: Re: rfi: hash set Date: Fri, 14 Jan 2011 20:42:16 -0500 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Trace: dough.gmane.org 1295056413 20289 80.91.229.12 (15 Jan 2011 01:53:33 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 15 Jan 2011 01:53:33 +0000 (UTC) Cc: guile-devel@gnu.org To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sat Jan 15 02:53:27 2011 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PdvKY-0000BU-4K for guile-devel@m.gmane.org; Sat, 15 Jan 2011 02:53:26 +0100 Original-Received: from localhost ([127.0.0.1]:44624 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PdvKX-00058z-LN for guile-devel@m.gmane.org; Fri, 14 Jan 2011 20:53:25 -0500 Original-Received: from [140.186.70.92] (port=47645 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PdvKP-0004tD-Nb for guile-devel@gnu.org; Fri, 14 Jan 2011 20:53:19 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pdv9l-0008R2-Uh for guile-devel@gnu.org; Fri, 14 Jan 2011 20:42:18 -0500 Original-Received: from mail-wy0-f169.google.com ([74.125.82.169]:45964) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pdv9l-0008Qk-QN for guile-devel@gnu.org; Fri, 14 Jan 2011 20:42:17 -0500 Original-Received: by wyj26 with SMTP id 26so3481788wyj.0 for ; Fri, 14 Jan 2011 17:42:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=GDh4i7OkV+SL1sB4UwXYMH9VmEMD4Jlot9p+/aFVQlw=; b=tv578BxjnAdGcozHkdvITnyF5iUcQgS8OdxT/38CcUw3FIkrokyfxMZKKRgo51Z3fe AhEli0rw1bIIXSDbUFBzup6DoNBdHnQStLlIn9WH1iAp+2Qm84XaxmerY0ksHKdztXEj X2xIa8CIBpdZtDivw7qMDqsXBqJlkC2rtB62Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=tn4+JOkkvXzfQxot9fFAxIQbc7s7dbgzrHKEw2wCN7VNlmj2aqS+Y9e0btYwtiDyTN qBTsJVl/vyHBo8pUTLjAH82jg5pv1r/Dh6VnghN9WW8oDomKsmLC3Z0jYqJquUKyQ1Rs 1+e6B1zRtedVmcbHvMN/NJvf6453RyX1Fv5pM= Original-Received: by 10.216.72.201 with SMTP id t51mr175490wed.6.1295055736335; Fri, 14 Jan 2011 17:42:16 -0800 (PST) Original-Received: by 10.216.156.65 with HTTP; Fri, 14 Jan 2011 17:42:16 -0800 (PST) In-Reply-To: X-Google-Sender-Auth: dW9WRqx-B3kQfDuNFcdxbOXznVo X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:11313 Archived-At: Hello, I started looking into implementing this, and I ran into something strange that I'd like clarification on. Am I correct in saying that currently, hash tables can only shrink by one size index when they are rehashed? I think this because of hashtab.c, line 293. This is a part of scm_i_rehash which looks like this: 281 /* rehashing is not triggered when i <= min_size */ 290 i = SCM_HASHTABLE (table)->size_index; 291 do 292 --i; 293 while (i > SCM_HASHTABLE (table)->min_size_index 294 && SCM_HASHTABLE_N_ITEMS (table) < hashtable_size[i] / 4); The i variable is an int representing the size of the new hash table. It is an index into hashtable_size, an array of allowed hashtable sizes. So i will be decremented until it represents a reasonable size for the table. However, i is also bounded by the table's min_size_index. Here's the thing: based on grepping through this file, it seems that min_size_index is set when a table is first made, to the initial size index of the table, and never changes. Therefore, any i that represents a shrink of the table will be <= min_size_index, so the while's condition will always fail, so the loop can only run once, no matter how few items are in the hash table, so i will always be the old size_index - 1. (This code path is only run in case of a shrink, not when a table needs to grow.) Is that right? Noah On Wed, Jan 5, 2011 at 10:56 PM, Andy Wingo wrote: > Hello, > > Currently the symbol table takes up twice as much memory as it needs to, > because it is a hash table instead of a set. (The difference being that > the buckets in a set don't need to be pairs.) > > We don't actually have a good set data type implementation, and I'm sure > people have opinions about this, so if anyone has the time, an > implementation would be appreciated. Name it hashset.[ch] and make sure > it handles the weak reference case. > > Thanks! :) (Hey, it's worth a try :) > > Andy > -- > http://wingolog.org/ > >