From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Joris van der Hoeven Newsgroups: gmane.lisp.guile.user Subject: Re: Efficiency and flexibility of hash-tables Date: Sat, 8 Feb 2003 15:14:39 +0100 (MET) Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Message-ID: References: <1044712677.1033.143.camel@localhost> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: main.gmane.org 1044713603 9434 80.91.224.249 (8 Feb 2003 14:13:23 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sat, 8 Feb 2003 14:13:23 +0000 (UTC) Cc: Joris van der Hoeven Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18hVjN-0002Rn-00 for ; Sat, 08 Feb 2003 15:13:21 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18hVl5-0005aw-06 for guile-user@m.gmane.org; Sat, 08 Feb 2003 09:15:07 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18hVkh-0005WC-00 for guile-user@gnu.org; Sat, 08 Feb 2003 09:14:43 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18hVkf-0005SC-00 for guile-user@gnu.org; Sat, 08 Feb 2003 09:14:42 -0500 Original-Received: from matups.math.u-psud.fr ([129.175.50.4]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18hVkf-0005Pp-00 for guile-user@gnu.org; Sat, 08 Feb 2003 09:14:41 -0500 Original-Received: from anh.math.u-psud.fr (anh.math.u-psud.fr [129.175.50.156]) h18EEeU23424 ; Sat, 8 Feb 2003 15:14:40 +0100 (MET) Original-Received: from anh (anh [129.175.50.156]) by anh.math.u-psud.fr (Postfix) with SMTP id 0697DB2C8; Sat, 8 Feb 2003 15:14:39 +0100 (MET) X-Sender: texmacs@anh Original-To: Roland Orre In-Reply-To: <1044712677.1033.143.camel@localhost> Original-cc: guile-user@gnu.org X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: General Guile related discussions List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: main.gmane.org gmane.lisp.guile.user:1606 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.user:1606 Hi, Thanks for your reply. Unfortunately, I think that you did not fully understand my question. > > When declaring a hash table using > > > > (define H (make-hash-table 100)) > > > > does this mean that the number of slots will *always* remain 100? > > No, the hash table is a vector of entries to lists where the actual > information is stored. A hash table in guile can therefore contain > any number of items. The number of entries is merely a choice of what > performance you need. If you declare too few entries you will get > a lot of linear search through the lists from each entry. > I myself use to estimate it so that the lists will rarely be deeper > than two or three to get a reasonable performance. That is why I distinguished the word 'slots' from the word 'entries'. The number of slots is the length of the vector you mention. So the ratio 'nr entries / nr slots' should be small in order to get a good performance. My question was: is the number of slots automatically adapted as a function of the number or entries, or is it not? If you cannot have a good estimate for the number of entries, then this auto-adaptation may be important. In fact, I think that a good low level implementation of general purpose hash tables should have this feature. > The performance is also reflected upon the hash function versus the > vector length. Usually it is advisable to use a prime number to avoid > systematic hashing to the same entries. Sometime it happened I missed > this and sloppily declared the hash table length to e.g. 1000000 if > needing about 3000000 items. The run took several hours instead of > the expected half an hour, which I got when changing the length to > 1000003. If you have access to some mathematical package like maple > there is often a function nextprime which can be helpful. > > Usually the built-in hash functions works fine but you may also > consider making a special hash functions for special needs if > the built-in function doesn't spread good enough. > > > I am frequently dealing with hash tables where I do not > > have a reasonable estimation of number of entires in advance. > > In TeXmacs, I therefore implemented a hash table type which > > doubles the number of slots each time that the number of entries > > becomes larger than a constant times the number of slots > > (and divides by two the number of slots when the number of > > entries becomes smaller than a constant times the number of slots). > > Has a similar system been implemented in (an extension of) guile? > > > > Thanks for your help, Joris > > > > > > ----------------------------------------------------------- > > Joris van der Hoeven > > http://www.texmacs.org: GNU TeXmacs scientific text editor > > http://www.math.u-psud.fr/~vdhoeven: personal homepage > > ----------------------------------------------------------- > > > > > > > > _______________________________________________ > > Guile-user mailing list > > Guile-user@gnu.org > > http://mail.gnu.org/mailman/listinfo/guile-user > -- > _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user