From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: HJSTEIN@bloomberg.net (Harvey J. Stein) Newsgroups: gmane.lisp.guile.user,gmane.lisp.guile.devel Subject: Re: Resizing hash tables in Guile Date: 13 Feb 2003 13:30:58 -0500 Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Message-ID: References: NNTP-Posting-Host: main.gmane.org X-Trace: main.gmane.org 1045161016 3302 80.91.224.249 (13 Feb 2003 18:30:16 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 13 Feb 2003 18:30:16 +0000 (UTC) Cc: Greg Troxel Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18jO7Y-0000qE-00 for ; Thu, 13 Feb 2003 19:30:05 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18jO92-0008Oq-03 for guile-user@m.gmane.org; Thu, 13 Feb 2003 13:31:36 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18jO8Y-0008BR-00 for guile-user@gnu.org; Thu, 13 Feb 2003 13:31:06 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18jO8W-0008BA-00 for guile-user@gnu.org; Thu, 13 Feb 2003 13:31:05 -0500 Original-Received: from mh1dmz4.bloomberg.net ([205.216.112.158] helo=mh1dmz4.bloomberg.com) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18jO8U-00088S-00; Thu, 13 Feb 2003 13:31:02 -0500 Original-Received: from mh6ny.bloomberg.com by mh1dmz4.bloomberg.net with ESMTP; Thu, 13 Feb 2003 13:30:57 -0500 Original-Received: from blinky.bloomberg.com by mh6ny.bloomberg.com with ESMTP; Thu, 13 Feb 2003 13:30:54 -0500 Original-Received: (from hjstein@localhost) by blinky.bloomberg.com (8.11.6/8.11.6) id h1DIUwG05348; Thu, 13 Feb 2003 13:30:58 -0500 X-Authentication-Warning: blinky.bloomberg.com: hjstein set sender to hjstein@bloomberg.com using -f Original-To: Joris van der Hoeven In-Reply-To: Joris van der Hoeven's message of "Thu, 13 Feb 2003 15:24:32 +0100 (MET)" Original-Lines: 68 X-Mailer: Gnus v5.7/Emacs 20.7 Original-cc: djurfeldt@nada.kth.se Original-cc: guile-user@gnu.org Original-cc: Roland Orre Original-cc: guile-devel@gnu.org Original-cc: Marius Vollmer X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: General Guile related discussions List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: main.gmane.org gmane.lisp.guile.user:1650 gmane.lisp.guile.devel:1931 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.user:1650 Joris van der Hoeven writes: > > > average. This means that, on average, operations are O(1). > > > > Inserts are, but lookups aren't necessarily. > > Both aren't necessarily, because inserting requires looking up too. Yes, if you want to return an error for inserting 2 items with the same key. No, if you allow multiple items with the same key. > > And good hashing functions are still hard to write. > > I do not really agree. A good hash algorithm for lists (or strings), > which I use in TeXmacs, is ... I guess "hard" isn't quite the word for it. What I meant was that the obvious ones don't work well. You basically need to look up a good one to get good behavior. Tcl's fcn works pretty well too. I forget what it does. > > With this many items, log(N) is still just 32, so an O(log(N)) > > algorithm will still beat an O(1) algorithm if it's really > > log_2(N) vs 32. > > Yes, but the O(1) is really *table lookup* multiplied by a small > constant here, so this is *fast*. You may adjust the small constant > by choosing an appropriate threshold for "size/nr buckets". Not just table lookup. Also hash calculation from the key, which can be more time consuming than a key comparison, and key comparisons for everything in the bucket. > > Trees also sort the data for you, which hash tables don't give you. > > But you need a compairison operation for that, > which may be even less natural than a hash function. Yes. For hash tables you just need a key equality test. For trees you need to be able to order the keys. > > So, ideally, one would have a hash table object with & without > > resizing, and various sorts of tree (AVL, red/black, B*, etc) objects. > > insert and delete and map would be methods that work on all of the > > above, with map on trees returning the data in sorted order. For that > > matter, insert & delete might as well also work on lists... > > Agreed: ideally, we have everything :^) I think we're on the same page here... STk had a nice scheme interface for hash tables. The code uses the Tcl hash tables, they autoresize and if I recall correctly, many of the important parameters can be modified. You might want to model the guile stuff on this, or utilize this work. I advocated this when the discussion came up years ago on the guile mailing list. BTW, there was a long discussion of hash tables on the guile list some number of years ago. At the time I didn't like liked the guile (aka scm) hash tables. They were a very minimal implementation. The user had to do too much to use them. I don't know what it's like these days, but it'd certainly be nice to have a good implementation available which includes autosizing. -- Harvey Stein Bloomberg LP hjstein@bloomberg.com _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user