From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Ludovic =?iso-8859-1?Q?Court=E8s?= Newsgroups: gmane.lisp.guile.user Subject: Re: Does anyone have a better scm_string_hash ? Date: Mon, 17 Nov 2003 14:01:33 +0100 Organization: LAAS-CNRS Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Message-ID: <20031117130131.GH493@powergnu.laas.fr> References: <8765hnf308.fsf@zagadka.ping.de> <1068823738.13123.54.camel@localhost> <20031114155148.GI16650@powergnu.laas.fr> <1069058032.1638.21.camel@localhost> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1069074289 19393 80.91.224.253 (17 Nov 2003 13:04:49 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 17 Nov 2003 13:04:49 +0000 (UTC) Cc: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Nov 17 14:04:45 2003 Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1ALj3c-0005FM-00 for ; Mon, 17 Nov 2003 14:04:44 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1ALjzw-0000MB-Fj for guile-user@m.gmane.org; Mon, 17 Nov 2003 09:05:00 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24) id 1ALjzL-0000Ke-08 for guile-user@gnu.org; Mon, 17 Nov 2003 09:04:23 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24) id 1ALjyn-00005G-Uu for guile-user@gnu.org; Mon, 17 Nov 2003 09:04:21 -0500 Original-Received: from [140.93.0.15] (helo=laas.laas.fr) by monty-python.gnu.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.24) id 1ALjy1-0008Nl-Re for guile-user@gnu.org; Mon, 17 Nov 2003 09:03:02 -0500 Original-Received: by laas.laas.fr (8.12.10/8.12.10) with SMTP id hAHD1X4p009507; Mon, 17 Nov 2003 14:01:34 +0100 (CET) Original-To: Roland Orre Mail-Followup-To: Roland Orre , guile-user@gnu.org Content-Disposition: inline In-Reply-To: <1069058032.1638.21.camel@localhost> X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-PGP-Key-ID: 0xEB1F5364 X-PGP-Key: http://ludo.humanoidz.org/ludovic.asc X-OS: GNU/Linux X-URL: http://ludo.humanoidz.org/ User-Agent: Mutt/1.5.4i [Guile enabled] X-Spam-Score: -5 () EMAIL_ATTRIBUTION, IN_REP_TO, QUOTED_EMAIL_TEXT, REFERENCES, REPLY_WITH_QUOTES, USER_AGENT_MUTT X-Scanned-By: MIMEDefang at CNRS-LAAS X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.2 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: main.gmane.org gmane.lisp.guile.user:2394 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.user:2394 Hi, Today, 3 hours, 58 minutes, 19 seconds ago, Roland Orre wrote: > The performance increase we obtained with this hash function were > tremendous. With the problematic symbol table the average lookup > time went down from 10 ms to 9 us on this specific machine (1 GHz PIII). > The max number of collisions went down from 13856 to 7 (on a total > of 14166 strings in a table of length 8209). The number of empty > entries went down from 8201 to 909. The run time for a data mining > scan on a database went down from 5 hours to 15 minutes. The reason > I added i in the hash function was to decrease the risk that two > strings with same but permuted contents would hash to the same value. > I have not done any thorough tests on this hash function yet. Looks cool. > /* replaced by Roland Orre orre@nada.kth.se */ > #define hash_mask 0x00FFFFFF > unsigned long > scm_string_hash (const unsigned char *str, size_t len) > { > /* from suggestion at: */ > /* http://srfi.schemers.org/srfi-13/mail-archive/msg00112.html */ > int i; > /* originally h=0 was suggested. orre@nada.kth.se */ > unsigned long h=177; > > for (i = len-1; i >= 0; i--) > { > /* h = (str[i] +i+ h*37) & hash_mask; */ > h = (str[i] + i + (h<<5) + (h<<2) + h) & hash_mask; > } > return h; > } /* scm_string_hash */ I don't think you need to compute a modulo (logical and with HASH_MASK) of the hash key, because scm_hasher () returns scm_string_hash () % n. Thanks, Ludovic. _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user