From: "Ludovic Courtès" <ludovic.courtes@laas.fr>
Cc: guile-user@gnu.org
Subject: Re: Does anyone have a better scm_string_hash ?
Date: Fri, 14 Nov 2003 16:51:55 +0100 [thread overview]
Message-ID: <20031114155148.GI16650@powergnu.laas.fr> (raw)
In-Reply-To: <1068823738.13123.54.camel@localhost>
You might want to look at
http://srfi.schemers.org/srfi-13/mail-archive/msg00112.html .
Basically, the idea there is that a hash key for string cn..c0 is:
h = (c0 + c1*37 + c2*37^2 + ...) % hash_size.
I used that and it *seems* to work quite well.
This can be written as follows:
k = 0;
while (*string)
{
k = (k * 37) + (*string);
k %= hash_size;
string++;
}
return k;
Cheers,
Ludovic.
Today, 15 minutes, 40 seconds ago, Roland Orre wrote:
> I noticed that our data mining software run very very slow with a
> new data base. I localized the problem to scm_string_hash.
>
> A hash table in this case was loaded with 14166 strings. I have
> a function which creates a reasonable sized hash table, in this
> case the hash table size was 8209.
>
> 13856 of these strings were hashed to the same index= 1067.
> 303 strings got index = 8061.
> 2 strings got the index = 754.
> 8201 entries were empty.
>
> We are running guile 1.6 but I checked the scm_string_hash from a recent
> 1.7 CVS also and the function in hash.c there is identical.
>
> I added a few of the symbols hashing to 1067 below. One can of course
> argue that the symbols in this case should be hashed as numbers.
> Anyway, does anyone have any hint or have a better string hash function?
>
> Best regards
> Roland Orre
>
>
> A few strings hashing to entry 1067 for hash table length 8209:
> "01632001" "01627301" "01626801" "01626601" "01626501" "01626401"
> "01626301" "01626101" "01626001" "01625901" "01625801" "01625701"
> "01625401" "01625301" "01625101" "01625001" "01624801" "01624601"
> "01624501" "01624401" "01624301" "01624101" "01624001" "01623901"
> "01623801" "01623701" "01623601" "01623501" "01623401" "01623201"
> "01623101" "01622901" "01622801" "01622701" "01622601" "01622401"
>
>
>
>
> _______________________________________________
> Guile-user mailing list
> Guile-user@gnu.org
> http://mail.gnu.org/mailman/listinfo/guile-user
_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user
next prev parent reply other threads:[~2003-11-14 15:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-11-13 21:55 cmod-play 1 available + modsup.h additions Thien-Thi Nguyen
2003-11-14 8:26 ` Ludovic Courtès
2003-11-14 13:10 ` Thien-Thi Nguyen
2003-11-14 13:37 ` Ludovic Courtès
2003-11-14 17:38 ` Thien-Thi Nguyen
2003-11-14 14:29 ` Marius Vollmer
2003-11-14 14:17 ` Marius Vollmer
2003-11-14 15:28 ` Does anyone have a better scm_string_hash ? Roland Orre
2003-11-14 15:51 ` Ludovic Courtès [this message]
2003-11-17 8:33 ` Roland Orre
2003-11-17 13:01 ` Ludovic Courtès
2003-11-17 15:42 ` Marius Vollmer
2003-11-17 16:02 ` Marius Vollmer
2003-11-17 16:29 ` Marius Vollmer
2003-11-17 16:48 ` Allister MacLeod
2003-11-17 17:57 ` Marius Vollmer
2003-11-17 19:17 ` OT: x86 assembly timings/size (was Re: Does anyone have a better scm_string_hash ?) Allister MacLeod
2003-11-17 21:27 ` OT: x86 assembly timings/size Marius Vollmer
2003-11-19 9:04 ` Does anyone have a better scm_string_hash ? Ludovic Courtès
2003-11-19 15:02 ` Marius Vollmer
2003-11-14 17:40 ` cmod-play 1 available + modsup.h additions Thien-Thi Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031114155148.GI16650@powergnu.laas.fr \
--to=ludovic.courtes@laas.fr \
--cc=guile-user@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).