* Re: van Emde Boas hash.
@ 2009-11-27 18:01 A. Soare
0 siblings, 0 replies; 15+ messages in thread
From: A. Soare @ 2009-11-27 18:01 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Emacs Dev [emacs-devel]
> > I did look at the code, and I cannot understand the algorithm from
> > `make-hash-table'.
>
> It's one of the standard hashing schemes, where the hash-table is
> resized as it grows, which should hopefully keep the access time
> more-or-less constant.
>
>
Seems like red black trees.
This is not better than van Emde B.
Alin
____________________________________________________
Derniers jours pour remporter le séjour au Maroc sur http://www.lesrevoila.fr/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
@ 2009-11-23 14:04 A. Soare
2009-11-23 16:43 ` Stefan Monnier
0 siblings, 1 reply; 15+ messages in thread
From: A. Soare @ 2009-11-23 14:04 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Emacs Dev [emacs-devel]
> > What algorithm uses emacs for hashing the obarrays of symbols ?
>
> Read the source: C-h f intern RET then click on top right to get to the
> source code, ...
>
> It's a hash table with simple haqshing into buckets that contain linked
> lists of symbols.
As far as I understand, the algorthm is so:
oblookup ( obarray, sym )
;; hash is an integer, the index of a bucket that may contain sym
hash = hash_string (sym)
;; bucket is a vector. obarray is a vector of many buckets
bucket = obarray [hash]
;; search sym in its bucket using a naive search
FOR every symbol S in bucket, check whether S has the same name
as sym. If so, return S. If no symbol matches, returns the hash.
Even if it is very unlikely that we can find many symbols into a bucket, the algorithm does not require constant time for every search.
Hashing using van Emde Boas requires in the worst case a time of log_2(N), where N is the length of the largest symbol name in obarray.
Alin.
____________________________________________________
Derniers jours pour remporter le séjour au Maroc sur http://www.lesrevoila.fr/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
2009-11-23 14:04 A. Soare
@ 2009-11-23 16:43 ` Stefan Monnier
2009-11-23 16:57 ` David Kastrup
2009-11-27 9:04 ` A Soare
0 siblings, 2 replies; 15+ messages in thread
From: Stefan Monnier @ 2009-11-23 16:43 UTC (permalink / raw)
To: alinsoar; +Cc: Emacs Dev [emacs-devel]
> As far as I understand, the algorthm is so:
> oblookup ( obarray, sym )
> ;; hash is an integer, the index of a bucket that may contain sym
> hash = hash_string (sym)
> ;; bucket is a vector. obarray is a vector of many buckets
> bucket = obarray [hash]
> ;; search sym in its bucket using a naive search
> FOR every symbol S in bucket, check whether S has the same name
> as sym. If so, return S. If no symbol matches, returns the hash.
> Even if it is very unlikely that we can find many symbols into
> a bucket, the algorithm does not require constant time for
> every search.
Yes, it's a very simple hashing algorithm. I'm sure we can come up with
something more efficient. Then again, I haven't seen any indication
that this is a performance problem.
Stefan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
2009-11-23 16:43 ` Stefan Monnier
@ 2009-11-23 16:57 ` David Kastrup
2009-11-27 9:04 ` A Soare
1 sibling, 0 replies; 15+ messages in thread
From: David Kastrup @ 2009-11-23 16:57 UTC (permalink / raw)
To: emacs-devel
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> As far as I understand, the algorthm is so:
>
>> oblookup ( obarray, sym )
>
>> ;; hash is an integer, the index of a bucket that may contain sym
>> hash = hash_string (sym)
>> ;; bucket is a vector. obarray is a vector of many buckets
>> bucket = obarray [hash]
>> ;; search sym in its bucket using a naive search
>> FOR every symbol S in bucket, check whether S has the same name
>> as sym. If so, return S. If no symbol matches, returns the hash.
>
>
>> Even if it is very unlikely that we can find many symbols into
>> a bucket, the algorithm does not require constant time for
>> every search.
>
> Yes, it's a very simple hashing algorithm. I'm sure we can come up
> with something more efficient. Then again, I haven't seen any
> indication that this is a performance problem.
I should think that it makes up a sizeable portion of byte-recompiling
and autoloading, both of which are not often benchmarked. But it isn't
something which you commonly do on data sets in a loop. Scanning files
is done just once usually.
--
David Kastrup
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
2009-11-23 16:43 ` Stefan Monnier
2009-11-23 16:57 ` David Kastrup
@ 2009-11-27 9:04 ` A Soare
1 sibling, 0 replies; 15+ messages in thread
From: A Soare @ 2009-11-27 9:04 UTC (permalink / raw)
To: emacs-devel
> Yes, it's a very simple hashing algorithm. I'm sure we can come up with
> something more efficient. Then again, I haven't seen any indication
> that this is a performance problem.
I think that the lists can be hashed using van Emde Boas trees, and this would
give a really great improvement to emacs.
Alin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
@ 2009-11-22 12:35 A. Soare
2009-11-23 2:25 ` Stefan Monnier
0 siblings, 1 reply; 15+ messages in thread
From: A. Soare @ 2009-11-22 12:35 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Emacs Dev [emacs-devel]
What algorithm uses emacs for hashing the obarrays of symbols ?
____________________________________________________
Gagnez un séjour au Maroc en découvrant les sketchs désopilants des ReVoila sur http://www.lesrevoila.fr/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
@ 2009-11-20 6:41 A. Soare
2009-11-23 4:06 ` tomas
0 siblings, 1 reply; 15+ messages in thread
From: A. Soare @ 2009-11-20 6:41 UTC (permalink / raw)
To: Deniz Dogan; +Cc: Stefan Monnier, Emacs Dev [emacs-devel]
Van Emde Boas arrays are hashed arrays (of numbers for example), which have the property that all operations are realized into a constant number of steps.
For example, for an array of length 2^32, the complexity is O(0), and the number of steps is less than log_2 (32) = "maximum 5 steps" for every operation.
The operations are:
* insert a new number
* delete a number
* seek for a number
* find the successor of a number
* find the predecessor of a number
____________________________________________________
Gagnez un séjour au Maroc en découvrant les sketchs désopilants des ReVoila sur http://www.lesrevoila.fr/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
2009-11-20 6:41 A. Soare
@ 2009-11-23 4:06 ` tomas
2009-11-23 14:39 ` A Soare
0 siblings, 1 reply; 15+ messages in thread
From: tomas @ 2009-11-23 4:06 UTC (permalink / raw)
To: emacs-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Fri, Nov 20, 2009 at 07:41:15AM +0100, A. Soare wrote:
>
> Van Emde Boas arrays are hashed arrays [...]
>
> For example, for an array of length 2^32, the complexity is O(0), and
> the number of steps is less than log_2 (32) = "maximum 5 steps" for every operation.
>
> The operations are:
>
> * insert a new number
> * delete a number
> * seek for a number
> * find the successor of a number
> * find the predecessor of a number
If I understood the Wikipedia page correctly (thnks, Deniz), you
can find "next" and "previous" to a number not in the hash? Then
they look like a good candidate to represent markers and boundaries of
overlays, no?
Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLCgpZBcgs9XrR2kYRAvh6AJ9TxH4v/4jFSN90gK/X/QHgJQuf/wCfUYYU
mZjOWuTH3spYQJWvZ3HtWzg=
=7Ccl
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: van Emde Boas hash.
2009-11-23 4:06 ` tomas
@ 2009-11-23 14:39 ` A Soare
0 siblings, 0 replies; 15+ messages in thread
From: A Soare @ 2009-11-23 14:39 UTC (permalink / raw)
To: emacs-devel
> If I understood the Wikipedia page correctly (thnks, Deniz), you
> can find "next" and "previous" to a number not in the hash?
When looking for the next of a number N, it will return the next/prev number
from the array greater than N. Does not matter whether N belongs or not to the
array.
> Then
> they look like a good candidate to represent markers and boundaries of
> overlays, no?
I do not understand. How are the markers and boundaries represented ?
Alin.
PS:
I post again this message, because I discovered http://post.gmane.org/, which is
very nice for reading the news groups.
Emacs can be used convenient to read/post to emacs devel? Can somebody help me
to confugire gnus or something else for?
^ permalink raw reply [flat|nested] 15+ messages in thread
* van Emde Boas hash.
@ 2009-11-19 12:04 A. Soare
2009-11-19 15:23 ` Stefan Monnier
0 siblings, 1 reply; 15+ messages in thread
From: A. Soare @ 2009-11-19 12:04 UTC (permalink / raw)
To: Emacs Dev [emacs-devel]
I am going to implement van Emde Boas. Do you consider that this hash method could find some useful applications in emacs?
If so, I will try to implement it with Emacs Lisp data structures...
____________________________________________________
Gagnez un séjour au Maroc en découvrant les sketchs désopilants des ReVoila sur http://www.lesrevoila.fr/
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-11-27 18:01 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-27 18:01 van Emde Boas hash A. Soare
-- strict thread matches above, loose matches on Subject: below --
2009-11-23 14:04 A. Soare
2009-11-23 16:43 ` Stefan Monnier
2009-11-23 16:57 ` David Kastrup
2009-11-27 9:04 ` A Soare
2009-11-22 12:35 A. Soare
2009-11-23 2:25 ` Stefan Monnier
2009-11-20 6:41 A. Soare
2009-11-23 4:06 ` tomas
2009-11-23 14:39 ` A Soare
2009-11-19 12:04 A. Soare
2009-11-19 15:23 ` Stefan Monnier
2009-11-19 16:44 ` Deniz Dogan
2009-11-27 9:14 ` A Soare
2009-11-27 16:56 ` Stefan Monnier
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).