From: <tomas@tuxteam.de>
To: help-gnu-emacs@gnu.org
Subject: Re: Most used words in current buffer
Date: Mon, 23 Jul 2018 09:34:28 +0200 [thread overview]
Message-ID: <20180723073428.GA14641@tuxteam.de> (raw)
In-Reply-To: <20180723000338486239475@bob.proulx.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Mon, Jul 23, 2018 at 12:09:15AM -0600, Bob Proulx wrote:
[...]
> Again the hash table is good until you need to grow the table. Then
> there is a pause while memory is shuffled around.
Precisely. To get this "amortized", you usually grow by some constant
factor, and there's where the log n comes in (which a tree has naturally
built in).
For really big hash tables (think all *dbm variations which are not
some cousins of B-trees), you even have several "levels", at which
point the hash tables start resembling B-trees somewhat (see below).
> But that never happens with a balanced tree because that cost is
> always amortized along as you go.
But once you get big, you start "feeling" the structure of your
storage (i.e. that the "storage as a big constant-time access
array" is just a lie). Having two pointers per node is horribly
wasteful, and if those pointers point to different continents
in your RAM, your CPU ends up waiting on cache fills all the
time.
> Also if you need to extract the entries in sorted order then the
> balanced tree is already sorted. Whereas with the hash table an extra
> sort step is needed.
That's right: once you need sorted walks or even lookup by sorted
index (i.e. "find the first entry whose value is greater or equal
to X" -- or "find all entries whose values are between X and Y"),
ordered trees are (probably) the way to go. But I'd guess, once
the tree becomes significant in size (yeah, handwaving here :)
some kind of B-tree, where you manage several entries in one node,
should be at an advantage.
> One of looking at this is that it is global optimization versus local
> optimization.
And the ratio of lookups per insert. And whether you want a sorted
index.
> In some ways it is swings and roundabouts.
Nicely put :-)
> But in other ways it depends upon what you need.
Absolutely
Cheers
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEARECAAYFAltVhQQACgkQBcgs9XrR2kYaIACfdi/XvTyfdbJ9yk4my+eJziMt
yyYAnjw7NadASHDsgDrCJpWdzvDpmxDm
=5dmS
-----END PGP SIGNATURE-----
next prev parent reply other threads:[~2018-07-23 7:34 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-17 9:28 Most used words in current buffer Udyant Wig
2018-07-17 18:41 ` Emanuel Berg
2018-07-18 9:36 ` Udyant Wig
2018-07-18 11:48 ` Emanuel Berg
2018-07-18 14:50 ` Udyant Wig
2018-07-18 16:32 ` Emanuel Berg
2018-07-18 22:39 ` Ben Bacarisse
2018-07-19 0:45 ` Bob Proulx
[not found] ` <mailman.3785.1531961144.1292.help-gnu-emacs@gnu.org>
2018-07-19 5:33 ` Udyant Wig
2018-07-19 7:04 ` Bob Proulx
2018-07-19 7:25 ` tomas
2018-07-19 17:19 ` Nick Dokos
2018-07-19 17:30 ` Eli Zaretskii
2018-07-19 20:08 ` Bob Proulx
2018-07-20 16:39 ` Nick Dokos
[not found] ` <mailman.3909.1532104802.1292.help-gnu-emacs@gnu.org>
2018-07-20 18:13 ` Udyant Wig
2018-07-20 22:24 ` Bob Newell
2018-07-21 0:00 ` Nick Dokos
2018-07-21 0:18 ` Nick Dokos
[not found] ` <mailman.3843.1532030947.1292.help-gnu-emacs@gnu.org>
2018-07-20 6:19 ` Udyant Wig
2018-07-20 23:25 ` Bob Proulx
2018-07-21 0:26 ` Nick Dokos
2018-07-21 4:03 ` Bob Proulx
[not found] ` <mailman.3934.1532129163.1292.help-gnu-emacs@gnu.org>
2018-07-21 13:39 ` Udyant Wig
[not found] ` <mailman.3826.1532020800.1292.help-gnu-emacs@gnu.org>
2018-07-20 5:52 ` Udyant Wig
[not found] ` <mailman.3796.1531983885.1292.help-gnu-emacs@gnu.org>
2018-07-19 13:26 ` Udyant Wig
2018-07-19 20:42 ` Bob Proulx
2018-07-20 3:08 ` Bob Newell
[not found] ` <mailman.3861.1532056120.1292.help-gnu-emacs@gnu.org>
2018-07-21 12:51 ` Udyant Wig
2018-07-21 16:15 ` Eric Abrahamsen
[not found] ` <mailman.3982.1532189751.1292.help-gnu-emacs@gnu.org>
2018-07-21 19:46 ` Udyant Wig
2018-07-22 3:57 ` Eric Abrahamsen
2018-07-22 4:00 ` Eric Abrahamsen
2018-07-22 4:05 ` Eric Abrahamsen
[not found] ` <mailman.4008.1532232144.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:28 ` Udyant Wig
2018-07-22 20:05 ` Eric Abrahamsen
[not found] ` <mailman.4007.1532231884.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:19 ` Udyant Wig
[not found] ` <mailman.3845.1532032966.1292.help-gnu-emacs@gnu.org>
2018-07-20 13:18 ` Udyant Wig
2018-07-21 18:22 ` Stefan Monnier
2018-07-22 9:02 ` tomas
2018-07-23 6:09 ` Bob Proulx
2018-07-23 7:34 ` tomas [this message]
[not found] ` <mailman.4074.1532326162.1292.help-gnu-emacs@gnu.org>
2018-07-23 7:26 ` Udyant Wig
[not found] ` <mailman.4013.1532250176.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:58 ` Udyant Wig
[not found] ` <mailman.3991.1532197378.1292.help-gnu-emacs@gnu.org>
2018-07-21 19:39 ` Udyant Wig
2018-07-21 20:54 ` Stefan Monnier
[not found] ` <mailman.3995.1532206511.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:43 ` Udyant Wig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180723073428.GA14641@tuxteam.de \
--to=tomas@tuxteam.de \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).