From: Jambunathan K <kjambunathan@gmail.com>
To: emacs-devel@gnu.org
Subject: intervals.c: About interval tree
Date: Mon, 15 Apr 2013 10:08:05 +0530 [thread overview]
Message-ID: <87k3o4cssy.fsf@gmail.com> (raw)
I have been closely looking at intervals.c and I have a few questions.
Type of tree (in standard literature)?
=====================================
It seems the tree that is used is "mostly balanced tree". I am
wondering what sort of "balancedness" we are talking about i.e.. what
invariant does the tree satisfy?
A cursory reading of `balance_an_interval' suggests that, at a given
node, choose one of
1. No rotation
2. Left rotate
3. Right rotate
which minmizes the diff between weight of left and right subtrees i.e.,
(weight => total length of the interval)
Is that balancing rule a heuristic or does it give nice algorithmic
properties?
----------------------------------------------------------------
Similar to (but not same as) this?
=================================
The closest (in form) to what current `balance_an_interval' does is the
one that I see from
http://www8.cs.umu.se/kurser/TDBAfl/VT06/algorithms/WEBSITE/IMPLEMEN/HANDBOOK/DISTRIB/HANDBOOK/ALGS/32/3415_R_C.HTM
But the code snippet there does *some more jugglery* beyond merely
comparing the left and right weights.
----------------------------------------------------------------
Is history relevant - Apropos threshold for weight balancing?
=============================================================
The weight balanced trees that I know of define a threshold that goes
something like what is seen here in the below pages.
http://xlinux.nist.gov/dads//HTML/bbalphatree.html
http://xlinux.nist.gov/dads//HTML/balance.html
(The first link points to code examples for checking balancedness.)
The very initial versions of intervals.c, has this
,----
| /* Factor for weight-balancing interval trees. */
| Lisp_Object interval_balance_threshold;
`----
and balancing was indeed taking this form
,----
| register int total_children_size = (LEFT_TOTAL_LENGTH (i)
| + RIGHT_TOTAL_LENGTH (i));
| register int threshold = (XFASTINT (interval_balance_threshold)
| * (total_children_size / 100));
|
| if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
| && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
| return rotate_right (i);
|
| if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
| && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
| return rotate_right (i);
`----
but the threshold removed as part of this commit
,----
| revno: 5416
| committer: Richard M. Stallman <rms@gnu.org>
| timestamp: Sun 1994-01-02 19:01:15 +0000
`----
I think the above re-working is part of the story that jwz relates in
,---- RMS words referred to in http://www.jwz.org/doc/lemacs.html
| But mainly we didn't even get to this stage. Problems became
| visible at the general design level.
|
| During that conversation I found out about the problem with
| slowness of interval processing. I called Arceneaux and looked at
| his code, and found a localized bug in the balancing of binary
| trees. If you created all the intervals from front to back, it
| would do no balancing, leaving the tree maximally unbalanced. The
| effect was a rather roundabout linear search. As luck would have
| it, the Lucid application usually created intervals from front to
| back, so they always saw the worst possible behavior. Anyway, the
| bug was simple once found. This all took a few days.
`----
next reply other threads:[~2013-04-15 4:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-15 4:38 Jambunathan K [this message]
2013-04-15 14:02 ` intervals.c: About interval tree Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k3o4cssy.fsf@gmail.com \
--to=kjambunathan@gmail.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).