unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* intervals.c: About interval tree
@ 2013-04-15  4:38 Jambunathan K
  2013-04-15 14:02 ` Stefan Monnier
  0 siblings, 1 reply; 2+ messages in thread
From: Jambunathan K @ 2013-04-15  4:38 UTC (permalink / raw)
  To: emacs-devel


I have been closely looking at intervals.c and I have a few questions.

Type of tree (in standard literature)?
=====================================

It seems the tree that is used is "mostly balanced tree".  I am
wondering what sort of "balancedness" we are talking about i.e.. what
invariant does the tree satisfy?

A cursory reading of `balance_an_interval' suggests that, at a given
node, choose one of 

        1. No rotation
        2. Left rotate
        3. Right rotate

which minmizes the diff between weight of left and right subtrees i.e.,
(weight => total length of the interval)

Is that balancing rule a heuristic or does it give nice algorithmic
properties?

----------------------------------------------------------------

Similar to (but not same as) this?
=================================

The closest (in form) to what current `balance_an_interval' does is the
one that I see from

        http://www8.cs.umu.se/kurser/TDBAfl/VT06/algorithms/WEBSITE/IMPLEMEN/HANDBOOK/DISTRIB/HANDBOOK/ALGS/32/3415_R_C.HTM

But the code snippet there does *some more jugglery* beyond merely
comparing the left and right weights.

----------------------------------------------------------------

Is history relevant - Apropos threshold for weight balancing?
=============================================================

The weight balanced trees that I know of define a threshold that goes
something like what is seen here in the below pages.

        http://xlinux.nist.gov/dads//HTML/bbalphatree.html
        http://xlinux.nist.gov/dads//HTML/balance.html

(The first link points to code examples for checking balancedness.)

The very initial versions of intervals.c, has this

    ,----
    | /* Factor for weight-balancing interval trees. */
    | Lisp_Object interval_balance_threshold;
    `----

and balancing was indeed taking this form

    ,----
    |   register int total_children_size = (LEFT_TOTAL_LENGTH (i)
    | 				      + RIGHT_TOTAL_LENGTH (i));
    |   register int threshold = (XFASTINT (interval_balance_threshold)
    | 			    * (total_children_size / 100));
    | 
    |   if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
    |       && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
    |     return rotate_right (i);
    | 
    |   if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
    |       && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
    |     return rotate_right (i);
    `----
        
but the threshold removed as part of this commit

    ,----
    | revno: 5416
    | committer: Richard M. Stallman <rms@gnu.org>
    | timestamp: Sun 1994-01-02 19:01:15 +0000
    `----

I think the above re-working is part of the story that jwz relates in
        
    ,----  RMS words referred to in http://www.jwz.org/doc/lemacs.html
    |  But mainly we didn't even get to this stage. Problems became
    |  visible at the general design level.
    | 
    | During that conversation I found out about the problem with
    | slowness of interval processing. I called Arceneaux and looked at
    | his code, and found a localized bug in the balancing of binary
    | trees. If you created all the intervals from front to back, it
    | would do no balancing, leaving the tree maximally unbalanced. The
    | effect was a rather roundabout linear search. As luck would have
    | it, the Lucid application usually created intervals from front to
    | back, so they always saw the worst possible behavior. Anyway, the
    | bug was simple once found. This all took a few days.
    `----



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: intervals.c: About interval tree
  2013-04-15  4:38 intervals.c: About interval tree Jambunathan K
@ 2013-04-15 14:02 ` Stefan Monnier
  0 siblings, 0 replies; 2+ messages in thread
From: Stefan Monnier @ 2013-04-15 14:02 UTC (permalink / raw)
  To: Jambunathan K; +Cc: emacs-devel

> It seems the tree that is used is "mostly balanced tree".  I am
> wondering what sort of "balancedness" we are talking about i.e.. what
> invariant does the tree satisfy?

If you look at balance_an_interval, I think it's just "balanced" in the
sense that the left and right subtrees cover (as much as possible) the
same text-size.

So it's different from the usual literature, where "balanced" refers to
the depth of the tree.


        Stefan "who replaced it with a splay-tree at some point"



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-04-15 14:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-15  4:38 intervals.c: About interval tree Jambunathan K
2013-04-15 14:02 ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).