unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Jambunathan K <kjambunathan@gmail.com>
To: emacs-devel@gnu.org
Subject: intervals.c: About interval tree
Date: Mon, 15 Apr 2013 10:08:05 +0530	[thread overview]
Message-ID: <87k3o4cssy.fsf@gmail.com> (raw)


I have been closely looking at intervals.c and I have a few questions.

Type of tree (in standard literature)?
=====================================

It seems the tree that is used is "mostly balanced tree".  I am
wondering what sort of "balancedness" we are talking about i.e.. what
invariant does the tree satisfy?

A cursory reading of `balance_an_interval' suggests that, at a given
node, choose one of 

        1. No rotation
        2. Left rotate
        3. Right rotate

which minmizes the diff between weight of left and right subtrees i.e.,
(weight => total length of the interval)

Is that balancing rule a heuristic or does it give nice algorithmic
properties?

----------------------------------------------------------------

Similar to (but not same as) this?
=================================

The closest (in form) to what current `balance_an_interval' does is the
one that I see from

        http://www8.cs.umu.se/kurser/TDBAfl/VT06/algorithms/WEBSITE/IMPLEMEN/HANDBOOK/DISTRIB/HANDBOOK/ALGS/32/3415_R_C.HTM

But the code snippet there does *some more jugglery* beyond merely
comparing the left and right weights.

----------------------------------------------------------------

Is history relevant - Apropos threshold for weight balancing?
=============================================================

The weight balanced trees that I know of define a threshold that goes
something like what is seen here in the below pages.

        http://xlinux.nist.gov/dads//HTML/bbalphatree.html
        http://xlinux.nist.gov/dads//HTML/balance.html

(The first link points to code examples for checking balancedness.)

The very initial versions of intervals.c, has this

    ,----
    | /* Factor for weight-balancing interval trees. */
    | Lisp_Object interval_balance_threshold;
    `----

and balancing was indeed taking this form

    ,----
    |   register int total_children_size = (LEFT_TOTAL_LENGTH (i)
    | 				      + RIGHT_TOTAL_LENGTH (i));
    |   register int threshold = (XFASTINT (interval_balance_threshold)
    | 			    * (total_children_size / 100));
    | 
    |   if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
    |       && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
    |     return rotate_right (i);
    | 
    |   if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
    |       && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
    |     return rotate_right (i);
    `----
        
but the threshold removed as part of this commit

    ,----
    | revno: 5416
    | committer: Richard M. Stallman <rms@gnu.org>
    | timestamp: Sun 1994-01-02 19:01:15 +0000
    `----

I think the above re-working is part of the story that jwz relates in
        
    ,----  RMS words referred to in http://www.jwz.org/doc/lemacs.html
    |  But mainly we didn't even get to this stage. Problems became
    |  visible at the general design level.
    | 
    | During that conversation I found out about the problem with
    | slowness of interval processing. I called Arceneaux and looked at
    | his code, and found a localized bug in the balancing of binary
    | trees. If you created all the intervals from front to back, it
    | would do no balancing, leaving the tree maximally unbalanced. The
    | effect was a rather roundabout linear search. As luck would have
    | it, the Lucid application usually created intervals from front to
    | back, so they always saw the worst possible behavior. Anyway, the
    | bug was simple once found. This all took a few days.
    `----



             reply	other threads:[~2013-04-15  4:38 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15  4:38 Jambunathan K [this message]
2013-04-15 14:02 ` intervals.c: About interval tree Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k3o4cssy.fsf@gmail.com \
    --to=kjambunathan@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).