From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Jambunathan K Newsgroups: gmane.emacs.devel Subject: intervals.c: About interval tree Date: Mon, 15 Apr 2013 10:08:05 +0530 Message-ID: <87k3o4cssy.fsf@gmail.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1366000703 28078 80.91.229.3 (15 Apr 2013 04:38:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 15 Apr 2013 04:38:23 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Apr 15 06:38:27 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1URbBS-0003b3-Oq for ged-emacs-devel@m.gmane.org; Mon, 15 Apr 2013 06:38:26 +0200 Original-Received: from localhost ([::1]:44505 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URbBS-00032W-Dl for ged-emacs-devel@m.gmane.org; Mon, 15 Apr 2013 00:38:26 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:42368) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URbBO-00032P-Q6 for emacs-devel@gnu.org; Mon, 15 Apr 2013 00:38:24 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1URbBM-0004Y5-6h for emacs-devel@gnu.org; Mon, 15 Apr 2013 00:38:22 -0400 Original-Received: from mail-pd0-f174.google.com ([209.85.192.174]:65008) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URbBM-0004Xh-13 for emacs-devel@gnu.org; Mon, 15 Apr 2013 00:38:20 -0400 Original-Received: by mail-pd0-f174.google.com with SMTP id p12so2286787pdj.33 for ; Sun, 14 Apr 2013 21:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:to:subject:date:message-id:user-agent:mime-version :content-type; bh=LXtm+G7fIPhwu2on2vXgj0huFtkWynKkwOE6TO6HytI=; b=HHgguZKCUv8uIsw/IqIv1tXQm3niygNcO47+1w7Rh7CtWhqegE6P9MqTCnzA/Q9MSk AFCOuZMpUgYTIKhSVQtaLEEC51NhUJ5vzXC0W4Pol3uoQYB5wFPyETE/Ce/DqXysudza sOm2fon0yGOBz0EcycPA3/QPU/WkLw8Tqwy4xngTpcBivE6AbGJ0ImcJixecv4gdRTD8 g2tiFQzXFeCWrSmeEnfiP51Lxf5bsrxz4PA4gfuA/Lq0WIh4VOHxQWdXuJxQNAAeGFEH 1Fkb3TDN/DJQrRk2ueRrz3tdKAtz7rAlNFzAnJXg5ffgTZv81W6CTBJqWc93qXCzMGMK rHJw== X-Received: by 10.66.121.104 with SMTP id lj8mr28167295pab.51.1366000699100; Sun, 14 Apr 2013 21:38:19 -0700 (PDT) Original-Received: from debian-6.05 ([115.241.127.190]) by mx.google.com with ESMTPS id gi2sm18769073pbb.2.2013.04.14.21.38.16 (version=TLSv1.1 cipher=RC4-SHA bits=128/128); Sun, 14 Apr 2013 21:38:18 -0700 (PDT) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.192.174 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:158912 Archived-At: I have been closely looking at intervals.c and I have a few questions. Type of tree (in standard literature)? ===================================== It seems the tree that is used is "mostly balanced tree". I am wondering what sort of "balancedness" we are talking about i.e.. what invariant does the tree satisfy? A cursory reading of `balance_an_interval' suggests that, at a given node, choose one of 1. No rotation 2. Left rotate 3. Right rotate which minmizes the diff between weight of left and right subtrees i.e., (weight => total length of the interval) Is that balancing rule a heuristic or does it give nice algorithmic properties? ---------------------------------------------------------------- Similar to (but not same as) this? ================================= The closest (in form) to what current `balance_an_interval' does is the one that I see from http://www8.cs.umu.se/kurser/TDBAfl/VT06/algorithms/WEBSITE/IMPLEMEN/HANDBOOK/DISTRIB/HANDBOOK/ALGS/32/3415_R_C.HTM But the code snippet there does *some more jugglery* beyond merely comparing the left and right weights. ---------------------------------------------------------------- Is history relevant - Apropos threshold for weight balancing? ============================================================= The weight balanced trees that I know of define a threshold that goes something like what is seen here in the below pages. http://xlinux.nist.gov/dads//HTML/bbalphatree.html http://xlinux.nist.gov/dads//HTML/balance.html (The first link points to code examples for checking balancedness.) The very initial versions of intervals.c, has this ,---- | /* Factor for weight-balancing interval trees. */ | Lisp_Object interval_balance_threshold; `---- and balancing was indeed taking this form ,---- | register int total_children_size = (LEFT_TOTAL_LENGTH (i) | + RIGHT_TOTAL_LENGTH (i)); | register int threshold = (XFASTINT (interval_balance_threshold) | * (total_children_size / 100)); | | if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i) | && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold) | return rotate_right (i); | | if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i) | && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold) | return rotate_right (i); `---- but the threshold removed as part of this commit ,---- | revno: 5416 | committer: Richard M. Stallman | timestamp: Sun 1994-01-02 19:01:15 +0000 `---- I think the above re-working is part of the story that jwz relates in ,---- RMS words referred to in http://www.jwz.org/doc/lemacs.html | But mainly we didn't even get to this stage. Problems became | visible at the general design level. | | During that conversation I found out about the problem with | slowness of interval processing. I called Arceneaux and looked at | his code, and found a localized bug in the balancing of binary | trees. If you created all the intervals from front to back, it | would do no balancing, leaving the tree maximally unbalanced. The | effect was a rather roundabout linear search. As luck would have | it, the Lucid application usually created intervals from front to | back, so they always saw the worst possible behavior. Anyway, the | bug was simple once found. This all took a few days. `----