From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Markers/intervals/overlays + trees Date: Thu, 09 Aug 2012 13:16:19 -0400 Message-ID: References: <5022934F.3050702@yandex.ru> <83wr19dzbj.fsf@gnu.org> <83vcgtdsg9.fsf@gnu.org> <5023344C.6020603@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1344532600 5603 80.91.229.3 (9 Aug 2012 17:16:40 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 9 Aug 2012 17:16:40 +0000 (UTC) Cc: Eli Zaretskii , emacs-devel@gnu.org To: Dmitry Antipov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Aug 09 19:16:39 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SzWLe-000675-5d for ged-emacs-devel@m.gmane.org; Thu, 09 Aug 2012 19:16:38 +0200 Original-Received: from localhost ([::1]:47291 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SzWLd-0004DC-8T for ged-emacs-devel@m.gmane.org; Thu, 09 Aug 2012 13:16:37 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:52302) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SzWLQ-0003ex-Un for emacs-devel@gnu.org; Thu, 09 Aug 2012 13:16:26 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SzWLP-0001ia-EH for emacs-devel@gnu.org; Thu, 09 Aug 2012 13:16:24 -0400 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.182]:37322) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SzWLN-0001fj-I4; Thu, 09 Aug 2012 13:16:21 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av0EAG6Zu09Ld/LR/2dsb2JhbABEtBGBCIIVAQEEAVYPFAULCw4mEhQYDSSIHAULuX6QRAOjM4FYgwU X-IronPort-AV: E=Sophos;i="4.75,637,1330923600"; d="scan'208";a="195368563" Original-Received: from 75-119-242-209.dsl.teksavvy.com (HELO pastel.home) ([75.119.242.209]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 09 Aug 2012 13:16:20 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id B36F159169; Thu, 9 Aug 2012 13:16:19 -0400 (EDT) In-Reply-To: <5023344C.6020603@yandex.ru> (Dmitry Antipov's message of "Thu, 09 Aug 2012 07:53:48 +0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.248.154.182 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:152370 Archived-At: > The more I do different things around buffers and > markers/intervals/overlays, the more I think that the last three ones > represents nearly the same thing, i.e. "buffer range with properties" > (marker is the range of length 0 (or 1, that's may be the question), Definitely 0. > and without properties). Is it reasonable/ possible/feasible to > generalize them into the only type and use it everywhere? As mentioned elsewhere, we could/should move overlays to the tree currently used for text-properties (using the "augmented tree" approach described in http://en.wikipedia.org/wiki/Interval_tree, and using the "interval tree" of text-properties as the simple underlying balanced binary tree). This wouldn't quite merge text-properties and overlays (the two concepts are subtly different), but it would bring the implementation of the two closer and should make overlays a lot more scalable/efficient. I suspect that once this is done, markers wouldn't matter much any more (because overlays wouldn't use them internally), so we could keep the current implementation or turn them into "degenerate overlays". > If not, shouldn't markers and overlays be chained into double-linked > lists instead of single-linked, for the sake of fast unchain/re-chain > and in-place sort? Not sure it's worth the trouble: - doubly-linked lists wouldn't speed up chaining. - while unchaining would be faster, this is normally not a common operation (hopefully most markers should be unchained lazily in batch by the GC, which can do it efficiently). Of course, this is only true of real markers, not of overlay's markers since these aren't implicitly reclaimed by the GC. - sorting would slow down insertion and would only speed up other operations by a factor 2 (you only need to traverse half the list on average). When the list is long enough to be a problem, a factor 2 is not sufficient, we need algorithmic improvement. > Finally, what about an idea to generalize red-black tree from alloc.c > and use it everywhere when O(log(n)) data structure is needed? > I suppose that we can even avoid our own tree implementation while > compiling for GNU/Linux because glibc trees (tsearch/tfind/etc.) are > balanced and good enough in general. If we have to provide our own implementation in some cases, then we may as well use it everywhere. OTOH we could maybe use Glib's trees. It would probably be fairly easy to do for alloc.c's redblack trees, but changing text-properties's intervals trees to some "standard" balanced tree library would probably take more work. Also it would be less efficient since every node would be split into the "tree node" and the "value node". Stefan