From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Joakim Jalap Newsgroups: gmane.emacs.devel Subject: Re: Overlays as an AA-tree Date: Fri, 03 Feb 2017 12:33:27 +0100 Message-ID: <87fujvpkzc.fsf@fastmail.com> References: <87d1jylv43.fsf@fastmail.com> <87fujv64mn.fsf@hochschule-trier.de> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1486121687 13993 195.159.176.226 (3 Feb 2017 11:34:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 3 Feb 2017 11:34:47 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (berkeley-unix) Cc: emacs-devel@gnu.org To: Andreas Politz Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Feb 03 12:34:43 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cZc8V-0003ON-MC for ged-emacs-devel@m.gmane.org; Fri, 03 Feb 2017 12:34:40 +0100 Original-Received: from localhost ([::1]:33563 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cZc8a-0002m8-RD for ged-emacs-devel@m.gmane.org; Fri, 03 Feb 2017 06:34:44 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40344) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cZc7X-0002jJ-9N for emacs-devel@gnu.org; Fri, 03 Feb 2017 06:33:40 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cZc7R-00023O-Ff for emacs-devel@gnu.org; Fri, 03 Feb 2017 06:33:37 -0500 Original-Received: from out1-smtp.messagingengine.com ([66.111.4.25]:44253) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cZc7R-0001ze-4u for emacs-devel@gnu.org; Fri, 03 Feb 2017 06:33:33 -0500 Original-Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 7AC942084B; Fri, 3 Feb 2017 06:33:29 -0500 (EST) Original-Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Fri, 03 Feb 2017 06:33:29 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.com; h=cc :content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=h/lwkPCiUbK3HwIa+vEH8kMP2cY=; b=i5N2e8 N5uHOh78GbFaohmP9QUY/CWQK6fz5aHw0oOrIkOX4VVqUgyjlcz6JZZfH3lbSkoq kI46e6eQk41SV1rp6Akab4kVI0p+S4gZaWo2TyztPyssSdUwK2qggUrBOmFp3GQg wNPlYK33mqId5U/ZJCcdhyk68mPJK2ggAAKI8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc:x-sasl-enc; s=smtpout; bh=h/lwkPCiUbK3Hw Ia+vEH8kMP2cY=; b=EeJjOqBTXF9SrEzHUVnqytXGRq6dNQ13zwupQtaav0ZRBu lCSKcPVbhYjbVpp1Rg6e/HuOYj9QrMA5tPhCDLqkQe+K7hPR65SrVR9odyXo0C2C XjSWGDru0LSWMXy2du64l+pvf2axeVuEGwe9lrr2Gq2XjfsSgfEIRvxwWwNxo= X-ME-Sender: X-Sasl-enc: SCTxBoijG2kBGBS0PMzcu8MqFL8bg2PfW4QQ+wXHu0Ir 1486121609 Original-Received: from genserv (unknown [5.150.202.248]) by mail.messagingengine.com (Postfix) with ESMTPA id CE6A47E270; Fri, 3 Feb 2017 06:33:28 -0500 (EST) In-Reply-To: <87fujv64mn.fsf@hochschule-trier.de> (Andreas Politz's message of "Fri, 03 Feb 2017 09:49:20 +0100") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:211925 Archived-At: Andreas Politz writes: > Hi, I'm interested in this problem as well. Great! I thought I might chime in since I've been bashing my head againt this for a while now :) > I want to clarify something: > > Overlays have begin B and end E positions each with a flag (front-advance > resp. rear-advance). These positions may change in one of 3 ways: > > 1. move-overlay is called > 2. Text of length L is inserted at position P. > Here the flags decide whether begin B resp. end E will move or not > in the literal border-cases P = B resp. P = E . > 3. Text of length L is deleted beginning at position P. There is also the case of a replace, conceptually I guess it's similar to a delete or an insert, but in the code it is handled separately. Interestingly, in the case of a replace, we don't look at the insertion type. > Is this the correct model ? Someone mentioned in a different thread, > that occasionally we may have B > E. How does this happen ? I think it happens in case 2 above, when B = E = P, and we have front-advance = t and read-advance = nil. Then B will move but E will not, so we end up with a negative size overlay. See also the comments at adjust_markers_for_insert in insdel.c and fix_start_end_in_overlays in buffer.c. > --- > > I think if a tree is sorted by begin only and the front-advance of > every node is either true or false, the tree can't possibly become > disordered by insertion/deletion operations. Further the disorder > exists at most for nodes having the same begin but different > front-advances after an insert at begin was performed. But if the tree is sorted by begin only we can have multiple overlays in the same position, how does that work? A linked list of all overlays starting at that position? I toyed with the idea, but I haven't actually tried it. For reference, the last case I got stuck on was byte compiling lisp/semantic/cpp.el, where there are a bunch overlays (26 I believe) at different positions, but then almost all the buffer text is deleted so that every overlay ends up being from 6 to 6! > This suggests using two trees or at least limits the amount of nodes > having to be reinserted after a change. Hm I don't understand this. How would we use two trees? > > Anyway, my attempt is to use the RB-tree from alloc.c and add some > node-attributes: > > 1. max_end :: Holds the maximum E value of this node's sub-tree and > bounds the complexity of queries in terms of their result, i.e. number > of intervals. (The so called ,,Augmented Tree'' approach to Interval > Trees (which is actually just an example for how to augment a tree in > the book).) > > 2. offset :: The amount of shift of this node's sub-tree relative to > its parent and applying to all position values (begin,end,max_end). > This should limit the time it takes to modify the tree after an > insertion/deletion in terms of intersecting intervals. I was going to do this too (but left that for later as I have bigger problems right now :)). The problem here is that you have to take the insertion_type into account, so an overlay may or may not change as the result of an insertion. Actually maybe that isn't a problem since it only affects those overlays with an endpoint exactly at the point of insertion... This whole thing makes me dizzy... > 3. modified_tick :: Essentially just a flag, indicating that all > parent offsets have been applied to this node. This would allow for > constant time access on B/E for repeated accesses. > > I'm essentially building an overlay model and if it works, will try to > plug it into the Editor. Sounds like a good idea. Let's hope you have better luck (or skill) than me :) > -ap -- Joakim