From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Joakim Jalap Newsgroups: gmane.emacs.devel Subject: Re: Overlays as an AA-tree Date: Thu, 22 Sep 2016 12:35:29 +0200 Message-ID: <87a8f0p69w.fsf@fastmail.com> References: <87d1jylv43.fsf@fastmail.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1474541236 7890 195.159.176.226 (22 Sep 2016 10:47:16 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 22 Sep 2016 10:47:16 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 22 12:47:11 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bn1Wp-00082h-Bw for ged-emacs-devel@m.gmane.org; Thu, 22 Sep 2016 12:46:55 +0200 Original-Received: from localhost ([::1]:37574 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bn1Wq-0004Sg-Ma for ged-emacs-devel@m.gmane.org; Thu, 22 Sep 2016 06:46:56 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57500) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bn1QN-0007Si-SZ for emacs-devel@gnu.org; Thu, 22 Sep 2016 06:40:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bn1QI-0006vf-2w for emacs-devel@gnu.org; Thu, 22 Sep 2016 06:40:14 -0400 Original-Received: from out2-smtp.messagingengine.com ([66.111.4.26]:56095) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bn1QF-0006oy-Ig for emacs-devel@gnu.org; Thu, 22 Sep 2016 06:40:10 -0400 Original-Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 21C6B20524; Thu, 22 Sep 2016 06:39:57 -0400 (EDT) Original-Received: from frontend1 ([10.202.2.160]) by compute6.internal (MEProxy); Thu, 22 Sep 2016 06:39:57 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.com; h=cc :content-type:date:from:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=jXidSRvz1NUmvVUS DPbEFv61TXs=; b=PHR6XViEtjInytkGSlepFV3Jd8TX9YPmyGoW6CsyUhdHNM+e QR54Ir5FU7h6N0bVPKzcKQodi9MymPy6pWROUHptbHcb7XKSBS1t8QgXHIwukcbM a+0A/JWPZKvJ5WMizaly/CJmUiX3IN6loUifeUDm1VKK3s99ssIkeoza+MI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:message-id :mime-version:references:subject:to:x-sasl-enc:x-sasl-enc; s= smtpout; bh=jXidSRvz1NUmvVUSDPbEFv61TXs=; b=ZqGUp2UHiGbH9nNPyYfh dFRA0il3ArHX5f42RNtiAH4XYIWRHxOZwKNrrpc2xIUdURqQ6Q6n618/7RFuLtl7 RclBXTk+/Hz+ot+xHYeh22QBX6e82ncu6Su+kESPVIHezcmQjJFx95fDTrgnMAfB sNzMbfpXGRApb8tkUpVVYvI= X-Sasl-enc: Ab3wrqn0KMJFgqSW85RHgvWtOyI842B1yM3ueVk8DxX7 1474540796 Original-Received: from thinkpad (79.138.129.132.mobile.tre.se [79.138.129.132]) by mail.messagingengine.com (Postfix) with ESMTPA id C5116F29CD; Thu, 22 Sep 2016 06:39:55 -0400 (EDT) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 66.111.4.26 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:207685 Archived-At: Stefan Monnier writes: Thanks for looking at this! It is, after all, your idea :) > Could you describe how you use this tree? An AA-tree is indexed by > a single "number", whereas overlays have two independent "numbers". > So how to use this AA-tree to implement an *interval* tree? See below. > This should be documented in the code. E.g. the "struct Lisp_Overlay" > should define very clearly the meaning of char_start, char_end, left, > and right (i.e. which overlays go to the left, and which go to the > right). > > While looking for an answer in the text I found: > > /* This function determines where overlays are inserted in the tree. > FIXME: I didn't think too hard about this... > */ > > which makes me suspect your design might not be quite right. Well, I didn't think *too* hard about it. I didn't really think about what to do when the intervals of two overlays are the same. Does it even matter? I have a feeling that if I want to insert an overlay which starts at the same place as another, it could just be inserted to the left of the latter. Is that correct? > Have you read https://en.wikipedia.org/wiki/Interval_tree ? Yes. Well, I tried to anyway. > [ BTW, our convention is to put the "*/" at the end of the last line > rather than alone on the next line. ] Got it. I'll fix that. > This said, reading overlay_tree_insert_internal, I have the impression > that you're implementing what wikipedia describes as an "Augmented tree" > where the basic tree is your AA-tree, indexed by the left position of > the overlays, with the `max` field being the "augmented" data, which > sounds just fine. > Is that right? That is exactly right. > The only places you absolutely need to replace are all the places that > need to modify the tree. There shouldn't be that many of them > (basically, make-overlay, move-overlay, delete-overlay, plus text > insertion/deletion). > Then the rest can be modified little by little. I think I ran into some other subtle things. For example an overlay can be in "no buffer" if both it's markers were detached from their buffer, I think. So in the case of the overlay tree, I need to detect this and remove the overlay from the buffers tree. > Some tricky parts: > - because of the insertion_type, two overlays may start with end1 and finish with end1>end2 without any call to move-overlay. Hm, ok. I will have to look into this further. > - the overlay tree needs to be "weak" (i.e. we'll need to add special > code in the GC). I don't really understand what this means. Could you please elaborate? I was hoping it would work with the current marking and sweeping. > I wouldn't worry about merging (better yet: merge from master right away > and then keep doing that on a regular basis, which should be easy since > I don't think we've touched (nor will touch) this code very much). I tried to do that yesterday. There were some conflicts in insdel.c (which I think I fixed) but now I can't commit, because 'git commit' complains about trailing whitespace in some regex test files. How do I get around this? > AFAIK, the byte-position of markers is used, but the byte-position of > overlays isn't, so you should be able to get rid of them. That's great! Thanks. -- Joakim