From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Matt Armstrong Newsgroups: gmane.emacs.devel Subject: Re: noverlay branch Date: Thu, 06 Oct 2022 13:41:08 -0700 Message-ID: <87mta8d6sb.fsf@rfc20.org> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23489"; mail-complaints-to="usenet@ciao.gmane.io" To: Stefan Monnier , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Oct 06 22:42:45 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ogXhb-0005rW-Gx for ged-emacs-devel@m.gmane-mx.org; Thu, 06 Oct 2022 22:42:44 +0200 Original-Received: from localhost ([::1]:58240 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ogXha-0002YW-EE for ged-emacs-devel@m.gmane-mx.org; Thu, 06 Oct 2022 16:42:42 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39402) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ogXgH-00016M-Bn for emacs-devel@gnu.org; Thu, 06 Oct 2022 16:41:21 -0400 Original-Received: from relay10.mail.gandi.net ([2001:4b98:dc4:8::230]:58087) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ogXgD-0006xW-GL for emacs-devel@gnu.org; Thu, 06 Oct 2022 16:41:21 -0400 Original-Received: (Authenticated sender: matt@rfc20.org) by mail.gandi.net (Postfix) with ESMTPSA id A368B240006; Thu, 6 Oct 2022 20:41:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rfc20.org; s=gm1; t=1665088872; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=PCtCLXki8w3oqB2w2RrwHTMHnm9v+QLH5SAoizE/k6w=; b=TD7MapESlg4fgfMZ5pHb49LqwpY2pA0fY6uavmgawhypFwvBx0H2p5kKJGNrVCoOltkYua GTUgTEeRWKp0Wv7VL05pl9JF81yTOj/foM5AYzJkTpcc0iIIdHyyMB2BPmFojxXEdZRJap 1LA6c/iQtjdUek+pS2LIEzK2XTTMAc3zrYj08eEM/0Awp16PInd7gX8XmMBNj0rJHhoy0P QnzoReRqPf3b/WTu3iVZ2QXT0tyiFBEZXWP2rTP7njBux4d2v59VNYj/8FGOR2CiIklSB5 zo7nKR/kPlD8QoMMu+VHMs60vd0iyBju8+/3MR5+ldL0ceOevdDWuC7SUt2rzw== Original-Received: from matt by naz with local (Exim 4.96) (envelope-from ) id 1ogXg4-003vyH-1I; Thu, 06 Oct 2022 13:41:08 -0700 In-Reply-To: Received-SPF: pass client-ip=2001:4b98:dc4:8::230; envelope-from=matt@rfc20.org; helo=relay10.mail.gandi.net X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:297124 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Stefan Monnier writes: > I just updated the noverlay branch to the code in `master` (most of it > was mechanical except for the fact that since the last commit on that > branch we have gotten rid of Lisp_Misc and we moved to pdumper). > > I'm getting ready to install this in `master`, so I'd encourage people > to try this code as much as possible to try and weed out the most > glaring problems before it hits master. The code generally looks good, > but it touches some quite "core" code in the redisplay (with lots of > off-by-one opportunities) and in the memory management (with > opportunities for crashes and memory leaks). > > For those who're not familiar with this branch, it changes the way > overlays are stored in a buffer: instead of keeping them in na=C3=AFve > singly-linked lists, it keeps them in balanced binary trees, so as to > replace an O(N) complexity with O(log N). This way Emacs should not get > sluggish even with millions of overlays (of course, if a given buffer > position is covered by a million overlays, it'll still be sluggish when > operating around that position). > > > Stefan Stefan, I don't have Emacs commit so I've attached comment improvement patches here. Lemme know if you'd prefer this occur on a bug. These commits are also on https://git.sr.ht/~matta/emacs/log/feature/noverlay --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-src-itree.h-include-lisp.h-for-Lisp_Object.patch >From 6dff825a9943434cfccd64916c506ab10977acf8 Mon Sep 17 00:00:00 2001 From: Matt Armstrong Date: Thu, 6 Oct 2022 09:36:24 -0700 Subject: [PATCH 1/4] ; * src/itree.h: include "lisp.h" for Lisp_Object --- src/itree.c | 2 +- src/itree.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/itree.c b/src/itree.c index ed31ef1156..a782410860 100644 --- a/src/itree.c +++ b/src/itree.c @@ -19,7 +19,7 @@ Copyright (C) 2017-2022 Free Software Foundation, Inc. #include #include -#include "lisp.h" + #include "itree.h" /* diff --git a/src/itree.h b/src/itree.h index 8f6bb667d6..9b79551f77 100644 --- a/src/itree.h +++ b/src/itree.h @@ -23,6 +23,8 @@ #define ITREE_H #include #include +#include "lisp.h" + /* The tree and node structs are mainly here, so they can be allocated. NOTE: The only time where it is safe to modify node.begin and -- 2.35.1 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0002-src-itree.h-struct-interval_node-document-field-inva.patch >From 4048988f24c60104e6658b164a34df752f7b6167 Mon Sep 17 00:00:00 2001 From: Matt Armstrong Date: Thu, 6 Oct 2022 13:05:19 -0700 Subject: [PATCH 2/4] ; * src/itree.h (struct interval_node): document field invariants. --- src/itree.h | 58 ++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 49 insertions(+), 9 deletions(-) diff --git a/src/itree.h b/src/itree.h index 9b79551f77..52219a8eef 100644 --- a/src/itree.h +++ b/src/itree.h @@ -25,22 +25,62 @@ #define ITREE_H #include "lisp.h" -/* The tree and node structs are mainly here, so they can be allocated. - - NOTE: The only time where it is safe to modify node.begin and - node.end directly, is while the node is not part of any tree. - - NOTE: It is safe to read node.begin and node.end directly, if the - node came from a generator, because it validates the nodes it - returns as a side-effect. -*/ +/* NOTE: the fields of the node and tree structs should for the most + * part be treated as opaque to the rest of Emacs. Exceptions are + * noted in comments. They are in the header file so they can be + * allocated. + */ struct interval_node; struct interval_node { + /* The normal parent, left and right links found in binary trees. + See also `red`, below, which completes the Red-Black tree + representation. */ struct interval_node *parent; struct interval_node *left; struct interval_node *right; + + /* The following five fields comprise the interval abstraction. + + BEGIN, END are buffer positions. BEGIN and END are the beginning + and end of this interval. These form an inclusive, exlusive (or + closed, open) range of buffer positions [BEGIN..END). When a + node is in a tree these fields are read only, written only by + itree functions. + + The LIMIT, OFFSET and OTICK fields should be considered internal + to itree.c and used only by itree functions. + + LIMIT is a buffer position, the maximum of END of this node and + its children. See itree.c for its use. + + OFFSET is in buffer position units, and will be non-zero only + when the node is dirty. + + OTICK determines whether BEGIN, END, LIMIT and OFFSET are + considered dirty. A node is clean when its OTICK is equal to the + OTICK of its tree (see struct interval_tree). Otherwise, it is + dirty. + + In a clean node, BEGIN, END and LIMIT are correct buffer + positions, and OFFSET is zero. The parent of a clean node is + clean also clean, recursively. + + In a dirty node, the node's OTICK won't equal its tree's OTICK, + and its OFFSET may be non-zero. At all times the descendents of + a dirty node are also dirty. BEGIN, END and LIMIT require + adjustment before use as buffer positions. + + NOTE: BEGIN and END must not be modified while the node is part + of a tree. Use interval_tree_insert_gap and + interval_tree_delete_gap instead. + + NOTE: The interval generators ensure nodes are clean before + yielding them, so BEGIN and END may be safely used as buffer + positions then. + */ + ptrdiff_t begin; /* The beginning of this interval. */ ptrdiff_t end; /* The end of the interval. */ ptrdiff_t limit; /* The maximum end in this subtree. */ -- 2.35.1 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0003-src-itree.c-change-comments-for-clarity.patch >From 52470aa06f5f037358d957271101b8dbaa3f44e7 Mon Sep 17 00:00:00 2001 From: Matt Armstrong Date: Thu, 6 Oct 2022 13:12:54 -0700 Subject: [PATCH 3/4] ; * src/itree.c: change comments for clarity. --- src/itree.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/itree.c b/src/itree.c index a782410860..3098fe1cf4 100644 --- a/src/itree.c +++ b/src/itree.c @@ -24,7 +24,7 @@ Copyright (C) 2017-2022 Free Software Foundation, Inc. /* Intervals of the form [BEGIN, END), are stored as nodes inside a RB - tree, sorted by BEGIN . The core operation of this tree (besides + tree, ordered by BEGIN. The core operation of this tree (besides insert, remove, etc.) is finding all intervals intersecting with some given interval. In order to perform this operation efficiently, every node stores a third value called LIMIT. (See @@ -65,10 +65,10 @@ Copyright (C) 2017-2022 Free Software Foundation, Inc. ==== Adjusting intervals ==== Since this data-structure will be used for overlays in an Emacs - buffer, a second core operation implements the ability to insert or - delete gaps in resp. from the tree. This models the insertion - resp. deletion of text in a buffer and the effects it may have on - the positions of overlays. + buffer, a second core operation is the ability to insert and delete + gaps in the tree. This models the insertion and deletion of text + in a buffer and the effects it may have on the positions of + overlays. Consider this: Something gets inserted at position P into a buffer and assume that all overlays occur strictly after P. Ordinarily, @@ -79,10 +79,10 @@ Copyright (C) 2017-2022 Free Software Foundation, Inc. The OFFSET of some some subtree, represented by its root, is the amount of shift that needs to be applied to its BEGIN, END and - LIMIT values, in order to get to the real values. Coming back to - the example, all we would need to do in this case, is to increment - the OFFSET of the tree's root, without any traversal of the tree - itself. + LIMIT values, in order to get to the actual buffer positions. + Coming back to the example, all we would need to do in this case, + is to increment the OFFSET of the tree's root, without any + traversal of the tree itself. As a consequence, the real values of BEGIN, END and LIMIT of some NODE need to be computed by incrementing them by the sum of NODE's -- 2.35.1 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0004-Use-a-bool-instead-of-a-bitfield.patch >From 53238b6c16f4dc3dec8d60205e88f5f79200d187 Mon Sep 17 00:00:00 2001 From: Matt Armstrong Date: Thu, 6 Oct 2022 13:18:46 -0700 Subject: [PATCH 4/4] Use a bool instead of a bitfield * src/itree.c (struct interval_generator): use a bool instead of a bitfield, since space is not an issue. --- src/itree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/itree.c b/src/itree.c index 3098fe1cf4..79e39d6e2a 100644 --- a/src/itree.c +++ b/src/itree.c @@ -145,7 +145,7 @@ null_is_sane (void) ptrdiff_t end; uintmax_t otick; /* A copy of the tree's `otick`. */ enum interval_tree_order order; - bool_bf running : 1; + bool running; const char* file; int line; }; -- 2.35.1 --=-=-=--