unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Matt Armstrong <matt@rfc20.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: noverlay branch
Date: Mon, 26 Sep 2022 22:12:44 -0700	[thread overview]
Message-ID: <87bkr1e6yb.fsf@rfc20.org> (raw)
In-Reply-To: <jwvleq7gk99.fsf-monnier+emacs@gnu.org>

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I just updated the noverlay branch to the code in `master` (most of it
> was mechanical except for the fact that since the last commit on that
> branch we have gotten rid of Lisp_Misc and we moved to pdumper).

This is pretty exciting!

I've been looking at the noverlay branch as well in some detail, but
only as a background task, over the past year.

I haven't considered the branch ready for merge because I saw some areas
where I think the implementation should improve.  But, working code is a
compelling thing.  If noverlay is ready to merge I'd suggest doing it.
Code can improve later, as and if needed.

I was planning to takle these problems before proposing a merge:

 1) Improve the worst case run time of `previous-overlay-change' from
    O(n) to O(log N).  The noverlay branch uses an O(N) algorithm,
    though it is difficult to spot.  Since the point of using a tree is
    O(log N) algorithms, and O(n) algorithms can easily become
    exponential algorithms when composed in higher level loops (the
    problem overlays sees today), this strikes me as important.

 2) Look at reducing the number of malloc'd overlay blocks in half by
    expressing the tree intrusively (the same way the overlay list is
    today).  I don't see a lot of value in itree.h/c abstracting away
    the interval logic from the overlay object itself.

 3) Improve quality of comments in the new code.  Personally, I find the
    algorithms quite subtle and quite a bit more complex than what you
    find on, say, https://en.wikipedia.org/wiki/Interval_tree or the
    Cormen et al. Introduction to Algorithms Book.  I think I pieced
    most of it together but it took a lot of effort.  At top of mind is
    looking at the interval_node.visited flag and figure out how that
    flag is used, then describe the algorithm in detail.  It isn't clear
    to me how that flag gets set/cleared.  Best case: doing so proves me
    wrong on point (1).

And lower priority:

 4) The overlay `front-advance' and `rear-advance' booleans are
    conceptually part of the overlay's BEG and END positions, except
    that this is ignored everhwhere except insertion.  Upon insertion at
    any given POS the overlays are according to *both* their BEG or END
    positions and the *-advance booleans.  Yet, this is not used when
    ordering overlays in the tree.  Doing that may bring an opportunity
    to simplify code or make it more efficient.  (Side note: there *may*
    also be a way to encode the *-advance flags implicitly in the
    beg/end fields positions if a way can be found to "steal" the free
    bit in the currently signed ptrdiff_t fields, effectively causing
    the *-advance flags to count as this extra "half" position for the
    purpose of insertion).

 5) Look at using an augmented B-tree of overlays instead of a binary
    tree.  B-trees are quite often faster than binary trees (at least on
    hardware made over the past few decades), so there is that enticing
    proposition.  They're also typically not harder to implement,
    either, requiring no "rotations" nor convluted deletion logic.  An
    *augmented* B-tree may also allow for simplification of the relative
    vs. absolute offsets in the tree.  The noverlay branch currently
    handles this by treating the absolute offsets as cached values,
    "dirtying" them whenever any mutation occurs in the tree, and
    recomputing absolute offsets on demand.  If augmented in the right
    way a B-tree might be shallow enough in practice that recomputation
    on demand is always fast enough, so the the invalidation/caching
    approach (as well as the memory used to track it) can go away.
    (wild idea)



  parent reply	other threads:[~2022-09-27  5:12 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-25 22:38 noverlay branch Stefan Monnier
2022-09-25 22:50 ` Lars Ingebrigtsen
2022-09-25 22:56   ` Stefan Monnier
2022-09-26  2:52 ` Ihor Radchenko
2022-09-26  3:17   ` Stefan Monnier
2022-09-26  6:11   ` Eli Zaretskii
2022-09-26 13:08     ` Ihor Radchenko
2022-09-26 15:59       ` Eli Zaretskii
     [not found]         ` <87v8ovdosz.fsf@localhost>
2022-10-08  6:57           ` Eli Zaretskii
2022-10-09  3:25             ` Matt Armstrong
2022-10-09  4:30               ` Eli Zaretskii
2022-10-09  3:23           ` Matt Armstrong
2022-10-09  3:47           ` Stefan Monnier
2022-10-13 12:09             ` Ihor Radchenko
2022-09-29 18:12       ` Stefan Monnier
2022-09-27  5:12 ` Matt Armstrong [this message]
2022-09-27  6:53   ` Eli Zaretskii
2022-09-27 17:31     ` Matt Armstrong
2022-09-27 18:45       ` Stefan Monnier
2022-09-28 23:09   ` Stefan Monnier
2022-09-29 14:54     ` Gerd Möllmann
2022-09-29 21:36       ` Stefan Monnier
2022-09-30  5:20         ` Gerd Möllmann
2022-10-06  4:47         ` Matt Armstrong
2022-10-06  5:43           ` Gerd Möllmann
2022-10-07  4:11             ` Matt Armstrong
2022-10-07  4:34               ` Gerd Möllmann
2022-10-07 13:33                 ` Stefan Monnier
2022-10-07 14:29                   ` Gerd Möllmann
2022-10-07 14:51                     ` Stefan Monnier
2022-10-07 15:12                       ` Gerd Möllmann
2022-10-07 17:12                         ` Stefan Monnier
2022-10-07 14:56                   ` Stefan Monnier
2022-10-07 15:59                   ` Matt Armstrong
2022-10-07 15:34                 ` Matt Armstrong
2022-10-06 12:08           ` Stefan Monnier
2022-09-27  8:39 ` Gerd Möllmann
2022-09-27  9:38   ` Eli Zaretskii
2022-10-06 20:41 ` Matt Armstrong
2022-10-07 16:51 ` Matt Armstrong
2022-10-07 18:28   ` Stefan Monnier
2022-10-08 23:33     ` Matt Armstrong
2022-10-09  3:44       ` Matt Armstrong
2022-10-09  4:12       ` Stefan Monnier
2022-10-09 15:34         ` Stefan Monnier
2022-10-10  2:57           ` Matt Armstrong
2022-10-10  6:24             ` Eli Zaretskii
2022-10-10 16:26               ` Matt Armstrong
2022-10-10 14:44             ` Stefan Monnier
2022-10-11  3:46               ` Matt Armstrong
2022-10-11  4:09                 ` Stefan Monnier
2022-10-11 18:02                   ` Matt Armstrong
2022-10-11 18:59                     ` Stefan Monnier
2022-10-12  5:18                       ` Matt Armstrong
2022-10-12 18:02                         ` Stefan Monnier
2022-10-12 22:26                           ` Matt Armstrong
2022-10-13  4:03                             ` Stefan Monnier
2022-10-09 23:47       ` Stefan Monnier
2022-10-10  0:05         ` Emanuel Berg
2022-10-10 16:27           ` Matt Armstrong
2022-10-10 16:33         ` Matt Armstrong
2022-10-10 18:32           ` Matt Armstrong
2022-10-11 16:06             ` Stefan Monnier
2022-10-12 17:33               ` Matt Armstrong
2022-10-13  3:59                 ` Stefan Monnier
2022-10-16 21:53                   ` Matt Armstrong
2022-10-23  4:49 ` Matt Armstrong
2022-10-24  9:14   ` Stefan Kangas
2022-10-24 16:21     ` Matt Armstrong
2022-10-24 12:51   ` Eli Zaretskii
2022-10-25 20:57     ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkr1e6yb.fsf@rfc20.org \
    --to=matt@rfc20.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).