From 6ca364a6ad45b1cfdcf080c492fde536975f6e09 Mon Sep 17 00:00:00 2001 From: Matt Armstrong Date: Thu, 6 Oct 2022 15:47:20 -0700 Subject: [PATCH 5/5] ; * src/itree.c: Add comment describing when noverlay is O(N) --- src/itree.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/src/itree.c b/src/itree.c index 79e39d6e2a..6de84ae0b8 100644 --- a/src/itree.c +++ b/src/itree.c @@ -62,6 +62,42 @@ Copyright (C) 2017-2022 Free Software Foundation, Inc. complexity of O(K*log(N)) for this operation, where K is the size of the result set and N the size of the tree. + ==== + FIXME: this data structure is O(N) for important operations -matt + ==== + + The amortized costs of Emacs' previous-overlay-change and + next-overlay-change functions are O(N) with this data structure. + The root problem is that we only have an order for the BEG field, + but not the END. The previous/next overlay change operations need + to find the nearest point where there is *either* an interval BEG + or END point, but there is no efficient way to narrow the search + space over END postions. + + Consider the case where next-overlay-change is called at POS, all + interval BEG positions are less than pos POS and all interval END + posistions are after. These END positions have no order, and so + *every* interval must be examined. This is at least O(N). The + previous-overlay-change case is similar. The root issue is that + the iterative "narrowing" approach is not guaranteed to reduce the + search space in logarithmic time, since END is not ordered in the + tree. + + One might argue that the LIMIT value will do this narrowing, but + this narrowing is O(K*log(N)) where K is the size of the result + set. If we are interested in finding the node in a range with the + smallest END, we might have to examine all K nodes in that range. + In the case of the *-overlay-channge functions, K may well be equal + to N. + + Ideally, a tree based data structure for overlays would have + O(log(N)) performance for previous-overlay-change and + next-overlay-change, as these are called in performance sensitive + situations such as redisplay. The only way I can think of + achieving this is by keeping one ordering by BEG and a separate + ordering by END, and then performing logic quite similar to the + current Emacs overlays-before and overlays-after lists. + ==== Adjusting intervals ==== Since this data-structure will be used for overlays in an Emacs -- 2.35.1