bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
@ 2022-09-29  5:29 Gerd Möllmann
  2022-09-29  6:28 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29  5:29 UTC (permalink / raw)
  To: 58158

In its current form, interval tree iteration works like this:

1. Call begin_iteration(T) to iterate over tree T
2. do stuff
3. Call end_iteration(T)

with the following rules:

- Begin_iteration and end_iteration must be paired.

- There can be only one iteration per tree at a time.  Nested iteration
  over the same tree is not supported (abort).

- No GC may happen in step 2.  This is because mark_buffer iterates over
  buffer overlays.

I think this is an exceedingly dangerous design.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  5:29 bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful Gerd Möllmann
@ 2022-09-29  6:28 ` Eli Zaretskii
  2022-09-29  7:03   ` Gerd Möllmann
  2022-10-06 22:26 ` Matt Armstrong
  2023-10-06 13:14 ` Gerd Möllmann
  2 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-29  6:28 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 58158

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Date: Thu, 29 Sep 2022 07:29:25 +0200
> 
> In its current form, interval tree iteration works like this:
> 
> 1. Call begin_iteration(T) to iterate over tree T
> 2. do stuff
> 3. Call end_iteration(T)
> 
> with the following rules:
> 
> - Begin_iteration and end_iteration must be paired.
> 
> - There can be only one iteration per tree at a time.  Nested iteration
>   over the same tree is not supported (abort).
> 
> - No GC may happen in step 2.  This is because mark_buffer iterates over
>   buffer overlays.
> 
> I think this is an exceedingly dangerous design.

Why, because of "no GC" requirement?  We could ensure that by calling
inhibit_garbage_collection (if the code doesn't do that already).
That is not very elegant, and might even cause memory pressure in some
(hopefully rare) situations, but we do have some users of this in the
sources.

What higher-level operations require "interval tree iteration" that
you describe?  Which primitives end up doing such iterations?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  6:28 ` Eli Zaretskii
@ 2022-09-29  7:03   ` Gerd Möllmann
  2022-09-29  8:08     ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29  7:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58158

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Date: Thu, 29 Sep 2022 07:29:25 +0200
>> 
>> In its current form, interval tree iteration works like this:
>> 
>> 1. Call begin_iteration(T) to iterate over tree T
>> 2. do stuff
>> 3. Call end_iteration(T)
>> 
>> with the following rules:
>> 
>> - Begin_iteration and end_iteration must be paired.
>> 
>> - There can be only one iteration per tree at a time.  Nested iteration
>>   over the same tree is not supported (abort).
>> 
>> - No GC may happen in step 2.  This is because mark_buffer iterates over
>>   buffer overlays.
>> 
>> I think this is an exceedingly dangerous design.
>
> Why, because of "no GC" requirement?  We could ensure that by calling
> inhibit_garbage_collection (if the code doesn't do that already).

It doesn't.

BTW, if anything signals in step 2, so that end_iteration isn't called,
we're also hosed.

> What higher-level operations require "interval tree iteration" that
> you describe?  Which primitives end up doing such iterations?

What has to do with overlays.  To name a few: overlay-at, overlays-in,
next-overlay-change, previous-overlay-change, overlay-lists, ...

I personally think this is a no-go.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  7:03   ` Gerd Möllmann
@ 2022-09-29  8:08     ` Eli Zaretskii
  2022-09-29  9:09       ` Gerd Möllmann
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-29  8:08 UTC (permalink / raw)
  To: Gerd Möllmann, Stefan Monnier; +Cc: 58158

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 58158@debbugs.gnu.org
> Date: Thu, 29 Sep 2022 09:03:17 +0200
> 
> >> - No GC may happen in step 2.  This is because mark_buffer iterates over
> >>   buffer overlays.
> >> 
> >> I think this is an exceedingly dangerous design.
> >
> > Why, because of "no GC" requirement?  We could ensure that by calling
> > inhibit_garbage_collection (if the code doesn't do that already).
> 
> It doesn't.

Should be easy to fix, no?

> BTW, if anything signals in step 2, so that end_iteration isn't called,
> we're also hosed.

record_unwind_protect should fix that, right?
(inhibit_garbage_collection already employs this mechanism).

> > What higher-level operations require "interval tree iteration" that
> > you describe?  Which primitives end up doing such iterations?
> 
> What has to do with overlays.  To name a few: overlay-at, overlays-in,
> next-overlay-change, previous-overlay-change, overlay-lists, ...
> 
> I personally think this is a no-go.

Really?  Even if we take all the measures mentioned above?  Why so?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  8:08     ` Eli Zaretskii
@ 2022-09-29  9:09       ` Gerd Möllmann
  2022-09-29  9:37         ` Eli Zaretskii
  2022-10-01  1:57         ` Richard Stallman
  0 siblings, 2 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29  9:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58158, Stefan Monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> I personally think this is a no-go.
>
> Really?  Even if we take all the measures mentioned above?

Yes.

> Why so?

I think it simply can't be that what is basically walking a binary tree
requires such restrictions.  And if it does because of some quirk of the
interval tree in itree.c, something is seriously wrong with the design.

That's a bit harsh, but I mean it :-).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  9:09       ` Gerd Möllmann
@ 2022-09-29  9:37         ` Eli Zaretskii
  2022-09-29 10:05           ` Gerd Möllmann
  2022-10-01  1:57         ` Richard Stallman
  1 sibling, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-29  9:37 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 58158, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  58158@debbugs.gnu.org
> Date: Thu, 29 Sep 2022 11:09:17 +0200
> 
> I think it simply can't be that what is basically walking a binary tree
> requires such restrictions.  And if it does because of some quirk of the
> interval tree in itree.c, something is seriously wrong with the design.
> 
> That's a bit harsh, but I mean it :-).

Fair enough.

Can you propose a fix?

I guess the begin_iteration/end_iteration dance is because the tree
could be in inconsistent state in-between?  If so, what would it take
to avoid the inconsistency?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  9:37         ` Eli Zaretskii
@ 2022-09-29 10:05           ` Gerd Möllmann
  2022-09-29 10:43             ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29 10:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58158, monnier

Eli Zaretskii <eliz@gnu.org> writes:

> Can you propose a fix?

Other than completely rewrite at least this part, no.

> I guess the begin_iteration/end_iteration dance is because the tree
> could be in inconsistent state in-between?  If so, what would it take
> to avoid the inconsistency?

I find it very hard to tell why this is done the way it is.  If someone
knowing the code can think of a reason, please speak up.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 10:05           ` Gerd Möllmann
@ 2022-09-29 10:43             ` Eli Zaretskii
  2022-09-29 11:33               ` Gerd Möllmann
  2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-29 10:43 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 58158, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: monnier@iro.umontreal.ca,  58158@debbugs.gnu.org
> Date: Thu, 29 Sep 2022 12:05:50 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Can you propose a fix?
> 
> Other than completely rewrite at least this part, no.
> 
> > I guess the begin_iteration/end_iteration dance is because the tree
> > could be in inconsistent state in-between?  If so, what would it take
> > to avoid the inconsistency?
> 
> I find it very hard to tell why this is done the way it is.  If someone
> knowing the code can think of a reason, please speak up.

I may be missing something, but it looks like the sole purpose of the
iter_start/iter_finish dance is to ensure only one iteration per tree
is running at any given time, and that's because the iteration uses
some state variable(s) of which there's only one instance per tree.

Stefan, am I missing something?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 10:43             ` Eli Zaretskii
@ 2022-09-29 11:33               ` Gerd Möllmann
  2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29 11:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58158, monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> I find it very hard to tell why this is done the way it is.  If someone
>> knowing the code can think of a reason, please speak up.
>
> I may be missing something, but it looks like the sole purpose of the
> iter_start/iter_finish dance is to ensure only one iteration per tree
> is running at any given time, and that's because the iteration uses
> some state variable(s) of which there's only one instance per tree.

BTW, It currently uses a "visited" flag in tree nodes, which is kind of
such a state, but something like that is normally not needed when
walking an rb-tree.






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 10:43             ` Eli Zaretskii
  2022-09-29 11:33               ` Gerd Möllmann
@ 2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-29 13:23                 ` Eli Zaretskii
                                   ` (2 more replies)
  1 sibling, 3 replies; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-29 13:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 58158

> I may be missing something, but it looks like the sole purpose of the
> iter_start/iter_finish dance is to ensure only one iteration per tree
> is running at any given time, and that's because the iteration uses
> some state variable(s) of which there's only one instance per tree.
>
> Stefan, am I missing something?

One reason is that traversing a binary tree usually requires something
like recursion, but that wouldn't fit very conveniently with the current
code (nor with C in general since you can't make a local recursive
closure which accesses local variables from the surrounding function).

Another is the need to update the begin/end fields (these need updating
because of insertions/deletions but they're updated lazily while
traversing the tree to avoid an O(N) complexity during the
insertions/deletions).  Hiding that behind 'some kind of "next node"
function keeps the code more readable.

But yes, the current restriction to have a single iteration at a time is
a bit of a problem, especially because it's very "global".  I added
a comment yesterday describing how we could make it non-global (hence
getting rid of the `visited` flag in the nodes).

For now, I pushed a simple fix to traverse the tree "by hand" in the GC
rather than via the iterator.

        Stefan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-09-29 13:23                 ` Eli Zaretskii
  2022-09-29 16:48                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-29 13:40                 ` Eli Zaretskii
  2022-09-29 14:15                 ` Gerd Möllmann
  2 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-29 13:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, 58158

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>,
>   58158@debbugs.gnu.org
> Date: Thu, 29 Sep 2022 09:10:19 -0400
> 
> > I may be missing something, but it looks like the sole purpose of the
> > iter_start/iter_finish dance is to ensure only one iteration per tree
> > is running at any given time, and that's because the iteration uses
> > some state variable(s) of which there's only one instance per tree.
> >
> > Stefan, am I missing something?
> 
> One reason is that traversing a binary tree usually requires something
> like recursion, but that wouldn't fit very conveniently with the current
> code (nor with C in general since you can't make a local recursive
> closure which accesses local variables from the surrounding function).

I'm not sure I understand how recursion is related.  Are you saying
that recursion is replaced with iteration?  But then, if _start and
_finish are called by the same caller, we don't really need the
protection, since no one can start another iteration while the first
one runs.  Right?

> Another is the need to update the begin/end fields (these need updating
> because of insertions/deletions but they're updated lazily while
> traversing the tree to avoid an O(N) complexity during the
> insertions/deletions).  Hiding that behind 'some kind of "next node"
> function keeps the code more readable.

Where in the code do you see iteration that adds or deletes nodes?

> For now, I pushed a simple fix to traverse the tree "by hand" in the GC
> rather than via the iterator.

So this removes the restriction of not having GC during iteration?

Also, I take it that you don't consider the current code is as
"harmful" as Gerd thinks it is?  IOW, you don't share his opinion that
this implementation is a "no-go"?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-29 13:23                 ` Eli Zaretskii
@ 2022-09-29 13:40                 ` Eli Zaretskii
  2022-09-29 14:15                 ` Gerd Möllmann
  2 siblings, 0 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-29 13:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, 58158

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>,
>   58158@debbugs.gnu.org
> Date: Thu, 29 Sep 2022 09:10:19 -0400
> 
> Another is the need to update the begin/end fields (these need updating
> because of insertions/deletions but they're updated lazily while
> traversing the tree to avoid an O(N) complexity during the
> insertions/deletions).  Hiding that behind 'some kind of "next node"
> function keeps the code more readable.

Btw, couldn't we handle this by having a flag in the node that tells
us whether the begin/end fields can be trusted?  Then the first caller
who need them would run the update and reset the flags, and we still
have lazy update, albeit somewhat less lazy, but without the need for
guarding the iteration with start/finish calls.  Would that work?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-29 13:23                 ` Eli Zaretskii
  2022-09-29 13:40                 ` Eli Zaretskii
@ 2022-09-29 14:15                 ` Gerd Möllmann
  2022-09-29 14:37                   ` Gerd Möllmann
  2022-09-29 16:40                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 2 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29 14:15 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> One reason is that traversing a binary tree usually requires something
> like recursion, but that wouldn't fit very conveniently with the current
> code (nor with C in general since you can't make a local recursive
> closure which accesses local variables from the surrounding function).

Ok, usually, but not necessarily.  The alternative is to implement an
iterator that starts with a node N, and an implementation of a successor
function, which return the successor of N in a given order.  This
requires a parent pointer in nodes, but that we have.

(Something like this is used for ordered containers like "map" and "set"
in C++ STL, for instance, which are based on rb-trees in GCC's
libstdc++.)

> Another is the need to update the begin/end fields (these need updating
> because of insertions/deletions but they're updated lazily while
> traversing the tree to avoid an O(N) complexity during the
> insertions/deletions).  Hiding that behind 'some kind of "next node"
> function keeps the code more readable.

Is this for buffer text changes, something akin to a delayed update of
marker positions?  I already wondered where that was.

> But yes, the current restriction to have a single iteration at a time is
> a bit of a problem, especially because it's very "global".  I added
> a comment yesterday describing how we could make it non-global (hence
> getting rid of the `visited` flag in the nodes).

BTW, and related to iteration directly, did you notice this in
interval_tree_insert?

      /* This suggests that nodes in the right subtree are strictly
         greater.  But this is not true due to later rotations. */
      child = node->begin <= child->begin ? child->left : child->right;





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 14:15                 ` Gerd Möllmann
@ 2022-09-29 14:37                   ` Gerd Möllmann
  2022-09-29 22:09                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-29 16:40                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-29 14:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 58158

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>> One reason is that traversing a binary tree usually requires something
>> like recursion, but that wouldn't fit very conveniently with the current
>> code (nor with C in general since you can't make a local recursive
>> closure which accesses local variables from the surrounding function).
>
> Ok, usually, but not necessarily.  The alternative is to implement an
> iterator that starts with a node N, and an implementation of a successor
> function, which return the successor of N in a given order.  This
> requires a parent pointer in nodes, but that we have.
>
> (Something like this is used for ordered containers like "map" and "set"
> in C++ STL, for instance, which are based on rb-trees in GCC's
> libstdc++.)

The code from libstdc++ is this (from libstdc++-v3/src/c++98/tree.cc):

  static _Rb_tree_node_base*
  local_Rb_tree_increment(_Rb_tree_node_base* __x) throw ()
  {
    if (__x->_M_right != 0)
      {
        __x = __x->_M_right;
        while (__x->_M_left != 0)
          __x = __x->_M_left;
      }
    else
      {
        _Rb_tree_node_base* __y = __x->_M_parent;
        while (__x == __y->_M_right)
          {
            __x = __y;
            __y = __y->_M_parent;
          }
        if (__x->_M_right != __y)
          __x = __y;
      }
    return __x;
  }

I hope one can read that.

The idea is to find the root of the smallest subtree containing the
current node, and proceed from there.  That's why the parent pointer is
needed.

Symmmetrical for max->min ordering.  And finding min/max is trivial.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 14:15                 ` Gerd Möllmann
  2022-09-29 14:37                   ` Gerd Möllmann
@ 2022-09-29 16:40                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-29 16:40 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, 58158

Gerd Möllmann [2022-09-29 16:15:09] wrote:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> One reason is that traversing a binary tree usually requires something
>> like recursion, but that wouldn't fit very conveniently with the current
>> code (nor with C in general since you can't make a local recursive
>> closure which accesses local variables from the surrounding function).
> Ok, usually, but not necessarily.  The alternative is to implement an
> iterator that starts with a node N, and an implementation of a successor
> function, which return the successor of N in a given order.

The approach currently used is somewhat similar to that.  Some of the
difference is that we need an actual "iterator/generator" object to
remember the parameter of the filtering we want to apply to the set of
objects.  And the problem is that this "object" is currently implemented
not only as a global value (thus restricting us to one-iteration at
a time) but also with some parts of the data stored in the tree.
I think this is the part that really needs to be changed.

> This requires a parent pointer in nodes, but that we have.
>
> (Something like this is used for ordered containers like "map" and "set"
> in C++ STL, for instance, which are based on rb-trees in GCC's
> libstdc++.)

Another difference is that itree.c's iterator uses a local "work stack"
instead of traversing the tree exclusively via left/right/parent like in
the code you show.  I don't know if that difference is important, tho.

>> Another is the need to update the begin/end fields (these need updating
>> because of insertions/deletions but they're updated lazily while
>> traversing the tree to avoid an O(N) complexity during the
>> insertions/deletions).  Hiding that behind 'some kind of "next node"
>> function keeps the code more readable.
>
> Is this for buffer text changes, something akin to a delayed update of
> marker positions?

Yes, exactly.

>> But yes, the current restriction to have a single iteration at a time is
>> a bit of a problem, especially because it's very "global".  I added
>> a comment yesterday describing how we could make it non-global (hence
>> getting rid of the `visited` flag in the nodes).
>
> BTW, and related to iteration directly, did you notice this in
> interval_tree_insert?
>
>       /* This suggests that nodes in the right subtree are strictly
>          greater.  But this is not true due to later rotations. */
>       child = node->begin <= child->begin ? child->left : child->right;

No, I had not noticed that yet, and I don't understand this comment.


        Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 13:23                 ` Eli Zaretskii
@ 2022-09-29 16:48                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-29 16:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 58158

>> > I may be missing something, but it looks like the sole purpose of the
>> > iter_start/iter_finish dance is to ensure only one iteration per tree
>> > is running at any given time, and that's because the iteration uses
>> > some state variable(s) of which there's only one instance per tree.
>> >
>> > Stefan, am I missing something?
>> 
>> One reason is that traversing a binary tree usually requires something
>> like recursion, but that wouldn't fit very conveniently with the current
>> code (nor with C in general since you can't make a local recursive
>> closure which accesses local variables from the surrounding function).
>
> I'm not sure I understand how recursion is related.  Are you saying
> that recursion is replaced with iteration?  But then, if _start and
> _finish are called by the same caller, we don't really need the
> protection, since no one can start another iteration while the first
> one runs.  Right?

Typically, current code will look something like:

    int x;
    Lisp_Object y;

    buffer_overlay_iter_start (current_buffer, prev, pos, ITREE_DESCENDING);
    while ((node = buffer_overlay_iter_next (current_buffer)))
      {
        ... do something that updates x and y ...
      }
    buffer_overlay_iter_finish (current_buffer);

If we were to use recursion, then we'd need to define a new (recursive)
function which does what's currently done in the `while` loop, but this
function can't access `x` and `y`, so it would need to take them as
argument, or a reference to them...

The use of an iterator is definitely convenient (and is a standard
approach in many languages).

>> Another is the need to update the begin/end fields (these need updating
>> because of insertions/deletions but they're updated lazily while
>> traversing the tree to avoid an O(N) complexity during the
>> insertions/deletions).  Hiding that behind 'some kind of "next node"
>> function keeps the code more readable.
>
> Where in the code do you see iteration that adds or deletes nodes?

No, I meant insertion/deletion of text in the buffer, thus requiring
updates to `begin/end` fields.

> Btw, couldn't we handle this by having a flag in the node that tells
> us whether the begin/end fields can be trusted?  Then the first caller
> who need them would run the update and reset the flags, and we still
> have lazy update, albeit somewhat less lazy, but without the need for
> guarding the iteration with start/finish calls.  Would that work?

Yes, it would, but it's still O(N).
The current approach is inspired by the approach used in `intervals.c`
which also updates those fields lazily/ondemand so as to avoid the O(N)
performance impact.

>> For now, I pushed a simple fix to traverse the tree "by hand" in the GC
>> rather than via the iterator.
> So this removes the restriction of not having GC during iteration?

Yes.

> Also, I take it that you don't consider the current code is as
> "harmful" as Gerd thinks it is?

I don't like the global state it uses, but I think we can fix this
aspect without too much trouble.

> IOW, you don't share his opinion that this implementation is
> a "no-go"?

No, indeed, I don't.


        Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 14:37                   ` Gerd Möllmann
@ 2022-09-29 22:09                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-30  5:28                       ` Gerd Möllmann
  0 siblings, 1 reply; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-29 22:09 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, 58158

>   static _Rb_tree_node_base*
>   local_Rb_tree_increment(_Rb_tree_node_base* __x) throw ()
>   {
>     if (__x->_M_right != 0)
>       {
>         __x = __x->_M_right;
>         while (__x->_M_left != 0)
>           __x = __x->_M_left;
>       }
>     else
>       {
>         _Rb_tree_node_base* __y = __x->_M_parent;
>         while (__x == __y->_M_right)
>           {
>             __x = __y;
>             __y = __y->_M_parent;
>           }
>         if (__x->_M_right != __y)
>           __x = __y;
>       }
>     return __x;
>   }
>
> I hope one can read that.

I changed the code to store the `visited` bit in the work stack, but if
you could rewrite the `interval_generator_next` along the lines of the
code above that would be great.
[ An alternative would be to try and get rid of the `parent` field :-)  ]


        Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29 22:09                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-09-30  5:28                       ` Gerd Möllmann
  2022-09-30  6:11                         ` Eli Zaretskii
  2022-09-30 13:25                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-30  5:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I changed the code to store the `visited` bit in the work stack, but if
> you could rewrite the `interval_generator_next` along the lines of the
> code above that would be great.

Ok, I'll rewrite that :-).  When I understand what that "narrowing" is
and how and for what it is used.

BTW, what do you think of changing function names to something a bit
shorter?  I find myself constantly getting confused when reading the
code.  I think an "itree_" prefix would suffice for non-static
functions, and static ones without prefix.  Renaming that is hopefully
not that much work with LSP.

> [ An alternative would be to try and get rid of the `parent` field :-)
> ]

:-)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30  5:28                       ` Gerd Möllmann
@ 2022-09-30  6:11                         ` Eli Zaretskii
  2022-09-30 11:31                           ` Gerd Möllmann
  2022-10-06 22:36                           ` Dmitry Gutov
  2022-09-30 13:25                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 2 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-30  6:11 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 58158, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  58158@debbugs.gnu.org
> Date: Fri, 30 Sep 2022 07:28:26 +0200
> 
> Renaming that is hopefully not that much work with LSP.

You should be able to do this with M-? followed by
"M-x xref-query-replace-in-results RET" (the latter
should be invoked in the "*xref*" buffer produced by M-?).





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30  6:11                         ` Eli Zaretskii
@ 2022-09-30 11:31                           ` Gerd Möllmann
  2022-09-30 18:29                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-10-06 22:36                           ` Dmitry Gutov
  1 sibling, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-30 11:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58158, monnier

On 22-09-30 8:11 , Eli Zaretskii wrote:
>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  58158@debbugs.gnu.org
>> Date: Fri, 30 Sep 2022 07:28:26 +0200
>>
>> Renaming that is hopefully not that much work with LSP.
> 
> You should be able to do this with M-? followed by
> "M-x xref-query-replace-in-results RET" (the latter
> should be invoked in the "*xref*" buffer produced by M-?).

Ok, thanks for the tip.

Info for Stefan: I've now removed the per-tree null node, and make check 
shows no unexpected results.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30  5:28                       ` Gerd Möllmann
  2022-09-30  6:11                         ` Eli Zaretskii
@ 2022-09-30 13:25                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-30 14:08                           ` Gerd Möllmann
  1 sibling, 1 reply; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-30 13:25 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, 58158

Gerd Möllmann [2022-09-30 07:28:26] wrote:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> I changed the code to store the `visited` bit in the work stack, but if
>> you could rewrite the `interval_generator_next` along the lines of the
>> code above that would be great.
> Ok, I'll rewrite that :-).  When I understand what that "narrowing" is
> and how and for what it is used.

The traversals are always bound by begin..end boundaries.  Usually we
know those bounds upfront (e.g. for `overlays-in` or `overlay-at`), but
when doing things like `next/previous-overlay-change` one of the bounds
is not know at first, so in order to try and avoid the O(N) complexity
the code refines that other bound on the fly (e.g. when searching
forward, after seeing an overlay that ends at POS we know that there's no
point looking further than POS so we can move the end boundary to POS).

> BTW, what do you think of changing function names to something a bit
> shorter?  I find myself constantly getting confused when reading the
> code.  I think an "itree_" prefix would suffice for non-static
> functions, and static ones without prefix.

Fine by me, yes.


        Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 13:25                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-09-30 14:08                           ` Gerd Möllmann
  2022-09-30 15:25                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-09-30 14:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> The traversals are always bound by begin..end boundaries.  Usually we
> know those bounds upfront (e.g. for `overlays-in` or `overlay-at`), but
> when doing things like `next/previous-overlay-change` one of the bounds
> is not know at first, so in order to try and avoid the O(N) complexity
> the code refines that other bound on the fly (e.g. when searching
> forward, after seeing an overlay that ends at POS we know that there's no
> point looking further than POS so we can move the end boundary to
> POS).

Thanks, that helps.

Maybe you could also help me with the questions below?

I'm assuming, from a comment somewhere, that an interval tree is an
rb-tree with keys being interval start positions, and allowing
duplicates.

That is, if N is a node, all nodes in the subtree N->left are strictly <
N, and nodes in N->right are >=.  Or in a picture, if [start, end] is an
interval, and we insert some intervals with the same start, we could
arrive at

                 [5, a]
                /      \
            [4, b]     [6, c]
              /\         /  \
             0  0       0   [6, d]
                              /\
                              ...

where 0 means null, and 6 is a duplicate start.

1. Is that correct?

2. Does the tree order duplicates in a particular way?  Maybe by overlay
priority, or by interval end?  And, perhaps, if it doesn't order
duplicates, would it be an idea to order them, maybe at some later
point?  I'm asking this because a successor(N) function would return
nodes in the order in the tree, always, and I don't know what the order
is now.

3. If my mental picture is right, we could of course end up with a tree
that has degenerated to a list.  Or a subtree, maybe.  Do you know if
that can happen?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 14:08                           ` Gerd Möllmann
@ 2022-09-30 15:25                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-30 16:04                               ` Eli Zaretskii
                                                 ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-30 15:25 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Andreas Politz, 58158

> Maybe you could also help me with the questions below?

I'll try (BTW, the original author is Andreas Politz who we can still
reach at <mail@andreas-politz.de>.  He doesn't have much time to devote
to it, tho, so best not to Cc him through all the conversations).

> I'm assuming, from a comment somewhere, that an interval tree is an
> rb-tree with keys being interval start positions, and allowing
> duplicates.

Yup.

> That is, if N is a node, all nodes in the subtree N->left are strictly <
> N, and nodes in N->right are >=.

The following code in `interval_tree_insert`:

  while (child != ITREE_NULL)
    {
      parent = child;
      offset += child->offset;
      child->limit = max (child->limit, node->end - offset);
      /* This suggests that nodes in the right subtree are strictly
         greater.  But this is not true due to later rotations. */
      child = node->begin <= child->begin ? child->left : child->right;
    }

suggests that N->left are <= and N->right are > but my reading of the
comment is that the only thing we can rely on is that N-<left is <= and
N->right is >=

[ I do understand this comment now :-)  ]

> 2. Does the tree order duplicates in a particular way?
> Maybe by overlay priority, or by interval end?

AFAICT, no it does not.

> And, perhaps, if it doesn't order duplicates, would it be an idea to
> order them, maybe at some later point?  I'm asking this because
> a successor(N) function would return nodes in the order in the tree,
> always, and I don't know what the order is now.

Ordering based on interval end could arguably make sense.  I'm not
completely sure how/where it would benefit us, tho.  Most times we look
at the itree, we just look at *all* the nodes within a specific
interval, so the order in which they're returned doesn't matter very
much (the ordering within the tree does matter in terms of how we manage
to prune the tree, but that has more to do with which elements are near
the root, which is a different kind of "ordering" than the "binary tree
ordering" itself).  Maybe the `next/previous-overlay-change` code
benefit from sub-ordering based on `end`, but I suspect the effect would
be lost in the noise: if we want to speed up that part of the code,
I expect that a better avenue is to rewrite the
`next/previous-single-overlay-change` so as not to work by (while ..
(next/previous-overlay-change ..) .. (get-char-property ..) ..) where
both functions do the O(log N) itree-iteration dance, but instead doing
the itree iteration itself.

> 3. If my mental picture is right, we could of course end up with a tree
> that has degenerated to a list.  Or a subtree, maybe.  Do you know if
> that can happen?

In terms of tree depth, no: the code preserves the RB invariants, which
ensure balance regardless of ordering (or lack thereof).

        Stefan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 15:25                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-09-30 16:04                               ` Eli Zaretskii
  2022-09-30 17:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-10-01  5:06                               ` Gerd Möllmann
  2022-10-01  7:25                               ` Gerd Möllmann
  2 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-09-30 16:04 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, mail, 58158

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  58158@debbugs.gnu.org, Andreas Politz
>  <mail@andreas-politz.de>
> Date: Fri, 30 Sep 2022 11:25:30 -0400
> 
> > And, perhaps, if it doesn't order duplicates, would it be an idea to
> > order them, maybe at some later point?  I'm asking this because
> > a successor(N) function would return nodes in the order in the tree,
> > always, and I don't know what the order is now.
> 
> Ordering based on interval end could arguably make sense.  I'm not
> completely sure how/where it would benefit us, tho.  Most times we look
> at the itree, we just look at *all* the nodes within a specific
> interval, so the order in which they're returned doesn't matter very
> much (the ordering within the tree does matter in terms of how we manage
> to prune the tree, but that has more to do with which elements are near
> the root, which is a different kind of "ordering" than the "binary tree
> ordering" itself).  Maybe the `next/previous-overlay-change` code
> benefit from sub-ordering based on `end`, but I suspect the effect would
> be lost in the noise: if we want to speed up that part of the code,
> I expect that a better avenue is to rewrite the
> `next/previous-single-overlay-change` so as not to work by (while ..
> (next/previous-overlay-change ..) .. (get-char-property ..) ..) where
> both functions do the O(log N) itree-iteration dance, but instead doing
> the itree iteration itself.

The display engine sorts the overlays, so order could be important.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 16:04                               ` Eli Zaretskii
@ 2022-09-30 17:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-30 17:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, mail, 58158

>> Ordering based on interval end could arguably make sense.  I'm not
>> completely sure how/where it would benefit us, tho.  Most times we look
>> at the itree, we just look at *all* the nodes within a specific
>> interval, so the order in which they're returned doesn't matter very
>> much (the ordering within the tree does matter in terms of how we manage
>> to prune the tree, but that has more to do with which elements are near
>> the root, which is a different kind of "ordering" than the "binary tree
>> ordering" itself).  Maybe the `next/previous-overlay-change` code
>> benefit from sub-ordering based on `end`, but I suspect the effect would
>> be lost in the noise: if we want to speed up that part of the code,
>> I expect that a better avenue is to rewrite the
>> `next/previous-single-overlay-change` so as not to work by (while ..
>> (next/previous-overlay-change ..) .. (get-char-property ..) ..) where
>> both functions do the O(log N) itree-iteration dance, but instead doing
>> the itree iteration itself.
>
> The display engine sorts the overlays, so order could be important.

Indeed, for things like `overlays_at` it would be handy if the iterator
could return the right overlays already in the right order, so we don't
have to `sort_overlays`.

But in `sort_overlays` the `priority` property takes precedence, so if
we order the RB tree based on that ordering, we will lose the main
purpose of the change, i.e. finding the overlays at POS in O(log N)
time, because the `compare_overlays` ordering does not let us know that
all of the left or right subtree is outside of a given interval.
IOW the itree has to use a different ordering than that of
`compare_overlays` anyway, so we're still forced to (re-)sort afterwards
with `sort_overlays` :-(


	Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 11:31                           ` Gerd Möllmann
@ 2022-09-30 18:29                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-10-02  8:06                               ` Gerd Möllmann
  0 siblings, 1 reply; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-30 18:29 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, 58158

> Info for Stefan: I've now removed the per-tree null node, and make check
> shows no unexpected results.

Thanks.  It's an improvement, but I still don't understand how it
works :-(


        Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  9:09       ` Gerd Möllmann
  2022-09-29  9:37         ` Eli Zaretskii
@ 2022-10-01  1:57         ` Richard Stallman
  2022-10-01  7:00           ` Eli Zaretskii
  1 sibling, 1 reply; 68+ messages in thread
From: Richard Stallman @ 2022-10-01  1:57 UTC (permalink / raw)
  To: Gerd MÃ¶llmann; +Cc: eliz, 58158, monnier

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I think it simply can't be that what is basically walking a binary tree
  > requires such restrictions.

If we allow running complicated code during the tree walk, that would
be asking for trouble.  But if the operations done during the tree
walk are simple and written in C, it should be clear that there is no
possibility of this kind of a bug.  Isn't that the case?

  > > What has to do with overlays.  To name a few: overlay-at, overlays-in,
  > > next-overlay-change, previous-overlay-change, overlay-lists, ...

Those operations can be done in C with no risk of signaling an error.

That ought to be sufficient, as long as we don't need the code to be
reentrant.  If the code needs to be reentrant then we would have
to eliminate the "visited" flag.  But if we don't run Lisp code
during the tree-walk, we should not need it to be reentrant.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 15:25                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-30 16:04                               ` Eli Zaretskii
@ 2022-10-01  5:06                               ` Gerd Möllmann
  2022-10-01 13:54                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-10-01  7:25                               ` Gerd Möllmann
  2 siblings, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-10-01  5:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Andreas Politz, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> That is, if N is a node, all nodes in the subtree N->left are strictly <
>> N, and nodes in N->right are >=.
>
> The following code in `interval_tree_insert`:
>
>   while (child != ITREE_NULL)
>     {
>       parent = child;
>       offset += child->offset;
>       child->limit = max (child->limit, node->end - offset);
>       /* This suggests that nodes in the right subtree are strictly
>          greater.  But this is not true due to later rotations. */
>       child = node->begin <= child->begin ? child->left : child->right;
>     }
>
> suggests that N->left are <= and N->right are > but my reading of the
> comment is that the only thing we can rely on is that N-<left is <= and
> N->right is >=

Phew.  I'm not sure but I get the feeling that makes implementing a
successor function, let's say, challenging.

I wonder how std::multimap deals with that.  Hm.

Anyways, with your removal of the visited flag (N thumbs up, for large
values of N :-)), most of the problems I preceived are gone anyway.
So, I guess we can live with the iteration stack, which seems to work
fine, and the alternative is only nice to have, now.

(And impacts are getting closer in the real world here, so I'm
a bit distracted anyway.)

>
> [ I do understand this comment now :-)  ]

:-)

>
>> 2. Does the tree order duplicates in a particular way?
>> Maybe by overlay priority, or by interval end?
>
> AFAICT, no it does not.

Ok.

>> And, perhaps, if it doesn't order duplicates, would it be an idea to
>> order them, maybe at some later point?  I'm asking this because
>> a successor(N) function would return nodes in the order in the tree,
>> always, and I don't know what the order is now.
>
> Ordering based on interval end could arguably make sense.  I'm not
> completely sure how/where it would benefit us, tho.  Most times we look
> at the itree, we just look at *all* the nodes within a specific
> interval, so the order in which they're returned doesn't matter very
> much (the ordering within the tree does matter in terms of how we manage
> to prune the tree, but that has more to do with which elements are near
> the root, which is a different kind of "ordering" than the "binary tree
> ordering" itself).  Maybe the `next/previous-overlay-change` code
> benefit from sub-ordering based on `end`, but I suspect the effect would
> be lost in the noise: if we want to speed up that part of the code,
> I expect that a better avenue is to rewrite the
> `next/previous-single-overlay-change` so as not to work by (while ..
> (next/previous-overlay-change ..) .. (get-char-property ..) ..) where
> both functions do the O(log N) itree-iteration dance, but instead doing
> the itree iteration itself.

Thanks.

>
>> 3. If my mental picture is right, we could of course end up with a tree
>> that has degenerated to a list.  Or a subtree, maybe.  Do you know if
>> that can happen?
>
> In terms of tree depth, no: the code preserves the RB invariants, which
> ensure balance regardless of ordering (or lack thereof).

Ok, thanks for your insights!





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-01  1:57         ` Richard Stallman
@ 2022-10-01  7:00           ` Eli Zaretskii
  0 siblings, 0 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-01  7:00 UTC (permalink / raw)
  To: rms; +Cc: gerd.moellmann, 58158, monnier

> From: Richard Stallman <rms@gnu.org>
> Cc: eliz@gnu.org, 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
> Date: Fri, 30 Sep 2022 21:57:36 -0400
> 
>   > > What has to do with overlays.  To name a few: overlay-at, overlays-in,
>   > > next-overlay-change, previous-overlay-change, overlay-lists, ...
> 
> Those operations can be done in C with no risk of signaling an error.

I think such assumptions were proven dangerous at best in the long
run.  The way Emacs develops, we constantly add more and more hooks to
C code, and more and more direct calls into Lisp from C.  These
invalidate any assumptions about "no risk of signaling an error", even
if they were originally true when the code was first written.

Moreover, we test for QUIT in operations that can be prolonged ones
(and for a good reason), so any long loop in C could potentially throw
to top level if the user pressed C-g.

All in all, experience shows that making such assumptions in Emacs is
unsafe.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 15:25                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-30 16:04                               ` Eli Zaretskii
  2022-10-01  5:06                               ` Gerd Möllmann
@ 2022-10-01  7:25                               ` Gerd Möllmann
  2022-10-01 10:55                                 ` Gerd Möllmann
  2022-10-01 14:01                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 2 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-10-01  7:25 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Andreas Politz, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Maybe you could also help me with the questions below?
>
> I'll try (BTW, the original author is Andreas Politz who we can still
> reach at <mail@andreas-politz.de>.  He doesn't have much time to devote
> to it, tho, so best not to Cc him through all the conversations).
>
>> I'm assuming, from a comment somewhere, that an interval tree is an
>> rb-tree with keys being interval start positions, and allowing
>> duplicates.
>
> Yup.
>
>> That is, if N is a node, all nodes in the subtree N->left are strictly <
>> N, and nodes in N->right are >=.
>
> The following code in `interval_tree_insert`:
>
>   while (child != ITREE_NULL)
>     {
>       parent = child;
>       offset += child->offset;
>       child->limit = max (child->limit, node->end - offset);
>       /* This suggests that nodes in the right subtree are strictly
>          greater.  But this is not true due to later rotations. */
>       child = node->begin <= child->begin ? child->left : child->right;
>     }
>
> suggests that N->left are <= and N->right are > but my reading of the
> comment is that the only thing we can rely on is that N-<left is <= and
> N->right is >=
>

Yup.  I've used overlay-tree a bit (compile with ITREE_DEBUG defined
after pulling), and used this:

(defun make-ivs ()
  (with-current-buffer (get-buffer-create "*iv*")
    (delete-all-overlays)
    (erase-buffer)
    (insert (make-string 50 ?x))
    (let ((o1 (make-overlay 1 1))
	  (o2 (make-overlay 10 11))
	  (o3 (make-overlay 10 12))
	  (o4 (make-overlay 1 1))
	  )
      (move-overlay o4 10 13)
      (overlay-tree))))

(pp (make-ivs))
((:begin 10 :end 12 :limit 13 :offset 0 :rear-advance nil :front-advance nil)
 ((:begin 1 :end 1 :limit 13 :offset 0 :rear-advance nil :front-advance nil)
  nil
  ((:begin 10 :end 13 :limit 13 :offset 0 :rear-advance nil :front-advance nil)
   nil nil))
 ((:begin 10 :end 11 :limit 11 :offset 0 :rear-advance nil :front-advance nil)
  nil nil))

That's 

                    [10, 12]
                   /        \
                 [1, 1]      [10, 11]
                 /    \         /\
                    [10, 13]
                     /    \
                          
The 10 is found "all over the place".

I surmise no reasonable successor function can be written for such a
tree.

I have to look at std::multimap, they manage to do this somehow.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-01  7:25                               ` Gerd Möllmann
@ 2022-10-01 10:55                                 ` Gerd Möllmann
  2022-10-01 14:01                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-10-01 10:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Andreas Politz, 58158

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> I have to look at std::multimap, they manage to do this somehow.

Well, I should have thought of that, because it's obvious from the fact
they are able to use successor/predecessor functions :-/.

The equivalent in our code would go like below, which is just written
down in a hurry, so please bear with me.  It's just about the idea.  So:

Insert a new interval_node only if overlay start is not in the tree
already.  If the contains a node for the start, make the node's value a
list of overlays, and add the new one to it in an order that's
convenient (The STL uses insertion order).

As a consequence, keys is the rb-tree are unique, and it is always
strictly ordered, rotations don't change that.  The min node is the
left-most node in a tree, and so on.

So far so good.

An iterator in the order min->max would have to record the interval_node
in the tree plus information where in the list of overlays it is, if it
is in any.  Finding the next node is implemented with a successor
function like the one I've shown from libstdc++.

To find all overlays containing a given position POS, find the node
whose start is <= POS.  Start at the root of the tree, and walk the left
link until we find the ndoes's start is <= POS.  Then iterate forward
until start > POS, returning only overlapping overlays.

Finding overlays intersecting an interval [BEG, END] is not as nice, but
we can exclude intervals starting after END.  So we have to iterate over
all overlays from the minimum of the tree, and iterate forward until we
reach the first excluded one.

That's so "easy" that even I can understand how it works :-).

What am I overlooking?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-01  5:06                               ` Gerd Möllmann
@ 2022-10-01 13:54                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-10-02  8:22                                   ` Gerd Möllmann
  0 siblings, 1 reply; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-10-01 13:54 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Andreas Politz, 58158

>> The following code in `interval_tree_insert`:
[...]
>> suggests that N->left are <= and N->right are > but my reading of the
>> comment is that the only thing we can rely on is that N-<left is <= and
>> N->right is >=
>
> Phew.  I'm not sure but I get the feeling that makes implementing a
> successor function, let's say, challenging.

I don't think it makes any difference in that respect, no.  The notion of
"successor" is just "the successor in the in-order traversal of the
tree" and while rotations change the initial property that "N->right are >",
they don't change the in-order traversal output (they don't re-order
nodes w.r.t in-order traversal (or in reverse order for that matter)).

As alluded to at the end of my previous email, while RB trees are
typically used for your usual binary tree which comes with some notion
of ordering that makes lookup O(log N), the RB trees ensure balance
without relying on any notion of ordering since the rotations just
blindly preserve the order as a side effect but they don't themselves
need to call the ordering function to decide how to do their job.

        Stefan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-01  7:25                               ` Gerd Möllmann
  2022-10-01 10:55                                 ` Gerd Möllmann
@ 2022-10-01 14:01                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 68+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-10-01 14:01 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, Andreas Politz, 58158

> That's 
>
>                     [10, 12]
>                    /        \
>                  [1, 1]      [10, 11]
>                  /    \         /\
>                     [10, 13]
>                      /    \
>                           
> The 10 is found "all over the place".

"all over the place" if you consider the pre-order or post-order
traversal, indeed.  But if you look at the in-order traversal (e.g., the
one you'd get from C++ `local_Rb_tree_increment` you showed), you get
the expected result:

   [1, 1]
   [10, 13]
   [10, 12]
   [10, 11]


-- Stefan






^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30 18:29                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-10-02  8:06                               ` Gerd Möllmann
  0 siblings, 0 replies; 68+ messages in thread
From: Gerd Möllmann @ 2022-10-02  8:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Info for Stefan: I've now removed the per-tree null node, and make check
>> shows no unexpected results.
>
> Thanks.  It's an improvement, but I still don't understand how it
> works :-(

Is it still null->parent?

If so, maybe it helps to picture that null is a child of many parents
:-).  Every node having null on the left or right is a parent in that
sense.  Therefore, if some code uses null->parent to do something it can
not be right.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-01 13:54                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-10-02  8:22                                   ` Gerd Möllmann
  2022-10-02 16:32                                     ` Andreas Politz
  0 siblings, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-10-02  8:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Andreas Politz, 58158

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Phew.  I'm not sure but I get the feeling that makes implementing a
>> successor function, let's say, challenging.
>
> I don't think it makes any difference in that respect, no.

The reason I find it challenging, and I'm sure now, is that I've written
the following code, and failed :-).

/* FIXME: This assumption is wrong.  Nodes on the left of P are <=,
   and nodes on the right are >=.  */

/* Value is the successor of interval node X in ascending order.  It
   is assumed that the tree is organized so that nodes < X are in
   X->left and nodes in X->right are >= X.  */

static struct interval_node *
in_order_successor (struct interval_node *x)
{
  if (x->right != ITREE_NULL)
    {
      /* X has a right child, which means X's right subtree has
	 elements >= X.  Proceed to the left-most child in the right
	 subtree, which is the smallest in that subtree.  */
      x = x->right;
      while (x->left != 0)
	x = x->left;
    }
  else
    {
      /* X's left subtree is uninteresting, because everything there
	 is < X.  Therefore follow the parent chain.  If X is the
	 parent's right child, this means the parent is < X,  We are
	 looking for a parent that is >=.  */
      struct interval_node *y = x->parent;
      while (x == y->right)
	{
	  x = y;
	  y = y->parent;
	}

      /* If we found a parent that's >=, the parent is what we sought.
	 Otherwise, X has arrived at the null node, whose right child
	 is the sentinel node itself.  */
      if (x->right != y)
	x = y;
    }

  return x;
}

I tried to change the comments and/or modify the code for a tree like we
have (left subtree <=, right >=), and couldn't explain why it works,
but I also couldn't produce a counter-example that the existing tree
code actually can produce.  IOW, I couldn't prove anything.

P.S.

With the "all over the place" I indented to hint at the fact that the
tree is not a "normal" BST.  I should probably have said that directly,
sorry.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-02  8:22                                   ` Gerd Möllmann
@ 2022-10-02 16:32                                     ` Andreas Politz
  2022-10-03  4:35                                       ` Gerd Möllmann
  0 siblings, 1 reply; 68+ messages in thread
From: Andreas Politz @ 2022-10-02 16:32 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, 58158, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 450 bytes --]

I've managed to construct a prototype of this "stateless" iterator.

I've only implemented the in-order case and only applied it in the overlays_in function.

The only real trouble I had was with pushing the offset to the children
during the tree navigation in some kind of sensible way.  In the end I
gave up and simply pasted calls to the corresponding function "all over
the place".  It seems to work, at least buffer-tests are passing.

Andreas


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: noverlay-stateless-iterator.diff --]
[-- Type: text/x-patch, Size: 6871 bytes --]

diff --git a/src/buffer.c b/src/buffer.c
index f59fddcbde..6a53b49aad 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -2948,15 +2948,16 @@ overlays_in (ptrdiff_t beg, ptrdiff_t end, bool extend,
   ptrdiff_t next = ZV;
   Lisp_Object *vec = *vec_ptr;
   struct interval_node *node;
+  struct interval_tree_iterator iterator;
 
-  ITREE_FOREACH (node, current_buffer->overlays, beg,
-                 /* Find empty OV at Z ? */
-                 (end >= ZV && empty) ? ZV + 1 : ZV, ASCENDING)
+  interval_tree_iterator_init(&iterator, current_buffer->overlays, beg,
+			      (end >= ZV && empty) ? ZV + 1 : ZV, ITREE_ASCENDING);
+
+  while ((node = interval_tree_iterator_next(&iterator)))
     {
       if (node->begin > end)
         {
           next = min (next, node->begin);
-          ITREE_FOREACH_ABORT ();
           break;
         }
       else if (node->begin == end)
@@ -2964,7 +2965,6 @@ overlays_in (ptrdiff_t beg, ptrdiff_t end, bool extend,
           next = node->begin;
           if ((! empty || end < ZV) && beg < end)
             {
-              ITREE_FOREACH_ABORT ();
               break;
             }
         }
diff --git a/src/itree.c b/src/itree.c
index eeecaf1839..21549fe8a7 100644
--- a/src/itree.c
+++ b/src/itree.c
@@ -1190,3 +1190,132 @@ interval_tree_subtree_min (const struct interval_tree *tree, struct interval_nod
  * +===================================================================================+ */
 
 /* See Foverlay_tree in buffer.c */
+
+\f
+/* +===================================================================================+
+ * | Stateless iterator
+ * +===================================================================================+ */
+
+static bool
+interval_tree_iter_traverse_p(struct interval_tree_iterator *iterator,
+                              struct interval_node *node);
+static
+struct interval_node *
+interval_tree_iterator_in_order_successor (struct interval_tree_iterator* iterator);
+
+void
+interval_tree_iterator_init(struct interval_tree_iterator *iterator,
+                            struct interval_tree *tree,
+                            ptrdiff_t begin,
+                            ptrdiff_t end,
+                            enum interval_tree_order order) {
+  iterator->tree = tree;
+  iterator->node = tree && begin <= end ? ITREE_NULL : NULL;
+  iterator->begin = begin;
+  iterator->end = end;
+  iterator->order = order;
+}
+
+struct interval_node *
+interval_tree_iterator_next(struct interval_tree_iterator *iterator) {
+  if (iterator->node) {
+    do {
+      switch (iterator->order) {
+      case ITREE_ASCENDING:
+        iterator->node = interval_tree_iterator_in_order_successor (iterator);
+        break;
+      default:
+        fprintf (stderr, "interval_tree_order != ITREE_ASCENDING not implemented");
+        emacs_abort ();
+      }
+    } while (iterator->node &&
+             ! interval_node_intersects (iterator->node, iterator->begin, iterator->end));
+  }
+
+  if (iterator->node == ITREE_NULL) {
+    fprintf (stderr, "Next node: ITREE_NULL\n");
+  } else if (! iterator->node) {
+    fprintf (stderr, "Next node: NULL\n");
+  } else {
+    fprintf (stderr, "Next node: begin = %ld, end = %ld (iterator: begin = %ld, end = %ld)\n",
+             iterator->node->begin, iterator->node->end, iterator->begin, iterator->end);
+  }
+  return iterator->node;
+}
+
+static bool
+interval_tree_iter_traverse_p(struct interval_tree_iterator *iterator,
+                              struct interval_node *node) {
+  eassert (node);
+
+  if (node == ITREE_NULL) {
+    return false;
+  } else {
+    eassert (node->parent != ITREE_NULL);
+    if (node->parent->left == node) {
+      return iterator->begin <= node->limit + node->offset;
+    } else {
+      return node->parent->begin <= iterator->end;
+    }
+  }
+}
+
+static
+struct interval_node *
+interval_tree_iterator_in_order_successor (struct interval_tree_iterator* iterator)
+{
+  struct interval_node *node = iterator->node;
+
+  if (node != ITREE_NULL) {
+    interval_tree_inherit_offset (iterator->tree, node);
+  }
+
+  if (node == ITREE_NULL) {
+    node = iterator->tree->root;
+    if (node != ITREE_NULL) {
+      interval_tree_inherit_offset (iterator->tree, node);
+    }
+    while (interval_tree_iter_traverse_p(iterator, node->left)) {
+      node = node->left;
+      if (node != ITREE_NULL) {
+        interval_tree_inherit_offset (iterator->tree, node);
+      }
+    }
+  } else if (interval_tree_iter_traverse_p(iterator, node->right))
+    {
+      node = node->right;
+      if (node != ITREE_NULL) {
+        interval_tree_inherit_offset (iterator->tree, node);
+      }
+      while (interval_tree_iter_traverse_p(iterator, node->left)) {
+        node = node->left;
+        if (node != ITREE_NULL) {
+          interval_tree_inherit_offset (iterator->tree, node);
+        }
+      }
+    }
+  else
+    {
+      struct interval_node *parent = node->parent;
+      while (node == parent->right)
+        {
+          node = parent;
+          parent = parent->parent;
+        }
+      if (node != ITREE_NULL)
+        node = parent;
+    }
+
+  return node == ITREE_NULL ? NULL : node;
+}
+
+void
+interval_tree_iterator_narrow (struct interval_tree_iterator *iterator,
+                               ptrdiff_t begin,
+                               ptrdiff_t end)
+{
+  eassert (begin >= iterator->begin);
+  eassert (end <= iterator->end);
+  iterator->begin =  max (begin, iterator->begin);
+  iterator->end =  min (end, iterator->end);
+}
diff --git a/src/itree.h b/src/itree.h
index 1f019a2607..53f03cca35 100644
--- a/src/itree.h
+++ b/src/itree.h
@@ -72,6 +72,15 @@ #define ITREE_NULL (&itree_null)
   ITREE_PRE_ORDER,
 };
 
+struct interval_tree_iterator
+{
+  struct interval_tree *tree;
+  struct interval_node *node;
+  ptrdiff_t begin;
+  ptrdiff_t end;
+  enum interval_tree_order order;
+};
+
 void interval_node_init (struct interval_node *, ptrdiff_t, ptrdiff_t, bool, bool, Lisp_Object);
 ptrdiff_t interval_node_begin (struct interval_tree *, struct interval_node *);
 ptrdiff_t interval_node_end (struct interval_tree *, struct interval_node *);
@@ -135,4 +144,15 @@ #define ITREE_FOREACH_ABORT() \
 #define ITREE_FOREACH_NARROW(beg, end) \
   interval_generator_narrow (itree_iter_, beg, end)
 
+void interval_tree_nodes (struct interval_tree *tree, struct interval_node **nodes, enum interval_tree_order order);
+
+void interval_tree_iterator_init(struct interval_tree_iterator *iterator,
+				 struct interval_tree *tree,
+				 ptrdiff_t begin,
+				 ptrdiff_t end,
+				 enum interval_tree_order order);
+struct interval_node *interval_tree_iterator_next(struct interval_tree_iterator *iterator);
+void interval_tree_iterator_narrow (struct interval_tree_iterator *iterator,
+			       ptrdiff_t begin,
+			       ptrdiff_t end);
 #endif

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-02 16:32                                     ` Andreas Politz
@ 2022-10-03  4:35                                       ` Gerd Möllmann
  2022-10-04 10:50                                         ` Andreas Politz
  0 siblings, 1 reply; 68+ messages in thread
From: Gerd Möllmann @ 2022-10-03  4:35 UTC (permalink / raw)
  To: Andreas Politz; +Cc: Eli Zaretskii, 58158, Stefan Monnier

Andreas Politz <mail@andreas-politz.de> writes:

> It seems to work, at least buffer-tests are passing.

"seems to work" is a bit weak.  Can we prove it?

I don't mean mathematically, but by reasoning like I tried in the
comments in successor function I posted,





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-03  4:35                                       ` Gerd Möllmann
@ 2022-10-04 10:50                                         ` Andreas Politz
  0 siblings, 0 replies; 68+ messages in thread
From: Andreas Politz @ 2022-10-04 10:50 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: Eli Zaretskii, 58158, Stefan Monnier

My implementation seems to work. The correctness of the algorithm would follow from 

1. The Rb tree algorithm produces a tree with left <= root <= right,
2. the algorithm you’ve posted, and I’ve  adapted, produces an in–order of the tree and
3. the conditions guiding the traversal are correct, i.e. they don’t exclude matching intervals.

Since I believe these statements are true, I believe the algorithm is correct.

Andreas

> Am 03.10.2022 um 06:35 schrieb Gerd Möllmann <gerd.moellmann@gmail.com>:
> 
> Andreas Politz <mail@andreas-politz.de> writes:
> 
>> It seems to work, at least buffer-tests are passing.
> 
> "seems to work" is a bit weak.  Can we prove it?
> 
> I don't mean mathematically, but by reasoning like I tried in the
> comments in successor function I posted,

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  5:29 bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful Gerd Möllmann
  2022-09-29  6:28 ` Eli Zaretskii
@ 2022-10-06 22:26 ` Matt Armstrong
  2023-10-06 13:14 ` Gerd Möllmann
  2 siblings, 0 replies; 68+ messages in thread
From: Matt Armstrong @ 2022-10-06 22:26 UTC (permalink / raw)
  To: Andreas Politz, Gerd Möllmann; +Cc: Eli Zaretskii, 58158, Stefan Monnier

Andreas Politz <mail@andreas-politz.de> writes:

> I've managed to construct a prototype of this "stateless" iterator.
>
> I've only implemented the in-order case and only applied it in the overlays_in function.
>
> The only real trouble I had was with pushing the offset to the children
> during the tree navigation in some kind of sensible way.  In the end I
> gave up and simply pasted calls to the corresponding function "all over
> the place".  It seems to work, at least buffer-tests are passing.

Hey Andreas, this looks pretty good to me but I have some questions on
it.


> +static
> +struct interval_node *
> +interval_tree_iterator_in_order_successor (struct interval_tree_iterator* iterator)
> +{
> +  struct interval_node *node = iterator->node;
> +
> +  if (node != ITREE_NULL) {
> +    interval_tree_inherit_offset (iterator->tree, node);
> +  }
> +  if (node == ITREE_NULL) {
> +    node = iterator->tree->root;

I don't understand this branch.  If the node is NULL upon entry to the
successor call, why start at the root?  Why is the "next in order node"
of NULL the root?  The root is just an node whose BEG is roughly in the
middle of the total inorder traversal, right?

The "stateless" inorder traversal algorithm I am used to starts with the
minimum node (walk only left links down from the root until at a leaf
and that is the minimum).  Then do inorder traversal from there.


> +    if (node != ITREE_NULL) {
> +      interval_tree_inherit_offset (iterator->tree, node);
> +    }
> +    while (interval_tree_iter_traverse_p(iterator, node->left)) {
> +      node = node->left;
> +      interval_tree_inherit_offset (iterator->tree, node);
> +    }
> +  } else if (interval_tree_iter_traverse_p(iterator, node->right))
> +    {
> +      node = node->right;
> +      interval_tree_inherit_offset (iterator->tree, node);
> +      while (interval_tree_iter_traverse_p(iterator, node->left)) {
> +        node = node->left;
> +        interval_tree_inherit_offset (iterator->tree, node);
> +      }
> +    }
> +  else
> +    {
> +      struct interval_node *parent = node->parent;
> +      while (node == parent->right)
> +        {
> +          node = parent;
> +          parent = parent->parent;
> +        }
> +      if (node != ITREE_NULL)
> +        node = parent;
> +    }
> +
> +  return node == ITREE_NULL ? NULL : node;
> +}

I don't understand how the last "else" case works correctly in the face
of a call to `interval_generator_narrow` during iteration.  Narrowing
could render a node's parent out of the "interesting" range, and so the
iterator should not return it but instead keep trying to find a
successor in range.  Right?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-30  6:11                         ` Eli Zaretskii
  2022-09-30 11:31                           ` Gerd Möllmann
@ 2022-10-06 22:36                           ` Dmitry Gutov
  2022-10-07 19:47                             ` Eli Zaretskii
  1 sibling, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-06 22:36 UTC (permalink / raw)
  To: Eli Zaretskii, Gerd Möllmann; +Cc: 58158, monnier

On 30.09.2022 09:11, Eli Zaretskii wrote:
> "M-x xref-query-replace-in-results RET"

JFYI this command has the convenient binding 'r' in Xref output buffers.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-06 22:36                           ` Dmitry Gutov
@ 2022-10-07 19:47                             ` Eli Zaretskii
  2022-10-08 18:50                               ` Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-07 19:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 58158, monnier

> Date: Fri, 7 Oct 2022 01:36:17 +0300
> Cc: 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 30.09.2022 09:11, Eli Zaretskii wrote:
> > "M-x xref-query-replace-in-results RET"
> 
> JFYI this command has the convenient binding 'r' in Xref output buffers.

It also works only when invoked from the Xref buffer, right?

Btw, what am I doing wrong below?

 emacs -Q
 C-x C-f src/character.h RET
 M-x visit-tags-table RET RET
 M-. char_string RET
 r whatever RET

Unexpected result: "No suitable matches here".  Huh? what did I miss?





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-07 19:47                             ` Eli Zaretskii
@ 2022-10-08 18:50                               ` Dmitry Gutov
  2022-10-10  8:10                                 ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-08 18:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 58158, monnier

On 07.10.2022 22:47, Eli Zaretskii wrote:
>> Date: Fri, 7 Oct 2022 01:36:17 +0300
>> Cc: 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> On 30.09.2022 09:11, Eli Zaretskii wrote:
>>> "M-x xref-query-replace-in-results RET"
>>
>> JFYI this command has the convenient binding 'r' in Xref output buffers.
> 
> It also works only when invoked from the Xref buffer, right?

Yep.

But we also have commands like xref-find-references-and-replace and 
dired-do-find-regexp-and-replace.

> Btw, what am I doing wrong below?
> 
>   emacs -Q
>   C-x C-f src/character.h RET
>   M-x visit-tags-table RET RET
>   M-. char_string RET
>   r whatever RET
> 
> Unexpected result: "No suitable matches here".  Huh? what did I miss?

We can't replace over "find definition" matches: they are more abstract 
and don't contain the necessary information to perform the replacement 
(such as the length of a match, for instance).

And such xrefs might navigate you to the beginning of the line, for 
example, rather than to the beginning of the name.

But that makes sense, doesn't it? If replacing over "find definitions" 
results worked fine, in the end you would get a codebase where all 
declarations of a method 'foo' got renamed, but all callsites of it 
remain unchanged. That couldn't have been your intention, could it?

The error message could use some improvement, I suppose, but I'm not 
sure how to make it better.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-08 18:50                               ` Dmitry Gutov
@ 2022-10-10  8:10                                 ` Eli Zaretskii
  2022-10-11  2:12                                   ` Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-10  8:10 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 58158, monnier

> Date: Sat, 8 Oct 2022 21:50:39 +0300
> Cc: gerd.moellmann@gmail.com, 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > Btw, what am I doing wrong below?
> > 
> >   emacs -Q
> >   C-x C-f src/character.h RET
> >   M-x visit-tags-table RET RET
> >   M-. char_string RET
> >   r whatever RET
> > 
> > Unexpected result: "No suitable matches here".  Huh? what did I miss?
> 
> We can't replace over "find definition" matches: they are more abstract 
> and don't contain the necessary information to perform the replacement 
> (such as the length of a match, for instance).
> 
> And such xrefs might navigate you to the beginning of the line, for 
> example, rather than to the beginning of the name.
> 
> But that makes sense, doesn't it? If replacing over "find definitions" 
> results worked fine, in the end you would get a codebase where all 
> declarations of a method 'foo' got renamed, but all callsites of it 
> remain unchanged. That couldn't have been your intention, could it?
> 
> The error message could use some improvement, I suppose, but I'm not 
> sure how to make it better.

I tried to improve the situation, both in the error message, the doc
string, and the manual.





^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-10  8:10                                 ` Eli Zaretskii
@ 2022-10-11  2:12                                   ` Dmitry Gutov
  2022-10-11  6:37                                     ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11  2:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 58158, monnier

On 10.10.2022 11:10, Eli Zaretskii wrote:
>> Date: Sat, 8 Oct 2022 21:50:39 +0300
>> Cc:gerd.moellmann@gmail.com,58158@debbugs.gnu.org,monnier@iro.umontreal.ca
>> From: Dmitry Gutov<dgutov@yandex.ru>
>>
>>> Btw, what am I doing wrong below?
>>>
>>>    emacs -Q
>>>    C-x C-f src/character.h RET
>>>    M-x visit-tags-table RET RET
>>>    M-. char_string RET
>>>    r whatever RET
>>>
>>> Unexpected result: "No suitable matches here".  Huh? what did I miss?
>> We can't replace over "find definition" matches: they are more abstract
>> and don't contain the necessary information to perform the replacement
>> (such as the length of a match, for instance).
>>
>> And such xrefs might navigate you to the beginning of the line, for
>> example, rather than to the beginning of the name.
>>
>> But that makes sense, doesn't it? If replacing over "find definitions"
>> results worked fine, in the end you would get a codebase where all
>> declarations of a method 'foo' got renamed, but all callsites of it
>> remain unchanged. That couldn't have been your intention, could it?
>>
>> The error message could use some improvement, I suppose, but I'm not
>> sure how to make it better.
> I tried to improve the situation, both in the error message, the doc
> string, and the manual.

It's probably better now, but still likely confusing.

What is a "subset of matches"? The user makes a search, gets a bunch of 
matches. Is it fair to call them "only a subset" is the user only 
searched for definitions?

Or to take another example, the user can search for some regexp inside a 
particular directly, with dired-do-find-regexp. Imagine that that 
directory is not the whole project (perhaps just a minor subdirectory). 
Would it be fair to call the results "only a subset" then? But 
xref-query-replace-in-results will work in that case, because the search 
returns values which have enough info to perform the replacements.

Perhaps we should make the error very specific, like "you can't replace 
inside xref-find-definitions results". Since that is going to be the 
exception in like 99.9% of the cases.

It's possible for other backend commands (such as find-references) to 
return unsuitable xref values in third-party backends, or for other code 
to use xref-show-xrefs with such, but those will be really very rare, if 
happen at all.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11  2:12                                   ` Dmitry Gutov
@ 2022-10-11  6:37                                     ` Eli Zaretskii
  2022-10-11 11:36                                       ` xref-query-replace-in-results error message after xref-find-definitions, was: " Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11  6:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 58158, monnier

> Date: Tue, 11 Oct 2022 05:12:11 +0300
> Cc: gerd.moellmann@gmail.com, 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> What is a "subset of matches"?

Feel free to suggest a less vague description.  The idea is that the
list in Xref buffer doesn't show all the references to the identifier,
making renaming infeasible.

> Perhaps we should make the error very specific, like "you can't replace 
> inside xref-find-definitions results". Since that is going to be the 
> exception in like 99.9% of the cases.

That'd be my preference, but what are those 0.1% of cases where the
Xref buffer produced by other commands could fit?

More generally, what exactly does xref.el test to produce the error
message, and how to describe that in user-level terms?

> It's possible for other backend commands (such as find-references) to 
> return unsuitable xref values in third-party backends, or for other code 
> to use xref-show-xrefs with such, but those will be really very rare, if 
> happen at all.

You are saying that 'r' is only useful after M-?, is that right?  The
manual says so, but the manual doesn't have to say "the whole truth".
The doc string should.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11  6:37                                     ` Eli Zaretskii
@ 2022-10-11 11:36                                       ` Dmitry Gutov
  2022-10-11 11:51                                         ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 11:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Moving from Debbugs here.

On 11.10.2022 09:37, Eli Zaretskii wrote:
>> Date: Tue, 11 Oct 2022 05:12:11 +0300
>> Cc: gerd.moellmann@gmail.com, 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> What is a "subset of matches"?
> 
> Feel free to suggest a less vague description.  The idea is that the
> list in Xref buffer doesn't show all the references to the identifier,
> making renaming infeasible.

How about:

   Cannot perform replacements in this search's results

>> Perhaps we should make the error very specific, like "you can't replace
>> inside xref-find-definitions results". Since that is going to be the
>> exception in like 99.9% of the cases.
> 
> That'd be my preference, but what are those 0.1% of cases where the
> Xref buffer produced by other commands could fit?

I suppose they will be covered by the above message, while in the manual 
you can explain it in more simple terms, since you don't need to worry 
about third-party code as much.

> More generally, what exactly does xref.el test to produce the error
> message, and how to describe that in user-level terms?

It tests whether the method xref-match-length returns non-nil for any 
search results. When they do, it would identify them as "match xrefs" 
(mentioned in the Commentary).

But I suppose that clashes with the terminology you prefer to use.

>> It's possible for other backend commands (such as find-references) to
>> return unsuitable xref values in third-party backends, or for other code
>> to use xref-show-xrefs with such, but those will be really very rare, if
>> happen at all.
> 
> You are saying that 'r' is only useful after M-?, is that right?  The
> manual says so, but the manual doesn't have to say "the whole truth".
> The doc string should.

It works after dired-do-find-regexp and project-find-regexp as well.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 11:36                                       ` xref-query-replace-in-results error message after xref-find-definitions, was: " Dmitry Gutov
@ 2022-10-11 11:51                                         ` Eli Zaretskii
  2022-10-11 12:10                                           ` Dmitry Gutov
  2022-10-11 14:04                                           ` Stefan Monnier
  0 siblings, 2 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11 11:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Tue, 11 Oct 2022 14:36:11 +0300
> Cc: emacs-devel <emacs-devel@gnu.org>
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> Moving from Debbugs here.
> 
> On 11.10.2022 09:37, Eli Zaretskii wrote:
> >> Date: Tue, 11 Oct 2022 05:12:11 +0300
> >> Cc: gerd.moellmann@gmail.com, 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
> >> From: Dmitry Gutov <dgutov@yandex.ru>
> >>
> >> What is a "subset of matches"?
> > 
> > Feel free to suggest a less vague description.  The idea is that the
> > list in Xref buffer doesn't show all the references to the identifier,
> > making renaming infeasible.
> 
> How about:
> 
>    Cannot perform replacements in this search's results

This is similar to the original message.  Its problem is that it
states the fact, but doesn't attempt to explain it, and thus doesn't
give a clue what the user did wrong and how to fix that.

> > More generally, what exactly does xref.el test to produce the error
> > message, and how to describe that in user-level terms?
> 
> It tests whether the method xref-match-length returns non-nil for any 
> search results. When they do, it would identify them as "match xrefs" 
> (mentioned in the Commentary).
> 
> But I suppose that clashes with the terminology you prefer to use.

If it's possible to come up with the semantics of xref-match-length or
of "match xrefs", maybe that could be useful.  The commentary just
says the "correspond to search result", which is not very useful for
this purpose.

> > You are saying that 'r' is only useful after M-?, is that right?  The
> > manual says so, but the manual doesn't have to say "the whole truth".
> > The doc string should.
> 
> It works after dired-do-find-regexp and project-find-regexp as well.

So wed cannot say something like "This can only be used after M-?",
sigh...

I still have no idea how to improve the error message.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 11:51                                         ` Eli Zaretskii
@ 2022-10-11 12:10                                           ` Dmitry Gutov
  2022-10-11 12:17                                             ` Eli Zaretskii
  2022-10-11 14:04                                           ` Stefan Monnier
  1 sibling, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 12:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 11.10.2022 14:51, Eli Zaretskii wrote:
>> Date: Tue, 11 Oct 2022 14:36:11 +0300
>> Cc: emacs-devel <emacs-devel@gnu.org>
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> Moving from Debbugs here.
>>
>> On 11.10.2022 09:37, Eli Zaretskii wrote:
>>>> Date: Tue, 11 Oct 2022 05:12:11 +0300
>>>> Cc: gerd.moellmann@gmail.com, 58158@debbugs.gnu.org, monnier@iro.umontreal.ca
>>>> From: Dmitry Gutov <dgutov@yandex.ru>
>>>>
>>>> What is a "subset of matches"?
>>>
>>> Feel free to suggest a less vague description.  The idea is that the
>>> list in Xref buffer doesn't show all the references to the identifier,
>>> making renaming infeasible.
>>
>> How about:
>>
>>     Cannot perform replacements in this search's results
> 
> This is similar to the original message.

It is, because it tries to be accurate foremost, covering all potential 
situations.

But, like I explained, your new message is not much better: it still 
tries to be "high-level", rather than stating particulars, and while 
doing that, contradicts some objective facts.

It might be worth it if it were very clear to the user and true in 99% 
of the situations, but that doesn't looks like the case.

> Its problem is that it
> states the fact, but doesn't attempt to explain it, and thus doesn't
> give a clue what the user did wrong and how to fix that.

How does one explain that we cannot replace in xref-find-definitions 
results? E.g. because the abstract we use for this operations (to 
support different backends) doesn't give us enough information to 
perform that replacement.

And also because replacing in xref-find-definitions results doesn't make 
sense to begin with.

>>> More generally, what exactly does xref.el test to produce the error
>>> message, and how to describe that in user-level terms?
>>
>> It tests whether the method xref-match-length returns non-nil for any
>> search results. When they do, it would identify them as "match xrefs"
>> (mentioned in the Commentary).
>>
>> But I suppose that clashes with the terminology you prefer to use.
> 
> If it's possible to come up with the semantics of xref-match-length or
> of "match xrefs", maybe that could be useful.

Those are basically generalized versions of xref file matches (also 
almost same info as what M-x Grep provides), which contain the line 
number and column, and length of the match. We obtain the first two 
pieces of info lazily, but we need the last one as well.

> The commentary just
> says the "correspond to search result", which is not very useful for
> this purpose.

Search result as in result of a "full scan" of the directory. As opposed 
to looking in some small index which contains the definitions. At least 
what's what I was thinking of, probably.

But the find-references search could also be sped up with an index, so 
this is probably not worth differentiating in these terms.

>>> You are saying that 'r' is only useful after M-?, is that right?  The
>>> manual says so, but the manual doesn't have to say "the whole truth".
>>> The doc string should.
>>
>> It works after dired-do-find-regexp and project-find-regexp as well.
> 
> So wed cannot say something like "This can only be used after M-?",
> sigh...
> 
> I still have no idea how to improve the error message.

Perhaps I should remind that xref-find-definitions is still the main 
exception -- where this command doesn't work.

We also had some talks previously where it's been suggested that we 
should try to show different UIs by default, for xref-find-definitions 
results, and for other xref searches. IIRC you disagreed.

I tried to make a poll for that idea, but there were no conclusive 
choice on what alternative xref-show-definitions-function to use.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 12:10                                           ` Dmitry Gutov
@ 2022-10-11 12:17                                             ` Eli Zaretskii
  2022-10-11 12:44                                               ` Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11 12:17 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Tue, 11 Oct 2022 15:10:47 +0300
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> >> How about:
> >>
> >>     Cannot perform replacements in this search's results
> > 
> > This is similar to the original message.
> 
> It is, because it tries to be accurate foremost, covering all potential 
> situations.
> 
> But, like I explained, your new message is not much better: it still 
> tries to be "high-level", rather than stating particulars, and while 
> doing that, contradicts some objective facts.
> 
> It might be worth it if it were very clear to the user and true in 99% 
> of the situations, but that doesn't looks like the case.
> 
> > Its problem is that it
> > states the fact, but doesn't attempt to explain it, and thus doesn't
> > give a clue what the user did wrong and how to fix that.
> 
> How does one explain that we cannot replace in xref-find-definitions 
> results?

That's what the "subset of matches of identifier" part attempts to do.

> And also because replacing in xref-find-definitions results doesn't make 
> sense to begin with.

I agree that it makes no sense.  The problem is how to say that in a
general enough way.

> > If it's possible to come up with the semantics of xref-match-length or
> > of "match xrefs", maybe that could be useful.
> 
> Those are basically generalized versions of xref file matches (also 
> almost same info as what M-x Grep provides), which contain the line 
> number and column, and length of the match. We obtain the first two 
> pieces of info lazily, but we need the last one as well.

And why do the results of xref-find-definitions lack that?

> > I still have no idea how to improve the error message.
> 
> Perhaps I should remind that xref-find-definitions is still the main 
> exception -- where this command doesn't work.

But not the only one?

> We also had some talks previously where it's been suggested that we 
> should try to show different UIs by default, for xref-find-definitions 
> results, and for other xref searches. IIRC you disagreed.

'r' is just one command which is sensitive to the differences.  AFAIR,
most other commands aren't.  So it makes sense to use the same UI.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 12:17                                             ` Eli Zaretskii
@ 2022-10-11 12:44                                               ` Dmitry Gutov
  2022-10-11 12:55                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 12:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 11.10.2022 15:17, Eli Zaretskii wrote:

>>> Its problem is that it
>>> states the fact, but doesn't attempt to explain it, and thus doesn't
>>> give a clue what the user did wrong and how to fix that.
>>
>> How does one explain that we cannot replace in xref-find-definitions
>> results?
> 
> That's what the "subset of matches of identifier" part attempts to do.

A high-level and not very accurate description, because it's only 
relevant for the difference between xref-find-definitions vs 
xref-find-references, but not when the *-find-regexp commands come into 
play.

>> And also because replacing in xref-find-definitions results doesn't make
>> sense to begin with.
> 
> I agree that it makes no sense.  The problem is how to say that in a
> general enough way.

"Not supported" is not too terrible an error message if we are sure the 
user is trying to do something they shouldn't even attempt to.

But we can try to be helpful by offering an alternative:

   Cannot replace in this search; to rename a symbol, invoke 
\\[xref-find-references] first

>>> If it's possible to come up with the semantics of xref-match-length or
>>> of "match xrefs", maybe that could be useful.
>>
>> Those are basically generalized versions of xref file matches (also
>> almost same info as what M-x Grep provides), which contain the line
>> number and column, and length of the match. We obtain the first two
>> pieces of info lazily, but we need the last one as well.
> 
> And why do the results of xref-find-definitions lack that?

So that the backend isn't forced to provide info that's harder to get, 
and that we don't use anyway. E.g. M-x find-function just brings you to 
BOL rather than to the beginning of the symbol.

>>> I still have no idea how to improve the error message.
>>
>> Perhaps I should remind that xref-find-definitions is still the main
>> exception -- where this command doesn't work.
> 
> But not the only one?

The only known one, so far. Although it might depend on the backend as 
well. The built-in backends are going to fail, but it seems like 
lsp-mode at least returns "match xrefs" for all searches. Maybe Eglot 
too, I haven't checked.

That would mean that one 'r' can work in lsp-mode's 
xref-find-definitions results (they define a bunch of custom commands 
like lsp-find-definition and lsp-find-declaration, but that probably 
doesn't matter). Not sure if we should do something about that.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 12:44                                               ` Dmitry Gutov
@ 2022-10-11 12:55                                                 ` Eli Zaretskii
  2022-10-11 14:55                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11 12:55 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Tue, 11 Oct 2022 15:44:18 +0300
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 11.10.2022 15:17, Eli Zaretskii wrote:
> 
> >>> Its problem is that it
> >>> states the fact, but doesn't attempt to explain it, and thus doesn't
> >>> give a clue what the user did wrong and how to fix that.
> >>
> >> How does one explain that we cannot replace in xref-find-definitions
> >> results?
> > 
> > That's what the "subset of matches of identifier" part attempts to do.
> 
> A high-level and not very accurate description, because it's only 
> relevant for the difference between xref-find-definitions vs 
> xref-find-references, but not when the *-find-regexp commands come into 
> play.

I know.  As I said, suggestions for better wording will be moist
welcome.

> But we can try to be helpful by offering an alternative:
> 
>    Cannot replace in this search; to rename a symbol, invoke 
> \\[xref-find-references] first

But then we'd need to name the other 2 commands as well, to be
accurate, yes?

> >> Perhaps I should remind that xref-find-definitions is still the main
> >> exception -- where this command doesn't work.
> > 
> > But not the only one?
> 
> The only known one, so far.

So maybe just saying

  Cannot do global replacement using results of \\[xref-find-definitions]

should be okay?

> That would mean that one 'r' can work in lsp-mode's 
> xref-find-definitions results (they define a bunch of custom commands 
> like lsp-find-definition and lsp-find-declaration, but that probably 
> doesn't matter). Not sure if we should do something about that.

If 'r' happens to work in that case, we don't have to worry about the
error message, right?



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 11:51                                         ` Eli Zaretskii
  2022-10-11 12:10                                           ` Dmitry Gutov
@ 2022-10-11 14:04                                           ` Stefan Monnier
  2022-10-11 14:07                                             ` Stefan Monnier
  2022-10-11 15:37                                             ` Eli Zaretskii
  1 sibling, 2 replies; 68+ messages in thread
From: Stefan Monnier @ 2022-10-11 14:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Gutov, emacs-devel

>> >> What is a "subset of matches"?
[...]
>>    Cannot perform replacements in this search's results

Not sure if it's been proposed already, but how 'bout something along
the lines of "There may be more matches <elsewhere>".


        Stefan




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 14:04                                           ` Stefan Monnier
@ 2022-10-11 14:07                                             ` Stefan Monnier
  2022-10-11 15:07                                               ` Dmitry Gutov
  2022-10-11 15:37                                             ` Eli Zaretskii
  1 sibling, 1 reply; 68+ messages in thread
From: Stefan Monnier @ 2022-10-11 14:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Gutov, emacs-devel

Stefan Monnier [2022-10-11 10:04:28] wrote:
>>> >> What is a "subset of matches"?
> [...]
>>>    Cannot perform replacements in this search's results
>
> Not sure if it's been proposed already, but how 'bout something along
> the lines of "There may be more matches <elsewhere>".

Or "This list is probably incomplete"?


        Stefan




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 12:55                                                 ` Eli Zaretskii
@ 2022-10-11 14:55                                                   ` Dmitry Gutov
  2022-10-11 16:01                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 14:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 11.10.2022 15:55, Eli Zaretskii wrote:

>> But we can try to be helpful by offering an alternative:
>>
>>     Cannot replace in this search; to rename a symbol, invoke
>> \\[xref-find-references] first
> 
> But then we'd need to name the other 2 commands as well, to be
> accurate, yes?

If our conclusion is that the error is due to user trying to rename a 
symbol (and failing because xref-find-definitions's results don't allow 
them to do so), then xref-find-references is exactly the right suggestion.

Users who try to replace in regexp search matches don't see the error.

>>>> Perhaps I should remind that xref-find-definitions is still the main
>>>> exception -- where this command doesn't work.
>>>
>>> But not the only one?
>>
>> The only known one, so far.
> 
> So maybe just saying
> 
>    Cannot do global replacement using results of \\[xref-find-definitions]
> 
> should be okay?

Isn't it almost the same as I suggested upthread? Except I suggested 
"this search" instead of naming the specific command.

Do you think naming it will be helpful enough to sacrifice the 10% 
accuracy of the message? I suppose someone might have indeed forgotten 
that they did the search using xref-find-definitions.

>> That would mean that one 'r' can work in lsp-mode's
>> xref-find-definitions results (they define a bunch of custom commands
>> like lsp-find-definition and lsp-find-declaration, but that probably
>> doesn't matter). Not sure if we should do something about that.
> 
> If 'r' happens to work in that case, we don't have to worry about the
> error message, right?

That's correct, but having the command succeed might be a problem by 
itself, couldn't it? It will rename the definitions (and/or 
declarations), but not other occurrences.

If we go in from this direction, we can have 
xref-show-definitions-buffer (the default 
xref-show-definitions-function) ensure that the binding for 'r' is set 
to some command that always reports an error (like 'cannot replace in 
definitions'), or is unbound.

This would do nothing for custom values of 
xref-show-definitions-function, but should remove most of the confusion 
with default configuration. And some non-default ones as well (lsp-mode 
doesn't change the value of xref-show-definitions-function).

If the docstring of xref-show-definitions-function looks okay to you, we 
can use its vocabulary.

   Cannot replace in definition search results

should cover xref-find-definitions, lsp-find-definition and 
lsp-find-declaration. Wouldn't help with lsp-find-implementation, though 
(its results are also questionable WRT renaming because they don't 
include all references either), but it won't make it worse.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 14:07                                             ` Stefan Monnier
@ 2022-10-11 15:07                                               ` Dmitry Gutov
  0 siblings, 0 replies; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 15:07 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: emacs-devel

On 11.10.2022 17:07, Stefan Monnier wrote:
> Stefan Monnier [2022-10-11 10:04:28] wrote:
>>>>>> What is a "subset of matches"?
>> [...]
>>>>     Cannot perform replacements in this search's results
>> Not sure if it's been proposed already, but how 'bout something along
>> the lines of "There may be more matches <elsewhere>".
> Or "This list is probably incomplete"?

That's the logical reason, but not really the technical one.

Although we might make a stronger effort to align these two (see my 
previous message in this thread).



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 14:04                                           ` Stefan Monnier
  2022-10-11 14:07                                             ` Stefan Monnier
@ 2022-10-11 15:37                                             ` Eli Zaretskii
  1 sibling, 0 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11 15:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dgutov, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Dmitry Gutov <dgutov@yandex.ru>,  emacs-devel@gnu.org
> Date: Tue, 11 Oct 2022 10:04:28 -0400
> 
> >> >> What is a "subset of matches"?
> [...]
> >>    Cannot perform replacements in this search's results
> 
> Not sure if it's been proposed already, but how 'bout something along
> the lines of "There may be more matches <elsewhere>".

Thanks.  But that doesn't sound like an error message to me.  That
there are more matches doesn't make it clear why the ones that are
present cannot be used.  It requires a lengthy explanation or
inference.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 14:55                                                   ` Dmitry Gutov
@ 2022-10-11 16:01                                                     ` Eli Zaretskii
  2022-10-11 16:41                                                       ` Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11 16:01 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Tue, 11 Oct 2022 17:55:02 +0300
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 11.10.2022 15:55, Eli Zaretskii wrote:
> 
> >> But we can try to be helpful by offering an alternative:
> >>
> >>     Cannot replace in this search; to rename a symbol, invoke
> >> \\[xref-find-references] first
> > 
> > But then we'd need to name the other 2 commands as well, to be
> > accurate, yes?
> 
> If our conclusion is that the error is due to user trying to rename a 
> symbol (and failing because xref-find-definitions's results don't allow 
> them to do so), then xref-find-references is exactly the right suggestion.

That's true.

> > So maybe just saying
> > 
> >    Cannot do global replacement using results of \\[xref-find-definitions]
> > 
> > should be okay?
> 
> Isn't it almost the same as I suggested upthread? Except I suggested 
> "this search" instead of naming the specific command.

I'm not trying to argue, I'm trying to find the best message text.  It
doesn't matter who suggested it first, what matters whether it is
accurate and self-explanatory.

> Do you think naming it will be helpful enough to sacrifice the 10% 
> accuracy of the message? I suppose someone might have indeed forgotten 
> that they did the search using xref-find-definitions.

What are the 10% of inaccuracy?  If the message will only ever be
shown when 'r' is invoked after M-., then what is inaccurate in it?

> >> That would mean that one 'r' can work in lsp-mode's
> >> xref-find-definitions results (they define a bunch of custom commands
> >> like lsp-find-definition and lsp-find-declaration, but that probably
> >> doesn't matter). Not sure if we should do something about that.
> > 
> > If 'r' happens to work in that case, we don't have to worry about the
> > error message, right?
> 
> That's correct, but having the command succeed might be a problem by 
> itself, couldn't it? It will rename the definitions (and/or 
> declarations), but not other occurrences.

No error message could possibly prevent that from happening, could it?

> If we go in from this direction, we can have 
> xref-show-definitions-buffer (the default 
> xref-show-definitions-function) ensure that the binding for 'r' is set 
> to some command that always reports an error (like 'cannot replace in 
> definitions'), or is unbound.

But users can always invoke the command by name as well.

> This would do nothing for custom values of 
> xref-show-definitions-function, but should remove most of the confusion 
> with default configuration. And some non-default ones as well (lsp-mode 
> doesn't change the value of xref-show-definitions-function).
> 
> If the docstring of xref-show-definitions-function looks okay to you, we 
> can use its vocabulary.
> 
>    Cannot replace in definition search results
> 
> should cover xref-find-definitions, lsp-find-definition and 
> lsp-find-declaration. Wouldn't help with lsp-find-implementation, though 
> (its results are also questionable WRT renaming because they don't 
> include all references either), but it won't make it worse.

Hmm... maybe

  Cannot perform global replacement in find-definition results

Another idea would be for the error message to be constructed
dynamically, and include the precise command that produced the Xref
buffer, if xref.el could record that in some buffer-local variable.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 16:01                                                     ` Eli Zaretskii
@ 2022-10-11 16:41                                                       ` Dmitry Gutov
  2022-10-11 16:50                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 16:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 11.10.2022 19:01, Eli Zaretskii wrote:

>>> So maybe just saying
>>>
>>>     Cannot do global replacement using results of \\[xref-find-definitions]
>>>
>>> should be okay?
>>
>> Isn't it almost the same as I suggested upthread? Except I suggested
>> "this search" instead of naming the specific command.
> 
> I'm not trying to argue, I'm trying to find the best message text.  It
> doesn't matter who suggested it first, what matters whether it is
> accurate and self-explanatory.

I wasn't playing "who said it first" either, but pointing at the 
previous suggestion as a little more inherently-accurate.

>> Do you think naming it will be helpful enough to sacrifice the 10%
>> accuracy of the message? I suppose someone might have indeed forgotten
>> that they did the search using xref-find-definitions.
> 
> What are the 10% of inaccuracy?

Sorry, losing 100% accuracy, to get a slightly smaller percentage.

> If the message will only ever be
> shown when 'r' is invoked after M-., then what is inaccurate in it?

Almost only ever. With potential for exceptions, unless we try to 
enforce their absence.

>>>> That would mean that one 'r' can work in lsp-mode's
>>>> xref-find-definitions results (they define a bunch of custom commands
>>>> like lsp-find-definition and lsp-find-declaration, but that probably
>>>> doesn't matter). Not sure if we should do something about that.
>>>
>>> If 'r' happens to work in that case, we don't have to worry about the
>>> error message, right?
>>
>> That's correct, but having the command succeed might be a problem by
>> itself, couldn't it? It will rename the definitions (and/or
>> declarations), but not other occurrences.
> 
> No error message could possibly prevent that from happening, could it?

Only if it happens somewhere else, or if the check before the errors is 
changed significantly.

>> If we go in from this direction, we can have
>> xref-show-definitions-buffer (the default
>> xref-show-definitions-function) ensure that the binding for 'r' is set
>> to some command that always reports an error (like 'cannot replace in
>> definitions'), or is unbound.
> 
> But users can always invoke the command by name as well.

That's, uh, true. But having commands intended for one major mode 
silently failing (or doing that in a weird fashion) when invoked with 
'M-x' in another major mode has not usually been a concern for us.

>> This would do nothing for custom values of
>> xref-show-definitions-function, but should remove most of the confusion
>> with default configuration. And some non-default ones as well (lsp-mode
>> doesn't change the value of xref-show-definitions-function).
>>
>> If the docstring of xref-show-definitions-function looks okay to you, we
>> can use its vocabulary.
>>
>>     Cannot replace in definition search results
>>
>> should cover xref-find-definitions, lsp-find-definition and
>> lsp-find-declaration. Wouldn't help with lsp-find-implementation, though
>> (its results are also questionable WRT renaming because they don't
>> include all references either), but it won't make it worse.
> 
> Hmm... maybe
> 
>    Cannot perform global replacement in find-definition results
> 
> Another idea would be for the error message to be constructed
> dynamically, and include the precise command that produced the Xref
> buffer, if xref.el could record that in some buffer-local variable.

I suppose it could. But then some backend (such as lsp or possibly 
eglot) might return 'definitions' results in format suitable for 
replacements, and that will mean that one can replace in 
xref-find-definitions's results, just with some backends but not others.

And with that, the error message

   Cannot do global replacement using results of \\[xref-find-definitions]

shown to the same user at a different time (perhaps when they're editing 
Elisp) will become a lie.

And then, okay, we could try to add the name of the backend symbol to 
the error message as well, but it's much harder to capture that one, 
especially given that not every command using Xref output will go 
through xref-backend-functions (project-find-regexp is a counter-example).



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 16:41                                                       ` Dmitry Gutov
@ 2022-10-11 16:50                                                         ` Eli Zaretskii
  2022-10-11 20:31                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-11 16:50 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Tue, 11 Oct 2022 19:41:35 +0300
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> >> This would do nothing for custom values of
> >> xref-show-definitions-function, but should remove most of the confusion
> >> with default configuration. And some non-default ones as well (lsp-mode
> >> doesn't change the value of xref-show-definitions-function).
> >>
> >> If the docstring of xref-show-definitions-function looks okay to you, we
> >> can use its vocabulary.
> >>
> >>     Cannot replace in definition search results
> >>
> >> should cover xref-find-definitions, lsp-find-definition and
> >> lsp-find-declaration. Wouldn't help with lsp-find-implementation, though
> >> (its results are also questionable WRT renaming because they don't
> >> include all references either), but it won't make it worse.
> > 
> > Hmm... maybe
> > 
> >    Cannot perform global replacement in find-definition results
> > 
> > Another idea would be for the error message to be constructed
> > dynamically, and include the precise command that produced the Xref
> > buffer, if xref.el could record that in some buffer-local variable.
> 
> I suppose it could. But then some backend (such as lsp or possibly 
> eglot) might return 'definitions' results in format suitable for 
> replacements, and that will mean that one can replace in 
> xref-find-definitions's results, just with some backends but not others.
> 
> And with that, the error message
> 
>    Cannot do global replacement using results of \\[xref-find-definitions]
> 
> shown to the same user at a different time (perhaps when they're editing 
> Elisp) will become a lie.
> 
> And then, okay, we could try to add the name of the backend symbol to 
> the error message as well, but it's much harder to capture that one, 
> especially given that not every command using Xref output will go 
> through xref-backend-functions (project-find-regexp is a counter-example).

So is this:

  Cannot perform global replacement in find-definition results

the best that can be done?  It sounds like no alternative is
significantly better.




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 16:50                                                         ` Eli Zaretskii
@ 2022-10-11 20:31                                                           ` Dmitry Gutov
  2022-10-12  5:17                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-11 20:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 11.10.2022 19:50, Eli Zaretskii wrote:

> So is this:
> 
>    Cannot perform global replacement in find-definition results
> 
> the best that can be done?  It sounds like no alternative is
> significantly better.

Perhaps.

Let's drop the "global" adjective, though: whether the command works 
"globally", "locally", or etc, depends on how the user made the search.

Then we can make this the error message and wait and see until the next 
time anybody complains about the new phrasing. Probably after Emacs 29.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-11 20:31                                                           ` Dmitry Gutov
@ 2022-10-12  5:17                                                             ` Eli Zaretskii
  2022-10-12 10:06                                                               ` John Yates
  2022-10-12 13:47                                                               ` Dmitry Gutov
  0 siblings, 2 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-12  5:17 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Tue, 11 Oct 2022 23:31:01 +0300
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 11.10.2022 19:50, Eli Zaretskii wrote:
> 
> > So is this:
> > 
> >    Cannot perform global replacement in find-definition results
> > 
> > the best that can be done?  It sounds like no alternative is
> > significantly better.
> 
> Perhaps.
> 
> Let's drop the "global" adjective, though: whether the command works 
> "globally", "locally", or etc, depends on how the user made the search.

The "global" part is important, though: it's supposed to hint on why
find-definition results cannot be used.  The opposite of "global" here
is "partial", not "local".  If you can suggest a better word for
"global" here, please do.

> Then we can make this the error message and wait and see until the next 
> time anybody complains about the new phrasing. Probably after Emacs 29.

Yep.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-12  5:17                                                             ` Eli Zaretskii
@ 2022-10-12 10:06                                                               ` John Yates
  2022-10-12 10:17                                                                 ` Eli Zaretskii
  2022-10-12 13:47                                                               ` Dmitry Gutov
  1 sibling, 1 reply; 68+ messages in thread
From: John Yates @ 2022-10-12 10:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Gutov, emacs-devel

On Wed, Oct 12, 2022 at 1:18 AM Eli Zaretskii <eliz@gnu.org> wrote:
>
> The "global" part is important, though: it's supposed to hint on why
> find-definition results cannot be used.  The opposite of "global" here
> is "partial", not "local".  If you can suggest a better word for
> "global" here, please do.

Clearly the 'global' / 'local' dichotomy is too deeply ingrained in emacs
concepts to allow easy reuse of those terms with even slightly different
meanings.  Perhaps 'full', 'complete', 'total', 'universal', or some such.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-12 10:06                                                               ` John Yates
@ 2022-10-12 10:17                                                                 ` Eli Zaretskii
       [not found]                                                                   ` <CAJnXXogKsM=gMTFi2NivDMHW4A3EBtBtsNDBV3o5vcu2xXfuvw@mail.gmail.com>
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-12 10:17 UTC (permalink / raw)
  To: emacs-devel, John Yates; +Cc: Dmitry Gutov

On October 12, 2022 1:06:22 PM GMT+03:00, John Yates <john@yates-sheets.org> wrote:
> On Wed, Oct 12, 2022 at 1:18 AM Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > The "global" part is important, though: it's supposed to hint on why
> > find-definition results cannot be used.  The opposite of "global" here
> > is "partial", not "local".  If you can suggest a better word for
> > "global" here, please do.
> 
> Clearly the 'global' / 'local' dichotomy is too deeply ingrained in emacs
> concepts to allow easy reuse of those terms with even slightly different
> meanings.  Perhaps 'full', 'complete', 'total', 'universal', or some such.


We are talking about "global replacement", yes?



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
       [not found]                                                                   ` <CAJnXXogKsM=gMTFi2NivDMHW4A3EBtBtsNDBV3o5vcu2xXfuvw@mail.gmail.com>
@ 2022-10-12 13:21                                                                     ` Eli Zaretskii
  2022-10-12 16:12                                                                       ` John Yates
  0 siblings, 1 reply; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-12 13:21 UTC (permalink / raw)
  To: John Yates; +Cc: emacs-devel

> From: John Yates <john@yates-sheets.org>
> Date: Wed, 12 Oct 2022 07:49:06 -0400
> 
> On Wed, Oct 12, 2022 at 6:17 AM Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > On October 12, 2022 1:06:22 PM GMT+03:00, John Yates <john@yates-sheets.org> wrote:
> > > . . .  Perhaps 'full', 'complete', 'total', 'universal', or some such.
> >
> > We are talking about "global replacement", yes?
> 
> Yes we are.  And, IIUC, your point is that the concept of
> "global replacement" is comparably embedded in software
> development.  Hmm...

Not only that, the replacements (pun intended) you proposed don't seem
to fit this context.  What synonyms of "global replacement" exist that
don't use the word "global"?

P.S. I restored the CC to emacs-devel on the assumption that you
omitted it by accident; apologies if that was intentional.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-12  5:17                                                             ` Eli Zaretskii
  2022-10-12 10:06                                                               ` John Yates
@ 2022-10-12 13:47                                                               ` Dmitry Gutov
  2022-10-12 14:05                                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 68+ messages in thread
From: Dmitry Gutov @ 2022-10-12 13:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 12.10.2022 08:17, Eli Zaretskii wrote:
>> Let's drop the "global" adjective, though: whether the command works
>> "globally", "locally", or etc, depends on how the user made the search.
> The "global" part is important, though: it's supposed to hint on why
> find-definition results cannot be used.  The opposite of "global" here
> is "partial", not "local".  If you can suggest a better word for
> "global" here, please do.

I think you're trying hard to make it clearer, but it still won't have 
100% the intended effect.

And it makes the error more likely to confuse the user and make them 
misunderstand how things work:

1) Saying "Cannot perform global replacement in find-definition 
results", with the explicit qualifier "global", potentially implies that 
a "local" replacement in find-definition results can be done. But it 
cannot. We can't do either currently for technical reasons.

We don't want to do "local" replacements in find-definition results also 
for logical reasons, but that's just the reason why we're not in a hurry 
to remove the technical limitation.

2) The message implies 'r' is limited to "global" replacement. But it 
can easily do "partial" replacement, as long as the command that 
produces the list returns "match xrefs". dired-do-find-regexp, for 
example. Or project-find-regexp, when called with C-u, and the user 
specifies a specific directory. They will return a "partial" list of 
matches, compared to the full project search.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-12 13:47                                                               ` Dmitry Gutov
@ 2022-10-12 14:05                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 68+ messages in thread
From: Eli Zaretskii @ 2022-10-12 14:05 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Date: Wed, 12 Oct 2022 16:47:15 +0300
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 12.10.2022 08:17, Eli Zaretskii wrote:
> >> Let's drop the "global" adjective, though: whether the command works
> >> "globally", "locally", or etc, depends on how the user made the search.
> > The "global" part is important, though: it's supposed to hint on why
> > find-definition results cannot be used.  The opposite of "global" here
> > is "partial", not "local".  If you can suggest a better word for
> > "global" here, please do.
> 
> I think you're trying hard to make it clearer, but it still won't have 
> 100% the intended effect.
> 
> And it makes the error more likely to confuse the user and make them 
> misunderstand how things work:
> 
> 1) Saying "Cannot perform global replacement in find-definition 
> results", with the explicit qualifier "global", potentially implies that 
> a "local" replacement in find-definition results can be done. But it 
> cannot. We can't do either currently for technical reasons.
> 
> We don't want to do "local" replacements in find-definition results also 
> for logical reasons, but that's just the reason why we're not in a hurry 
> to remove the technical limitation.
> 
> 2) The message implies 'r' is limited to "global" replacement. But it 
> can easily do "partial" replacement, as long as the command that 
> produces the list returns "match xrefs". dired-do-find-regexp, for 
> example. Or project-find-regexp, when called with C-u, and the user 
> specifies a specific directory. They will return a "partial" list of 
> matches, compared to the full project search.

Sorry, I'm unconvinced.  AFAICT, you are reiterating issues and
arguments we already have been through.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: xref-query-replace-in-results error message after xref-find-definitions, was: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-10-12 13:21                                                                     ` Eli Zaretskii
@ 2022-10-12 16:12                                                                       ` John Yates
  0 siblings, 0 replies; 68+ messages in thread
From: John Yates @ 2022-10-12 16:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Wed, Oct 12, 2022 at 9:20 AM Eli Zaretskii <eliz@gnu.org> wrote:
>
> Not only that, the replacements (pun intended) you proposed don't seem
> to fit this context.

I agree.  Chalk it up to a "senior moment" :-)

> What synonyms of "global replacement" exist that
> don't use the word "global"?

A synonym search suggests "universal" or "galactic" :-)

> P.S. I restored the CC to emacs-devel on the assumption that you
> omitted it by accident; apologies if that was intentional.

I do not always remember to use reply-all.  Thank you.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful
  2022-09-29  5:29 bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful Gerd Möllmann
  2022-09-29  6:28 ` Eli Zaretskii
  2022-10-06 22:26 ` Matt Armstrong
@ 2023-10-06 13:14 ` Gerd Möllmann
  2 siblings, 0 replies; 68+ messages in thread
From: Gerd Möllmann @ 2023-10-06 13:14 UTC (permalink / raw)
  To: 58158

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> In its current form, interval tree iteration works like this:
>
> 1. Call begin_iteration(T) to iterate over tree T
> 2. do stuff
> 3. Call end_iteration(T)
>
> with the following rules:
>
> - Begin_iteration and end_iteration must be paired.
>
> - There can be only one iteration per tree at a time.  Nested iteration
>   over the same tree is not supported (abort).
>
> - No GC may happen in step 2.  This is because mark_buffer iterates over
>   buffer overlays.
>
> I think this is an exceedingly dangerous design.

Time has passed, and I think this is no longer relevant.
I'm therefore closing this bug.





^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2023-10-06 13:14 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-29  5:29 bug#58158: 29.0.50; [overlay] Interval tree iteration considered harmful Gerd Möllmann
2022-09-29  6:28 ` Eli Zaretskii
2022-09-29  7:03   ` Gerd Möllmann
2022-09-29  8:08     ` Eli Zaretskii
2022-09-29  9:09       ` Gerd Möllmann
2022-09-29  9:37         ` Eli Zaretskii
2022-09-29 10:05           ` Gerd Möllmann
2022-09-29 10:43             ` Eli Zaretskii
2022-09-29 11:33               ` Gerd Möllmann
2022-09-29 13:10               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-29 13:23                 ` Eli Zaretskii
2022-09-29 16:48                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-29 13:40                 ` Eli Zaretskii
2022-09-29 14:15                 ` Gerd Möllmann
2022-09-29 14:37                   ` Gerd Möllmann
2022-09-29 22:09                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-30  5:28                       ` Gerd Möllmann
2022-09-30  6:11                         ` Eli Zaretskii
2022-09-30 11:31                           ` Gerd Möllmann
2022-09-30 18:29                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-02  8:06                               ` Gerd Möllmann
2022-10-06 22:36                           ` Dmitry Gutov
2022-10-07 19:47                             ` Eli Zaretskii
2022-10-08 18:50                               ` Dmitry Gutov
2022-10-10  8:10                                 ` Eli Zaretskii
2022-10-11  2:12                                   ` Dmitry Gutov
2022-10-11  6:37                                     ` Eli Zaretskii
2022-10-11 11:36                                       ` xref-query-replace-in-results error message after xref-find-definitions, was: " Dmitry Gutov
2022-10-11 11:51                                         ` Eli Zaretskii
2022-10-11 12:10                                           ` Dmitry Gutov
2022-10-11 12:17                                             ` Eli Zaretskii
2022-10-11 12:44                                               ` Dmitry Gutov
2022-10-11 12:55                                                 ` Eli Zaretskii
2022-10-11 14:55                                                   ` Dmitry Gutov
2022-10-11 16:01                                                     ` Eli Zaretskii
2022-10-11 16:41                                                       ` Dmitry Gutov
2022-10-11 16:50                                                         ` Eli Zaretskii
2022-10-11 20:31                                                           ` Dmitry Gutov
2022-10-12  5:17                                                             ` Eli Zaretskii
2022-10-12 10:06                                                               ` John Yates
2022-10-12 10:17                                                                 ` Eli Zaretskii
     [not found]                                                                   ` <CAJnXXogKsM=gMTFi2NivDMHW4A3EBtBtsNDBV3o5vcu2xXfuvw@mail.gmail.com>
2022-10-12 13:21                                                                     ` Eli Zaretskii
2022-10-12 16:12                                                                       ` John Yates
2022-10-12 13:47                                                               ` Dmitry Gutov
2022-10-12 14:05                                                                 ` Eli Zaretskii
2022-10-11 14:04                                           ` Stefan Monnier
2022-10-11 14:07                                             ` Stefan Monnier
2022-10-11 15:07                                               ` Dmitry Gutov
2022-10-11 15:37                                             ` Eli Zaretskii
2022-09-30 13:25                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-30 14:08                           ` Gerd Möllmann
2022-09-30 15:25                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-30 16:04                               ` Eli Zaretskii
2022-09-30 17:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-01  5:06                               ` Gerd Möllmann
2022-10-01 13:54                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-02  8:22                                   ` Gerd Möllmann
2022-10-02 16:32                                     ` Andreas Politz
2022-10-03  4:35                                       ` Gerd Möllmann
2022-10-04 10:50                                         ` Andreas Politz
2022-10-01  7:25                               ` Gerd Möllmann
2022-10-01 10:55                                 ` Gerd Möllmann
2022-10-01 14:01                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-29 16:40                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-01  1:57         ` Richard Stallman
2022-10-01  7:00           ` Eli Zaretskii
2022-10-06 22:26 ` Matt Armstrong
2023-10-06 13:14 ` Gerd Möllmann

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.