all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree
@ 2022-12-14  0:11 Yuan Fu
  2022-12-14 12:08 ` Eli Zaretskii
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Yuan Fu @ 2022-12-14  0:11 UTC (permalink / raw)
  To: 60054


This is not really an Emacs bug, but either tree-siter-c or
tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
tomorrow, and treesit-search-forward-goto and friends hang, 
we (eh, you) know what’s going on.

I’ve submitted an issue here:
https://github.com/tree-sitter/tree-sitter-c/issues/119

So far, I’ve only observed this in that specific edge case.

Yuan




^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree
  2022-12-14  0:11 bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Yuan Fu
@ 2022-12-14 12:08 ` Eli Zaretskii
  2022-12-14 20:27   ` Yuan Fu
  2022-12-16  1:14 ` Yuan Fu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2022-12-14 12:08 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 60054

> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 13 Dec 2022 16:11:01 -0800
> 
> 
> This is not really an Emacs bug, but either tree-siter-c or
> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
> tomorrow, and treesit-search-forward-goto and friends hang, 
> we (eh, you) know what’s going on.
> 
> I’ve submitted an issue here:
> https://github.com/tree-sitter/tree-sitter-c/issues/119
> 
> So far, I’ve only observed this in that specific edge case.

We should have protection against that, which should be easy, right?





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree
  2022-12-14 12:08 ` Eli Zaretskii
@ 2022-12-14 20:27   ` Yuan Fu
  2022-12-15  6:16     ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: Yuan Fu @ 2022-12-14 20:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60054



> On Dec 14, 2022, at 4:08 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Tue, 13 Dec 2022 16:11:01 -0800
>> 
>> 
>> This is not really an Emacs bug, but either tree-siter-c or
>> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
>> tomorrow, and treesit-search-forward-goto and friends hang, 
>> we (eh, you) know what’s going on.
>> 
>> I’ve submitted an issue here:
>> https://github.com/tree-sitter/tree-sitter-c/issues/119
>> 
>> So far, I’ve only observed this in that specific edge case.
> 
> We should have protection against that, which should be easy, right?

Just to make sure, we want to use something like slow-fast pointers, where we have two pointers, and one goes twice as fast, right? That’s the one I was taught in school :-)

The author advices to use cursors for traversing the tree, as cursors doesn’t have this bug. He also advised against using ts_node_parent and said that they could be removed in the future. I didn’t use cursors in the first place because they can’t traverse the tree backwards, ie, no equivalent of ts_node_prev_sibling, and the performance difference is not significant in Emacs settings. But I just went to look at the source, and it seems ts_node_prev_siblings(node) is implemented by just iterating each children from first to last, until it finds the child just before NODE, LOL[1]. I can do similar things in treesit.c with cursors. By doing that we can fix this problem and be future-proof.

In summary, I’m proposing: 
1. I add the slow-fast pointer checks in treesit.c and treesit.el
2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal functions in treesit.c.

Yuan

[1]

static inline TSNode ts_node__prev_sibling(TSNode self, bool include_anonymous) {
  Subtree self_subtree = ts_node__subtree(self);
  bool self_is_empty = ts_subtree_total_bytes(self_subtree) == 0;
  uint32_t target_end_byte = ts_node_end_byte(self);

  TSNode node = ts_node_parent(self);
  TSNode earlier_node = ts_node__null();
  bool earlier_node_is_relevant = false;

  while (!ts_node_is_null(node)) {
    TSNode earlier_child = ts_node__null();
    bool earlier_child_is_relevant = false;
    bool found_child_containing_target = false;

    TSNode child;
    NodeChildIterator iterator = ts_node_iterate_children(&node);
    while (ts_node_child_iterator_next(&iterator, &child)) {
      if (child.id == self.id) break;
      if (iterator.position.bytes > target_end_byte) {
        found_child_containing_target = true;
        break;
      }

      if (iterator.position.bytes == target_end_byte &&
          (!self_is_empty ||
           ts_subtree_has_trailing_empty_descendant(ts_node__subtree(child), self_subtree))) {
        found_child_containing_target = true;
        break;
      }

      if (ts_node__is_relevant(child, include_anonymous)) {
        earlier_child = child;
        earlier_child_is_relevant = true;
      } else if (ts_node__relevant_child_count(child, include_anonymous) > 0) {
        earlier_child = child;
        earlier_child_is_relevant = false;
      }
    }

    if (found_child_containing_target) {
      if (!ts_node_is_null(earlier_child)) {
        earlier_node = earlier_child;
        earlier_node_is_relevant = earlier_child_is_relevant;
      }
      node = child;
    } else if (earlier_child_is_relevant) {
      return earlier_child;
    } else if (!ts_node_is_null(earlier_child)) {
      node = earlier_child;
    } else if (earlier_node_is_relevant) {
      return earlier_node;
    } else {
      node = earlier_node;
    }
  }

  return ts_node__null();
}






^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree
  2022-12-14 20:27   ` Yuan Fu
@ 2022-12-15  6:16     ` Eli Zaretskii
  0 siblings, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2022-12-15  6:16 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 60054

> From: Yuan Fu <casouri@gmail.com>
> Date: Wed, 14 Dec 2022 12:27:58 -0800
> Cc: 60054@debbugs.gnu.org
> 
> >> https://github.com/tree-sitter/tree-sitter-c/issues/119
> >> 
> >> So far, I’ve only observed this in that specific edge case.
> > 
> > We should have protection against that, which should be easy, right?
> 
> Just to make sure, we want to use something like slow-fast pointers, where we have two pointers, and one goes twice as fast, right? That’s the one I was taught in school :-)

No, I mean protect us from inflooping by checking that the parent of a
node is not the node itself.

> In summary, I’m proposing: 
> 1. I add the slow-fast pointer checks in treesit.c and treesit.el
> 2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal functions in treesit.c.

SGTM, thanks.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in  the parse tree
  2022-12-14  0:11 bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Yuan Fu
  2022-12-14 12:08 ` Eli Zaretskii
@ 2022-12-16  1:14 ` Yuan Fu
  2022-12-17 23:28 ` Yuan Fu
  2022-12-18  8:10 ` Yuan Fu
  3 siblings, 0 replies; 8+ messages in thread
From: Yuan Fu @ 2022-12-16  1:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60054


Eli Zaretskii <eliz@gnu.org> writes:

>> From: Yuan Fu <casouri@gmail.com>
>> Date: Wed, 14 Dec 2022 12:27:58 -0800
>> Cc: 60054@debbugs.gnu.org
>> 
>> >> https://github.com/tree-sitter/tree-sitter-c/issues/119
>> >> 
>> >> So far, I’ve only observed this in that specific edge case.
>> > 
>> > We should have protection against that, which should be easy, right?
>> 
>> Just to make sure, we want to use something like slow-fast pointers,
>> where we have two pointers, and one goes twice as fast, right?
>> That’s the one I was taught in school :-)
>
> No, I mean protect us from inflooping by checking that the parent of a
> node is not the node itself.

In this particular case, it is the siblings’ parent that equals to the
node. Ie, node->sibling->parent = node.  If your intention is to protect
us from this particular case, switching to use cursors will avoid this
bug.

Yuan





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in  the parse tree
  2022-12-14  0:11 bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Yuan Fu
  2022-12-14 12:08 ` Eli Zaretskii
  2022-12-16  1:14 ` Yuan Fu
@ 2022-12-17 23:28 ` Yuan Fu
  2022-12-18  6:00   ` Eli Zaretskii
  2022-12-18  8:10 ` Yuan Fu
  3 siblings, 1 reply; 8+ messages in thread
From: Yuan Fu @ 2022-12-17 23:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60054


Yuan Fu <casouri@gmail.com> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Yuan Fu <casouri@gmail.com>
>>> Date: Wed, 14 Dec 2022 12:27:58 -0800
>>> Cc: 60054@debbugs.gnu.org
>>> 
>>> >> https://github.com/tree-sitter/tree-sitter-c/issues/119
>>> >> 
>>> >> So far, I’ve only observed this in that specific edge case.
>>> > 
>>> > We should have protection against that, which should be easy, right?
>>> 
>>> Just to make sure, we want to use something like slow-fast pointers,
>>> where we have two pointers, and one goes twice as fast, right?
>>> That’s the one I was taught in school :-)
>>
>> No, I mean protect us from inflooping by checking that the parent of a
>> node is not the node itself.
>
> In this particular case, it is the siblings’ parent that equals to the
> node. Ie, node->sibling->parent = node.  If your intention is to protect
> us from this particular case, switching to use cursors will avoid this
> bug.

Ok, I made the change to use cursor API with tests. Hopefully this is
the last time we need to change treesit.c before release. The
node->sibling->parent = node cyclic path should be fixed by this change,
do you still want checks for it?

Yuan





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in  the parse tree
  2022-12-17 23:28 ` Yuan Fu
@ 2022-12-18  6:00   ` Eli Zaretskii
  0 siblings, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2022-12-18  6:00 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 60054

> From: Yuan Fu <casouri@gmail.com>
> Date: Sat, 17 Dec 2022 15:28:01 -0800
> Cc: 60054@debbugs.gnu.org
> 
> > In this particular case, it is the siblings’ parent that equals to the
> > node. Ie, node->sibling->parent = node.  If your intention is to protect
> > us from this particular case, switching to use cursors will avoid this
> > bug.
> 
> Ok, I made the change to use cursor API with tests. Hopefully this is
> the last time we need to change treesit.c before release.

This broke the Windows build (I fixed it).  You cannot start using new
tree-sitter functions without adding the boilerplate code for loading
them dynamically from the shared library at run time.

> The node->sibling->parent = node cyclic path should be fixed by this
> change, do you still want checks for it?

If that problem can never happen, there's no need for the checks.

Thanks.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#60054: 29.0.60; Infinite loop when there are cyclic path in  the parse tree
  2022-12-14  0:11 bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Yuan Fu
                   ` (2 preceding siblings ...)
  2022-12-17 23:28 ` Yuan Fu
@ 2022-12-18  8:10 ` Yuan Fu
  3 siblings, 0 replies; 8+ messages in thread
From: Yuan Fu @ 2022-12-18  8:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 60054-done


Eli Zaretskii <eliz@gnu.org> writes:

>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sat, 17 Dec 2022 15:28:01 -0800
>> Cc: 60054@debbugs.gnu.org
>> 
>> > In this particular case, it is the siblings’ parent that equals to the
>> > node. Ie, node->sibling->parent = node.  If your intention is to protect
>> > us from this particular case, switching to use cursors will avoid this
>> > bug.
>> 
>> Ok, I made the change to use cursor API with tests. Hopefully this is
>> the last time we need to change treesit.c before release.
>
> This broke the Windows build (I fixed it).  You cannot start using new
> tree-sitter functions without adding the boilerplate code for loading
> them dynamically from the shared library at run time.

Ah right, it evaded my mind, sorry about that.

>> The node->sibling->parent = node cyclic path should be fixed by this
>> change, do you still want checks for it?
>
> If that problem can never happen, there's no need for the checks.

Cool. I’m closing this.

Yuan





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-18  8:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-14  0:11 bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Yuan Fu
2022-12-14 12:08 ` Eli Zaretskii
2022-12-14 20:27   ` Yuan Fu
2022-12-15  6:16     ` Eli Zaretskii
2022-12-16  1:14 ` Yuan Fu
2022-12-17 23:28 ` Yuan Fu
2022-12-18  6:00   ` Eli Zaretskii
2022-12-18  8:10 ` Yuan Fu

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.