unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 60054@debbugs.gnu.org
Subject: bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree
Date: Wed, 14 Dec 2022 12:27:58 -0800	[thread overview]
Message-ID: <DFA8CA29-4CF8-4D92-B07D-D7CFED884684@gmail.com> (raw)
In-Reply-To: <83cz8mnq6g.fsf@gnu.org>



> On Dec 14, 2022, at 4:08 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Tue, 13 Dec 2022 16:11:01 -0800
>> 
>> 
>> This is not really an Emacs bug, but either tree-siter-c or
>> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
>> tomorrow, and treesit-search-forward-goto and friends hang, 
>> we (eh, you) know what’s going on.
>> 
>> I’ve submitted an issue here:
>> https://github.com/tree-sitter/tree-sitter-c/issues/119
>> 
>> So far, I’ve only observed this in that specific edge case.
> 
> We should have protection against that, which should be easy, right?

Just to make sure, we want to use something like slow-fast pointers, where we have two pointers, and one goes twice as fast, right? That’s the one I was taught in school :-)

The author advices to use cursors for traversing the tree, as cursors doesn’t have this bug. He also advised against using ts_node_parent and said that they could be removed in the future. I didn’t use cursors in the first place because they can’t traverse the tree backwards, ie, no equivalent of ts_node_prev_sibling, and the performance difference is not significant in Emacs settings. But I just went to look at the source, and it seems ts_node_prev_siblings(node) is implemented by just iterating each children from first to last, until it finds the child just before NODE, LOL[1]. I can do similar things in treesit.c with cursors. By doing that we can fix this problem and be future-proof.

In summary, I’m proposing: 
1. I add the slow-fast pointer checks in treesit.c and treesit.el
2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal functions in treesit.c.

Yuan

[1]

static inline TSNode ts_node__prev_sibling(TSNode self, bool include_anonymous) {
  Subtree self_subtree = ts_node__subtree(self);
  bool self_is_empty = ts_subtree_total_bytes(self_subtree) == 0;
  uint32_t target_end_byte = ts_node_end_byte(self);

  TSNode node = ts_node_parent(self);
  TSNode earlier_node = ts_node__null();
  bool earlier_node_is_relevant = false;

  while (!ts_node_is_null(node)) {
    TSNode earlier_child = ts_node__null();
    bool earlier_child_is_relevant = false;
    bool found_child_containing_target = false;

    TSNode child;
    NodeChildIterator iterator = ts_node_iterate_children(&node);
    while (ts_node_child_iterator_next(&iterator, &child)) {
      if (child.id == self.id) break;
      if (iterator.position.bytes > target_end_byte) {
        found_child_containing_target = true;
        break;
      }

      if (iterator.position.bytes == target_end_byte &&
          (!self_is_empty ||
           ts_subtree_has_trailing_empty_descendant(ts_node__subtree(child), self_subtree))) {
        found_child_containing_target = true;
        break;
      }

      if (ts_node__is_relevant(child, include_anonymous)) {
        earlier_child = child;
        earlier_child_is_relevant = true;
      } else if (ts_node__relevant_child_count(child, include_anonymous) > 0) {
        earlier_child = child;
        earlier_child_is_relevant = false;
      }
    }

    if (found_child_containing_target) {
      if (!ts_node_is_null(earlier_child)) {
        earlier_node = earlier_child;
        earlier_node_is_relevant = earlier_child_is_relevant;
      }
      node = child;
    } else if (earlier_child_is_relevant) {
      return earlier_child;
    } else if (!ts_node_is_null(earlier_child)) {
      node = earlier_child;
    } else if (earlier_node_is_relevant) {
      return earlier_node;
    } else {
      node = earlier_node;
    }
  }

  return ts_node__null();
}






  reply	other threads:[~2022-12-14 20:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-14  0:11 bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Yuan Fu
2022-12-14 12:08 ` Eli Zaretskii
2022-12-14 20:27   ` Yuan Fu [this message]
2022-12-15  6:16     ` Eli Zaretskii
2022-12-16  1:14 ` Yuan Fu
2022-12-17 23:28 ` Yuan Fu
2022-12-18  6:00   ` Eli Zaretskii
2022-12-18  8:10 ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DFA8CA29-4CF8-4D92-B07D-D7CFED884684@gmail.com \
    --to=casouri@gmail.com \
    --cc=60054@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).