all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Dmitry Gutov <dmitry@gutov.dev>
Cc: JD Smith <jdtsmith@gmail.com>, emacs-devel@gnu.org
Subject: Re: Tree-sitter navigation time grows as sqrt(line-number)
Date: Sat, 19 Aug 2023 19:01:25 -0700	[thread overview]
Message-ID: <BC6822FB-E9A4-42E6-8306-4B4F841834CE@gmail.com> (raw)
In-Reply-To: <f7da2839-3ffb-9aaa-3f1b-85ffbe1f1c4d@gutov.dev>

[-- Attachment #1: Type: text/plain, Size: 1704 bytes --]



> On Aug 19, 2023, at 5:39 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 20/08/2023 03:18, JD Smith wrote:
>> Great, thanks.  I tried this patch out, and there is indeed about 10x of improvement.  Check the bottom of the gist.  That said, node_parent remains 10x faster yet (at worst, in a long file), so maybe there’s room for further improvement?
> 
> Similarly, I also see an improvement from Yuan's patch in my testing (about 2x), while the patch with ts_node_parent remains the fastest anyway. Where my test looks like this:
> 
>  (benchmark 1000 '(treesit-node-parent n))
> 
> I looked around for the reasons for the difference. Built the latest tree-sitter (didn't help) and found these two threads on GH:
> 
> https://github.com/tree-sitter/tree-sitter/issues/567#issuecomment-595564171 - Max Brunsfield says "There is some caching done in that method to make sure it performs well in the common case of walking repeatedly up the tree", but I haven't found where said caching resides so far.
> 
> https://github.com/tree-sitter/tree-sitter/discussions/878 - mentions that mixing cursor and direct node apis leads to suboptimal results, and just using the former gives an improvement. No "good" code example in there.
> 
> >  May be worth looking at how others are doing it, e.g. the python API.
> 
> Apparently they have both a wrapper for a cursor API, and node_get_parent which is implemented using ts_node_parent: https://github.com/tree-sitter/py-tree-sitter/issues/34
> 
> Leaving it to the callers to choose which one to use.

Ok, I fiddled around a bit more, and this patch (applies to master) should make the speed comparable to ts_node_parent.

Yuan


[-- Attachment #2: node-parent.patch --]
[-- Type: application/octet-stream, Size: 2787 bytes --]

From 21d3e612d1d6819278621b956629f6c28a324145 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Sat, 19 Aug 2023 15:04:20 -0700
Subject: [PATCH] Improve performance of treesit_cursor_helper_1

* src/treesit.c: (treesit_cursor_helper_1): Use
ts_tree_cursor_goto_first_child_for_byte to speed up traversing among
siblings.  The "while (ts_node_end_byte (cursor_node) < end_pos)" can
be removed with the check added in the loop below.
---
 src/treesit.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/src/treesit.c b/src/treesit.c
index 1f694e47201..1017c64f899 100644
--- a/src/treesit.c
+++ b/src/treesit.c
@@ -3023,7 +3023,8 @@ treesit_assume_true (bool val)
    limit.  */
 static bool
 treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target,
-			 uint32_t end_pos, ptrdiff_t limit)
+			 uint32_t start_pos, uint32_t end_pos,
+			 ptrdiff_t limit)
 {
   if (limit <= 0)
     return false;
@@ -3032,23 +3033,17 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target,
   if (ts_node_eq (cursor_node, *target))
     return true;
 
-  if (!ts_tree_cursor_goto_first_child (cursor))
+  if (ts_tree_cursor_goto_first_child_for_byte (cursor, start_pos) == -1)
     return false;
 
-  /* Skip nodes that definitely don't contain TARGET.  */
-  while (ts_node_end_byte (cursor_node) < end_pos)
-    {
-      if (!ts_tree_cursor_goto_next_sibling (cursor))
-	break;
-      cursor_node = ts_tree_cursor_current_node (cursor);
-    }
-
   /* Go through each sibling that could contain TARGET.  Because of
      missing nodes (their width is 0), there could be multiple
      siblings that could contain TARGET.  */
   while (ts_node_start_byte (cursor_node) <= end_pos)
     {
-      if (treesit_cursor_helper_1 (cursor, target, end_pos, limit - 1))
+      if (ts_node_end_byte (cursor_node) >= end_pos
+	  && treesit_cursor_helper_1 (cursor, target, start_pos, end_pos,
+				      limit - 1))
 	return true;
 
       if (!ts_tree_cursor_goto_next_sibling (cursor))
@@ -3080,11 +3075,12 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target,
 static bool
 treesit_cursor_helper (TSTreeCursor *cursor, TSNode node, Lisp_Object parser)
 {
+  uint32_t start_pos = ts_node_start_byte (node);
   uint32_t end_pos = ts_node_end_byte (node);
   TSNode root = ts_tree_root_node (XTS_PARSER (parser)->tree);
   *cursor = ts_tree_cursor_new (root);
-  bool success = treesit_cursor_helper_1 (cursor, &node, end_pos,
-					  TREESIT_RECURSION_LIMIT);
+  bool success = treesit_cursor_helper_1 (cursor, &node, start_pos,
+					  end_pos, TREESIT_RECURSION_LIMIT);
   if (!success)
     ts_tree_cursor_delete (cursor);
   return success;
-- 
2.41.0


[-- Attachment #3: Type: text/plain, Size: 2 bytes --]




  reply	other threads:[~2023-08-20  2:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-17  4:01 Tree-sitter navigation time grows as sqrt(line-number) JD Smith
2023-08-17 10:56 ` Dmitry Gutov
2023-08-17 11:41   ` Eli Zaretskii
2023-08-17 11:51     ` tomas
2023-08-17 12:21   ` JD Smith
2023-08-17 12:34     ` Dmitry Gutov
2023-08-17 13:19       ` Dmitry Gutov
2023-08-19 14:24         ` JD Smith
2023-08-19 22:16           ` Yuan Fu
2023-08-20  0:18             ` JD Smith
2023-08-20  0:39               ` Dmitry Gutov
2023-08-20  2:01                 ` Yuan Fu [this message]
2023-08-20 12:40                   ` JD Smith
2023-08-20 20:26                     ` Dmitry Gutov
2023-08-22  1:41                       ` Yuan Fu
2023-08-22 21:07                         ` JD Smith
2023-08-31  4:26                           ` Yuan Fu
2023-08-31  6:03                             ` Eli Zaretskii
2023-08-31 11:04                               ` Dmitry Gutov
2023-08-31 11:42                                 ` Po Lu
2023-08-31 17:32                                   ` Dmitry Gutov
2023-08-31 12:51                                 ` Eli Zaretskii
2023-08-31 13:58                                   ` JD Smith
2023-08-31 17:49                                   ` Dmitry Gutov
2023-08-31 19:03                               ` Yuan Fu
2023-08-31 19:06                                 ` Eli Zaretskii
2023-08-31 20:24                                   ` Stefan Kangas
2023-09-01  5:33                                     ` Eli Zaretskii
2023-08-20  6:18             ` Eli Zaretskii
2023-08-18  3:00 ` Yuan Fu
2023-08-18  4:19   ` JD Smith
2023-08-18  5:20     ` Yuan Fu
2023-08-18 13:21       ` Dmitry Gutov
2023-08-18 13:39       ` JD Smith
     [not found] <87v8ddsqwe.fsf@web.de>
2023-08-17 14:25 ` Dmitry Gutov
2023-08-17 14:36   ` Dmitry Gutov
2023-08-17 15:01     ` Dmitry Gutov
2023-08-17 15:15       ` Felix
2023-08-18  2:49         ` Yuan Fu
2023-08-18 13:29           ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BC6822FB-E9A4-42E6-8306-4B4F841834CE@gmail.com \
    --to=casouri@gmail.com \
    --cc=dmitry@gutov.dev \
    --cc=emacs-devel@gnu.org \
    --cc=jdtsmith@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.