From: Yuan Fu <casouri@gmail.com>
To: Dmitry Gutov <dmitry@gutov.dev>
Cc: JD Smith <jdtsmith@gmail.com>, emacs-devel@gnu.org
Subject: Re: Tree-sitter navigation time grows as sqrt(line-number)
Date: Sat, 19 Aug 2023 19:01:25 -0700 [thread overview]
Message-ID: <BC6822FB-E9A4-42E6-8306-4B4F841834CE@gmail.com> (raw)
In-Reply-To: <f7da2839-3ffb-9aaa-3f1b-85ffbe1f1c4d@gutov.dev>
[-- Attachment #1: Type: text/plain, Size: 1704 bytes --]
> On Aug 19, 2023, at 5:39 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 20/08/2023 03:18, JD Smith wrote:
>> Great, thanks. I tried this patch out, and there is indeed about 10x of improvement. Check the bottom of the gist. That said, node_parent remains 10x faster yet (at worst, in a long file), so maybe there’s room for further improvement?
>
> Similarly, I also see an improvement from Yuan's patch in my testing (about 2x), while the patch with ts_node_parent remains the fastest anyway. Where my test looks like this:
>
> (benchmark 1000 '(treesit-node-parent n))
>
> I looked around for the reasons for the difference. Built the latest tree-sitter (didn't help) and found these two threads on GH:
>
> https://github.com/tree-sitter/tree-sitter/issues/567#issuecomment-595564171 - Max Brunsfield says "There is some caching done in that method to make sure it performs well in the common case of walking repeatedly up the tree", but I haven't found where said caching resides so far.
>
> https://github.com/tree-sitter/tree-sitter/discussions/878 - mentions that mixing cursor and direct node apis leads to suboptimal results, and just using the former gives an improvement. No "good" code example in there.
>
> > May be worth looking at how others are doing it, e.g. the python API.
>
> Apparently they have both a wrapper for a cursor API, and node_get_parent which is implemented using ts_node_parent: https://github.com/tree-sitter/py-tree-sitter/issues/34
>
> Leaving it to the callers to choose which one to use.
Ok, I fiddled around a bit more, and this patch (applies to master) should make the speed comparable to ts_node_parent.
Yuan
[-- Attachment #2: node-parent.patch --]
[-- Type: application/octet-stream, Size: 2787 bytes --]
From 21d3e612d1d6819278621b956629f6c28a324145 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Sat, 19 Aug 2023 15:04:20 -0700
Subject: [PATCH] Improve performance of treesit_cursor_helper_1
* src/treesit.c: (treesit_cursor_helper_1): Use
ts_tree_cursor_goto_first_child_for_byte to speed up traversing among
siblings. The "while (ts_node_end_byte (cursor_node) < end_pos)" can
be removed with the check added in the loop below.
---
src/treesit.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/src/treesit.c b/src/treesit.c
index 1f694e47201..1017c64f899 100644
--- a/src/treesit.c
+++ b/src/treesit.c
@@ -3023,7 +3023,8 @@ treesit_assume_true (bool val)
limit. */
static bool
treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target,
- uint32_t end_pos, ptrdiff_t limit)
+ uint32_t start_pos, uint32_t end_pos,
+ ptrdiff_t limit)
{
if (limit <= 0)
return false;
@@ -3032,23 +3033,17 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target,
if (ts_node_eq (cursor_node, *target))
return true;
- if (!ts_tree_cursor_goto_first_child (cursor))
+ if (ts_tree_cursor_goto_first_child_for_byte (cursor, start_pos) == -1)
return false;
- /* Skip nodes that definitely don't contain TARGET. */
- while (ts_node_end_byte (cursor_node) < end_pos)
- {
- if (!ts_tree_cursor_goto_next_sibling (cursor))
- break;
- cursor_node = ts_tree_cursor_current_node (cursor);
- }
-
/* Go through each sibling that could contain TARGET. Because of
missing nodes (their width is 0), there could be multiple
siblings that could contain TARGET. */
while (ts_node_start_byte (cursor_node) <= end_pos)
{
- if (treesit_cursor_helper_1 (cursor, target, end_pos, limit - 1))
+ if (ts_node_end_byte (cursor_node) >= end_pos
+ && treesit_cursor_helper_1 (cursor, target, start_pos, end_pos,
+ limit - 1))
return true;
if (!ts_tree_cursor_goto_next_sibling (cursor))
@@ -3080,11 +3075,12 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target,
static bool
treesit_cursor_helper (TSTreeCursor *cursor, TSNode node, Lisp_Object parser)
{
+ uint32_t start_pos = ts_node_start_byte (node);
uint32_t end_pos = ts_node_end_byte (node);
TSNode root = ts_tree_root_node (XTS_PARSER (parser)->tree);
*cursor = ts_tree_cursor_new (root);
- bool success = treesit_cursor_helper_1 (cursor, &node, end_pos,
- TREESIT_RECURSION_LIMIT);
+ bool success = treesit_cursor_helper_1 (cursor, &node, start_pos,
+ end_pos, TREESIT_RECURSION_LIMIT);
if (!success)
ts_tree_cursor_delete (cursor);
return success;
--
2.41.0
[-- Attachment #3: Type: text/plain, Size: 2 bytes --]
next prev parent reply other threads:[~2023-08-20 2:01 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-17 4:01 Tree-sitter navigation time grows as sqrt(line-number) JD Smith
2023-08-17 10:56 ` Dmitry Gutov
2023-08-17 11:41 ` Eli Zaretskii
2023-08-17 11:51 ` tomas
2023-08-17 12:21 ` JD Smith
2023-08-17 12:34 ` Dmitry Gutov
2023-08-17 13:19 ` Dmitry Gutov
2023-08-19 14:24 ` JD Smith
2023-08-19 22:16 ` Yuan Fu
2023-08-20 0:18 ` JD Smith
2023-08-20 0:39 ` Dmitry Gutov
2023-08-20 2:01 ` Yuan Fu [this message]
2023-08-20 12:40 ` JD Smith
2023-08-20 20:26 ` Dmitry Gutov
2023-08-22 1:41 ` Yuan Fu
2023-08-22 21:07 ` JD Smith
2023-08-31 4:26 ` Yuan Fu
2023-08-31 6:03 ` Eli Zaretskii
2023-08-31 11:04 ` Dmitry Gutov
2023-08-31 11:42 ` Po Lu
2023-08-31 17:32 ` Dmitry Gutov
2023-08-31 12:51 ` Eli Zaretskii
2023-08-31 13:58 ` JD Smith
2023-08-31 17:49 ` Dmitry Gutov
2023-08-31 19:03 ` Yuan Fu
2023-08-31 19:06 ` Eli Zaretskii
2023-08-31 20:24 ` Stefan Kangas
2023-09-01 5:33 ` Eli Zaretskii
2023-08-20 6:18 ` Eli Zaretskii
2023-08-18 3:00 ` Yuan Fu
2023-08-18 4:19 ` JD Smith
2023-08-18 5:20 ` Yuan Fu
2023-08-18 13:21 ` Dmitry Gutov
2023-08-18 13:39 ` JD Smith
[not found] <87v8ddsqwe.fsf@web.de>
2023-08-17 14:25 ` Dmitry Gutov
2023-08-17 14:36 ` Dmitry Gutov
2023-08-17 15:01 ` Dmitry Gutov
2023-08-17 15:15 ` Felix
2023-08-18 2:49 ` Yuan Fu
2023-08-18 13:29 ` Dmitry Gutov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BC6822FB-E9A4-42E6-8306-4B4F841834CE@gmail.com \
--to=casouri@gmail.com \
--cc=dmitry@gutov.dev \
--cc=emacs-devel@gnu.org \
--cc=jdtsmith@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.