* Tree-sitter navigation time grows as sqrt(line-number) @ 2023-08-17 4:01 JD Smith 2023-08-17 10:56 ` Dmitry Gutov 2023-08-18 3:00 ` Yuan Fu 0 siblings, 2 replies; 40+ messages in thread From: JD Smith @ 2023-08-17 4:01 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1240 bytes --] I recently posted about the high variability of Emacs 29’s tree-sitter navigation performance within a file. I decided to conduct a simple test on a large python file of about 8400 lines to see if I could learn more. The test is as follows: at the start of each line, locate the current syntax node, and starting from it, navigate up to the root node via `treesit-node-parent’. I was surprised to find that the time this takes grows as sqrt(N), for line number N. This leads to performance variability of >100x for code that needs to walk the local syntax tree in large files. Such variability can make performance projections and optimizations for latency-sensitive uses of tree-sitter (e.g. via font-lock) tricky. I’m unclear whether this is fundamental to the tree-sitter parse/tree algorithm, or if the scaling comes from Emacs’ TS implementation. It does vaguely remind me of similar scaling with an old line-numbering algorithm, where lines were always being counted from the beginning of the buffer, so very fast at the front, and very slow near the end. Code and details here: https://gist.github.com/jdtsmith/7fa6263a13559d587abb51827e6ae472 tree-sitter navigation speed test gist.github.com [-- Attachment #2.1: Type: text/html, Size: 4250 bytes --] [-- Attachment #2.2: gist-og-image.png --] [-- Type: image/png, Size: 36122 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 4:01 Tree-sitter navigation time grows as sqrt(line-number) JD Smith @ 2023-08-17 10:56 ` Dmitry Gutov 2023-08-17 11:41 ` Eli Zaretskii 2023-08-17 12:21 ` JD Smith 2023-08-18 3:00 ` Yuan Fu 1 sibling, 2 replies; 40+ messages in thread From: Dmitry Gutov @ 2023-08-17 10:56 UTC (permalink / raw) To: JD Smith, emacs-devel On 17/08/2023 07:01, JD Smith wrote: > I’m unclear whether this is fundamental to the tree-sitter parse/tree > algorithm, or if the scaling comes from Emacs’ TS implementation. It > does vaguely remind me of similar scaling with an old line-numbering > algorithm, where lines were always being counted from the beginning of > the buffer, so very fast at the front, and very slow near the end. Have you tried my patch yet? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 10:56 ` Dmitry Gutov @ 2023-08-17 11:41 ` Eli Zaretskii 2023-08-17 11:51 ` tomas 2023-08-17 12:21 ` JD Smith 1 sibling, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2023-08-17 11:41 UTC (permalink / raw) To: Dmitry Gutov; +Cc: jdtsmith, emacs-devel > Date: Thu, 17 Aug 2023 13:56:46 +0300 > From: Dmitry Gutov <dmitry@gutov.dev> > > On 17/08/2023 07:01, JD Smith wrote: > > It > > does vaguely remind me of similar scaling with an old line-numbering > > algorithm, where lines were always being counted from the beginning of > > the buffer, so very fast at the front, and very slow near the end. Why on earth would someone need to count lines far from the beginning of the buffer? Or have a computer with more than 640KB of memory? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 11:41 ` Eli Zaretskii @ 2023-08-17 11:51 ` tomas 0 siblings, 0 replies; 40+ messages in thread From: tomas @ 2023-08-17 11:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Dmitry Gutov, jdtsmith, emacs-devel [-- Attachment #1: Type: text/plain, Size: 792 bytes --] On Thu, Aug 17, 2023 at 02:41:24PM +0300, Eli Zaretskii wrote: > > Date: Thu, 17 Aug 2023 13:56:46 +0300 > > From: Dmitry Gutov <dmitry@gutov.dev> > > > > On 17/08/2023 07:01, JD Smith wrote: > > > It > > > does vaguely remind me of similar scaling with an old line-numbering > > > algorithm, where lines were always being counted from the beginning of > > > the buffer, so very fast at the front, and very slow near the end. > > Why on earth would someone need to count lines far from the beginning > of the buffer? Or have a computer with more than 640KB of memory? Oh, oh, fond memories of Windows around 3.1 where the available editor (notepad) wasn't capable of loading the only C header (<windows.h>) because that one had more than 64K lines... Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 10:56 ` Dmitry Gutov 2023-08-17 11:41 ` Eli Zaretskii @ 2023-08-17 12:21 ` JD Smith 2023-08-17 12:34 ` Dmitry Gutov 1 sibling, 1 reply; 40+ messages in thread From: JD Smith @ 2023-08-17 12:21 UTC (permalink / raw) To: Dmitry Gutov; +Cc: emacs-devel Thanks for that, but I’m afraid I haven’t yet had time. I provided the test code and target file in the hopes that others could confirm the scaling behavior and then experiment with algorithm tweaks, if anything obvious presented itself. > On Aug 17, 2023, at 6:56 AM, Dmitry Gutov <dmitry@gutov.dev> wrote: > > On 17/08/2023 07:01, JD Smith wrote: >> I’m unclear whether this is fundamental to the tree-sitter parse/tree algorithm, or if the scaling comes from Emacs’ TS implementation. It does vaguely remind me of similar scaling with an old line-numbering algorithm, where lines were always being counted from the beginning of the buffer, so very fast at the front, and very slow near the end. > > Have you tried my patch yet? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 12:21 ` JD Smith @ 2023-08-17 12:34 ` Dmitry Gutov 2023-08-17 13:19 ` Dmitry Gutov 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-17 12:34 UTC (permalink / raw) To: JD Smith; +Cc: emacs-devel On 17/08/2023 15:21, JD Smith wrote: > I provided the test code and target file in the hopes that others could confirm the scaling behavior and then experiment with algorithm tweaks, if anything obvious presented itself. I experimented a little bit with benchmarking (treesit-node-parent) calls, and the patch came from that. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 12:34 ` Dmitry Gutov @ 2023-08-17 13:19 ` Dmitry Gutov 2023-08-19 14:24 ` JD Smith 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-17 13:19 UTC (permalink / raw) To: JD Smith; +Cc: emacs-devel On 17/08/2023 15:34, Dmitry Gutov wrote: > On 17/08/2023 15:21, JD Smith wrote: >> I provided the test code and target file in the hopes that others >> could confirm the scaling behavior and then experiment with algorithm >> tweaks, if anything obvious presented itself. > > I experimented a little bit with benchmarking (treesit-node-parent) > calls, and the patch came from that. In case somebody else here wants to try it: diff --git a/src/treesit.c b/src/treesit.c index 1f694e47201..4b35e5ee2e5 100644 --- a/src/treesit.c +++ b/src/treesit.c @@ -52,6 +52,7 @@ Copyright (C) 2021-2023 Free Software Foundation, Inc. #undef ts_node_named_descendant_for_byte_range #undef ts_node_next_named_sibling #undef ts_node_next_sibling +#undef ts_node_parent #undef ts_node_prev_named_sibling #undef ts_node_prev_sibling #undef ts_node_start_byte @@ -1899,16 +1900,27 @@ DEFUN ("treesit-node-parent", TSNode treesit_node = XTS_NODE (node)->node; Lisp_Object parser = XTS_NODE (node)->parser; - TSTreeCursor cursor; - if (!treesit_cursor_helper (&cursor, treesit_node, parser)) - return return_value; - if (ts_tree_cursor_goto_parent (&cursor)) - { - TSNode parent = ts_tree_cursor_current_node (&cursor); - return_value = make_treesit_node (parser, parent); - } - ts_tree_cursor_delete (&cursor); + if (treesit_node_uptodate_p(node)) + { + TSNode parent = ts_node_parent (treesit_node); + return_value = make_treesit_node (parser, parent); + } + else + { + Lisp_Object parser = XTS_NODE (node)->parser; + TSTreeCursor cursor; + if (!treesit_cursor_helper (&cursor, treesit_node, parser)) + return return_value; + + if (ts_tree_cursor_goto_parent (&cursor)) + { + TSNode parent = ts_tree_cursor_current_node (&cursor); + return_value = make_treesit_node (parser, parent); + } + ts_tree_cursor_delete (&cursor); + } + return return_value; } ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 13:19 ` Dmitry Gutov @ 2023-08-19 14:24 ` JD Smith 2023-08-19 22:16 ` Yuan Fu 0 siblings, 1 reply; 40+ messages in thread From: JD Smith @ 2023-08-19 14:24 UTC (permalink / raw) To: Dmitry Gutov, emacs-devel Thanks for your patch, Dmitry. I had a chance to test it this morning (the new, non-crashing version). I made a new NS build, with and without the patch. The results are really striking (scroll to bottom): https://gist.github.com/jdtsmith/7fa6263a13559d587abb51827e6ae472 Summary: - Applying the same test above on _axes.py reproduces the earlier emacs-mac/29 results: the time to navigate from the node at line beginning to root starts at under 10µs, but rises as sqrt(N) by ~100x, reaching over 3000µs. - With Dimitry’s patch, it performs much, much better, starting off with similar timing at early positions in the file, but rising no higher than 50µs, scaling much shallower than sqrt(N). I should emphasize this is a new fast machine; I fully expect my old laptop would be much slower (10x ?) than 3ms in files this large, which makes using parent navigation for things like font-lock problematic. The patched version results also make a lot more sense in terms of their similar logarithmic growth as node-at-point, since the method of search for a node at point and for its parent is, as Yuan points at, quite similar. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-19 14:24 ` JD Smith @ 2023-08-19 22:16 ` Yuan Fu 2023-08-20 0:18 ` JD Smith 2023-08-20 6:18 ` Eli Zaretskii 0 siblings, 2 replies; 40+ messages in thread From: Yuan Fu @ 2023-08-19 22:16 UTC (permalink / raw) To: JD Smith; +Cc: Dmitry Gutov, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1580 bytes --] > On Aug 19, 2023, at 7:24 AM, JD Smith <jdtsmith@gmail.com> wrote: > > Thanks for your patch, Dmitry. I had a chance to test it this morning (the new, non-crashing version). I made a new NS build, with and without the patch. The results are really striking (scroll to bottom): > > https://gist.github.com/jdtsmith/7fa6263a13559d587abb51827e6ae472 > > Summary: > > - Applying the same test above on _axes.py reproduces the earlier emacs-mac/29 results: the time to navigate from the node at line beginning to root starts at under 10µs, but rises as sqrt(N) by ~100x, reaching over 3000µs. > > - With Dimitry’s patch, it performs much, much better, starting off with similar timing at early positions in the file, but rising no higher than 50µs, scaling much shallower than sqrt(N). > > I should emphasize this is a new fast machine; I fully expect my old laptop would be much slower (10x ?) than 3ms in files this large, which makes using parent navigation for things like font-lock problematic. > > The patched version results also make a lot more sense in terms of their similar logarithmic growth as node-at-point, since the method of search for a node at point and for its parent is, as Yuan points at, quite similar. I inspected the descending algorithm, and there’s indeed an oversight made by me. Here’s a patch that should fix it. I tested it briefly and it does speeds things up greatly. Thanks for investigating this, JD! I think the patch is relatively safe, so maybe we can push it to emacs-29 instead of master. Yuan [-- Attachment #2: node-parent.patch --] [-- Type: application/octet-stream, Size: 1088 bytes --] From dd20c4449493765c22dd2067ae410490e9f1d1dc Mon Sep 17 00:00:00 2001 From: Yuan Fu <casouri@gmail.com> Date: Sat, 19 Aug 2023 15:04:20 -0700 Subject: [PATCH] Fix treesit_cursor_helper_1 * src/treesit.c (treesit_cursor_helper_1): Skip child nodes that can't contain TARGET when traversing the tree: only traverse down the child node if that node's end is grater or equal to TARGET's end. --- src/treesit.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/treesit.c b/src/treesit.c index 1f694e47201..f9e98244a4f 100644 --- a/src/treesit.c +++ b/src/treesit.c @@ -3048,7 +3048,8 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target, siblings that could contain TARGET. */ while (ts_node_start_byte (cursor_node) <= end_pos) { - if (treesit_cursor_helper_1 (cursor, target, end_pos, limit - 1)) + if (ts_node_end_byte (cursor_node) >= end_pos + && treesit_cursor_helper_1 (cursor, target, end_pos, limit - 1)) return true; if (!ts_tree_cursor_goto_next_sibling (cursor)) -- 2.41.0 ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-19 22:16 ` Yuan Fu @ 2023-08-20 0:18 ` JD Smith 2023-08-20 0:39 ` Dmitry Gutov 2023-08-20 6:18 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: JD Smith @ 2023-08-20 0:18 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, emacs-devel Great, thanks. I tried this patch out, and there is indeed about 10x of improvement. Check the bottom of the gist. That said, node_parent remains 10x faster yet (at worst, in a long file), so maybe there’s room for further improvement? May be worth looking at how others are doing it, e.g. the python API. Alternatively, have we ruled the seemingly simplest node_parent out prematurely? If the issue is a node being its own parent in some odd trees, wouldn’t a simple check suffice to guard against this rare possibility? > On Aug 19, 2023, at 6:16 PM, Yuan Fu <casouri@gmail.com> wrote: > > > >> On Aug 19, 2023, at 7:24 AM, JD Smith <jdtsmith@gmail.com> wrote: >> >> Thanks for your patch, Dmitry. I had a chance to test it this morning (the new, non-crashing version). I made a new NS build, with and without the patch. The results are really striking (scroll to bottom): >> >> https://gist.github.com/jdtsmith/7fa6263a13559d587abb51827e6ae472 >> >> Summary: >> >> - Applying the same test above on _axes.py reproduces the earlier emacs-mac/29 results: the time to navigate from the node at line beginning to root starts at under 10µs, but rises as sqrt(N) by ~100x, reaching over 3000µs. >> >> - With Dimitry’s patch, it performs much, much better, starting off with similar timing at early positions in the file, but rising no higher than 50µs, scaling much shallower than sqrt(N). >> >> I should emphasize this is a new fast machine; I fully expect my old laptop would be much slower (10x ?) than 3ms in files this large, which makes using parent navigation for things like font-lock problematic. >> >> The patched version results also make a lot more sense in terms of their similar logarithmic growth as node-at-point, since the method of search for a node at point and for its parent is, as Yuan points at, quite similar. > > I inspected the descending algorithm, and there’s indeed an oversight made by me. Here’s a patch that should fix it. I tested it briefly and it does speeds things up greatly. Thanks for investigating this, JD! > > I think the patch is relatively safe, so maybe we can push it to emacs-29 instead of master. > > Yuan > > <node-parent.patch> ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-20 0:18 ` JD Smith @ 2023-08-20 0:39 ` Dmitry Gutov 2023-08-20 2:01 ` Yuan Fu 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-20 0:39 UTC (permalink / raw) To: JD Smith, Yuan Fu; +Cc: emacs-devel On 20/08/2023 03:18, JD Smith wrote: > Great, thanks. I tried this patch out, and there is indeed about 10x of improvement. Check the bottom of the gist. That said, node_parent remains 10x faster yet (at worst, in a long file), so maybe there’s room for further improvement? Similarly, I also see an improvement from Yuan's patch in my testing (about 2x), while the patch with ts_node_parent remains the fastest anyway. Where my test looks like this: (benchmark 1000 '(treesit-node-parent n)) I looked around for the reasons for the difference. Built the latest tree-sitter (didn't help) and found these two threads on GH: https://github.com/tree-sitter/tree-sitter/issues/567#issuecomment-595564171 - Max Brunsfield says "There is some caching done in that method to make sure it performs well in the common case of walking repeatedly up the tree", but I haven't found where said caching resides so far. https://github.com/tree-sitter/tree-sitter/discussions/878 - mentions that mixing cursor and direct node apis leads to suboptimal results, and just using the former gives an improvement. No "good" code example in there. > May be worth looking at how others are doing it, e.g. the python API. Apparently they have both a wrapper for a cursor API, and node_get_parent which is implemented using ts_node_parent: https://github.com/tree-sitter/py-tree-sitter/issues/34 Leaving it to the callers to choose which one to use. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-20 0:39 ` Dmitry Gutov @ 2023-08-20 2:01 ` Yuan Fu 2023-08-20 12:40 ` JD Smith 0 siblings, 1 reply; 40+ messages in thread From: Yuan Fu @ 2023-08-20 2:01 UTC (permalink / raw) To: Dmitry Gutov; +Cc: JD Smith, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1704 bytes --] > On Aug 19, 2023, at 5:39 PM, Dmitry Gutov <dmitry@gutov.dev> wrote: > > On 20/08/2023 03:18, JD Smith wrote: >> Great, thanks. I tried this patch out, and there is indeed about 10x of improvement. Check the bottom of the gist. That said, node_parent remains 10x faster yet (at worst, in a long file), so maybe there’s room for further improvement? > > Similarly, I also see an improvement from Yuan's patch in my testing (about 2x), while the patch with ts_node_parent remains the fastest anyway. Where my test looks like this: > > (benchmark 1000 '(treesit-node-parent n)) > > I looked around for the reasons for the difference. Built the latest tree-sitter (didn't help) and found these two threads on GH: > > https://github.com/tree-sitter/tree-sitter/issues/567#issuecomment-595564171 - Max Brunsfield says "There is some caching done in that method to make sure it performs well in the common case of walking repeatedly up the tree", but I haven't found where said caching resides so far. > > https://github.com/tree-sitter/tree-sitter/discussions/878 - mentions that mixing cursor and direct node apis leads to suboptimal results, and just using the former gives an improvement. No "good" code example in there. > > > May be worth looking at how others are doing it, e.g. the python API. > > Apparently they have both a wrapper for a cursor API, and node_get_parent which is implemented using ts_node_parent: https://github.com/tree-sitter/py-tree-sitter/issues/34 > > Leaving it to the callers to choose which one to use. Ok, I fiddled around a bit more, and this patch (applies to master) should make the speed comparable to ts_node_parent. Yuan [-- Attachment #2: node-parent.patch --] [-- Type: application/octet-stream, Size: 2787 bytes --] From 21d3e612d1d6819278621b956629f6c28a324145 Mon Sep 17 00:00:00 2001 From: Yuan Fu <casouri@gmail.com> Date: Sat, 19 Aug 2023 15:04:20 -0700 Subject: [PATCH] Improve performance of treesit_cursor_helper_1 * src/treesit.c: (treesit_cursor_helper_1): Use ts_tree_cursor_goto_first_child_for_byte to speed up traversing among siblings. The "while (ts_node_end_byte (cursor_node) < end_pos)" can be removed with the check added in the loop below. --- src/treesit.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/src/treesit.c b/src/treesit.c index 1f694e47201..1017c64f899 100644 --- a/src/treesit.c +++ b/src/treesit.c @@ -3023,7 +3023,8 @@ treesit_assume_true (bool val) limit. */ static bool treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target, - uint32_t end_pos, ptrdiff_t limit) + uint32_t start_pos, uint32_t end_pos, + ptrdiff_t limit) { if (limit <= 0) return false; @@ -3032,23 +3033,17 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target, if (ts_node_eq (cursor_node, *target)) return true; - if (!ts_tree_cursor_goto_first_child (cursor)) + if (ts_tree_cursor_goto_first_child_for_byte (cursor, start_pos) == -1) return false; - /* Skip nodes that definitely don't contain TARGET. */ - while (ts_node_end_byte (cursor_node) < end_pos) - { - if (!ts_tree_cursor_goto_next_sibling (cursor)) - break; - cursor_node = ts_tree_cursor_current_node (cursor); - } - /* Go through each sibling that could contain TARGET. Because of missing nodes (their width is 0), there could be multiple siblings that could contain TARGET. */ while (ts_node_start_byte (cursor_node) <= end_pos) { - if (treesit_cursor_helper_1 (cursor, target, end_pos, limit - 1)) + if (ts_node_end_byte (cursor_node) >= end_pos + && treesit_cursor_helper_1 (cursor, target, start_pos, end_pos, + limit - 1)) return true; if (!ts_tree_cursor_goto_next_sibling (cursor)) @@ -3080,11 +3075,12 @@ treesit_cursor_helper_1 (TSTreeCursor *cursor, TSNode *target, static bool treesit_cursor_helper (TSTreeCursor *cursor, TSNode node, Lisp_Object parser) { + uint32_t start_pos = ts_node_start_byte (node); uint32_t end_pos = ts_node_end_byte (node); TSNode root = ts_tree_root_node (XTS_PARSER (parser)->tree); *cursor = ts_tree_cursor_new (root); - bool success = treesit_cursor_helper_1 (cursor, &node, end_pos, - TREESIT_RECURSION_LIMIT); + bool success = treesit_cursor_helper_1 (cursor, &node, start_pos, + end_pos, TREESIT_RECURSION_LIMIT); if (!success) ts_tree_cursor_delete (cursor); return success; -- 2.41.0 [-- Attachment #3: Type: text/plain, Size: 2 bytes --] ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-20 2:01 ` Yuan Fu @ 2023-08-20 12:40 ` JD Smith 2023-08-20 20:26 ` Dmitry Gutov 0 siblings, 1 reply; 40+ messages in thread From: JD Smith @ 2023-08-20 12:40 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, emacs-devel [-- Attachment #1: Type: text/plain, Size: 537 bytes --] Looks like a winner (see below, or the gist)! Thanks all. I do think we should consider a treesit-node-ancestors function that collects all the parent (of parent) nodes in one go into an (emacs) list, since you basically have to descend the whole tree from root to find the 1st parent anyway. Then people who want to know, e.g., “am I in an if block?” can just test node type down the full ancestor list. Of course, also, node-parent-until/while could be re-written to use node-ancestors, for some additional efficiency. [-- Attachment #2: PastedGraphic-1.png --] [-- Type: image/png, Size: 107148 bytes --] [-- Attachment #3: Type: text/plain, Size: 1840 bytes --] > On Aug 19, 2023, at 10:01 PM, Yuan Fu <casouri@gmail.com> wrote: > > > >> On Aug 19, 2023, at 5:39 PM, Dmitry Gutov <dmitry@gutov.dev> wrote: >> >> On 20/08/2023 03:18, JD Smith wrote: >>> Great, thanks. I tried this patch out, and there is indeed about 10x of improvement. Check the bottom of the gist. That said, node_parent remains 10x faster yet (at worst, in a long file), so maybe there’s room for further improvement? >> >> Similarly, I also see an improvement from Yuan's patch in my testing (about 2x), while the patch with ts_node_parent remains the fastest anyway. Where my test looks like this: >> >> (benchmark 1000 '(treesit-node-parent n)) >> >> I looked around for the reasons for the difference. Built the latest tree-sitter (didn't help) and found these two threads on GH: >> >> https://github.com/tree-sitter/tree-sitter/issues/567#issuecomment-595564171 - Max Brunsfield says "There is some caching done in that method to make sure it performs well in the common case of walking repeatedly up the tree", but I haven't found where said caching resides so far. >> >> https://github.com/tree-sitter/tree-sitter/discussions/878 - mentions that mixing cursor and direct node apis leads to suboptimal results, and just using the former gives an improvement. No "good" code example in there. >> >>> May be worth looking at how others are doing it, e.g. the python API. >> >> Apparently they have both a wrapper for a cursor API, and node_get_parent which is implemented using ts_node_parent: https://github.com/tree-sitter/py-tree-sitter/issues/34 >> >> Leaving it to the callers to choose which one to use. > > Ok, I fiddled around a bit more, and this patch (applies to master) should make the speed comparable to ts_node_parent. > > Yuan > > <node-parent.patch> > ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-20 12:40 ` JD Smith @ 2023-08-20 20:26 ` Dmitry Gutov 2023-08-22 1:41 ` Yuan Fu 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-20 20:26 UTC (permalink / raw) To: JD Smith, Yuan Fu; +Cc: emacs-devel On 20/08/2023 15:40, JD Smith wrote: > Looks like a winner (see below, or the gist)! Thanks all. Same here, thanks all indeed. > I do think we should consider a treesit-node-ancestors function that collects all the parent (of parent) nodes in one go into an (emacs) list, since you basically have to descend the whole tree from root to find the 1st parent anyway. Then people who want to know, e.g., “am I in an if block?” can just test node type down the full ancestor list. Of course, also, node-parent-until/while could be re-written to use node-ancestors, for some additional efficiency. Should be useful, but the speedup from traversing only once might be negated by the work of allocating the full list of Lisp objects. So it might improve only certain applications. OTOH, node-parent-until/while could be rewritten in a way that only allocates Lisp on-demand. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-20 20:26 ` Dmitry Gutov @ 2023-08-22 1:41 ` Yuan Fu 2023-08-22 21:07 ` JD Smith 0 siblings, 1 reply; 40+ messages in thread From: Yuan Fu @ 2023-08-22 1:41 UTC (permalink / raw) To: Dmitry Gutov; +Cc: JD Smith, emacs-devel > On Aug 20, 2023, at 1:26 PM, Dmitry Gutov <dmitry@gutov.dev> wrote: > > On 20/08/2023 15:40, JD Smith wrote: >> Looks like a winner (see below, or the gist)! Thanks all. > > Same here, thanks all indeed. Let’s run with this patch for sometime. If all goes well, I’ll push to emacs-29. > >> I do think we should consider a treesit-node-ancestors function that collects all the parent (of parent) nodes in one go into an (emacs) list, since you basically have to descend the whole tree from root to find the 1st parent anyway. Then people who want to know, e.g., “am I in an if block?” can just test node type down the full ancestor list. Of course, also, node-parent-until/while could be re-written to use node-ancestors, for some additional efficiency. > > Should be useful, but the speedup from traversing only once might be negated by the work of allocating the full list of Lisp objects. So it might improve only certain applications. > > OTOH, node-parent-until/while could be rewritten in a way that only allocates Lisp on-demand. Yeah, node-parent is already fast, and a tree’s height mostly doesn’t grow higher than, say, 30 levels, so I won’t worry about it until some real scenario pops up. Yuan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-22 1:41 ` Yuan Fu @ 2023-08-22 21:07 ` JD Smith 2023-08-31 4:26 ` Yuan Fu 0 siblings, 1 reply; 40+ messages in thread From: JD Smith @ 2023-08-22 21:07 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, emacs-devel > On Aug 21, 2023, at 9:41 PM, Yuan Fu <casouri@gmail.com> wrote: > > > >> On Aug 20, 2023, at 1:26 PM, Dmitry Gutov <dmitry@gutov.dev> wrote: >> >> On 20/08/2023 15:40, JD Smith wrote: >>> Looks like a winner (see below, or the gist)! Thanks all. >> >> Same here, thanks all indeed. > > Let’s run with this patch for sometime. If all goes well, I’ll push to emacs-29. > >> >>> I do think we should consider a treesit-node-ancestors function that collects all the parent (of parent) nodes in one go into an (emacs) list, since you basically have to descend the whole tree from root to find the 1st parent anyway. Then people who want to know, e.g., “am I in an if block?” can just test node type down the full ancestor list. Of course, also, node-parent-until/while could be re-written to use node-ancestors, for some additional efficiency. >> >> Should be useful, but the speedup from traversing only once might be negated by the work of allocating the full list of Lisp objects. So it might improve only certain applications. >> >> OTOH, node-parent-until/while could be rewritten in a way that only allocates Lisp on-demand. > > Yeah, node-parent is already fast, and a tree’s height mostly doesn’t grow higher than, say, 30 levels, so I won’t worry about it until some real scenario pops up. > Sounds good. In the meantime I have discovered treesit-query-capture, which is already very fast and can generate a list of all parents (of a certain type etc.): (let* ((qry (treesit-query-compile 'python '([(argument_list) (parameters) (list) (dictionary) (parenthesized_expression) (subscript)] @ctx))) (n (treesit-node-at (point))) (bg (treesit-node-start n)) (en (treesit-node-end n))) (treesit-query-capture 'python qry bg en t)) It is key to compile the query in advance. It’s actually pretty similar in speed to the parent navigation. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-22 21:07 ` JD Smith @ 2023-08-31 4:26 ` Yuan Fu 2023-08-31 6:03 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Yuan Fu @ 2023-08-31 4:26 UTC (permalink / raw) To: JD Smith; +Cc: Dmitry Gutov, emacs-devel > On Aug 22, 2023, at 2:07 PM, JD Smith <jdtsmith@gmail.com> wrote: > > > >> On Aug 21, 2023, at 9:41 PM, Yuan Fu <casouri@gmail.com> wrote: >> >> >> >>> On Aug 20, 2023, at 1:26 PM, Dmitry Gutov <dmitry@gutov.dev> wrote: >>> >>> On 20/08/2023 15:40, JD Smith wrote: >>>> Looks like a winner (see below, or the gist)! Thanks all. >>> >>> Same here, thanks all indeed. >> >> Let’s run with this patch for sometime. If all goes well, I’ll push to emacs-29. >> I’ve pushed the patch to emacs-29. Yuan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 4:26 ` Yuan Fu @ 2023-08-31 6:03 ` Eli Zaretskii 2023-08-31 11:04 ` Dmitry Gutov 2023-08-31 19:03 ` Yuan Fu 0 siblings, 2 replies; 40+ messages in thread From: Eli Zaretskii @ 2023-08-31 6:03 UTC (permalink / raw) To: Yuan Fu; +Cc: jdtsmith, dmitry, emacs-devel > From: Yuan Fu <casouri@gmail.com> > Date: Wed, 30 Aug 2023 21:26:58 -0700 > Cc: Dmitry Gutov <dmitry@gutov.dev>, > emacs-devel@gnu.org > > >>> On 20/08/2023 15:40, JD Smith wrote: > >>>> Looks like a winner (see below, or the gist)! Thanks all. > >>> > >>> Same here, thanks all indeed. > >> > >> Let’s run with this patch for sometime. If all goes well, I’ll push to emacs-29. > >> > > I’ve pushed the patch to emacs-29. Thanks, but why emacs-29? Is this a bugfix? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 6:03 ` Eli Zaretskii @ 2023-08-31 11:04 ` Dmitry Gutov 2023-08-31 11:42 ` Po Lu 2023-08-31 12:51 ` Eli Zaretskii 2023-08-31 19:03 ` Yuan Fu 1 sibling, 2 replies; 40+ messages in thread From: Dmitry Gutov @ 2023-08-31 11:04 UTC (permalink / raw) To: Eli Zaretskii, Yuan Fu; +Cc: jdtsmith, emacs-devel On 31/08/2023 09:03, Eli Zaretskii wrote: > Thanks, but why emacs-29? Is this a bugfix? Depending on the POV, O(N^2) performance for certain buffer interactions can be considered a bug. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 11:04 ` Dmitry Gutov @ 2023-08-31 11:42 ` Po Lu 2023-08-31 17:32 ` Dmitry Gutov 2023-08-31 12:51 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: Po Lu @ 2023-08-31 11:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Eli Zaretskii, Yuan Fu, jdtsmith, emacs-devel Dmitry Gutov <dmitry@gutov.dev> writes: > Depending on the POV, O(N^2) performance for certain buffer > interactions can be considered a bug. But is fixing it is worth the risk? Is the change innocuous enough for the release branch? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 11:42 ` Po Lu @ 2023-08-31 17:32 ` Dmitry Gutov 0 siblings, 0 replies; 40+ messages in thread From: Dmitry Gutov @ 2023-08-31 17:32 UTC (permalink / raw) To: Po Lu; +Cc: Eli Zaretskii, Yuan Fu, jdtsmith, emacs-devel On 31/08/2023 14:42, Po Lu wrote: > Dmitry Gutov<dmitry@gutov.dev> writes: > >> Depending on the POV, O(N^2) performance for certain buffer >> interactions can be considered a bug. > But is fixing it is worth the risk? Is the change innocuous enough for > the release branch? It seems so, yes. The change doesn't increase code complexity, and rather fixes an omission where many nodes weren't outright filtered out during the search. Also, as JD said, packages using tree-sitter are going to be constrained by the lower common denominator of the primitives' performance characteristics. At least some of those will be able to recommend 29.2 (or put it as a dependency). ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 11:04 ` Dmitry Gutov 2023-08-31 11:42 ` Po Lu @ 2023-08-31 12:51 ` Eli Zaretskii 2023-08-31 13:58 ` JD Smith 2023-08-31 17:49 ` Dmitry Gutov 1 sibling, 2 replies; 40+ messages in thread From: Eli Zaretskii @ 2023-08-31 12:51 UTC (permalink / raw) To: Dmitry Gutov, Stefan Kangas; +Cc: casouri, jdtsmith, emacs-devel > Date: Thu, 31 Aug 2023 14:04:39 +0300 > Cc: jdtsmith@gmail.com, emacs-devel@gnu.org > From: Dmitry Gutov <dmitry@gutov.dev> > > On 31/08/2023 09:03, Eli Zaretskii wrote: > > Thanks, but why emacs-29? Is this a bugfix? > > Depending on the POV, O(N^2) performance for certain buffer interactions > can be considered a bug. Performance improvement is an enhancement. Unless the performance is very bad, I don't think its place is on the release branch, especially after the major release. Stefan, WDYT? ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 12:51 ` Eli Zaretskii @ 2023-08-31 13:58 ` JD Smith 2023-08-31 17:49 ` Dmitry Gutov 1 sibling, 0 replies; 40+ messages in thread From: JD Smith @ 2023-08-31 13:58 UTC (permalink / raw) To: Eli Zaretskii, Stefan Kangas; +Cc: Dmitry Gutov, Yuan Fu, emacs-devel, mickey [-- Attachment #1: Type: text/plain, Size: 1741 bytes --] The scale of the improvement is up to 100x. For me it has been on the edge between “acceptable” and “very bad”. I have the impression that most modes are using treesit-query-capture with compiled queries for performance-sensitive stuff (font lock), which is fast, and hasn’t changed. So those shouldn’t be affected. But any tree-sitter modes developed over the next few years to respect more “containing context” (structural editing, navigation, smarter context highlighting and folding, etc.) will I bet first reach for treesit-node-parent and its friends treesit-parent-while/until, as I did. For example, I believe combobulate[1] is using these frequently (Mickey copied here). These functions are up to 100x slower than necessary on long but still reasonable (10k line) files. Is there a test harness somewhere that exercises treesit commands in large buffers of various different languages? Perhaps a test could be added to “navigate up several generations” from random locations in the buffer and confirm the same nodes are reached. [1] https://github.com/mickeynp/combobulate > On Aug 31, 2023, at 8:51 AM, Eli Zaretskii <eliz@gnu.org> wrote: > >> Date: Thu, 31 Aug 2023 14:04:39 +0300 >> Cc: jdtsmith@gmail.com, emacs-devel@gnu.org >> From: Dmitry Gutov <dmitry@gutov.dev> >> >> On 31/08/2023 09:03, Eli Zaretskii wrote: >>> Thanks, but why emacs-29? Is this a bugfix? >> >> Depending on the POV, O(N^2) performance for certain buffer interactions >> can be considered a bug. > > Performance improvement is an enhancement. Unless the performance is > very bad, I don't think its place is on the release branch, especially > after the major release. > > Stefan, WDYT? [-- Attachment #2: Type: text/html, Size: 2496 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 12:51 ` Eli Zaretskii 2023-08-31 13:58 ` JD Smith @ 2023-08-31 17:49 ` Dmitry Gutov 1 sibling, 0 replies; 40+ messages in thread From: Dmitry Gutov @ 2023-08-31 17:49 UTC (permalink / raw) To: Eli Zaretskii, Stefan Kangas; +Cc: casouri, jdtsmith, emacs-devel On 31/08/2023 15:51, Eli Zaretskii wrote: > Unless the performance is > very bad, I don't think its place is on the release branch, especially > after the major release. There was a complaint on Reddit about rust-ts-mode's syntax-highlighting performance is certain large (but not huge) files. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 6:03 ` Eli Zaretskii 2023-08-31 11:04 ` Dmitry Gutov @ 2023-08-31 19:03 ` Yuan Fu 2023-08-31 19:06 ` Eli Zaretskii 1 sibling, 1 reply; 40+ messages in thread From: Yuan Fu @ 2023-08-31 19:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: JD Smith, Dmitry Gutov, emacs-devel > On Aug 30, 2023, at 11:03 PM, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Yuan Fu <casouri@gmail.com> >> Date: Wed, 30 Aug 2023 21:26:58 -0700 >> Cc: Dmitry Gutov <dmitry@gutov.dev>, >> emacs-devel@gnu.org >> >>>>> On 20/08/2023 15:40, JD Smith wrote: >>>>>> Looks like a winner (see below, or the gist)! Thanks all. >>>>> >>>>> Same here, thanks all indeed. >>>> >>>> Let’s run with this patch for sometime. If all goes well, I’ll push to emacs-29. >>>> >> >> I’ve pushed the patch to emacs-29. > > Thanks, but why emacs-29? Is this a bugfix? The line is a bit blurry for this one, as others have discussed. I pushed to emacs-29 because a while ago you said yes to pushing to emacs-29. Admittedly that was a slightly different patch but the result is the same. Personally I don’t have strong feelings either way. Yuan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 19:03 ` Yuan Fu @ 2023-08-31 19:06 ` Eli Zaretskii 2023-08-31 20:24 ` Stefan Kangas 0 siblings, 1 reply; 40+ messages in thread From: Eli Zaretskii @ 2023-08-31 19:06 UTC (permalink / raw) To: Yuan Fu, Stefan Kangas; +Cc: jdtsmith, dmitry, emacs-devel > From: Yuan Fu <casouri@gmail.com> > Date: Thu, 31 Aug 2023 12:03:49 -0700 > Cc: JD Smith <jdtsmith@gmail.com>, > Dmitry Gutov <dmitry@gutov.dev>, > emacs-devel@gnu.org > > >> I’ve pushed the patch to emacs-29. > > > > Thanks, but why emacs-29? Is this a bugfix? > > The line is a bit blurry for this one, as others have discussed. I pushed to emacs-29 because a while ago you said yes to pushing to emacs-29. Admittedly that was a slightly different patch but the result is the same. Personally I don’t have strong feelings either way. Stefan, WDYT about this? I admit I'm a bit weary, but everyone else seems to think this is a bugfix. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 19:06 ` Eli Zaretskii @ 2023-08-31 20:24 ` Stefan Kangas 2023-09-01 5:33 ` Eli Zaretskii 0 siblings, 1 reply; 40+ messages in thread From: Stefan Kangas @ 2023-08-31 20:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Yuan Fu, jdtsmith, dmitry, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Stefan, WDYT about this? I admit I'm a bit weary, but everyone else > seems to think this is a bugfix. I'm not sure I can be of much help here; I really haven't been following treesitter development very closely. Having looked at the patch, the fix is also slightly less trivial than I had hoped, in the sense that I don't understand it. :-) That said, I do think that stability expectations might be a bit different for a completely new and (arguably) semi-optional feature. Yuan Fu's stated opinion up-thread was that the patch is "relatively safe". If it was up to me, I'd probably keep it on emacs-29, with a readiness to revert if any issues were to crop up. But I won't object either way. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-31 20:24 ` Stefan Kangas @ 2023-09-01 5:33 ` Eli Zaretskii 0 siblings, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2023-09-01 5:33 UTC (permalink / raw) To: Stefan Kangas; +Cc: casouri, jdtsmith, dmitry, emacs-devel > From: Stefan Kangas <stefankangas@gmail.com> > Date: Thu, 31 Aug 2023 22:24:03 +0200 > Cc: Yuan Fu <casouri@gmail.com>, jdtsmith@gmail.com, dmitry@gutov.dev, > emacs-devel@gnu.org > > Eli Zaretskii <eliz@gnu.org> writes: > > > Stefan, WDYT about this? I admit I'm a bit weary, but everyone else > > seems to think this is a bugfix. > > I'm not sure I can be of much help here; I really haven't been > following treesitter development very closely. Having looked at the > patch, the fix is also slightly less trivial than I had hoped, in the > sense that I don't understand it. :-) > > That said, I do think that stability expectations might be a bit > different for a completely new and (arguably) semi-optional feature. > Yuan Fu's stated opinion up-thread was that the patch is "relatively > safe". > > If it was up to me, I'd probably keep it on emacs-29, with a readiness > to revert if any issues were to crop up. But I won't object either > way. OK, so let's keep it on emacs-29 and watch for any possible problems it could cause. Thanks. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-19 22:16 ` Yuan Fu 2023-08-20 0:18 ` JD Smith @ 2023-08-20 6:18 ` Eli Zaretskii 1 sibling, 0 replies; 40+ messages in thread From: Eli Zaretskii @ 2023-08-20 6:18 UTC (permalink / raw) To: Yuan Fu; +Cc: jdtsmith, dmitry, emacs-devel > From: Yuan Fu <casouri@gmail.com> > Date: Sat, 19 Aug 2023 15:16:12 -0700 > Cc: Dmitry Gutov <dmitry@gutov.dev>, > emacs-devel@gnu.org > > I inspected the descending algorithm, and there’s indeed an oversight made by me. Here’s a patch that should fix it. I tested it briefly and it does speeds things up greatly. Thanks for investigating this, JD! > > I think the patch is relatively safe, so maybe we can push it to emacs-29 instead of master. Yes, please install on emacs-29 (assuming no one comes up with some regression due to it). Thanks. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 4:01 Tree-sitter navigation time grows as sqrt(line-number) JD Smith 2023-08-17 10:56 ` Dmitry Gutov @ 2023-08-18 3:00 ` Yuan Fu 2023-08-18 4:19 ` JD Smith 1 sibling, 1 reply; 40+ messages in thread From: Yuan Fu @ 2023-08-18 3:00 UTC (permalink / raw) To: JD Smith; +Cc: emacs-devel > On Aug 16, 2023, at 9:01 PM, JD Smith <jdtsmith@gmail.com> wrote: > > I recently posted about the high variability of Emacs 29’s tree-sitter navigation performance within a file. I decided to conduct a simple test on a large python file of about 8400 lines to see if I could learn more. The test is as follows: at the start of each line, locate the current syntax node, and starting from it, navigate up to the root node via `treesit-node-parent’. > > I was surprised to find that the time this takes grows as sqrt(N), for line number N. This leads to performance variability of >100x for code that needs to walk the local syntax tree in large files. Such variability can make performance projections and optimizations for latency-sensitive uses of tree-sitter (e.g. via font-lock) tricky. > > I’m unclear whether this is fundamental to the tree-sitter parse/tree algorithm, or if the scaling comes from Emacs’ TS implementation. It does vaguely remind me of similar scaling with an old line-numbering algorithm, where lines were always being counted from the beginning of the buffer, so very fast at the front, and very slow near the end. > > Code and details here: > > > https://gist.github.com/jdtsmith/7fa6263a13559d587abb51827e6ae472 I’m not entirely surprised. In the parse tree that tree-sitter generates is a DAG, where the parent node has pointers to the child nodes, but not the other way around. That means to go to the parent node from a child node, what tree-sitter actually does is go down from the root node until it hits the parent node. This process is linear to the height of the tree. Also, getting the node at point isn’t free either. To get the node at point, we actually iterates from the first child node of the root node until reaching one that contains the point, then iterate from the first child node of that node until reaching one that contains the point, etc, until we reach a leaf node. So log(N) time complexity is expected. Theses are fundamental limits of tree-sitter, unless it changes its data structure. I’m not too worried tho, because IIRC the absolute time is very short. The 100x variability doesn’t mean much if the 100x is still very fast. Yuan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-18 3:00 ` Yuan Fu @ 2023-08-18 4:19 ` JD Smith 2023-08-18 5:20 ` Yuan Fu 0 siblings, 1 reply; 40+ messages in thread From: JD Smith @ 2023-08-18 4:19 UTC (permalink / raw) To: Yuan Fu; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 3247 bytes --] > On Aug 17, 2023, at 11:00 PM, Yuan Fu <casouri@gmail.com> wrote: >> >> Code and details here: >> >> https://gist.github.com/jdtsmith/7fa6263a13559d587abb51827e6ae472 > > I’m not entirely surprised. In the parse tree that tree-sitter generates is a DAG, where the parent node has pointers to the child nodes, but not the other way around. That means to go to the parent node from a child node, what tree-sitter actually does is go down from the root node until it hits the parent node. This process is linear to the height of the tree. Thanks for this info, very interesting. > to go to the parent node from a child node, what tree-sitter actually does is go down from the root node until it hits the parent node. This process is linear to the height of the tree. Do you mean linear in the width of the tree? I’ve color-coded the plot by tree depth (i.e. how many ancestors a given node has back to root). Or maybe you are thinking of the tree lying on its side (like vundo ;)? > Also, getting the node at point isn’t free either. To get the node at point, we actually iterates from the first child node of the root node until reaching one that contains the point, then iterate from the first child node of that node until reaching one that contains the point, etc, until we reach a leaf node. So log(N) time complexity is expected. I tested node-at-point on the same file, and it is quite fast, with worst case performance only 30µs (vs. 3ms for the full parent navigation), and growing very slowly, between sqrt(log(N)) and log(N). Check the gist for a new figure. Unless I am misunderstanding, for the common case of finding parent of the node at point, it seems the algorithm you describe could be tweaked to work well. I.e. instead of "stop when reaching a leaf node containing point", just "stop when you reach a node containing point who has the original node as a child”? This should give (hypothetical) “parent-of-node-at-point” the same great speed as node-at-point. Then “parent-of-node-at-point-until” could do something quite clever: accumulate parent nodes all the way from root to the child’s direct parent into a list (same low cost, modulo the node storage). Then run the predicate on that list (in reverse), returning the first match. Could of course return that full list too (“ancestors-of-node-at-point”), for other uses. These should all be quite fast compared to a full breadth and depth searching of every nook and cranny in the syntax tree that node-parent seems to do (having no positional information to wield). > I’m not too worried tho, because IIRC the absolute time is very short. The 100x variability doesn’t mean much if the 100x is still very fast. This is on a brand new fast machine, and 3ms is pretty slow for things that need to run dozens of times per keystroke (e.g. font-lock). > These are fundamental limits of tree-sitter, unless it changes its data structure. That is an interesting limitation for sure. It seems that perhaps each node's start..end information can be really helpful here for winnowing the tree, when you are mostly concerned about nodes relevant to and covering particular positions in the buffer. [-- Attachment #2: Type: text/html, Size: 5415 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-18 4:19 ` JD Smith @ 2023-08-18 5:20 ` Yuan Fu 2023-08-18 13:21 ` Dmitry Gutov 2023-08-18 13:39 ` JD Smith 0 siblings, 2 replies; 40+ messages in thread From: Yuan Fu @ 2023-08-18 5:20 UTC (permalink / raw) To: JD Smith; +Cc: emacs-devel >> to go to the parent node from a child node, what tree-sitter actually does is go down from the root node until it hits the parent node. This process is linear to the height of the tree. > > Do you mean linear in the width of the tree? I’ve color-coded the plot by tree depth (i.e. how many ancestors a given node has back to root). Or maybe you are thinking of the tree lying on its side (like vundo ;)? Going down from the root node is proportion to the height of the tree, no? > >> Also, getting the node at point isn’t free either. To get the node at point, we actually iterates from the first child node of the root node until reaching one that contains the point, then iterate from the first child node of that node until reaching one that contains the point, etc, until we reach a leaf node. So log(N) time complexity is expected. > > I tested node-at-point on the same file, and it is quite fast, with worst case performance only 30µs (vs. 3ms for the full parent navigation), and growing very slowly, between sqrt(log(N)) and log(N). Check the gist for a new figure. > > Unless I am misunderstanding, for the common case of finding parent of the node at point, it seems the algorithm you describe could be tweaked to work well. I.e. instead of "stop when reaching a leaf node containing point", just "stop when you reach a node containing point who has the original node as a child”? This should give (hypothetical) “parent-of-node-at-point” the same great speed as node-at-point. I should’ve been more clear. For finding the parent of a node x, we stop at the parent of x. We only go to the leaf node when finding the node at point, which always returns a leaf node. > > Then “parent-of-node-at-point-until” could do something quite clever: accumulate parent nodes all the way from root to the child’s direct parent into a list (same low cost, modulo the node storage). Then run the predicate on that list (in reverse), returning the first match. Could of course return that full list too (“ancestors-of-node-at-point”), for other uses. These should all be quite fast compared to a full breadth and depth searching of every nook and cranny in the syntax tree that node-parent seems to do (having no positional information to wield). Sure, there hasn’t been a clever version because so far no one have complained they can’t find the parent of a node fast enough. Also I doubt the amount of gain we can get with a more clever algorithm. You can try implement one and benchmark it. I’m curious to know the result :-) > >> I’m not too worried tho, because IIRC the absolute time is very short. The 100x variability doesn’t mean much if the 100x is still very fast. > > > This is on a brand new fast machine, and 3ms is pretty slow for things that need to run dozens of times per keystroke (e.g. font-lock). Font-lock doesn’t need to find the parent of a node. So that hasn’t been a problem. In use-cases where finding the parent of a node is useful, 3ms hasn’t been a problem AFAIK. (Well, I guess indentation could benefit from a faster parent-finding function.) > >> These are fundamental limits of tree-sitter, unless it changes its data structure. > > That is an interesting limitation for sure. It seems that perhaps each node's start..end information can be really helpful here for winnowing the tree, when you are mostly concerned about nodes relevant to and covering particular positions in the buffer. > BTW, https://tree-sitter.github.io/tree-sitter/using-parsers says: > • > A TSNode represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children. I’m not entirely sure what exactly do you have in mind. A node’s start and end position doesn’t really help us finding it. Suppose you have an array of numbers, and you know one of the numbers is 3. How do you find the index of the number 3 in the array? While the documentation say it tracks its relation to other nodes, I think it’s more pedagogical than factual. Behind the scenes, tree-sitter’s own node API does the same thing as I described: it goes from the root node and traverses down. Yuan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-18 5:20 ` Yuan Fu @ 2023-08-18 13:21 ` Dmitry Gutov 2023-08-18 13:39 ` JD Smith 1 sibling, 0 replies; 40+ messages in thread From: Dmitry Gutov @ 2023-08-18 13:21 UTC (permalink / raw) To: Yuan Fu, JD Smith; +Cc: emacs-devel On 18/08/2023 08:20, Yuan Fu wrote: >>> I’m not too worried tho, because IIRC the absolute time is very short. The 100x variability doesn’t mean much if the 100x is still very fast. >> >> >> This is on a brand new fast machine, and 3ms is pretty slow for things that need to run dozens of times per keystroke (e.g. font-lock). > > Font-lock doesn’t need to find the parent of a node. So that hasn’t been a problem. In use-cases where finding the parent of a node is useful, 3ms hasn’t been a problem AFAIK. (Well, I guess indentation could benefit from a faster parent-finding function.) It could be unrelated, but someone on Reddit complained about the speed of rust-ts-mode's font-lock in a large enough file, and this mode's fontification rules contain treesit-node-parent calls. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-18 5:20 ` Yuan Fu 2023-08-18 13:21 ` Dmitry Gutov @ 2023-08-18 13:39 ` JD Smith 1 sibling, 0 replies; 40+ messages in thread From: JD Smith @ 2023-08-18 13:39 UTC (permalink / raw) To: Yuan Fu; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 6711 bytes --] > On Aug 18, 2023, at 1:21 AM, Yuan Fu <casouri@gmail.com> wrote: > > >>> to go to the parent node from a child node, what tree-sitter actually does is go down from the root node until it hits the parent node. This process is linear to the height of the tree. >> >> Do you mean linear in the width of the tree? I’ve color-coded the plot by tree depth (i.e. how many ancestors a given node has back to root). Or maybe you are thinking of the tree lying on its side (like vundo ;)? > > Going down from the root node is proportion to the height of the tree, no? > Sure, but as you can see from the plot, depth (“height”) is subdominant to the starting line number. To me that says the width/breadth of the tree must also matter a lot here. >> >>> Also, getting the node at point isn’t free either. To get the node at point, we actually iterates from the first child node of the root node until reaching one that contains the point, then iterate from the first child node of that node until reaching one that contains the point, etc, until we reach a leaf node. So log(N) time complexity is expected. >> >> I tested node-at-point on the same file, and it is quite fast, with worst case performance only 30µs (vs. 3ms for the full parent navigation), and growing very slowly, between sqrt(log(N)) and log(N). Check the gist for a new figure. >> >> Unless I am misunderstanding, for the common case of finding parent of the node at point, it seems the algorithm you describe could be tweaked to work well. I.e. instead of "stop when reaching a leaf node containing point", just "stop when you reach a node containing point who has the original node as a child”? This should give (hypothetical) “parent-of-node-at-point” the same great speed as node-at-point. > > I should’ve been more clear. For finding the parent of a node x, we stop at the parent of x. We only go to the leaf node when finding the node at point, which always returns a leaf node. Thanks for the clarification. But does parent of x make use of the positional information node’s store, or only look for “some node who has a matching child x”? I understood the latter, which would make sense why finding a node’s parent is more expensive than finding mode at point. But this post seems to indicate it does use the bounds to accelerate all searches: https://github.com/tree-sitter/tree-sitter/discussions/2250#discussioncomment-5848411 time complexity of TSNode::* APIs · tree-sitter tree-sitter · Discussion #2250 github.com So I added a plot for the ratio in time taken to find a node’s (single) direct parent, and find the node at point. Parent is more expensive, and the ratio does grow to about 10 at high line number. But better than the 100x growth in time-to-root. So I’d conclude some of the increase in the navigate to root time with line is due to the natural growing tree depth with line number, and some is just being at the “end” of a long list of children at many levels. That said, I can’t understand why finding one parent is not just as cheap as finding a leaf node at point. Maybe I’ll take it up on treesitter to confirm. >> >> Then “parent-of-node-at-point-until” could do something quite clever: accumulate parent nodes all the way from root to the child’s direct parent into a list (same low cost, modulo the node storage). Then run the predicate on that list (in reverse), returning the first match. Could of course return that full list too (“ancestors-of-node-at-point”), for other uses. These should all be quite fast compared to a full breadth and depth searching of every nook and cranny in the syntax tree that node-parent seems to do (having no positional information to wield). > > Sure, there hasn’t been a clever version because so far no one have complained they can’t find the parent of a node fast enough. Also I doubt the amount of gain we can get with a more clever algorithm. You can try implement one and benchmark it. I’m curious to know the result :-) Yeah it seems this is baked into the design of TS. >> >>> I’m not too worried tho, because IIRC the absolute time is very short. The 100x variability doesn’t mean much if the 100x is still very fast. >> >> >> This is on a brand new fast machine, and 3ms is pretty slow for things that need to run dozens of times per keystroke (e.g. font-lock). > > Font-lock doesn’t need to find the parent of a node. So that hasn’t been a problem. In use-cases where finding the parent of a node is useful, 3ms hasn’t been a problem AFAIK. (Well, I guess indentation could benefit from a faster parent-finding function.) And any algorithm that responds to containing syntactic units. For example, I’m working on my indent-bars package to allow it to find tree-sitter “containing nodes of certain types” to alter the bar depth, inside lists, function parameters/argument, etc. >> >>> These are fundamental limits of tree-sitter, unless it changes its data structure. >> >> That is an interesting limitation for sure. It seems that perhaps each node's start..end information can be really helpful here for winnowing the tree, when you are mostly concerned about nodes relevant to and covering particular positions in the buffer. > >> BTW, https://tree-sitter.github.io/tree-sitter/using-parsers says: >> • >> A TSNode represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children. > > I’m not entirely sure what exactly do you have in mind. A node’s start and end position doesn’t really help us finding it. Suppose you have an array of numbers, and you know one of the numbers is 3. How do you find the index of the number 3 in the array? This was to some degree my confusion over whether node-at-point and node-parent are really that different in terms of how they search the tree. But the start/end very much do help as you described in the node-at-point algorithm, and as mentioned in the post above: it does pretty effective stepping down over a tree to a node by using checks of predecessor ranges I.e. you can skip searching entire branches that do not contain node’s start..end range. > While the documentation say it tracks its relation to other nodes, I think it’s more pedagogical than factual. Behind the scenes, tree-sitter’s own node API does the same thing as I described: it goes from the root node and traverses down. I see, kind of an interesting choice to call that “tracking”. Thanks again for the insights. [-- Attachment #2.1: Type: text/html, Size: 12193 bytes --] [-- Attachment #2.2: 2250.png --] [-- Type: image/png, Size: 48129 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <87v8ddsqwe.fsf@web.de>]
* Re: Tree-sitter navigation time grows as sqrt(line-number) [not found] <87v8ddsqwe.fsf@web.de> @ 2023-08-17 14:25 ` Dmitry Gutov 2023-08-17 14:36 ` Dmitry Gutov 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-17 14:25 UTC (permalink / raw) To: Felix; +Cc: emacs-devel On 17/08/2023 17:08, Felix wrote: > Hi, > > I tried that patch, and it crashes emacs whenever i try to scroll inside > a tree-sitter managed buffer. > > GNU Emacs 30.0.50 (build 22, x86_64-pc-linux-gnu, GTK+ Version 3.24.38, cairo version 1.17.8) of 2023-08-17 > > compiled with pgtk running on wayland. Thanks for testing. Does that happen in any ts mode, or some particular ones? If you run Emacs from the terminal, could you post the error backtrace? (Please keep emacs-devel in Cc). ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 14:25 ` Dmitry Gutov @ 2023-08-17 14:36 ` Dmitry Gutov 2023-08-17 15:01 ` Dmitry Gutov 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-17 14:36 UTC (permalink / raw) To: Felix; +Cc: emacs-devel On 17/08/2023 17:25, Dmitry Gutov wrote: > On 17/08/2023 17:08, Felix wrote: >> Hi, >> >> I tried that patch, and it crashes emacs whenever i try to scroll inside >> a tree-sitter managed buffer. >> >> GNU Emacs 30.0.50 (build 22, x86_64-pc-linux-gnu, GTK+ Version >> 3.24.38, cairo version 1.17.8) of 2023-08-17 >> >> compiled with pgtk running on wayland. > > Thanks for testing. Does that happen in any ts mode, or some particular > ones? > > If you run Emacs from the terminal, could you post the error backtrace? > > (Please keep emacs-devel in Cc). For my part, I don't see it crash when scrolling (in ruby-ts-mdoe or js-ts-mode, including larges files), but the test scenario by JD does make it crash. When running under GDB, the backtrace simply looks like this: #0 0x00007ffff63bf93d in ts_tree_root_node (self=0x0) at lib/src/tree.c:36 #1 0x00007ffff63a54e5 in ts_node_parent (self=...) at lib/src/node.c:462 #2 0x00005555557f0d17 in Ftreesit_node_parent (node=XIL(0x5555574908c5)) at treesit.c:1906 #3 0x000055555576f62b in eval_sub (form=<optimized out>) at eval.c:2511 #4 0x000055555576f987 in Fsetq (args=<optimized out>) at eval.c:483 #5 0x000055555576f368 in eval_sub (form=<optimized out>) at eval.c:2462 #6 0x000055555576ff75 in Fprogn (body=XIL(0)) at eval.c:436 ... ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 14:36 ` Dmitry Gutov @ 2023-08-17 15:01 ` Dmitry Gutov 2023-08-17 15:15 ` Felix 0 siblings, 1 reply; 40+ messages in thread From: Dmitry Gutov @ 2023-08-17 15:01 UTC (permalink / raw) To: Felix; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1557 bytes --] On 17/08/2023 17:36, Dmitry Gutov wrote: > On 17/08/2023 17:25, Dmitry Gutov wrote: >> On 17/08/2023 17:08, Felix wrote: >>> Hi, >>> >>> I tried that patch, and it crashes emacs whenever i try to scroll inside >>> a tree-sitter managed buffer. >>> >>> GNU Emacs 30.0.50 (build 22, x86_64-pc-linux-gnu, GTK+ Version >>> 3.24.38, cairo version 1.17.8) of 2023-08-17 >>> >>> compiled with pgtk running on wayland. >> >> Thanks for testing. Does that happen in any ts mode, or some >> particular ones? >> >> If you run Emacs from the terminal, could you post the error backtrace? >> >> (Please keep emacs-devel in Cc). > > For my part, I don't see it crash when scrolling (in ruby-ts-mdoe or > js-ts-mode, including larges files), but the test scenario by JD does > make it crash. > > When running under GDB, the backtrace simply looks like this: > > #0 0x00007ffff63bf93d in ts_tree_root_node (self=0x0) at lib/src/tree.c:36 > #1 0x00007ffff63a54e5 in ts_node_parent (self=...) at lib/src/node.c:462 > #2 0x00005555557f0d17 in Ftreesit_node_parent > (node=XIL(0x5555574908c5)) at treesit.c:1906 > #3 0x000055555576f62b in eval_sub (form=<optimized out>) at eval.c:2511 > #4 0x000055555576f987 in Fsetq (args=<optimized out>) at eval.c:483 > #5 0x000055555576f368 in eval_sub (form=<optimized out>) at eval.c:2462 > #6 0x000055555576ff75 in Fprogn (body=XIL(0)) at eval.c:436 > ... Try this one. I've read the tree-sitter docs a bit more, and also dropped the "else" case because it was unreachable anyway. I doesn't crash anymore here. [-- Attachment #2: treesit-node-parent.diff --] [-- Type: text/x-patch, Size: 1199 bytes --] diff --git a/src/treesit.c b/src/treesit.c index 1f694e47201..46fbacaefe3 100644 --- a/src/treesit.c +++ b/src/treesit.c @@ -52,6 +52,7 @@ Copyright (C) 2021-2023 Free Software Foundation, Inc. #undef ts_node_named_descendant_for_byte_range #undef ts_node_next_named_sibling #undef ts_node_next_sibling +#undef ts_node_parent #undef ts_node_prev_named_sibling #undef ts_node_prev_sibling #undef ts_node_start_byte @@ -1895,21 +1896,15 @@ DEFUN ("treesit-node-parent", treesit_check_node (node); treesit_initialize (); - Lisp_Object return_value = Qnil; - TSNode treesit_node = XTS_NODE (node)->node; Lisp_Object parser = XTS_NODE (node)->parser; - TSTreeCursor cursor; - if (!treesit_cursor_helper (&cursor, treesit_node, parser)) - return return_value; - if (ts_tree_cursor_goto_parent (&cursor)) - { - TSNode parent = ts_tree_cursor_current_node (&cursor); - return_value = make_treesit_node (parser, parent); - } - ts_tree_cursor_delete (&cursor); - return return_value; + TSNode parent = ts_node_parent (treesit_node); + + if (ts_node_is_null (parent)) + return Qnil; + + return make_treesit_node (parser, parent); } DEFUN ("treesit-node-child", ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 15:01 ` Dmitry Gutov @ 2023-08-17 15:15 ` Felix 2023-08-18 2:49 ` Yuan Fu 0 siblings, 1 reply; 40+ messages in thread From: Felix @ 2023-08-17 15:15 UTC (permalink / raw) To: Felix, Dmitry Gutov; +Cc: emacs-devel Dmitry Gutov <dmitry@gutov.dev> writes: > On 17/08/2023 17:36, Dmitry Gutov wrote: >> On 17/08/2023 17:25, Dmitry Gutov wrote: >>> On 17/08/2023 17:08, Felix wrote: >>>> Hi, >>>> >>>> I tried that patch, and it crashes emacs whenever i try to scroll inside >>>> a tree-sitter managed buffer. >>>> >>>> GNU Emacs 30.0.50 (build 22, x86_64-pc-linux-gnu, GTK+ Version >>>> 3.24.38, cairo version 1.17.8) of 2023-08-17 >>>> >>>> compiled with pgtk running on wayland. >>> >>> Thanks for testing. Does that happen in any ts mode, or some >>> particular ones? >>> >>> If you run Emacs from the terminal, could you post the error backtrace? >>> >>> (Please keep emacs-devel in Cc). >> For my part, I don't see it crash when scrolling (in ruby-ts-mdoe or >> js-ts-mode, including larges files), but the test scenario by JD >> does make it crash. >> When running under GDB, the backtrace simply looks like this: >> #0 0x00007ffff63bf93d in ts_tree_root_node (self=0x0) at >> lib/src/tree.c:36 >> #1 0x00007ffff63a54e5 in ts_node_parent (self=...) at lib/src/node.c:462 >> #2 0x00005555557f0d17 in Ftreesit_node_parent >> (node=XIL(0x5555574908c5)) at treesit.c:1906 >> #3 0x000055555576f62b in eval_sub (form=<optimized out>) at eval.c:2511 >> #4 0x000055555576f987 in Fsetq (args=<optimized out>) at eval.c:483 >> #5 0x000055555576f368 in eval_sub (form=<optimized out>) at eval.c:2462 >> #6 0x000055555576ff75 in Fprogn (body=XIL(0)) at eval.c:436 >> ... > > Try this one. > > I've read the tree-sitter docs a bit more, and also dropped the "else" > case because it was unreachable anyway. I doesn't crash anymore here. > > [2. text/x-patch; treesit-node-parent.diff]... This one doesn't crash while scrolling. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-17 15:15 ` Felix @ 2023-08-18 2:49 ` Yuan Fu 2023-08-18 13:29 ` Dmitry Gutov 0 siblings, 1 reply; 40+ messages in thread From: Yuan Fu @ 2023-08-18 2:49 UTC (permalink / raw) To: Felix; +Cc: Dmitry Gutov, emacs-devel > On Aug 17, 2023, at 8:15 AM, Felix <felix.dick@web.de> wrote: > > > Dmitry Gutov <dmitry@gutov.dev> writes: > >> On 17/08/2023 17:36, Dmitry Gutov wrote: >>> On 17/08/2023 17:25, Dmitry Gutov wrote: >>>> On 17/08/2023 17:08, Felix wrote: >>>>> Hi, >>>>> >>>>> I tried that patch, and it crashes emacs whenever i try to scroll inside >>>>> a tree-sitter managed buffer. >>>>> >>>>> GNU Emacs 30.0.50 (build 22, x86_64-pc-linux-gnu, GTK+ Version >>>>> 3.24.38, cairo version 1.17.8) of 2023-08-17 >>>>> >>>>> compiled with pgtk running on wayland. >>>> >>>> Thanks for testing. Does that happen in any ts mode, or some >>>> particular ones? >>>> >>>> If you run Emacs from the terminal, could you post the error backtrace? >>>> >>>> (Please keep emacs-devel in Cc). >>> For my part, I don't see it crash when scrolling (in ruby-ts-mdoe or >>> js-ts-mode, including larges files), but the test scenario by JD >>> does make it crash. >>> When running under GDB, the backtrace simply looks like this: >>> #0 0x00007ffff63bf93d in ts_tree_root_node (self=0x0) at >>> lib/src/tree.c:36 >>> #1 0x00007ffff63a54e5 in ts_node_parent (self=...) at lib/src/node.c:462 >>> #2 0x00005555557f0d17 in Ftreesit_node_parent >>> (node=XIL(0x5555574908c5)) at treesit.c:1906 >>> #3 0x000055555576f62b in eval_sub (form=<optimized out>) at eval.c:2511 >>> #4 0x000055555576f987 in Fsetq (args=<optimized out>) at eval.c:483 >>> #5 0x000055555576f368 in eval_sub (form=<optimized out>) at eval.c:2462 >>> #6 0x000055555576ff75 in Fprogn (body=XIL(0)) at eval.c:436 >>> ... >> >> Try this one. >> >> I've read the tree-sitter docs a bit more, and also dropped the "else" >> case because it was unreachable anyway. I doesn't crash anymore here. >> >> [2. text/x-patch; treesit-node-parent.diff]... > > This one doesn't crash while scrolling. So I guess the cursor stuff caused the crash. But we shouldn’t fix it by using tree_sitter_node_parent, we purposefully replaced it using the cursor API, see bug#60054. Yuan ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Tree-sitter navigation time grows as sqrt(line-number) 2023-08-18 2:49 ` Yuan Fu @ 2023-08-18 13:29 ` Dmitry Gutov 0 siblings, 0 replies; 40+ messages in thread From: Dmitry Gutov @ 2023-08-18 13:29 UTC (permalink / raw) To: Yuan Fu, Felix; +Cc: emacs-devel On 18/08/2023 05:49, Yuan Fu wrote: > So I guess the cursor stuff caused the crash. Not at all, the crash was entirely the fault of v1 of my patch (where I didn't check whether the returned node is_null. > But we shouldn’t fix it by using tree_sitter_node_parent, we purposefully replaced it using the cursor API, see bug#60054. Yeah, I've found this thread a little after posting the v2 patch. Even so, trying it out might be useful to see whether the built-in ts_node_parent is faster in some specific usage scenarios than what we currently have to use. If that is found to be the case, we'll know the potential direction for improvement, at least. ^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2023-09-01 5:33 UTC | newest] Thread overview: 40+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-17 4:01 Tree-sitter navigation time grows as sqrt(line-number) JD Smith 2023-08-17 10:56 ` Dmitry Gutov 2023-08-17 11:41 ` Eli Zaretskii 2023-08-17 11:51 ` tomas 2023-08-17 12:21 ` JD Smith 2023-08-17 12:34 ` Dmitry Gutov 2023-08-17 13:19 ` Dmitry Gutov 2023-08-19 14:24 ` JD Smith 2023-08-19 22:16 ` Yuan Fu 2023-08-20 0:18 ` JD Smith 2023-08-20 0:39 ` Dmitry Gutov 2023-08-20 2:01 ` Yuan Fu 2023-08-20 12:40 ` JD Smith 2023-08-20 20:26 ` Dmitry Gutov 2023-08-22 1:41 ` Yuan Fu 2023-08-22 21:07 ` JD Smith 2023-08-31 4:26 ` Yuan Fu 2023-08-31 6:03 ` Eli Zaretskii 2023-08-31 11:04 ` Dmitry Gutov 2023-08-31 11:42 ` Po Lu 2023-08-31 17:32 ` Dmitry Gutov 2023-08-31 12:51 ` Eli Zaretskii 2023-08-31 13:58 ` JD Smith 2023-08-31 17:49 ` Dmitry Gutov 2023-08-31 19:03 ` Yuan Fu 2023-08-31 19:06 ` Eli Zaretskii 2023-08-31 20:24 ` Stefan Kangas 2023-09-01 5:33 ` Eli Zaretskii 2023-08-20 6:18 ` Eli Zaretskii 2023-08-18 3:00 ` Yuan Fu 2023-08-18 4:19 ` JD Smith 2023-08-18 5:20 ` Yuan Fu 2023-08-18 13:21 ` Dmitry Gutov 2023-08-18 13:39 ` JD Smith [not found] <87v8ddsqwe.fsf@web.de> 2023-08-17 14:25 ` Dmitry Gutov 2023-08-17 14:36 ` Dmitry Gutov 2023-08-17 15:01 ` Dmitry Gutov 2023-08-17 15:15 ` Felix 2023-08-18 2:49 ` Yuan Fu 2023-08-18 13:29 ` Dmitry Gutov
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).