* Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 @ 2023-09-09 19:26 Danny Freeman 2023-09-10 1:15 ` Yuan Fu 0 siblings, 1 reply; 8+ messages in thread From: Danny Freeman @ 2023-09-09 19:26 UTC (permalink / raw) To: emacs-devel; +Cc: Yuan Fu Hello, I believe I have found a regression in the treesit package introduced by commit ec4d29c4494f32acf0ff7c5632a1d951d957f084 I noticed that indentation does not work properly on Emacs master branch in clojure-ts-mode. After doing a git bisect on Emacs source code I landed on the above commit. To reproduce, I build emacs with ec4d29c44 [0][1] and load up clojure-ts-mode from the current main branch (commit 58f3c835 [2]). I open up a new file, say test.clj, and run `M-x clojure-ts-mode`. In that file I can paste the following contents (defn foo [] 1) The buffer should have the following tree structure, which can be seen with M-x treesit-explore-mode <RET> clojure <RET> (source (list_lit close: ( (sym_lit name: (sym_name)) (sym_lit (sym_name)) (vec_lit [ ]) (num_lit) ))) If I place my cursor directly before the 1 character like so, where | is my cursor (defn foo [] |1) and then press <RET>, I would expect the following indentation (defn foo [] |1) Instead I get (defn foo [] |1) and if `treesit--indent-verbose` is set to `t` then I see these messages in *Messages* Matched rule: ((parent-is "source") parent-bol 0) PARENT is nil, not indenting Before this change I would see Matched rule: ((parent-is "source") parent-bol 0) Matched rule: (clojure-ts--match-expression-in-body parent 2) While my cursor is before the 1 character, running M-x eval-expression <RET> (treesit-node-parent (treesit-node-at (point))) will return nil. Before this change, it would return the parent list_lit node as I would expect. Please let me know if you have any question, or if there is something I can do to make this easier to test. I see the commit the git bisect pointed me to contains changes to the C code for treesit. I do not know C well enough to know why this breaks, unfortunately. I can only say something is not quite right. [0] - the configure flags I use to build emacs https://git.sr.ht/~dannyfreeman/emacs/tree/ba4a5f3678bf83a28ce382fc33611e0d7aed2a86/item/flake.nix#L63-81 [1] - The init file I use https://git.sr.ht/~dannyfreeman/emacs/tree/main/item/init/init.el [2] - clojure-ts-mode version used to reproduce this error: https://github.com/clojure-emacs/clojure-ts-mode/commit/58f3c835aeafe0378ab9693731b95096827dbb24 Thank you, -- Danny Freeman ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-09 19:26 Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 Danny Freeman @ 2023-09-10 1:15 ` Yuan Fu 2023-09-10 4:00 ` Danny Freeman 0 siblings, 1 reply; 8+ messages in thread From: Yuan Fu @ 2023-09-10 1:15 UTC (permalink / raw) To: Danny Freeman; +Cc: emacs-devel > On Sep 9, 2023, at 12:26 PM, Danny Freeman <danny@dfreeman.email> wrote: > > Hello, > > I believe I have found a regression in the treesit package introduced by > commit ec4d29c4494f32acf0ff7c5632a1d951d957f084 > > I noticed that indentation does not work properly on Emacs master branch > in clojure-ts-mode. After doing a git bisect on Emacs source code I > landed on the above commit. > > > > To reproduce, I build emacs with ec4d29c44 [0][1] and load up > clojure-ts-mode from the current main branch (commit 58f3c835 [2]). > > I open up a new file, say test.clj, and run `M-x clojure-ts-mode`. > > In that file I can paste the following contents > > (defn foo [] 1) > > The buffer should have the following tree structure, which can be seen > with M-x treesit-explore-mode <RET> clojure <RET> > > (source > (list_lit close: ( > (sym_lit name: (sym_name)) > (sym_lit (sym_name)) > (vec_lit [ ]) > (num_lit) ))) > > If I place my cursor directly before the 1 character like so, where | is > my cursor > > (defn foo [] |1) > > and then press <RET>, I would expect the following indentation > > (defn foo [] > |1) > > Instead I get > > (defn foo [] > |1) > > and if `treesit--indent-verbose` is set to `t` then I see these messages > in *Messages* > > Matched rule: ((parent-is "source") parent-bol 0) > PARENT is nil, not indenting > > Before this change I would see > > Matched rule: ((parent-is "source") parent-bol 0) > Matched rule: (clojure-ts--match-expression-in-body parent 2) > > While my cursor is before the 1 character, running > > M-x eval-expression <RET> (treesit-node-parent (treesit-node-at (point))) > > will return nil. Before this change, it would return the parent list_lit > node as I would expect. > > > Please let me know if you have any question, or if there is something I > can do to make this easier to test. I see the commit the git bisect > pointed me to contains changes to the C code for treesit. I do not know > C well enough to know why this breaks, unfortunately. I can only say > something is not quite right. > > [0] - the configure flags I use to build emacs > https://git.sr.ht/~dannyfreeman/emacs/tree/ba4a5f3678bf83a28ce382fc33611e0d7aed2a86/item/flake.nix#L63-81 > [1] - The init file I use > https://git.sr.ht/~dannyfreeman/emacs/tree/main/item/init/init.el > [2] - clojure-ts-mode version used to reproduce this error: > https://github.com/clojure-emacs/clojure-ts-mode/commit/58f3c835aeafe0378ab9693731b95096827dbb24 > > Thank you, > -- > Danny Freeman Hmmm, I can’t reproduce this. I’m getting Matched rule: ((parent-is "source") parent-bol 0) Matched rule: (clojure-ts--match-expression-in-body parent 2) The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working. Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem? Yuan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-10 1:15 ` Yuan Fu @ 2023-09-10 4:00 ` Danny Freeman 2023-09-10 5:26 ` Yuan Fu 0 siblings, 1 reply; 8+ messages in thread From: Danny Freeman @ 2023-09-10 4:00 UTC (permalink / raw) To: Yuan Fu; +Cc: emacs-devel Yuan Fu <casouri@gmail.com> writes: > > Hmmm, I can’t reproduce this. I’m getting > > Matched rule: ((parent-is "source") parent-bol 0) > Matched rule: (clojure-ts--match-expression-in-body parent 2) > > The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working. > > Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem? Before I was just using whatever tree-sitter version my distro packages. Apparently v0.20.8 (built from the git tag of the same name) I tried compiling with this version https://github.com/tree-sitter/tree-sitter/commit/524bf7e2c664d4a5dbd0c20d4d10f1e58f99e8ce which at the time of writing is the latest, and still experience the same problem. I also made sure to re-compile my grammars. I may have done something wrong though when building. I'm going to try to build using a more traditional process rather than with nix and see if that helps. Thank you, -- Danny Freeman ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-10 4:00 ` Danny Freeman @ 2023-09-10 5:26 ` Yuan Fu 2023-09-10 7:12 ` Yuan Fu 0 siblings, 1 reply; 8+ messages in thread From: Yuan Fu @ 2023-09-10 5:26 UTC (permalink / raw) To: Danny Freeman; +Cc: emacs-devel > On Sep 9, 2023, at 9:00 PM, Danny Freeman <danny@dfreeman.email> wrote: > > > Yuan Fu <casouri@gmail.com> writes: >> >> Hmmm, I can’t reproduce this. I’m getting >> >> Matched rule: ((parent-is "source") parent-bol 0) >> Matched rule: (clojure-ts--match-expression-in-body parent 2) >> >> The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working. >> >> Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem? > > Before I was just using whatever tree-sitter version my distro packages. > Apparently v0.20.8 (built from the git tag of the same name) > > I tried compiling with this version > https://github.com/tree-sitter/tree-sitter/commit/524bf7e2c664d4a5dbd0c20d4d10f1e58f99e8ce > > which at the time of writing is the latest, and still experience the > same problem. I also made sure to re-compile my grammars. > > I may have done something wrong though when building. I'm > going to try to build using a more traditional process rather than with > nix and see if that helps. > > Thank you, > -- > Danny Freeman I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem. Yuan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-10 5:26 ` Yuan Fu @ 2023-09-10 7:12 ` Yuan Fu 2023-09-11 12:41 ` Danny Freeman 0 siblings, 1 reply; 8+ messages in thread From: Yuan Fu @ 2023-09-10 7:12 UTC (permalink / raw) To: Danny Freeman; +Cc: emacs-devel > On Sep 9, 2023, at 10:26 PM, Yuan Fu <casouri@gmail.com> wrote: > > > >> On Sep 9, 2023, at 9:00 PM, Danny Freeman <danny@dfreeman.email> wrote: >> >> >> Yuan Fu <casouri@gmail.com> writes: >>> >>> Hmmm, I can’t reproduce this. I’m getting >>> >>> Matched rule: ((parent-is "source") parent-bol 0) >>> Matched rule: (clojure-ts--match-expression-in-body parent 2) >>> >>> The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working. >>> >>> Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem? >> >> Before I was just using whatever tree-sitter version my distro packages. >> Apparently v0.20.8 (built from the git tag of the same name) >> >> I tried compiling with this version >> https://github.com/tree-sitter/tree-sitter/commit/524bf7e2c664d4a5dbd0c20d4d10f1e58f99e8ce >> >> which at the time of writing is the latest, and still experience the >> same problem. I also made sure to re-compile my grammars. >> >> I may have done something wrong though when building. I'm >> going to try to build using a more traditional process rather than with >> nix and see if that helps. >> >> Thank you, >> -- >> Danny Freeman > > I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem. > > Yuan What I found is that, for some reason, ts_tree_cursor_goto_first_child_for_byte doesn’t always work as expected in clojure [1]. In your example, if I evaluate (treesit-node-parent (treesit-node-at (point))) with point at the beginning of “defn”, “foo”, “[]”, and “1”, only “defn” correctly evaluates to the parent node. I tried this in other modes like c-ts-mode, but things seems to work fine in all the few cases I tried. So now the question is, is there anything tree-sitter-clojure does differently? Not saying it does, but that seems like a starting point. [1] Instead of returning the first child node of NODE that extends beyond POS, ts_tree_cursor_goto_first_child_for_byte (NODE, POS) returns false (meaning not found). Yuan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-10 7:12 ` Yuan Fu @ 2023-09-11 12:41 ` Danny Freeman 2023-09-11 22:19 ` Yuan Fu 0 siblings, 1 reply; 8+ messages in thread From: Danny Freeman @ 2023-09-11 12:41 UTC (permalink / raw) To: Yuan Fu; +Cc: emacs-devel Yuan Fu <casouri@gmail.com> writes: >> I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem. Well I'm glad to know I am capable of building Emacs with the latest tree sitter. Not as glad that the problem still exists lol > What I found is that, for some reason, ts_tree_cursor_goto_first_child_for_byte doesn’t always work > as expected in clojure [1]. In your example, if I evaluate (treesit-node-parent (treesit-node-at > (point))) with point at the beginning of “defn”, “foo”, “[]”, and “1”, only “defn” correctly > evaluates to the parent node. I tried this in other modes like c-ts-mode, but things seems to work > fine in all the few cases I tried. > > So now the question is, is there anything tree-sitter-clojure does differently? Not saying it does, but that seems like a starting point. Maybe there is. The grammar has not changed in a couple months, and those things that did change were related to keyword literals, which are not involved in the example I've been using to reproduce this. Even knowing that. I will try to narrow the problem down even further, and see if I can find another grammar the exhibits this behavior. > [1] Instead of returning the first child node of NODE that extends beyond POS, ts_tree_cursor_goto_first_child_for_byte (NODE, POS) returns false (meaning not found). I remember we found an issue with a similarly named function here https://github.com/tree-sitter/tree-sitter/issues/2012 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60127 You ended up writing a workaround in emacs for this called treesit_cursor_first_child_for_byte I don't think there is overlap between that change and this ec4d29 one. However, maybe there is overlap in the tree-sitter implementation? I'm not sure how to tell right now, but maybe this is an upstream tree-sitter problem that is just brought into the light by the commit in question here. -- Danny Freeman ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-11 12:41 ` Danny Freeman @ 2023-09-11 22:19 ` Yuan Fu 2023-09-12 3:49 ` Danny Freeman 0 siblings, 1 reply; 8+ messages in thread From: Yuan Fu @ 2023-09-11 22:19 UTC (permalink / raw) To: Danny Freeman; +Cc: emacs-devel > On Sep 11, 2023, at 5:41 AM, Danny Freeman <danny@dfreeman.email> wrote: > > > Yuan Fu <casouri@gmail.com> writes: > >>> I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem. > > Well I'm glad to know I am capable of building Emacs with the latest > tree sitter. Not as glad that the problem still exists lol > > >> What I found is that, for some reason, ts_tree_cursor_goto_first_child_for_byte doesn’t always work >> as expected in clojure [1]. In your example, if I evaluate (treesit-node-parent (treesit-node-at >> (point))) with point at the beginning of “defn”, “foo”, “[]”, and “1”, only “defn” correctly >> evaluates to the parent node. I tried this in other modes like c-ts-mode, but things seems to work >> fine in all the few cases I tried. >> >> So now the question is, is there anything tree-sitter-clojure does differently? Not saying it does, but that seems like a starting point. > > Maybe there is. The grammar has not changed in a couple months, and > those things that did change were related to keyword literals, which are > not involved in the example I've been using to reproduce this. > > Even knowing that. I will try to narrow the problem down even further, > and see if I can find another grammar the exhibits this behavior. > >> [1] Instead of returning the first child node of NODE that extends beyond POS, ts_tree_cursor_goto_first_child_for_byte (NODE, POS) returns false (meaning not found). > > I remember we found an issue with a similarly named function here > > https://github.com/tree-sitter/tree-sitter/issues/2012 > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60127 > > You ended up writing a workaround in emacs for this called > treesit_cursor_first_child_for_byte > > I don't think there is overlap between that change and this ec4d29 one. > However, maybe there is overlap in the tree-sitter implementation? I'm > not sure how to tell right now, but maybe this is an upstream > tree-sitter problem that is just brought into the light by the commit in > question here. Duh, I can’t believe I completely forgot about this. Yes, this is exactly it, thank you for binging it up. Before tree-sitter can fix treesit_cursor_first_child_for_byte, I’ll just replicate the workaround and we should be golden. I’ve pushed a fix to emacs-29. Once master pulls from emacs-29, the problem should go away. BTW, I should’ve opened a bug report and move the discussion there. But since we are mostly done here, and there’s another bug report to refer to, there’s no need to do that now. Yuan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 2023-09-11 22:19 ` Yuan Fu @ 2023-09-12 3:49 ` Danny Freeman 0 siblings, 0 replies; 8+ messages in thread From: Danny Freeman @ 2023-09-12 3:49 UTC (permalink / raw) To: Yuan Fu; +Cc: emacs-devel Yuan Fu <casouri@gmail.com> writes: > Duh, I can’t believe I completely forgot about this. Yes, this is exactly it, thank you for binging > it up. Before tree-sitter can fix treesit_cursor_first_child_for_byte, I’ll just replicate the > workaround and we should be golden. I’ve pushed a fix to emacs-29. Once master pulls from emacs-29, > the problem should go away. It took me a minute to connect the dots there, but that did seem to be the cause. I just tried it against Emacs 30 with the commit in question cherry picked and it works perfectly. Thank you for fixing it. > BTW, I should’ve opened a bug report and move the discussion there. But since we are mostly done here, and there’s another bug report to refer to, there’s no need to do that now. Understood. I probably should have started with a bug report but for some reason sent my initial email to emacs-devel. I'll be better about that next time I run into something in the core. Thanks again for your help, all the best. -- Danny Freeman ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-09-12 3:49 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-09 19:26 Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 Danny Freeman 2023-09-10 1:15 ` Yuan Fu 2023-09-10 4:00 ` Danny Freeman 2023-09-10 5:26 ` Yuan Fu 2023-09-10 7:12 ` Yuan Fu 2023-09-11 12:41 ` Danny Freeman 2023-09-11 22:19 ` Yuan Fu 2023-09-12 3:49 ` Danny Freeman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.