* Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
@ 2023-09-09 19:26 Danny Freeman
2023-09-10 1:15 ` Yuan Fu
0 siblings, 1 reply; 8+ messages in thread
From: Danny Freeman @ 2023-09-09 19:26 UTC (permalink / raw)
To: emacs-devel; +Cc: Yuan Fu
Hello,
I believe I have found a regression in the treesit package introduced by
commit ec4d29c4494f32acf0ff7c5632a1d951d957f084
I noticed that indentation does not work properly on Emacs master branch
in clojure-ts-mode. After doing a git bisect on Emacs source code I
landed on the above commit.
To reproduce, I build emacs with ec4d29c44 [0][1] and load up
clojure-ts-mode from the current main branch (commit 58f3c835 [2]).
I open up a new file, say test.clj, and run `M-x clojure-ts-mode`.
In that file I can paste the following contents
(defn foo [] 1)
The buffer should have the following tree structure, which can be seen
with M-x treesit-explore-mode <RET> clojure <RET>
(source
(list_lit close: (
(sym_lit name: (sym_name))
(sym_lit (sym_name))
(vec_lit [ ])
(num_lit) )))
If I place my cursor directly before the 1 character like so, where | is
my cursor
(defn foo [] |1)
and then press <RET>, I would expect the following indentation
(defn foo []
|1)
Instead I get
(defn foo []
|1)
and if `treesit--indent-verbose` is set to `t` then I see these messages
in *Messages*
Matched rule: ((parent-is "source") parent-bol 0)
PARENT is nil, not indenting
Before this change I would see
Matched rule: ((parent-is "source") parent-bol 0)
Matched rule: (clojure-ts--match-expression-in-body parent 2)
While my cursor is before the 1 character, running
M-x eval-expression <RET> (treesit-node-parent (treesit-node-at (point)))
will return nil. Before this change, it would return the parent list_lit
node as I would expect.
Please let me know if you have any question, or if there is something I
can do to make this easier to test. I see the commit the git bisect
pointed me to contains changes to the C code for treesit. I do not know
C well enough to know why this breaks, unfortunately. I can only say
something is not quite right.
[0] - the configure flags I use to build emacs
https://git.sr.ht/~dannyfreeman/emacs/tree/ba4a5f3678bf83a28ce382fc33611e0d7aed2a86/item/flake.nix#L63-81
[1] - The init file I use
https://git.sr.ht/~dannyfreeman/emacs/tree/main/item/init/init.el
[2] - clojure-ts-mode version used to reproduce this error:
https://github.com/clojure-emacs/clojure-ts-mode/commit/58f3c835aeafe0378ab9693731b95096827dbb24
Thank you,
--
Danny Freeman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-09 19:26 Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 Danny Freeman
@ 2023-09-10 1:15 ` Yuan Fu
2023-09-10 4:00 ` Danny Freeman
0 siblings, 1 reply; 8+ messages in thread
From: Yuan Fu @ 2023-09-10 1:15 UTC (permalink / raw)
To: Danny Freeman; +Cc: emacs-devel
> On Sep 9, 2023, at 12:26 PM, Danny Freeman <danny@dfreeman.email> wrote:
>
> Hello,
>
> I believe I have found a regression in the treesit package introduced by
> commit ec4d29c4494f32acf0ff7c5632a1d951d957f084
>
> I noticed that indentation does not work properly on Emacs master branch
> in clojure-ts-mode. After doing a git bisect on Emacs source code I
> landed on the above commit.
>
>
>
> To reproduce, I build emacs with ec4d29c44 [0][1] and load up
> clojure-ts-mode from the current main branch (commit 58f3c835 [2]).
>
> I open up a new file, say test.clj, and run `M-x clojure-ts-mode`.
>
> In that file I can paste the following contents
>
> (defn foo [] 1)
>
> The buffer should have the following tree structure, which can be seen
> with M-x treesit-explore-mode <RET> clojure <RET>
>
> (source
> (list_lit close: (
> (sym_lit name: (sym_name))
> (sym_lit (sym_name))
> (vec_lit [ ])
> (num_lit) )))
>
> If I place my cursor directly before the 1 character like so, where | is
> my cursor
>
> (defn foo [] |1)
>
> and then press <RET>, I would expect the following indentation
>
> (defn foo []
> |1)
>
> Instead I get
>
> (defn foo []
> |1)
>
> and if `treesit--indent-verbose` is set to `t` then I see these messages
> in *Messages*
>
> Matched rule: ((parent-is "source") parent-bol 0)
> PARENT is nil, not indenting
>
> Before this change I would see
>
> Matched rule: ((parent-is "source") parent-bol 0)
> Matched rule: (clojure-ts--match-expression-in-body parent 2)
>
> While my cursor is before the 1 character, running
>
> M-x eval-expression <RET> (treesit-node-parent (treesit-node-at (point)))
>
> will return nil. Before this change, it would return the parent list_lit
> node as I would expect.
>
>
> Please let me know if you have any question, or if there is something I
> can do to make this easier to test. I see the commit the git bisect
> pointed me to contains changes to the C code for treesit. I do not know
> C well enough to know why this breaks, unfortunately. I can only say
> something is not quite right.
>
> [0] - the configure flags I use to build emacs
> https://git.sr.ht/~dannyfreeman/emacs/tree/ba4a5f3678bf83a28ce382fc33611e0d7aed2a86/item/flake.nix#L63-81
> [1] - The init file I use
> https://git.sr.ht/~dannyfreeman/emacs/tree/main/item/init/init.el
> [2] - clojure-ts-mode version used to reproduce this error:
> https://github.com/clojure-emacs/clojure-ts-mode/commit/58f3c835aeafe0378ab9693731b95096827dbb24
>
> Thank you,
> --
> Danny Freeman
Hmmm, I can’t reproduce this. I’m getting
Matched rule: ((parent-is "source") parent-bol 0)
Matched rule: (clojure-ts--match-expression-in-body parent 2)
The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working.
Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem?
Yuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-10 1:15 ` Yuan Fu
@ 2023-09-10 4:00 ` Danny Freeman
2023-09-10 5:26 ` Yuan Fu
0 siblings, 1 reply; 8+ messages in thread
From: Danny Freeman @ 2023-09-10 4:00 UTC (permalink / raw)
To: Yuan Fu; +Cc: emacs-devel
Yuan Fu <casouri@gmail.com> writes:
>
> Hmmm, I can’t reproduce this. I’m getting
>
> Matched rule: ((parent-is "source") parent-bol 0)
> Matched rule: (clojure-ts--match-expression-in-body parent 2)
>
> The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working.
>
> Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem?
Before I was just using whatever tree-sitter version my distro packages.
Apparently v0.20.8 (built from the git tag of the same name)
I tried compiling with this version
https://github.com/tree-sitter/tree-sitter/commit/524bf7e2c664d4a5dbd0c20d4d10f1e58f99e8ce
which at the time of writing is the latest, and still experience the
same problem. I also made sure to re-compile my grammars.
I may have done something wrong though when building. I'm
going to try to build using a more traditional process rather than with
nix and see if that helps.
Thank you,
--
Danny Freeman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-10 4:00 ` Danny Freeman
@ 2023-09-10 5:26 ` Yuan Fu
2023-09-10 7:12 ` Yuan Fu
0 siblings, 1 reply; 8+ messages in thread
From: Yuan Fu @ 2023-09-10 5:26 UTC (permalink / raw)
To: Danny Freeman; +Cc: emacs-devel
> On Sep 9, 2023, at 9:00 PM, Danny Freeman <danny@dfreeman.email> wrote:
>
>
> Yuan Fu <casouri@gmail.com> writes:
>>
>> Hmmm, I can’t reproduce this. I’m getting
>>
>> Matched rule: ((parent-is "source") parent-bol 0)
>> Matched rule: (clojure-ts--match-expression-in-body parent 2)
>>
>> The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working.
>>
>> Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem?
>
> Before I was just using whatever tree-sitter version my distro packages.
> Apparently v0.20.8 (built from the git tag of the same name)
>
> I tried compiling with this version
> https://github.com/tree-sitter/tree-sitter/commit/524bf7e2c664d4a5dbd0c20d4d10f1e58f99e8ce
>
> which at the time of writing is the latest, and still experience the
> same problem. I also made sure to re-compile my grammars.
>
> I may have done something wrong though when building. I'm
> going to try to build using a more traditional process rather than with
> nix and see if that helps.
>
> Thank you,
> --
> Danny Freeman
I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem.
Yuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-10 5:26 ` Yuan Fu
@ 2023-09-10 7:12 ` Yuan Fu
2023-09-11 12:41 ` Danny Freeman
0 siblings, 1 reply; 8+ messages in thread
From: Yuan Fu @ 2023-09-10 7:12 UTC (permalink / raw)
To: Danny Freeman; +Cc: emacs-devel
> On Sep 9, 2023, at 10:26 PM, Yuan Fu <casouri@gmail.com> wrote:
>
>
>
>> On Sep 9, 2023, at 9:00 PM, Danny Freeman <danny@dfreeman.email> wrote:
>>
>>
>> Yuan Fu <casouri@gmail.com> writes:
>>>
>>> Hmmm, I can’t reproduce this. I’m getting
>>>
>>> Matched rule: ((parent-is "source") parent-bol 0)
>>> Matched rule: (clojure-ts--match-expression-in-body parent 2)
>>>
>>> The commit you pointed to does change treesit-node-parent, so it does match with parent-is matcher not working.
>>>
>>> Might be because I recently upgraded to a newer tree-sitter library. If you compile with the newest tree-sitter build (from their repo), do you still see this problem?
>>
>> Before I was just using whatever tree-sitter version my distro packages.
>> Apparently v0.20.8 (built from the git tag of the same name)
>>
>> I tried compiling with this version
>> https://github.com/tree-sitter/tree-sitter/commit/524bf7e2c664d4a5dbd0c20d4d10f1e58f99e8ce
>>
>> which at the time of writing is the latest, and still experience the
>> same problem. I also made sure to re-compile my grammars.
>>
>> I may have done something wrong though when building. I'm
>> going to try to build using a more traditional process rather than with
>> nix and see if that helps.
>>
>> Thank you,
>> --
>> Danny Freeman
>
> I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem.
>
> Yuan
What I found is that, for some reason, ts_tree_cursor_goto_first_child_for_byte doesn’t always work as expected in clojure [1]. In your example, if I evaluate (treesit-node-parent (treesit-node-at (point))) with point at the beginning of “defn”, “foo”, “[]”, and “1”, only “defn” correctly evaluates to the parent node. I tried this in other modes like c-ts-mode, but things seems to work fine in all the few cases I tried.
So now the question is, is there anything tree-sitter-clojure does differently? Not saying it does, but that seems like a starting point.
[1] Instead of returning the first child node of NODE that extends beyond POS, ts_tree_cursor_goto_first_child_for_byte (NODE, POS) returns false (meaning not found).
Yuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-10 7:12 ` Yuan Fu
@ 2023-09-11 12:41 ` Danny Freeman
2023-09-11 22:19 ` Yuan Fu
0 siblings, 1 reply; 8+ messages in thread
From: Danny Freeman @ 2023-09-11 12:41 UTC (permalink / raw)
To: Yuan Fu; +Cc: emacs-devel
Yuan Fu <casouri@gmail.com> writes:
>> I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem.
Well I'm glad to know I am capable of building Emacs with the latest
tree sitter. Not as glad that the problem still exists lol
> What I found is that, for some reason, ts_tree_cursor_goto_first_child_for_byte doesn’t always work
> as expected in clojure [1]. In your example, if I evaluate (treesit-node-parent (treesit-node-at
> (point))) with point at the beginning of “defn”, “foo”, “[]”, and “1”, only “defn” correctly
> evaluates to the parent node. I tried this in other modes like c-ts-mode, but things seems to work
> fine in all the few cases I tried.
>
> So now the question is, is there anything tree-sitter-clojure does differently? Not saying it does, but that seems like a starting point.
Maybe there is. The grammar has not changed in a couple months, and
those things that did change were related to keyword literals, which are
not involved in the example I've been using to reproduce this.
Even knowing that. I will try to narrow the problem down even further,
and see if I can find another grammar the exhibits this behavior.
> [1] Instead of returning the first child node of NODE that extends beyond POS, ts_tree_cursor_goto_first_child_for_byte (NODE, POS) returns false (meaning not found).
I remember we found an issue with a similarly named function here
https://github.com/tree-sitter/tree-sitter/issues/2012
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60127
You ended up writing a workaround in emacs for this called
treesit_cursor_first_child_for_byte
I don't think there is overlap between that change and this ec4d29 one.
However, maybe there is overlap in the tree-sitter implementation? I'm
not sure how to tell right now, but maybe this is an upstream
tree-sitter problem that is just brought into the light by the commit in
question here.
--
Danny Freeman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-11 12:41 ` Danny Freeman
@ 2023-09-11 22:19 ` Yuan Fu
2023-09-12 3:49 ` Danny Freeman
0 siblings, 1 reply; 8+ messages in thread
From: Yuan Fu @ 2023-09-11 22:19 UTC (permalink / raw)
To: Danny Freeman; +Cc: emacs-devel
> On Sep 11, 2023, at 5:41 AM, Danny Freeman <danny@dfreeman.email> wrote:
>
>
> Yuan Fu <casouri@gmail.com> writes:
>
>>> I take that back. It is indeed caused by that commit. I just had a wrong Emacs build when testing. In that commit, ts_tree_cursor_goto_first_child_for_byte seem to be the problem.
>
> Well I'm glad to know I am capable of building Emacs with the latest
> tree sitter. Not as glad that the problem still exists lol
>
>
>> What I found is that, for some reason, ts_tree_cursor_goto_first_child_for_byte doesn’t always work
>> as expected in clojure [1]. In your example, if I evaluate (treesit-node-parent (treesit-node-at
>> (point))) with point at the beginning of “defn”, “foo”, “[]”, and “1”, only “defn” correctly
>> evaluates to the parent node. I tried this in other modes like c-ts-mode, but things seems to work
>> fine in all the few cases I tried.
>>
>> So now the question is, is there anything tree-sitter-clojure does differently? Not saying it does, but that seems like a starting point.
>
> Maybe there is. The grammar has not changed in a couple months, and
> those things that did change were related to keyword literals, which are
> not involved in the example I've been using to reproduce this.
>
> Even knowing that. I will try to narrow the problem down even further,
> and see if I can find another grammar the exhibits this behavior.
>
>> [1] Instead of returning the first child node of NODE that extends beyond POS, ts_tree_cursor_goto_first_child_for_byte (NODE, POS) returns false (meaning not found).
>
> I remember we found an issue with a similarly named function here
>
> https://github.com/tree-sitter/tree-sitter/issues/2012
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60127
>
> You ended up writing a workaround in emacs for this called
> treesit_cursor_first_child_for_byte
>
> I don't think there is overlap between that change and this ec4d29 one.
> However, maybe there is overlap in the tree-sitter implementation? I'm
> not sure how to tell right now, but maybe this is an upstream
> tree-sitter problem that is just brought into the light by the commit in
> question here.
Duh, I can’t believe I completely forgot about this. Yes, this is exactly it, thank you for binging it up. Before tree-sitter can fix treesit_cursor_first_child_for_byte, I’ll just replicate the workaround and we should be golden. I’ve pushed a fix to emacs-29. Once master pulls from emacs-29, the problem should go away.
BTW, I should’ve opened a bug report and move the discussion there. But since we are mostly done here, and there’s another bug report to refer to, there’s no need to do that now.
Yuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084
2023-09-11 22:19 ` Yuan Fu
@ 2023-09-12 3:49 ` Danny Freeman
0 siblings, 0 replies; 8+ messages in thread
From: Danny Freeman @ 2023-09-12 3:49 UTC (permalink / raw)
To: Yuan Fu; +Cc: emacs-devel
Yuan Fu <casouri@gmail.com> writes:
> Duh, I can’t believe I completely forgot about this. Yes, this is exactly it, thank you for binging
> it up. Before tree-sitter can fix treesit_cursor_first_child_for_byte, I’ll just replicate the
> workaround and we should be golden. I’ve pushed a fix to emacs-29. Once master pulls from emacs-29,
> the problem should go away.
It took me a minute to connect the dots there, but that did seem to be
the cause. I just tried it against Emacs 30 with the commit in question
cherry picked and it works perfectly. Thank you for fixing it.
> BTW, I should’ve opened a bug report and move the discussion there. But since we are mostly done here, and there’s another bug report to refer to, there’s no need to do that now.
Understood. I probably should have started with a bug report but for
some reason sent my initial email to emacs-devel. I'll be better about
that next time I run into something in the core.
Thanks again for your help, all the best.
--
Danny Freeman
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-09-12 3:49 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-09 19:26 Treesit Regression In ec4d29c4494f32acf0ff7c5632a1d951d957f084 Danny Freeman
2023-09-10 1:15 ` Yuan Fu
2023-09-10 4:00 ` Danny Freeman
2023-09-10 5:26 ` Yuan Fu
2023-09-10 7:12 ` Yuan Fu
2023-09-11 12:41 ` Danny Freeman
2023-09-11 22:19 ` Yuan Fu
2023-09-12 3:49 ` Danny Freeman
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).