* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-10 1:41 ` Dmitry Gutov
@ 2023-11-10 7:37 ` Juri Linkov
2023-11-11 2:41 ` Yuan Fu
2023-11-11 15:43 ` Loïc Lemaître
2 siblings, 0 replies; 15+ messages in thread
From: Juri Linkov @ 2023-11-10 7:37 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Theodor Thornhill, 66988, Loïc Lemaître, Yuan Fu
> @@ -3843,6 +3843,7 @@ js--treesit-sexp-nodes
> "undefined"
> "arguments"
> "pair"
> + "parenthesized_expression"
> "jsx")
> "Nodes that designate sexps in JavaScript.
> See `treesit-thing-settings' for more information.")
I had the same problem and to make this mode more usable had to use such patch:
diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index 5a669fdbd42..4c07fbd94b7 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -3825,7 +3825,9 @@ js--treesit-sentence-nodes
See `treesit-thing-settings' for more information.")
(defvar js--treesit-sexp-nodes
- '("expression"
+ '("expression" ;; SHOULD NOT MATCH "expression_statement", BUT SHOULD MATCH "parenthesized_expression"
+ "parenthesized_expression"
+ "formal_parameters"
"pattern"
"array"
"function"
@@ -3843,7 +3845,13 @@ js--treesit-sexp-nodes
"undefined"
"arguments"
"pair"
- "jsx")
+ "jsx"
+ "statement_block"
+ "object"
+ "object_pattern"
+ "named_imports"
+ "class_body"
+ )
"Nodes that designate sexps in JavaScript.
See `treesit-thing-settings' for more information.")
PS:
Also tried to replace
(setq-local treesit-sexp-type-regexp (regexp-opt js--treesit-sexp-nodes))
with
(setq-local treesit-sexp-type-regexp (rx-to-string `(seq bol (or ,@js--treesit-sexp-nodes) eol)))
^ permalink raw reply related [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-10 1:41 ` Dmitry Gutov
2023-11-10 7:37 ` Juri Linkov
@ 2023-11-11 2:41 ` Yuan Fu
2023-11-11 7:35 ` Eli Zaretskii
2023-11-11 10:49 ` Dmitry Gutov
2023-11-11 15:43 ` Loïc Lemaître
2 siblings, 2 replies; 15+ messages in thread
From: Yuan Fu @ 2023-11-11 2:41 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Theodor Thornhill, 66988, Loïc Lemaître
> On Nov 9, 2023, at 5:41 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> Hi!
>
> On 07/11/2023 16:56, Loïc Lemaître wrote:
>> Hi Emacs team,
>> Here the steps to demonstrate the bug :
>> 1. Compile Emacs from master branch with tree-sitter support
>> 2. Install javascript and tsx languages
>> 3. Run Emacs
>> 4. Create a new buffer
>> 5. Turn major mode to either js-ts-mode or tsx-ts-mode
>> 6. Past the following content into the buffer:
>> (
>> <div>
>> </div>
>> );
>> 7. Place point before opening parenthese
>> 8. M-x forward-sexp (which will call treesit-forward-sexp)
>> => New position is right after the semi-colon instead of being before the semi-colon.
>> Note that the bug disappear if the buffer content is changed for :
>> const component = (
>> <div>
>> </div>
>> );
>> But previous content, while not being very usefull, is valid JSX, as far as I know.
>> I use this syntax for unit test purpose, since it is very short.
>
> Thanks for the report.
>
> The patch below should fix it.
>
> Yuan, what do you think? A similar change (bos and eos anchors) might be useful for other things and other modes.
>
> Alternatively, treesit-thing-settings could be interpreted to imply full matches, then the code using it should not only match against the regexps but also check that the entire string (type name) is matched.
I regret not doing this by default for treesit-indent-rules and traverse functions. Now it’s hard to change without creating confusion and breaking backward compatibility. I wonder if there are good way to smoothly transition to match full names by default…
Treesit-thing-settings can be changed to match full names, but only if we can change treesit-indent-rules and friends too. Otherwise it would be too confusing.
Yuan
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-11 2:41 ` Yuan Fu
@ 2023-11-11 7:35 ` Eli Zaretskii
2023-11-15 6:28 ` Yuan Fu
2023-11-11 10:49 ` Dmitry Gutov
1 sibling, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2023-11-11 7:35 UTC (permalink / raw)
To: Yuan Fu; +Cc: dmitry, 66988, theo, loic.lemaitre
> Cc: Theodor Thornhill <theo@thornhill.no>, 66988@debbugs.gnu.org,
> Loïc Lemaître <loic.lemaitre@gmail.com>
> From: Yuan Fu <casouri@gmail.com>
> Date: Fri, 10 Nov 2023 18:41:20 -0800
>
> > Alternatively, treesit-thing-settings could be interpreted to imply full matches, then the code using it should not only match against the regexps but also check that the entire string (type name) is matched.
>
> I regret not doing this by default for treesit-indent-rules and traverse functions. Now it’s hard to change without creating confusion and breaking backward compatibility. I wonder if there are good way to smoothly transition to match full names by default…
>
> Treesit-thing-settings can be changed to match full names, but only if we can change treesit-indent-rules and friends too. Otherwise it would be too confusing.
Is it feasible to have a variable that controls whether the full
matches are implied in these APIs? Then we could start by making it
optional, and at some later time make it the default.
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-11 7:35 ` Eli Zaretskii
@ 2023-11-15 6:28 ` Yuan Fu
2023-11-15 12:19 ` Eli Zaretskii
0 siblings, 1 reply; 15+ messages in thread
From: Yuan Fu @ 2023-11-15 6:28 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Dmitry Gutov, 66988, Theodor Thornhill, loic.lemaitre
> On Nov 10, 2023, at 11:35 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> Cc: Theodor Thornhill <theo@thornhill.no>, 66988@debbugs.gnu.org,
>> Loïc Lemaître <loic.lemaitre@gmail.com>
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Fri, 10 Nov 2023 18:41:20 -0800
>>
>>> Alternatively, treesit-thing-settings could be interpreted to imply full matches, then the code using it should not only match against the regexps but also check that the entire string (type name) is matched.
>>
>> I regret not doing this by default for treesit-indent-rules and traverse functions. Now it’s hard to change without creating confusion and breaking backward compatibility. I wonder if there are good way to smoothly transition to match full names by default…
>>
>> Treesit-thing-settings can be changed to match full names, but only if we can change treesit-indent-rules and friends too. Otherwise it would be too confusing.
>
> Is it feasible to have a variable that controls whether the full
> matches are implied in these APIs? Then we could start by making it
> optional, and at some later time make it the default.
It’s feasible, but I don’t think it’s TRT. The major mode author should have the control over whether the full match is implied, not the user. It doesn’t make sense to toggle the variable either. If you change that variable, major mode code must also change to be correct.
Yuan
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-15 6:28 ` Yuan Fu
@ 2023-11-15 12:19 ` Eli Zaretskii
2023-11-18 18:57 ` Yuan Fu
0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2023-11-15 12:19 UTC (permalink / raw)
To: Yuan Fu; +Cc: dmitry, 66988, theo, loic.lemaitre
> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 14 Nov 2023 22:28:34 -0800
> Cc: Dmitry Gutov <dmitry@gutov.dev>,
> Theodor Thornhill <theo@thornhill.no>,
> 66988@debbugs.gnu.org,
> loic.lemaitre@gmail.com
>
> > Is it feasible to have a variable that controls whether the full
> > matches are implied in these APIs? Then we could start by making it
> > optional, and at some later time make it the default.
>
> It’s feasible, but I don’t think it’s TRT. The major mode author should have the control over whether the full match is implied, not the user. It doesn’t make sense to toggle the variable either. If you change that variable, major mode code must also change to be correct.
So what do you suggest that we do about this issue?
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-15 12:19 ` Eli Zaretskii
@ 2023-11-18 18:57 ` Yuan Fu
2023-11-19 5:48 ` Eli Zaretskii
0 siblings, 1 reply; 15+ messages in thread
From: Yuan Fu @ 2023-11-18 18:57 UTC (permalink / raw)
To: Eli Zaretskii
Cc: Dmitry Gutov, 66988, Mattias Engdegård, Theodor Thornhill,
Loïc Lemaître
> On Nov 15, 2023, at 4:19 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Tue, 14 Nov 2023 22:28:34 -0800
>> Cc: Dmitry Gutov <dmitry@gutov.dev>,
>> Theodor Thornhill <theo@thornhill.no>,
>> 66988@debbugs.gnu.org,
>> loic.lemaitre@gmail.com
>>
>>> Is it feasible to have a variable that controls whether the full
>>> matches are implied in these APIs? Then we could start by making it
>>> optional, and at some later time make it the default.
>>
>> It’s feasible, but I don’t think it’s TRT. The major mode author should have the control over whether the full match is implied, not the user. It doesn’t make sense to toggle the variable either. If you change that variable, major mode code must also change to be correct.
>
> So what do you suggest that we do about this issue?
We change every treesit function that takes a regexp for matching node type names to imply full match. I think most people wrote code as if these functions use full match, so the breakage should be small. And I can forecast this change in as many channels as I can.
Yuan
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-18 18:57 ` Yuan Fu
@ 2023-11-19 5:48 ` Eli Zaretskii
2023-11-19 6:19 ` Yuan Fu
0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2023-11-19 5:48 UTC (permalink / raw)
To: Yuan Fu; +Cc: dmitry, 66988, mattias.engdegard, theo, loic.lemaitre
> From: Yuan Fu <casouri@gmail.com>
> Date: Sat, 18 Nov 2023 10:57:56 -0800
> Cc: Dmitry Gutov <dmitry@gutov.dev>,
> Theodor Thornhill <theo@thornhill.no>,
> 66988@debbugs.gnu.org,
> Loïc Lemaître <loic.lemaitre@gmail.com>,
> Mattias Engdegård <mattias.engdegard@gmail.com>
>
>
>
> > On Nov 15, 2023, at 4:19 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> >
> >> From: Yuan Fu <casouri@gmail.com>
> >> Date: Tue, 14 Nov 2023 22:28:34 -0800
> >> Cc: Dmitry Gutov <dmitry@gutov.dev>,
> >> Theodor Thornhill <theo@thornhill.no>,
> >> 66988@debbugs.gnu.org,
> >> loic.lemaitre@gmail.com
> >>
> >>> Is it feasible to have a variable that controls whether the full
> >>> matches are implied in these APIs? Then we could start by making it
> >>> optional, and at some later time make it the default.
> >>
> >> It’s feasible, but I don’t think it’s TRT. The major mode author should have the control over whether the full match is implied, not the user. It doesn’t make sense to toggle the variable either. If you change that variable, major mode code must also change to be correct.
> >
> > So what do you suggest that we do about this issue?
>
> We change every treesit function that takes a regexp for matching node type names to imply full match. I think most people wrote code as if these functions use full match, so the breakage should be small. And I can forecast this change in as many channels as I can.
Fine by me, so let's do it.
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-19 5:48 ` Eli Zaretskii
@ 2023-11-19 6:19 ` Yuan Fu
0 siblings, 0 replies; 15+ messages in thread
From: Yuan Fu @ 2023-11-19 6:19 UTC (permalink / raw)
To: Eli Zaretskii
Cc: Dmitry Gutov, 66988, Mattias Engdegård, Theodor Thornhill,
Loïc Lemaître
> On Nov 18, 2023, at 9:48 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sat, 18 Nov 2023 10:57:56 -0800
>> Cc: Dmitry Gutov <dmitry@gutov.dev>,
>> Theodor Thornhill <theo@thornhill.no>,
>> 66988@debbugs.gnu.org,
>> Loïc Lemaître <loic.lemaitre@gmail.com>,
>> Mattias Engdegård <mattias.engdegard@gmail.com>
>>
>>
>>
>>> On Nov 15, 2023, at 4:19 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>>>
>>>> From: Yuan Fu <casouri@gmail.com>
>>>> Date: Tue, 14 Nov 2023 22:28:34 -0800
>>>> Cc: Dmitry Gutov <dmitry@gutov.dev>,
>>>> Theodor Thornhill <theo@thornhill.no>,
>>>> 66988@debbugs.gnu.org,
>>>> loic.lemaitre@gmail.com
>>>>
>>>>> Is it feasible to have a variable that controls whether the full
>>>>> matches are implied in these APIs? Then we could start by making it
>>>>> optional, and at some later time make it the default.
>>>>
>>>> It’s feasible, but I don’t think it’s TRT. The major mode author should have the control over whether the full match is implied, not the user. It doesn’t make sense to toggle the variable either. If you change that variable, major mode code must also change to be correct.
>>>
>>> So what do you suggest that we do about this issue?
>>
>> We change every treesit function that takes a regexp for matching node type names to imply full match. I think most people wrote code as if these functions use full match, so the breakage should be small. And I can forecast this change in as many channels as I can.
>
> Fine by me, so let's do it.
Ok, I’ll start preparing the patch and news. Meanwhile, folks, let me know if anyone has objections.
Yuan
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-11 2:41 ` Yuan Fu
2023-11-11 7:35 ` Eli Zaretskii
@ 2023-11-11 10:49 ` Dmitry Gutov
2023-11-15 6:32 ` Yuan Fu
1 sibling, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2023-11-11 10:49 UTC (permalink / raw)
To: Yuan Fu; +Cc: Theodor Thornhill, 66988, Loïc Lemaître
On 11/11/2023 04:41, Yuan Fu wrote:
> I regret not doing this by default for treesit-indent-rules and traverse functions. Now it’s hard to change without creating confusion and breaking backward compatibility. I wonder if there are good way to smoothly transition to match full names by default…
>
> Treesit-thing-settings can be changed to match full names, but only if we can change treesit-indent-rules and friends too. Otherwise it would be too confusing.
It's my understanding that the current implementation, when it doesn't
use a full match, is a potential bug in every single instance.
Perhaps you have an example of when partial match is intended and
beneficial? If so, we can just go through all other regexps and wrap
them in \` and \'. And should.
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-11 10:49 ` Dmitry Gutov
@ 2023-11-15 6:32 ` Yuan Fu
0 siblings, 0 replies; 15+ messages in thread
From: Yuan Fu @ 2023-11-15 6:32 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Theodor Thornhill, 66988, Loïc Lemaître
> On Nov 11, 2023, at 2:49 AM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 11/11/2023 04:41, Yuan Fu wrote:
>> I regret not doing this by default for treesit-indent-rules and traverse functions. Now it’s hard to change without creating confusion and breaking backward compatibility. I wonder if there are good way to smoothly transition to match full names by default…
>> Treesit-thing-settings can be changed to match full names, but only if we can change treesit-indent-rules and friends too. Otherwise it would be too confusing.
>
> It's my understanding that the current implementation, when it doesn't use a full match, is a potential bug in every single instance.
That’s a good point.
> Perhaps you have an example of when partial match is intended and beneficial? If so, we can just go through all other regexps and wrap them in \` and \'. And should.
I don’t, except for a very few cases where I saved typing a few characters.
You have a good point. I think most people instinctively write the full match in their code anyway, so changing to the full match should be fine. We can start from master and see if the world ends or not.
Yuan
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-10 1:41 ` Dmitry Gutov
2023-11-10 7:37 ` Juri Linkov
2023-11-11 2:41 ` Yuan Fu
@ 2023-11-11 15:43 ` Loïc Lemaître
2023-11-11 23:40 ` Dmitry Gutov
2 siblings, 1 reply; 15+ messages in thread
From: Loïc Lemaître @ 2023-11-11 15:43 UTC (permalink / raw)
To: Dmitry Gutov, 66988, Yuan Fu, Theodor Thornhill
[-- Attachment #1: Type: text/plain, Size: 2849 bytes --]
Thanks for the patch ! It fixes the bug.
But unfortunatly, there is another similar bug in
/treesit-forward-sexp/, that you can reproduce with that example:
({(<A></A>)});
/(treesit-forward-sexp)/ does not work as expected for both the opening
parentheses and the brace.
I have checked that it is not a regression due to the patch. That said,
the patch changes the results (that are not what we expect in any cases).
Loïc
Le 10/11/2023 à 02:41, Dmitry Gutov a écrit :
> Hi!
>
> On 07/11/2023 16:56, Loïc Lemaître wrote:
>> Hi Emacs team,
>>
>> Here the steps to demonstrate the bug :
>>
>> 1. Compile Emacs from master branch with tree-sitter support
>> 2. Install javascript and tsx languages
>> 3. Run Emacs
>> 4. Create a new buffer
>> 5. Turn major mode to either js-ts-mode or tsx-ts-mode
>> 6. Past the following content into the buffer:
>> (
>> <div>
>> </div>
>> );
>> 7. Place point before opening parenthese
>> 8. M-x forward-sexp (which will call treesit-forward-sexp)
>>
>> => New position is right after the semi-colon instead of being before
>> the semi-colon.
>>
>> Note that the bug disappear if the buffer content is changed for :
>> const component = (
>> <div>
>> </div>
>> );
>>
>> But previous content, while not being very usefull, is valid JSX, as
>> far as I know.
>> I use this syntax for unit test purpose, since it is very short.
>
> Thanks for the report.
>
> The patch below should fix it.
>
> Yuan, what do you think? A similar change (bos and eos anchors) might
> be useful for other things and other modes.
>
> Alternatively, treesit-thing-settings could be interpreted to imply
> full matches, then the code using it should not only match against the
> regexps but also check that the entire string (type name) is matched.
>
> Also Cc'ing Theodor.
>
> diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
> index 5a669fdbd42..d81fa9ed322 100644
> --- a/lisp/progmodes/js.el
> +++ b/lisp/progmodes/js.el
> @@ -3843,6 +3843,7 @@ js--treesit-sexp-nodes
> "undefined"
> "arguments"
> "pair"
> + "parenthesized_expression"
> "jsx")
> "Nodes that designate sexps in JavaScript.
> See `treesit-thing-settings' for more information.")
> @@ -3886,7 +3887,7 @@ js-ts-mode
>
> (setq-local treesit-thing-settings
> `((javascript
> - (sexp ,(regexp-opt js--treesit-sexp-nodes))
> + (sexp ,(format "\\`%s\\'" (regexp-opt
> js--treesit-sexp-nodes)))
> (sentence ,(regexp-opt js--treesit-sentence-nodes))
> (text ,(regexp-opt '("comment"
> "template_string"))))))
>
[-- Attachment #2: Type: text/html, Size: 4359 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-11 15:43 ` Loïc Lemaître
@ 2023-11-11 23:40 ` Dmitry Gutov
2023-11-12 12:10 ` Loïc Lemaître
0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Gutov @ 2023-11-11 23:40 UTC (permalink / raw)
To: Loïc Lemaître, 66988, Yuan Fu, Theodor Thornhill
On 11/11/2023 17:43, Loïc Lemaître wrote:
> Thanks for the patch ! It fixes the bug.
> But unfortunatly, there is another similar bug in
> /treesit-forward-sexp/, that you can reproduce with that example:
> ({(<A></A>)});
The problem in this case is that the code doesn't parse (one of the
nodes in the parse tree is ERROR). Removing either the curlies, or the
outer parens pair makes the code valid and the behavior correspondingly
better.
Although for treesit-forward-sexp to jump between curlies in
{(<A></A>)};
we'll also need to add "statement_block" to js--treesit-sexp-nodes.
^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#66988: 30.0.50; treesit-forward-sexp not working properly in js-ts-mode and tsx-ts-mode
2023-11-11 23:40 ` Dmitry Gutov
@ 2023-11-12 12:10 ` Loïc Lemaître
0 siblings, 0 replies; 15+ messages in thread
From: Loïc Lemaître @ 2023-11-12 12:10 UTC (permalink / raw)
To: Dmitry Gutov, 66988, Yuan Fu, Theodor Thornhill
Sorry for my example that is actually not valid JSX...
Yours demontrates better the remaining issue.
Thanks
Loïc
Le 12/11/2023 à 00:40, Dmitry Gutov a écrit :
> On 11/11/2023 17:43, Loïc Lemaître wrote:
>> Thanks for the patch ! It fixes the bug.
>> But unfortunatly, there is another similar bug in
>> /treesit-forward-sexp/, that you can reproduce with that example:
>> ({(<A></A>)});
>
> The problem in this case is that the code doesn't parse (one of the
> nodes in the parse tree is ERROR). Removing either the curlies, or the
> outer parens pair makes the code valid and the behavior
> correspondingly better.
>
> Although for treesit-forward-sexp to jump between curlies in
>
> {(<A></A>)};
>
> we'll also need to add "statement_block" to js--treesit-sexp-nodes.
^ permalink raw reply [flat|nested] 15+ messages in thread