unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
@ 2024-09-21  5:06 Mickey Petersen
  2024-09-26  7:42 ` Yuan Fu
  0 siblings, 1 reply; 10+ messages in thread
From: Mickey Petersen @ 2024-09-21  5:06 UTC (permalink / raw)
  To: 73404



Examples with javascript-mode. It holds for all modes i tested with a
TS equivalent. Let -!- be the starting point and ^N be the subsequent
position after a movement command.

-!-export const add = (a, b) => a + b;

Repeated `C-M-f' yields

export const add = (a, b) => a + b;

      ^1    ^2  ^3       ^4   ^5  ^6


In other words, it works as it always has.

Meanwhile, in `js-ts-mode':

export const add = (a, b) => a + b;
                ^1       ^2   ^3  ^4

From ^1 and back with `C-M-b'

export const add-!- = (a, b) => a + b;

export const add = (a, b) => a + b;
             ^1

At this point, `C-M-b' no longer goes back. It is stuck.


Another example:

-!-console.log("Addition result:", result1);

With `C-M-f':

console.log("Addition result:", result1);

       ^1                               ^2


This affects every single -sexp function that uses either
`forward-sexp-function' or `transpose-sexp-function' to do its job.

Thanks.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-21  5:06 bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Mickey Petersen
@ 2024-09-26  7:42 ` Yuan Fu
  2024-09-26  9:56   ` Mickey Petersen
  0 siblings, 1 reply; 10+ messages in thread
From: Yuan Fu @ 2024-09-26  7:42 UTC (permalink / raw)
  To: Mickey Petersen; +Cc: 73404



> On Sep 20, 2024, at 10:06 PM, Mickey Petersen <mickey@masteringemacs.org> wrote:
> 
> 
> 
> Examples with javascript-mode. It holds for all modes i tested with a
> TS equivalent. Let -!- be the starting point and ^N be the subsequent
> position after a movement command.
> 
> -!-export const add = (a, b) => a + b;
> 
> Repeated `C-M-f' yields
> 
> export const add = (a, b) => a + b;
> 
>      ^1    ^2  ^3       ^4   ^5  ^6
> 
> 
> In other words, it works as it always has.
> 
> Meanwhile, in `js-ts-mode':
> 
> export const add = (a, b) => a + b;
>                ^1       ^2   ^3  ^4
> 
> From ^1 and back with `C-M-b'
> 
> export const add-!- = (a, b) => a + b;
> 
> export const add = (a, b) => a + b;
>             ^1
> 
> At this point, `C-M-b' no longer goes back. It is stuck.
> 
> 
> Another example:
> 
> -!-console.log("Addition result:", result1);
> 
> With `C-M-f':
> 
> console.log("Addition result:", result1);
> 
>       ^1                               ^2
> 
> 
> This affects every single -sexp function that uses either
> `forward-sexp-function' or `transpose-sexp-function' to do its job.
> 
> Thanks.
> 

I’m aware of this problem and it’s quite inconvenient at times, but right now I don’t have a good solution for it. Ideas are welcome.

Basically tree-sitter’s sexp movement works on subtrees. It determines the position of the point in the whole parse tree and goes forward/back across the next subtree in the parse tree. If there’s no more sibling subtrees in the same level to move over, sexp movement stops like in lisp. The parse tree is invisible and often groups token in unexpected ways, so many times the sexp movement isn’t intuitive.

We might need to add a user option so people can easily turn off tree-sitter sexp movement, since it isn’t a strict upgrade from the generic sexp movement—it’s more of a different flavored sexp movement.

Yuan




^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26  7:42 ` Yuan Fu
@ 2024-09-26  9:56   ` Mickey Petersen
  2024-09-26 10:53     ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Mickey Petersen @ 2024-09-26  9:56 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 73404


Yuan Fu <casouri@gmail.com> writes:

>> On Sep 20, 2024, at 10:06 PM, Mickey Petersen <mickey@masteringemacs.org> wrote:
>>
>>
>>
>> Examples with javascript-mode. It holds for all modes i tested with a
>> TS equivalent. Let -!- be the starting point and ^N be the subsequent
>> position after a movement command.
>>
>> -!-export const add = (a, b) => a + b;
>>
>> Repeated `C-M-f' yields
>>
>> export const add = (a, b) => a + b;
>>
>>      ^1    ^2  ^3       ^4   ^5  ^6
>>
>>
>> In other words, it works as it always has.
>>
>> Meanwhile, in `js-ts-mode':
>>
>> export const add = (a, b) => a + b;
>>                ^1       ^2   ^3  ^4
>>
>> From ^1 and back with `C-M-b'
>>
>> export const add-!- = (a, b) => a + b;
>>
>> export const add = (a, b) => a + b;
>>             ^1
>>
>> At this point, `C-M-b' no longer goes back. It is stuck.
>>
>>
>> Another example:
>>
>> -!-console.log("Addition result:", result1);
>>
>> With `C-M-f':
>>
>> console.log("Addition result:", result1);
>>
>>       ^1                               ^2
>>
>>
>> This affects every single -sexp function that uses either
>> `forward-sexp-function' or `transpose-sexp-function' to do its job.
>>
>> Thanks.
>>
>
> I’m aware of this problem and it’s quite inconvenient at times, but right now I don’t have a good solution for it. Ideas are welcome.
>
> Basically tree-sitter’s sexp movement works on subtrees. It determines
> the position of the point in the whole parse tree and goes
> forward/back across the next subtree in the parse tree. If there’s no
> more sibling subtrees in the same level to move over, sexp movement
> stops like in lisp. The parse tree is invisible and often groups token
> in unexpected ways, so many times the sexp movement isn’t intuitive.
>

Hi Yuan,

In my opinion, that's not what `sexp' movement is.

Sexp movement is movement by balanced expressions -- and a fallback to
word-like behaviour absent that -- and this is not that. It would be
better to relegate this sort of thing to its own set of keybindings.


> We might need to add a user option so people can easily turn off
> tree-sitter sexp movement, since it isn’t a strict upgrade from the
> generic sexp movement—it’s more of a different flavored sexp movement.

It should be opt-in, not opt-out.

>
> Yuan






^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26  9:56   ` Mickey Petersen
@ 2024-09-26 10:53     ` Eli Zaretskii
  2024-09-26 12:13       ` Mickey Petersen
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-26 10:53 UTC (permalink / raw)
  To: Mickey Petersen; +Cc: casouri, 73404

> Cc: 73404@debbugs.gnu.org
> From: Mickey Petersen <mickey@masteringemacs.org>
> Date: Thu, 26 Sep 2024 10:56:35 +0100
> 
> In my opinion, that's not what `sexp' movement is.
> 
> Sexp movement is movement by balanced expressions -- and a fallback to
> word-like behaviour absent that -- and this is not that. It would be
> better to relegate this sort of thing to its own set of keybindings.

The term "balanced expression" is not well defined in languages other
than Lisp and Lisp-like ones.  It is clear what expected when point is
on a brace or a parenthesis, but entirely NOT clear when you start
from something else.  For example:

  int foo = bar + 2 * baz;

Suppose you start with point at "foo": what would you expect
forward-sexp to do? nothing?

> > We might need to add a user option so people can easily turn off
> > tree-sitter sexp movement, since it isn’t a strict upgrade from the
> > generic sexp movement—it’s more of a different flavored sexp movement.
> 
> It should be opt-in, not opt-out.

I disagree.  Moving by sub-trees is a natural generalization of sexp
movement for languages where parentheses and braces are rare and far
in-between.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26 10:53     ` Eli Zaretskii
@ 2024-09-26 12:13       ` Mickey Petersen
  2024-09-26 13:46         ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Mickey Petersen @ 2024-09-26 12:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, 73404


Eli Zaretskii <eliz@gnu.org> writes:

>> Cc: 73404@debbugs.gnu.org
>> From: Mickey Petersen <mickey@masteringemacs.org>
>> Date: Thu, 26 Sep 2024 10:56:35 +0100
>>
>> In my opinion, that's not what `sexp' movement is.
>>
>> Sexp movement is movement by balanced expressions -- and a fallback to
>> word-like behaviour absent that -- and this is not that. It would be
>> better to relegate this sort of thing to its own set of keybindings.
>
> The term "balanced expression" is not well defined in languages other
> than Lisp and Lisp-like ones.  It is clear what expected when point is
> on a brace or a parenthesis, but entirely NOT clear when you start
> from something else.  For example:
>
>   int foo = bar + 2 * baz;
>
> Suppose you start with point at "foo": what would you expect
> forward-sexp to do? nothing?
>

I expect it to behave as it presently does: default to word-like
behaviour such as M-@ / M-f etc.

Balanced expression is not well defined, de jure, but it is in
practical terms, making it de facto rather well understood and
supported. It behaves reasonably consistently across languages, and I
use *-sexp commands thousands of times a day in a wide range of major modes and
contexts, both in code and also prose.

Most people who use *-sexp (or *-word commands for that matter) in
major modes come to recognise how they work and know what happens to
the text/point in their buffer before they run them.

I would challenge anyone, given even small samples of code, to do the
same with the current TS only implementation.

>> > We might need to add a user option so people can easily turn off
>> > tree-sitter sexp movement, since it isn’t a strict upgrade from the
>> > generic sexp movement—it’s more of a different flavored sexp movement.
>>
>> It should be opt-in, not opt-out.
>
> I disagree.  Moving by sub-trees is a natural generalization of sexp
> movement for languages where parentheses and braces are rare and far
> in-between.

Yes, if one can intuit the sub trees' structure, which is not so
simple; and if the selection of commands are sufficiently expressive
enough to let you navigate the tree. I am not sure they are.

The CSTs are deep, wide, and nodes' ranges frequently overlap; they
are multi-dimensional structures that map to a simple 2-dimensional
'grid' in your buffer. Making heads or tails of that is no easy feat.






^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26 12:13       ` Mickey Petersen
@ 2024-09-26 13:46         ` Eli Zaretskii
  2024-09-26 15:21           ` Mickey Petersen
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-26 13:46 UTC (permalink / raw)
  To: Mickey Petersen; +Cc: casouri, 73404

> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NO_RECEIVED,
> 	NO_RELAYS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no
> 	version=3.4.2
> From: Mickey Petersen <mickey@masteringemacs.org>
> Cc: casouri@gmail.com, 73404@debbugs.gnu.org
> Date: Thu, 26 Sep 2024 13:13:53 +0100
> 
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >   int foo = bar + 2 * baz;
> >
> > Suppose you start with point at "foo": what would you expect
> > forward-sexp to do? nothing?
> >
> 
> I expect it to behave as it presently does: default to word-like
> behaviour such as M-@ / M-f etc.

Then we just lost an opportunity to have more useful commands, because
we already have M-f and M-@.

> Balanced expression is not well defined, de jure, but it is in
> practical terms, making it de facto rather well understood and
> supported. It behaves reasonably consistently across languages, and I
> use *-sexp commands thousands of times a day in a wide range of major modes and
> contexts, both in code and also prose.

I think the ability to move by parse sub-trees is also very useful.

> Most people who use *-sexp (or *-word commands for that matter) in
> major modes come to recognise how they work and know what happens to
> the text/point in their buffer before they run them.
> 
> I would challenge anyone, given even small samples of code, to do the
> same with the current TS only implementation.

That's just a matter of getting used to the new semantics.

> > I disagree.  Moving by sub-trees is a natural generalization of sexp
> > movement for languages where parentheses and braces are rare and far
> > in-between.
> 
> Yes, if one can intuit the sub trees' structure, which is not so
> simple; and if the selection of commands are sufficiently expressive
> enough to let you navigate the tree. I am not sure they are.

There are enough situations where moving by words will also surprise
you.  For example, did you know that M-f stops when it finds a
character from a different script?  And yet we still use these
commands.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26 13:46         ` Eli Zaretskii
@ 2024-09-26 15:21           ` Mickey Petersen
  2024-09-26 15:45             ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Mickey Petersen @ 2024-09-26 15:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, 73404


Eli Zaretskii <eliz@gnu.org> writes:

>> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NO_RECEIVED,
>> 	NO_RELAYS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no
>> 	version=3.4.2
>> From: Mickey Petersen <mickey@masteringemacs.org>
>> Cc: casouri@gmail.com, 73404@debbugs.gnu.org
>> Date: Thu, 26 Sep 2024 13:13:53 +0100
>>
>>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>> >   int foo = bar + 2 * baz;
>> >
>> > Suppose you start with point at "foo": what would you expect
>> > forward-sexp to do? nothing?
>> >
>>
>> I expect it to behave as it presently does: default to word-like
>> behaviour such as M-@ / M-f etc.
>
> Then we just lost an opportunity to have more useful commands, because
> we already have M-f and M-@.
>
>> Balanced expression is not well defined, de jure, but it is in
>> practical terms, making it de facto rather well understood and
>> supported. It behaves reasonably consistently across languages, and I
>> use *-sexp commands thousands of times a day in a wide range of major modes and
>> contexts, both in code and also prose.
>
> I think the ability to move by parse sub-trees is also very useful.
>

Agreed. What matters is whether the crop of new sexp commands, such as they
are, perform satisfactorily.

Do you think the examples I listed in the original bug report match
your expectations? If so, then it is probably OK to close the bug report.

>> Most people who use *-sexp (or *-word commands for that matter) in
>> major modes come to recognise how they work and know what happens to
>> the text/point in their buffer before they run them.
>>
>> I would challenge anyone, given even small samples of code, to do the
>> same with the current TS only implementation.
>
> That's just a matter of getting used to the new semantics.
>
>> > I disagree.  Moving by sub-trees is a natural generalization of sexp
>> > movement for languages where parentheses and braces are rare and far
>> > in-between.
>>
>> Yes, if one can intuit the sub trees' structure, which is not so
>> simple; and if the selection of commands are sufficiently expressive
>> enough to let you navigate the tree. I am not sure they are.
>
> There are enough situations where moving by words will also surprise
> you.  For example, did you know that M-f stops when it finds a
> character from a different script?  And yet we still use these
> commands.






^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26 15:21           ` Mickey Petersen
@ 2024-09-26 15:45             ` Eli Zaretskii
  2024-09-27  5:43               ` Yuan Fu
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-26 15:45 UTC (permalink / raw)
  To: Mickey Petersen; +Cc: casouri, 73404

> From: Mickey Petersen <mickey@masteringemacs.org>
> Cc: casouri@gmail.com, 73404@debbugs.gnu.org
> Date: Thu, 26 Sep 2024 16:21:33 +0100
> 
> > I think the ability to move by parse sub-trees is also very useful.
> >
> 
> Agreed. What matters is whether the crop of new sexp commands, such as they
> are, perform satisfactorily.
> 
> Do you think the examples I listed in the original bug report match
> your expectations? If so, then it is probably OK to close the bug report.

Yes, I do, but let's wait for others to chime in if they have opinions
on this.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-26 15:45             ` Eli Zaretskii
@ 2024-09-27  5:43               ` Yuan Fu
  2024-09-29 16:56                 ` Juri Linkov
  0 siblings, 1 reply; 10+ messages in thread
From: Yuan Fu @ 2024-09-27  5:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mickey Petersen, 73404



> On Sep 26, 2024, at 8:45 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Mickey Petersen <mickey@masteringemacs.org>
>> Cc: casouri@gmail.com, 73404@debbugs.gnu.org
>> Date: Thu, 26 Sep 2024 16:21:33 +0100
>> 
>>> I think the ability to move by parse sub-trees is also very useful.
>>> 
>> 
>> Agreed. What matters is whether the crop of new sexp commands, such as they
>> are, perform satisfactorily.

Note that you can affect the behavior of tree-sitter sexp movement by defining the sexp “thing” in treesit-thing-settings. Js-ts-mode defines one (js--treesit-sexp-nodes) and it only consider some nodes as sexp. You might be able to tweak the sexp movement to your liking by changing it, or directly modifying the definition for `sexp’ in treesit-thing-settings.

>> 
>> Do you think the examples I listed in the original bug report match
>> your expectations? If so, then it is probably OK to close the bug report.
> 
> Yes, I do, but let's wait for others to chime in if they have opinions
> on this.






^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes
  2024-09-27  5:43               ` Yuan Fu
@ 2024-09-29 16:56                 ` Juri Linkov
  0 siblings, 0 replies; 10+ messages in thread
From: Juri Linkov @ 2024-09-29 16:56 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, Mickey Petersen, 73404

> Note that you can affect the behavior of tree-sitter sexp movement by
> defining the sexp “thing” in treesit-thing-settings. Js-ts-mode defines one
> (js--treesit-sexp-nodes) and it only consider some nodes as sexp. You might
> be able to tweak the sexp movement to your liking by changing it, or
> directly modifying the definition for `sexp’ in treesit-thing-settings.
>
>>> Do you think the examples I listed in the original bug report match
>>> your expectations? If so, then it is probably OK to close the bug report.
>>
>> Yes, I do, but let's wait for others to chime in if they have opinions
>> on this.

Here are some ideas how to cover more use cases.

Suppose that a user wants to disable tree-sitter sexp movement
completely to use the default forward-sexp-default-function.
The natural way to do this would be set the list of nodes to nil:

  (setq js--treesit-sexp-nodes nil)

However, this currently doesn't work, and requires a change like this:

  @@ -2290,10 +2290,12 @@ treesit-forward-sexp
            (treesit-node-at (point) (treesit-language-at (point)))))
       (or (when (and node-at-point
                      ;; Make sure point is strictly inside node.
  -                   (< (treesit-node-start node-at-point)
  -                      (point)
  -                      (treesit-node-end node-at-point))
  -                   (treesit-node-match-p node-at-point 'text t))
  +                   (<= (treesit-node-start node-at-point)
  +                       (point)
  +                       (treesit-node-end node-at-point))
  +                   (or (treesit-node-match-p node-at-point 'text t)
  +                       (not (treesit-node-match-p node-at-point 'sexp t))
  +                       ))
             (forward-sexp-default-function arg)
             t)
           (if (> arg 0)

Now, the next case: what if the user wants to use the default
forward-sexp-default-function except for the 'binary_expression'
like "a + b" where `C-M-f' should move from "a" to the end of "b":

  export const add = (a, b) => -!-a + b;

should move to

  export const add = (a, b) => a + b;

                                    ^1

The best way for the user would be to customize:

  (setq js--treesit-sexp-nodes '("binary_expression"))

But this is not yet handled by the condition above:

  (not (treesit-node-match-p node-at-point 'sexp t))

because 'node-at-point' is "identifier".
So we need to use 'treesit-parent-until'
to check if all parent nodes match
'js--treesit-sexp-nodes'.  Then it will find
the parent "binary_expression".

I believe something like this will make
treesit-forward-sexp more customizable.





^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-09-29 16:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-21  5:06 bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Mickey Petersen
2024-09-26  7:42 ` Yuan Fu
2024-09-26  9:56   ` Mickey Petersen
2024-09-26 10:53     ` Eli Zaretskii
2024-09-26 12:13       ` Mickey Petersen
2024-09-26 13:46         ` Eli Zaretskii
2024-09-26 15:21           ` Mickey Petersen
2024-09-26 15:45             ` Eli Zaretskii
2024-09-27  5:43               ` Yuan Fu
2024-09-29 16:56                 ` Juri Linkov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).