* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes @ 2024-09-21 5:06 Mickey Petersen 2024-09-26 7:42 ` Yuan Fu 0 siblings, 1 reply; 66+ messages in thread From: Mickey Petersen @ 2024-09-21 5:06 UTC (permalink / raw) To: 73404 Examples with javascript-mode. It holds for all modes i tested with a TS equivalent. Let -!- be the starting point and ^N be the subsequent position after a movement command. -!-export const add = (a, b) => a + b; Repeated `C-M-f' yields export const add = (a, b) => a + b; ^1 ^2 ^3 ^4 ^5 ^6 In other words, it works as it always has. Meanwhile, in `js-ts-mode': export const add = (a, b) => a + b; ^1 ^2 ^3 ^4 From ^1 and back with `C-M-b' export const add-!- = (a, b) => a + b; export const add = (a, b) => a + b; ^1 At this point, `C-M-b' no longer goes back. It is stuck. Another example: -!-console.log("Addition result:", result1); With `C-M-f': console.log("Addition result:", result1); ^1 ^2 This affects every single -sexp function that uses either `forward-sexp-function' or `transpose-sexp-function' to do its job. Thanks. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-21 5:06 bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Mickey Petersen @ 2024-09-26 7:42 ` Yuan Fu 2024-09-26 9:56 ` Mickey Petersen 0 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-09-26 7:42 UTC (permalink / raw) To: Mickey Petersen; +Cc: 73404 > On Sep 20, 2024, at 10:06 PM, Mickey Petersen <mickey@masteringemacs.org> wrote: > > > > Examples with javascript-mode. It holds for all modes i tested with a > TS equivalent. Let -!- be the starting point and ^N be the subsequent > position after a movement command. > > -!-export const add = (a, b) => a + b; > > Repeated `C-M-f' yields > > export const add = (a, b) => a + b; > > ^1 ^2 ^3 ^4 ^5 ^6 > > > In other words, it works as it always has. > > Meanwhile, in `js-ts-mode': > > export const add = (a, b) => a + b; > ^1 ^2 ^3 ^4 > > From ^1 and back with `C-M-b' > > export const add-!- = (a, b) => a + b; > > export const add = (a, b) => a + b; > ^1 > > At this point, `C-M-b' no longer goes back. It is stuck. > > > Another example: > > -!-console.log("Addition result:", result1); > > With `C-M-f': > > console.log("Addition result:", result1); > > ^1 ^2 > > > This affects every single -sexp function that uses either > `forward-sexp-function' or `transpose-sexp-function' to do its job. > > Thanks. > I’m aware of this problem and it’s quite inconvenient at times, but right now I don’t have a good solution for it. Ideas are welcome. Basically tree-sitter’s sexp movement works on subtrees. It determines the position of the point in the whole parse tree and goes forward/back across the next subtree in the parse tree. If there’s no more sibling subtrees in the same level to move over, sexp movement stops like in lisp. The parse tree is invisible and often groups token in unexpected ways, so many times the sexp movement isn’t intuitive. We might need to add a user option so people can easily turn off tree-sitter sexp movement, since it isn’t a strict upgrade from the generic sexp movement—it’s more of a different flavored sexp movement. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 7:42 ` Yuan Fu @ 2024-09-26 9:56 ` Mickey Petersen 2024-09-26 10:53 ` Eli Zaretskii 0 siblings, 1 reply; 66+ messages in thread From: Mickey Petersen @ 2024-09-26 9:56 UTC (permalink / raw) To: Yuan Fu; +Cc: 73404 Yuan Fu <casouri@gmail.com> writes: >> On Sep 20, 2024, at 10:06 PM, Mickey Petersen <mickey@masteringemacs.org> wrote: >> >> >> >> Examples with javascript-mode. It holds for all modes i tested with a >> TS equivalent. Let -!- be the starting point and ^N be the subsequent >> position after a movement command. >> >> -!-export const add = (a, b) => a + b; >> >> Repeated `C-M-f' yields >> >> export const add = (a, b) => a + b; >> >> ^1 ^2 ^3 ^4 ^5 ^6 >> >> >> In other words, it works as it always has. >> >> Meanwhile, in `js-ts-mode': >> >> export const add = (a, b) => a + b; >> ^1 ^2 ^3 ^4 >> >> From ^1 and back with `C-M-b' >> >> export const add-!- = (a, b) => a + b; >> >> export const add = (a, b) => a + b; >> ^1 >> >> At this point, `C-M-b' no longer goes back. It is stuck. >> >> >> Another example: >> >> -!-console.log("Addition result:", result1); >> >> With `C-M-f': >> >> console.log("Addition result:", result1); >> >> ^1 ^2 >> >> >> This affects every single -sexp function that uses either >> `forward-sexp-function' or `transpose-sexp-function' to do its job. >> >> Thanks. >> > > I’m aware of this problem and it’s quite inconvenient at times, but right now I don’t have a good solution for it. Ideas are welcome. > > Basically tree-sitter’s sexp movement works on subtrees. It determines > the position of the point in the whole parse tree and goes > forward/back across the next subtree in the parse tree. If there’s no > more sibling subtrees in the same level to move over, sexp movement > stops like in lisp. The parse tree is invisible and often groups token > in unexpected ways, so many times the sexp movement isn’t intuitive. > Hi Yuan, In my opinion, that's not what `sexp' movement is. Sexp movement is movement by balanced expressions -- and a fallback to word-like behaviour absent that -- and this is not that. It would be better to relegate this sort of thing to its own set of keybindings. > We might need to add a user option so people can easily turn off > tree-sitter sexp movement, since it isn’t a strict upgrade from the > generic sexp movement—it’s more of a different flavored sexp movement. It should be opt-in, not opt-out. > > Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 9:56 ` Mickey Petersen @ 2024-09-26 10:53 ` Eli Zaretskii 2024-09-26 12:13 ` Mickey Petersen 0 siblings, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-09-26 10:53 UTC (permalink / raw) To: Mickey Petersen; +Cc: casouri, 73404 > Cc: 73404@debbugs.gnu.org > From: Mickey Petersen <mickey@masteringemacs.org> > Date: Thu, 26 Sep 2024 10:56:35 +0100 > > In my opinion, that's not what `sexp' movement is. > > Sexp movement is movement by balanced expressions -- and a fallback to > word-like behaviour absent that -- and this is not that. It would be > better to relegate this sort of thing to its own set of keybindings. The term "balanced expression" is not well defined in languages other than Lisp and Lisp-like ones. It is clear what expected when point is on a brace or a parenthesis, but entirely NOT clear when you start from something else. For example: int foo = bar + 2 * baz; Suppose you start with point at "foo": what would you expect forward-sexp to do? nothing? > > We might need to add a user option so people can easily turn off > > tree-sitter sexp movement, since it isn’t a strict upgrade from the > > generic sexp movement—it’s more of a different flavored sexp movement. > > It should be opt-in, not opt-out. I disagree. Moving by sub-trees is a natural generalization of sexp movement for languages where parentheses and braces are rare and far in-between. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 10:53 ` Eli Zaretskii @ 2024-09-26 12:13 ` Mickey Petersen 2024-09-26 13:46 ` Eli Zaretskii 0 siblings, 1 reply; 66+ messages in thread From: Mickey Petersen @ 2024-09-26 12:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 73404 Eli Zaretskii <eliz@gnu.org> writes: >> Cc: 73404@debbugs.gnu.org >> From: Mickey Petersen <mickey@masteringemacs.org> >> Date: Thu, 26 Sep 2024 10:56:35 +0100 >> >> In my opinion, that's not what `sexp' movement is. >> >> Sexp movement is movement by balanced expressions -- and a fallback to >> word-like behaviour absent that -- and this is not that. It would be >> better to relegate this sort of thing to its own set of keybindings. > > The term "balanced expression" is not well defined in languages other > than Lisp and Lisp-like ones. It is clear what expected when point is > on a brace or a parenthesis, but entirely NOT clear when you start > from something else. For example: > > int foo = bar + 2 * baz; > > Suppose you start with point at "foo": what would you expect > forward-sexp to do? nothing? > I expect it to behave as it presently does: default to word-like behaviour such as M-@ / M-f etc. Balanced expression is not well defined, de jure, but it is in practical terms, making it de facto rather well understood and supported. It behaves reasonably consistently across languages, and I use *-sexp commands thousands of times a day in a wide range of major modes and contexts, both in code and also prose. Most people who use *-sexp (or *-word commands for that matter) in major modes come to recognise how they work and know what happens to the text/point in their buffer before they run them. I would challenge anyone, given even small samples of code, to do the same with the current TS only implementation. >> > We might need to add a user option so people can easily turn off >> > tree-sitter sexp movement, since it isn’t a strict upgrade from the >> > generic sexp movement—it’s more of a different flavored sexp movement. >> >> It should be opt-in, not opt-out. > > I disagree. Moving by sub-trees is a natural generalization of sexp > movement for languages where parentheses and braces are rare and far > in-between. Yes, if one can intuit the sub trees' structure, which is not so simple; and if the selection of commands are sufficiently expressive enough to let you navigate the tree. I am not sure they are. The CSTs are deep, wide, and nodes' ranges frequently overlap; they are multi-dimensional structures that map to a simple 2-dimensional 'grid' in your buffer. Making heads or tails of that is no easy feat. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 12:13 ` Mickey Petersen @ 2024-09-26 13:46 ` Eli Zaretskii 2024-09-26 15:21 ` Mickey Petersen 0 siblings, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-09-26 13:46 UTC (permalink / raw) To: Mickey Petersen; +Cc: casouri, 73404 > X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NO_RECEIVED, > NO_RELAYS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no > version=3.4.2 > From: Mickey Petersen <mickey@masteringemacs.org> > Cc: casouri@gmail.com, 73404@debbugs.gnu.org > Date: Thu, 26 Sep 2024 13:13:53 +0100 > > > Eli Zaretskii <eliz@gnu.org> writes: > > > int foo = bar + 2 * baz; > > > > Suppose you start with point at "foo": what would you expect > > forward-sexp to do? nothing? > > > > I expect it to behave as it presently does: default to word-like > behaviour such as M-@ / M-f etc. Then we just lost an opportunity to have more useful commands, because we already have M-f and M-@. > Balanced expression is not well defined, de jure, but it is in > practical terms, making it de facto rather well understood and > supported. It behaves reasonably consistently across languages, and I > use *-sexp commands thousands of times a day in a wide range of major modes and > contexts, both in code and also prose. I think the ability to move by parse sub-trees is also very useful. > Most people who use *-sexp (or *-word commands for that matter) in > major modes come to recognise how they work and know what happens to > the text/point in their buffer before they run them. > > I would challenge anyone, given even small samples of code, to do the > same with the current TS only implementation. That's just a matter of getting used to the new semantics. > > I disagree. Moving by sub-trees is a natural generalization of sexp > > movement for languages where parentheses and braces are rare and far > > in-between. > > Yes, if one can intuit the sub trees' structure, which is not so > simple; and if the selection of commands are sufficiently expressive > enough to let you navigate the tree. I am not sure they are. There are enough situations where moving by words will also surprise you. For example, did you know that M-f stops when it finds a character from a different script? And yet we still use these commands. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 13:46 ` Eli Zaretskii @ 2024-09-26 15:21 ` Mickey Petersen 2024-09-26 15:45 ` Eli Zaretskii 0 siblings, 1 reply; 66+ messages in thread From: Mickey Petersen @ 2024-09-26 15:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 73404 Eli Zaretskii <eliz@gnu.org> writes: >> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NO_RECEIVED, >> NO_RELAYS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no >> version=3.4.2 >> From: Mickey Petersen <mickey@masteringemacs.org> >> Cc: casouri@gmail.com, 73404@debbugs.gnu.org >> Date: Thu, 26 Sep 2024 13:13:53 +0100 >> >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > int foo = bar + 2 * baz; >> > >> > Suppose you start with point at "foo": what would you expect >> > forward-sexp to do? nothing? >> > >> >> I expect it to behave as it presently does: default to word-like >> behaviour such as M-@ / M-f etc. > > Then we just lost an opportunity to have more useful commands, because > we already have M-f and M-@. > >> Balanced expression is not well defined, de jure, but it is in >> practical terms, making it de facto rather well understood and >> supported. It behaves reasonably consistently across languages, and I >> use *-sexp commands thousands of times a day in a wide range of major modes and >> contexts, both in code and also prose. > > I think the ability to move by parse sub-trees is also very useful. > Agreed. What matters is whether the crop of new sexp commands, such as they are, perform satisfactorily. Do you think the examples I listed in the original bug report match your expectations? If so, then it is probably OK to close the bug report. >> Most people who use *-sexp (or *-word commands for that matter) in >> major modes come to recognise how they work and know what happens to >> the text/point in their buffer before they run them. >> >> I would challenge anyone, given even small samples of code, to do the >> same with the current TS only implementation. > > That's just a matter of getting used to the new semantics. > >> > I disagree. Moving by sub-trees is a natural generalization of sexp >> > movement for languages where parentheses and braces are rare and far >> > in-between. >> >> Yes, if one can intuit the sub trees' structure, which is not so >> simple; and if the selection of commands are sufficiently expressive >> enough to let you navigate the tree. I am not sure they are. > > There are enough situations where moving by words will also surprise > you. For example, did you know that M-f stops when it finds a > character from a different script? And yet we still use these > commands. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 15:21 ` Mickey Petersen @ 2024-09-26 15:45 ` Eli Zaretskii 2024-09-27 5:43 ` Yuan Fu 0 siblings, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-09-26 15:45 UTC (permalink / raw) To: Mickey Petersen; +Cc: casouri, 73404 > From: Mickey Petersen <mickey@masteringemacs.org> > Cc: casouri@gmail.com, 73404@debbugs.gnu.org > Date: Thu, 26 Sep 2024 16:21:33 +0100 > > > I think the ability to move by parse sub-trees is also very useful. > > > > Agreed. What matters is whether the crop of new sexp commands, such as they > are, perform satisfactorily. > > Do you think the examples I listed in the original bug report match > your expectations? If so, then it is probably OK to close the bug report. Yes, I do, but let's wait for others to chime in if they have opinions on this. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-26 15:45 ` Eli Zaretskii @ 2024-09-27 5:43 ` Yuan Fu 2024-09-29 16:56 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-09-27 5:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Mickey Petersen, 73404 > On Sep 26, 2024, at 8:45 AM, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Mickey Petersen <mickey@masteringemacs.org> >> Cc: casouri@gmail.com, 73404@debbugs.gnu.org >> Date: Thu, 26 Sep 2024 16:21:33 +0100 >> >>> I think the ability to move by parse sub-trees is also very useful. >>> >> >> Agreed. What matters is whether the crop of new sexp commands, such as they >> are, perform satisfactorily. Note that you can affect the behavior of tree-sitter sexp movement by defining the sexp “thing” in treesit-thing-settings. Js-ts-mode defines one (js--treesit-sexp-nodes) and it only consider some nodes as sexp. You might be able to tweak the sexp movement to your liking by changing it, or directly modifying the definition for `sexp’ in treesit-thing-settings. >> >> Do you think the examples I listed in the original bug report match >> your expectations? If so, then it is probably OK to close the bug report. > > Yes, I do, but let's wait for others to chime in if they have opinions > on this. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-27 5:43 ` Yuan Fu @ 2024-09-29 16:56 ` Juri Linkov 2024-10-01 3:57 ` Yuan Fu 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-09-29 16:56 UTC (permalink / raw) To: Yuan Fu; +Cc: Eli Zaretskii, Mickey Petersen, 73404 > Note that you can affect the behavior of tree-sitter sexp movement by > defining the sexp “thing” in treesit-thing-settings. Js-ts-mode defines one > (js--treesit-sexp-nodes) and it only consider some nodes as sexp. You might > be able to tweak the sexp movement to your liking by changing it, or > directly modifying the definition for `sexp’ in treesit-thing-settings. > >>> Do you think the examples I listed in the original bug report match >>> your expectations? If so, then it is probably OK to close the bug report. >> >> Yes, I do, but let's wait for others to chime in if they have opinions >> on this. Here are some ideas how to cover more use cases. Suppose that a user wants to disable tree-sitter sexp movement completely to use the default forward-sexp-default-function. The natural way to do this would be set the list of nodes to nil: (setq js--treesit-sexp-nodes nil) However, this currently doesn't work, and requires a change like this: @@ -2290,10 +2290,12 @@ treesit-forward-sexp (treesit-node-at (point) (treesit-language-at (point))))) (or (when (and node-at-point ;; Make sure point is strictly inside node. - (< (treesit-node-start node-at-point) - (point) - (treesit-node-end node-at-point)) - (treesit-node-match-p node-at-point 'text t)) + (<= (treesit-node-start node-at-point) + (point) + (treesit-node-end node-at-point)) + (or (treesit-node-match-p node-at-point 'text t) + (not (treesit-node-match-p node-at-point 'sexp t)) + )) (forward-sexp-default-function arg) t) (if (> arg 0) Now, the next case: what if the user wants to use the default forward-sexp-default-function except for the 'binary_expression' like "a + b" where `C-M-f' should move from "a" to the end of "b": export const add = (a, b) => -!-a + b; should move to export const add = (a, b) => a + b; ^1 The best way for the user would be to customize: (setq js--treesit-sexp-nodes '("binary_expression")) But this is not yet handled by the condition above: (not (treesit-node-match-p node-at-point 'sexp t)) because 'node-at-point' is "identifier". So we need to use 'treesit-parent-until' to check if all parent nodes match 'js--treesit-sexp-nodes'. Then it will find the parent "binary_expression". I believe something like this will make treesit-forward-sexp more customizable. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-09-29 16:56 ` Juri Linkov @ 2024-10-01 3:57 ` Yuan Fu 2024-10-01 17:49 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-10-01 3:57 UTC (permalink / raw) To: Juri Linkov; +Cc: Eli Zaretskii, Mickey Petersen, 73404 > On Sep 29, 2024, at 9:56 AM, Juri Linkov <juri@linkov.net> wrote: > >> Note that you can affect the behavior of tree-sitter sexp movement by >> defining the sexp “thing” in treesit-thing-settings. Js-ts-mode defines one >> (js--treesit-sexp-nodes) and it only consider some nodes as sexp. You might >> be able to tweak the sexp movement to your liking by changing it, or >> directly modifying the definition for `sexp’ in treesit-thing-settings. >> >>>> Do you think the examples I listed in the original bug report match >>>> your expectations? If so, then it is probably OK to close the bug report. >>> >>> Yes, I do, but let's wait for others to chime in if they have opinions >>> on this. > > Here are some ideas how to cover more use cases. > > Suppose that a user wants to disable tree-sitter sexp movement > completely to use the default forward-sexp-default-function. > The natural way to do this would be set the list of nodes to nil: > > (setq js--treesit-sexp-nodes nil) > > However, this currently doesn't work, and requires a change like this: > > @@ -2290,10 +2290,12 @@ treesit-forward-sexp > (treesit-node-at (point) (treesit-language-at (point))))) > (or (when (and node-at-point > ;; Make sure point is strictly inside node. > - (< (treesit-node-start node-at-point) > - (point) > - (treesit-node-end node-at-point)) > - (treesit-node-match-p node-at-point 'text t)) > + (<= (treesit-node-start node-at-point) > + (point) > + (treesit-node-end node-at-point)) > + (or (treesit-node-match-p node-at-point 'text t) > + (not (treesit-node-match-p node-at-point 'sexp t)) > + )) > (forward-sexp-default-function arg) > t) > (if (> arg 0) > > Now, the next case: what if the user wants to use the default > forward-sexp-default-function except for the 'binary_expression' > like "a + b" where `C-M-f' should move from "a" to the end of "b": > > export const add = (a, b) => -!-a + b; > > should move to > > export const add = (a, b) => a + b; > > ^1 > > The best way for the user would be to customize: > > (setq js--treesit-sexp-nodes '("binary_expression")) > > But this is not yet handled by the condition above: > > (not (treesit-node-match-p node-at-point 'sexp t)) > > because 'node-at-point' is "identifier". > So we need to use 'treesit-parent-until' > to check if all parent nodes match > 'js--treesit-sexp-nodes'. Then it will find > the parent "binary_expression". > > I believe something like this will make > treesit-forward-sexp more customizable. The user can modify treesit-thing-settings to alter the behavior of sexp navigation, they don’t necessarily need to use js--treesit-sexp-nodes. Maybe we should add a test for (treesit-thing-defined-p 'sexp nil) in treesit-forward-sexp? Your second example sounds useful, but right now the premise of tree-sitter sexp movement is to use the parse tree primarily, and only use the default sexp movement for comments and strings. What you envisioned seems to be the other way around: use default sexp movement by default, and only use tree-sitter movement under certain conditions. Is that few lines of change able to make such big difference in the logic? Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-10-01 3:57 ` Yuan Fu @ 2024-10-01 17:49 ` Juri Linkov 2024-10-02 6:14 ` Yuan Fu 2024-12-05 18:52 ` Juri Linkov 0 siblings, 2 replies; 66+ messages in thread From: Juri Linkov @ 2024-10-01 17:49 UTC (permalink / raw) To: Yuan Fu; +Cc: Eli Zaretskii, Mickey Petersen, 73404 > The user can modify treesit-thing-settings to alter the behavior of > sexp navigation, they don’t necessarily need to use > js--treesit-sexp-nodes. Maybe we should add a test for > (treesit-thing-defined-p 'sexp nil) in treesit-forward-sexp? I tried to do something like this, and everything works nicely with: @@ -2289,11 +2289,10 @@ treesit-forward-sexp (node-at-point (treesit-node-at (point) (treesit-language-at (point))))) (or (when (and node-at-point - ;; Make sure point is strictly inside node. - (< (treesit-node-start node-at-point) - (point) - (treesit-node-end node-at-point)) - (treesit-node-match-p node-at-point 'text t)) + (or (treesit-node-match-p node-at-point 'text t) + (not (treesit-thing-at + (if (> arg 0) (point) (1- (point))) + (treesit-thing-definition 'sexp nil))))) (forward-sexp-default-function arg) t) (if (> arg 0) The new logic is the following: if there is no sexp thing defined at point, then fall back to 'forward-sexp-default-function'. Then after (setq js--treesit-sexp-nodes '("binary_expression")) 'C-M-f' in e.g. export const add = (a, b) => -!-a + b; moves point to export const add = (a, b) => a + b-!-; The condition (if (> arg 0) (point) (1- (point))) above is necessary to allow 'C-M-b' to move back to: export const add = (a, b) => -!-a + b; Also the condition to make sure point is strictly inside node was removed to handle the case when point was at the beginning of the buffer: -!- export const add = (a, b) => a + b; to move after export-!- const add = (a, b) => a + b; by 'forward-sexp-default-function'. > Your second example sounds useful, but right now the premise of tree-sitter > sexp movement is to use the parse tree primarily, and only use the default > sexp movement for comments and strings. What you envisioned seems to be the > other way around: use default sexp movement by default, and only use > tree-sitter movement under certain conditions. Is that few lines of change > able to make such big difference in the logic? I think we need to support both ways: 1. opt-out - where sexp-thing definition is used by default, and only text-thing allows users to override it; 2. opt-in - where 'forward-sexp-default-function' is used by default, and user can explicitly define what sexp-things are preferable for navigation by treesit. Then in the latter case the users could prefer to use treesit sexp navigation only for constructions with "invisible parens". For example, in Ruby there are two interchangeable syntaxes for code blocks: 1. curly braces {...} that are already handled by 'forward-sexp-default-function'; 2. do...end that can't be handled by 'forward-sexp-default-function', so treesit is coming to the rescue for the case of such implicit braces. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-10-01 17:49 ` Juri Linkov @ 2024-10-02 6:14 ` Yuan Fu 2024-12-05 18:52 ` Juri Linkov 1 sibling, 0 replies; 66+ messages in thread From: Yuan Fu @ 2024-10-02 6:14 UTC (permalink / raw) To: Juri Linkov; +Cc: Eli Zaretskii, Mickey Petersen, 73404 > On Oct 1, 2024, at 10:49 AM, Juri Linkov <juri@linkov.net> wrote: > >> The user can modify treesit-thing-settings to alter the behavior of >> sexp navigation, they don’t necessarily need to use >> js--treesit-sexp-nodes. Maybe we should add a test for >> (treesit-thing-defined-p 'sexp nil) in treesit-forward-sexp? > > I tried to do something like this, and everything works nicely with: > > @@ -2289,11 +2289,10 @@ treesit-forward-sexp > (node-at-point > (treesit-node-at (point) (treesit-language-at (point))))) > (or (when (and node-at-point > - ;; Make sure point is strictly inside node. > - (< (treesit-node-start node-at-point) > - (point) > - (treesit-node-end node-at-point)) > - (treesit-node-match-p node-at-point 'text t)) > + (or (treesit-node-match-p node-at-point 'text t) > + (not (treesit-thing-at > + (if (> arg 0) (point) (1- (point))) > + (treesit-thing-definition 'sexp nil))))) > (forward-sexp-default-function arg) > t) > (if (> arg 0) > > The new logic is the following: if there is no sexp thing defined at point, > then fall back to 'forward-sexp-default-function'. > > Then after (setq js--treesit-sexp-nodes '("binary_expression")) > 'C-M-f' in e.g. > > export const add = (a, b) => -!-a + b; > > moves point to > > export const add = (a, b) => a + b-!-; > > The condition (if (> arg 0) (point) (1- (point))) above > is necessary to allow 'C-M-b' to move back to: > > export const add = (a, b) => -!-a + b; > > Also the condition to make sure point is strictly inside node > was removed to handle the case when point was at the beginning > of the buffer: > > -!- > export const add = (a, b) => a + b; > > to move after > > export-!- const add = (a, b) => a + b; > > by 'forward-sexp-default-function'. Sounds good. Feel free to install on master if you think it works well :-) > >> Your second example sounds useful, but right now the premise of tree-sitter >> sexp movement is to use the parse tree primarily, and only use the default >> sexp movement for comments and strings. What you envisioned seems to be the >> other way around: use default sexp movement by default, and only use >> tree-sitter movement under certain conditions. Is that few lines of change >> able to make such big difference in the logic? > > I think we need to support both ways: > > 1. opt-out - where sexp-thing definition is used by default, > and only text-thing allows users to override it; > > 2. opt-in - where 'forward-sexp-default-function' is used by default, > and user can explicitly define what sexp-things are preferable > for navigation by treesit. > > Then in the latter case the users could prefer to use > treesit sexp navigation only for constructions with > "invisible parens". For example, in Ruby there are > two interchangeable syntaxes for code blocks: > > 1. curly braces {...} that are already handled > by 'forward-sexp-default-function'; > > 2. do...end that can't be handled by 'forward-sexp-default-function', > so treesit is coming to the rescue for the case of such > implicit braces. Sounds good to me, I wonder if there are clever way to implement this. If there isn’t, we’d need to define two sets treesit-sexp functions and add a custom option to control which one to use. Seems a bit clunky to me. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-10-01 17:49 ` Juri Linkov 2024-10-02 6:14 ` Yuan Fu @ 2024-12-05 18:52 ` Juri Linkov 2024-12-05 19:53 ` Juri Linkov 1 sibling, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-05 18:52 UTC (permalink / raw) To: Yuan Fu; +Cc: Eli Zaretskii, Mickey Petersen, 73404 > The new logic is the following: if there is no sexp thing defined at point, > then fall back to 'forward-sexp-default-function'. > > Then after (setq js--treesit-sexp-nodes '("binary_expression")) > 'C-M-f' in e.g. > > export const add = (a, b) => -!-a + b; > > moves point to > > export const add = (a, b) => a + b-!-; Unfortunately, I still can't find a way to handle such case that from export const add = (a, b) -!- => a + b; typing 'C-M-f' should jump to the end of the next sexp (to the end of whole "binary_expression"): export const add = (a, b) => a + b-!-; since only tree-sitter knows about "binary_expression", so 'forward-sexp-default-function' can't be used here. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-05 18:52 ` Juri Linkov @ 2024-12-05 19:53 ` Juri Linkov 2024-12-10 17:20 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-05 19:53 UTC (permalink / raw) To: Yuan Fu; +Cc: Eli Zaretskii, Mickey Petersen, 73404 >> The new logic is the following: if there is no sexp thing defined at point, >> then fall back to 'forward-sexp-default-function'. >> >> Then after (setq js--treesit-sexp-nodes '("binary_expression")) >> 'C-M-f' in e.g. >> >> export const add = (a, b) => -!-a + b; >> >> moves point to >> >> export const add = (a, b) => a + b-!-; > > Unfortunately, I still can't find a way to handle such case > that from > > export const add = (a, b) -!- => a + b; > > typing 'C-M-f' should jump to the end of the next sexp > (to the end of whole "binary_expression"): > > export const add = (a, b) => a + b-!-; > > since only tree-sitter knows about "binary_expression", > so 'forward-sexp-default-function' can't be used here. Actually, I have one idea of possible heuristics: 1. first try 'forward-sexp-default-function' 2. if it crosses the boundary of sexp defined by 'treesit-thing-settings' then use 'treesit-end-of-thing' instead This should work. Ok, will try. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-05 19:53 ` Juri Linkov @ 2024-12-10 17:20 ` Juri Linkov 2024-12-11 6:31 ` Yuan Fu 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-10 17:20 UTC (permalink / raw) To: Yuan Fu; +Cc: Eli Zaretskii, Mickey Petersen, 73404 [-- Attachment #1: Type: text/plain, Size: 1619 bytes --] >> export const add = (a, b) -!- => a + b; >> >> typing 'C-M-f' should jump to the end of the next sexp >> (to the end of whole "binary_expression"): >> >> export const add = (a, b) => a + b-!-; >> >> since only tree-sitter knows about "binary_expression", >> so 'forward-sexp-default-function' can't be used here. > > Actually, I have one idea of possible heuristics: > > 1. first try 'forward-sexp-default-function' > 2. if it crosses the boundary of sexp defined by 'treesit-thing-settings' > then use 'treesit-end-of-thing' instead > > This should work. Ok, will try. This is implemented now in the attached patch, and it works nicely. The main rule is the following: 'forward-sexp-default-function' should not go out of the current thing, neither go inside a sibling. So we use 'treesit-end-of-thing' in such cases. But when inside a thing or outside a thing, use the default function. This supposes that such things as "identifier" in js should be removed from 'treesit-thing-settings' since identifiers should be navigated the same way as such keywords as "export" and "const" using 'forward-sexp-default-function'. What should remain in 'treesit-thing-settings' are only grouping constructs such as "parenthesized_expression" and "statement_block". Removing "identifier" from 'treesit-thing-settings' exposed a problem in 'treesit-navigate-thing'. This line ((and (null next) (null prev)) parent) tries to go out of the current thing to its parent, thus breaking the main principle that 'forward-sexp' should move forward across siblings only. But removing this line fixed the problem: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: treesit-forward-sexp.patch --] [-- Type: text/x-diff, Size: 3264 bytes --] diff --git a/lisp/treesit.el b/lisp/treesit.el index db8f7a7595d..4fcdbe7fc56 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -2373,21 +2373,41 @@ treesit-forward-sexp What constitutes as text and source code sexp is determined by `text' and `sexp' in `treesit-thing-settings'." (interactive "^p") - (let ((arg (or arg 1)) - (pred (or treesit-sexp-type-regexp 'sexp)) - (node-at-point - (treesit-node-at (point) (treesit-language-at (point))))) - (or (when (and node-at-point - ;; Make sure point is strictly inside node. - (< (treesit-node-start node-at-point) - (point) - (treesit-node-end node-at-point)) - (treesit-node-match-p node-at-point 'text t)) - (forward-sexp-default-function arg) - t) - (if (> arg 0) - (treesit-end-of-thing pred (abs arg) 'restricted) - (treesit-beginning-of-thing pred (abs arg) 'restricted)) + (let* ((arg (or arg 1)) + (pred (or treesit-sexp-type-regexp 'sexp)) + (current-thing (treesit-thing-at (point) pred t)) + (default-pos + (condition-case _ + (save-excursion + (forward-sexp-default-function arg) + (point)) + (scan-error nil))) + (default-pos (unless (eq (point) default-pos) default-pos)) + (sibling-pos + (save-excursion + (and (if (> arg 0) + (treesit-end-of-thing pred (abs arg) 'restricted) + (treesit-beginning-of-thing pred (abs arg) 'restricted)) + (point)))) + (sibling (when sibling-pos + (if (> arg 0) + (treesit-thing-prev sibling-pos pred) + (treesit-thing-next sibling-pos pred))))) + + ;; 'forward-sexp-default-function' should not go out of the current thing, + ;; neither go inside the next thing, neither go over the next thing + (or (when (and default-pos + (or (null current-thing) + (if (> arg 0) + (< default-pos (treesit-node-end current-thing)) + (> default-pos (treesit-node-start current-thing)))) + (or (null sibling) + (if (> arg 0) + (< default-pos (treesit-node-start sibling)) + (> default-pos (treesit-node-end sibling))))) + (goto-char default-pos)) + (when sibling-pos + (goto-char sibling-pos)) ;; If we couldn't move, we should signal an error and report ;; the obstacle, like `forward-sexp' does. If we couldn't ;; find a parent, we simply return nil without moving point, @@ -2849,8 +2869,7 @@ treesit-navigate-thing (if (eq tactic 'restricted) (setq pos (funcall advance - (cond ((and (null next) (null prev)) parent) - ((> arg 0) next) + (cond ((> arg 0) next) (t prev)))) ;; For `nested', it's a bit more work: ;; Move... ^ permalink raw reply related [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-10 17:20 ` Juri Linkov @ 2024-12-11 6:31 ` Yuan Fu 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-19 7:34 ` bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Juri Linkov 0 siblings, 2 replies; 66+ messages in thread From: Yuan Fu @ 2024-12-11 6:31 UTC (permalink / raw) To: Juri Linkov Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, 73404, Stefan Monnier > On Dec 10, 2024, at 9:20 AM, Juri Linkov <juri@linkov.net> wrote: > >>> export const add = (a, b) -!- => a + b; >>> >>> typing 'C-M-f' should jump to the end of the next sexp >>> (to the end of whole "binary_expression"): >>> >>> export const add = (a, b) => a + b-!-; >>> >>> since only tree-sitter knows about "binary_expression", >>> so 'forward-sexp-default-function' can't be used here. >> >> Actually, I have one idea of possible heuristics: >> >> 1. first try 'forward-sexp-default-function' >> 2. if it crosses the boundary of sexp defined by 'treesit-thing-settings' >> then use 'treesit-end-of-thing' instead >> >> This should work. Ok, will try. > > This is implemented now in the attached patch, and it works nicely. > > The main rule is the following: 'forward-sexp-default-function' > should not go out of the current thing, neither go inside a sibling. > So we use 'treesit-end-of-thing' in such cases. But when inside > a thing or outside a thing, use the default function. > > This supposes that such things as "identifier" in js should be > removed from 'treesit-thing-settings' since identifiers should be > navigated the same way as such keywords as "export" and "const" > using 'forward-sexp-default-function'. > > What should remain in 'treesit-thing-settings' are only grouping > constructs such as "parenthesized_expression" and "statement_block". Ah, this matches my idea of defining sexp in other languages as “repeatable construct/list-like construct”. We went with “every syntactic construct” at the time, which I didn’t object to, but I’m definitely happier with the repeatable construct approach. Including Stefan and Theo since they were part of the original sexp navigation discussion. My only concern is that would the result be a bit unpredictable/confusing when we mix the result of two logic together in such an involved way? We can push to master and try it out for a while. I use tree-sitter sexp navigation for work every day, albeit strictly for navigating list-like constructs—I use forward/backward-word for smaller navigation. > > Removing "identifier" from 'treesit-thing-settings' exposed a problem > in 'treesit-navigate-thing'. This line > > ((and (null next) (null prev)) parent) > > tries to go out of the current thing to its parent, > thus breaking the main principle that 'forward-sexp' > should move forward across siblings only. But removing > this line fixed the problem: Thanks, LGTM. > > diff --git a/lisp/treesit.el b/lisp/treesit.el > index db8f7a7595d..4fcdbe7fc56 100644 > --- a/lisp/treesit.el > +++ b/lisp/treesit.el > @@ -2373,21 +2373,41 @@ treesit-forward-sexp > What constitutes as text and source code sexp is determined > by `text' and `sexp' in `treesit-thing-settings'." > (interactive "^p") > - (let ((arg (or arg 1)) > - (pred (or treesit-sexp-type-regexp 'sexp)) > - (node-at-point > - (treesit-node-at (point) (treesit-language-at (point))))) > - (or (when (and node-at-point > - ;; Make sure point is strictly inside node. > - (< (treesit-node-start node-at-point) > - (point) > - (treesit-node-end node-at-point)) > - (treesit-node-match-p node-at-point 'text t)) > - (forward-sexp-default-function arg) > - t) > - (if (> arg 0) > - (treesit-end-of-thing pred (abs arg) 'restricted) > - (treesit-beginning-of-thing pred (abs arg) 'restricted)) > + (let* ((arg (or arg 1)) > + (pred (or treesit-sexp-type-regexp 'sexp)) > + (current-thing (treesit-thing-at (point) pred t)) > + (default-pos > + (condition-case _ > + (save-excursion > + (forward-sexp-default-function arg) > + (point)) > + (scan-error nil))) > + (default-pos (unless (eq (point) default-pos) default-pos)) > + (sibling-pos > + (save-excursion > + (and (if (> arg 0) > + (treesit-end-of-thing pred (abs arg) 'restricted) > + (treesit-beginning-of-thing pred (abs arg) 'restricted)) > + (point)))) > + (sibling (when sibling-pos > + (if (> arg 0) > + (treesit-thing-prev sibling-pos pred) > + (treesit-thing-next sibling-pos pred))))) > + > + ;; 'forward-sexp-default-function' should not go out of the current thing, > + ;; neither go inside the next thing, neither go over the next thing > + (or (when (and default-pos > + (or (null current-thing) > + (if (> arg 0) > + (< default-pos (treesit-node-end current-thing)) > + (> default-pos (treesit-node-start current-thing)))) > + (or (null sibling) > + (if (> arg 0) > + (< default-pos (treesit-node-start sibling)) > + (> default-pos (treesit-node-end sibling))))) > + (goto-char default-pos)) > + (when sibling-pos > + (goto-char sibling-pos)) > ;; If we couldn't move, we should signal an error and report > ;; the obstacle, like `forward-sexp' does. If we couldn't > ;; find a parent, we simply return nil without moving point, > @@ -2849,8 +2869,7 @@ treesit-navigate-thing > (if (eq tactic 'restricted) > (setq pos (funcall > advance > - (cond ((and (null next) (null prev)) parent) > - ((> arg 0) next) > + (cond ((> arg 0) next) > (t prev)))) > ;; For `nested', it's a bit more work: > ;; Move... ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-11 6:31 ` Yuan Fu @ 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 15:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors ` (2 more replies) 2024-12-19 7:34 ` bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Juri Linkov 1 sibling, 3 replies; 66+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-11 15:12 UTC (permalink / raw) To: Yuan Fu Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, 73404, Juri Linkov > Ah, this matches my idea of defining sexp in other languages as “repeatable > construct/list-like construct”. We went with “every syntactic construct” at > the time, which I didn’t object to, but I’m definitely happier with the > repeatable construct approach. Including Stefan and Theo since they were > part of the original sexp navigation discussion. FWIW, we have both `forward-list` and `forward-list` and the new behavior you suggest sounds closer to the historical behavior of `forward-list` than `forward-sexp`. Stefan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-11 15:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 16:50 ` Mickey Petersen 2024-12-11 18:27 ` Yuan Fu 2 siblings, 0 replies; 66+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-11 15:29 UTC (permalink / raw) To: Yuan Fu Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, 73404, Juri Linkov > FWIW, we have both `forward-list` and `forward-list` and the new ^^^^ sexp Otherwise it sounds smarter than it is. Stefan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 15:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-11 16:50 ` Mickey Petersen 2024-12-11 18:27 ` Yuan Fu 2 siblings, 0 replies; 66+ messages in thread From: Mickey Petersen @ 2024-12-11 16:50 UTC (permalink / raw) To: Stefan Monnier Cc: 73404, Yuan Fu, Theodor Thornhill, Eli Zaretskii, Juri Linkov Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Ah, this matches my idea of defining sexp in other languages as “repeatable >> construct/list-like construct”. We went with “every syntactic construct” at >> the time, which I didn’t object to, but I’m definitely happier with the >> repeatable construct approach. Including Stefan and Theo since they were >> part of the original sexp navigation discussion. > > FWIW, we have both `forward-list` and `forward-list` and the new > behavior you suggest sounds closer to the historical behavior of > `forward-list` than `forward-sexp`. > Indeed, in Combobulate `<forward/backward>-list' is explicitly used for sibling navigation, and `<forward/backward>-sexp' instead does what it does in plain major modes, but tweaked ever so slightly. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 15:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 16:50 ` Mickey Petersen @ 2024-12-11 18:27 ` Yuan Fu 2024-12-12 7:17 ` Juri Linkov 2 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-12-11 18:27 UTC (permalink / raw) To: Stefan Monnier Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, 73404, Juri Linkov > On Dec 11, 2024, at 7:12 AM, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > >> Ah, this matches my idea of defining sexp in other languages as “repeatable >> construct/list-like construct”. We went with “every syntactic construct” at >> the time, which I didn’t object to, but I’m definitely happier with the >> repeatable construct approach. Including Stefan and Theo since they were >> part of the original sexp navigation discussion. > > FWIW, we have both `forward-list` and `forward-list` and the new > behavior you suggest sounds closer to the historical behavior of > `forward-list` than `forward-sexp`. > > > Stefan > Actually, what’s the difference between forward-list and forward-sexp? I always thought they are the same at least for Lisp. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-11 18:27 ` Yuan Fu @ 2024-12-12 7:17 ` Juri Linkov 2024-12-12 7:40 ` Eli Zaretskii 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-12 7:17 UTC (permalink / raw) To: Yuan Fu Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, Stefan Monnier, 73404 >>> Ah, this matches my idea of defining sexp in other languages as “repeatable >>> construct/list-like construct”. We went with “every syntactic construct” at >>> the time, which I didn’t object to, but I’m definitely happier with the >>> repeatable construct approach. Including Stefan and Theo since they were >>> part of the original sexp navigation discussion. >> >> FWIW, we have both `forward-list` and `forward-list` and the new >> behavior you suggest sounds closer to the historical behavior of >> `forward-list` than `forward-sexp`. > > Actually, what’s the difference between forward-list and forward-sexp? > I always thought they are the same at least for Lisp. forward-sexp moves over a balanced parenthetical group like forward-list does. Plus forward-sexp also moves over an atom such as a symbol, a number. The problem is that treesit adds too much structural information to such simple things as a symbol and a number. For example, in js a simple keyword "export" gets the "(export_statement export" subtree, Another keyword "const" gets "(lexical_declaration kind: const", etc. Therefore for such symbols forward-sexp needs to bypass the structure and use simpler syntactic information to move over them like on a flat list. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 7:17 ` Juri Linkov @ 2024-12-12 7:40 ` Eli Zaretskii 2024-12-12 7:58 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-12-12 7:40 UTC (permalink / raw) To: Juri Linkov; +Cc: theo, casouri, mickey, monnier, 73404 > From: Juri Linkov <juri@linkov.net> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Eli Zaretskii > <eliz@gnu.org>, Mickey Petersen <mickey@masteringemacs.org>, > 73404@debbugs.gnu.org, Theodor Thornhill <theo@thornhill.no> > Date: Thu, 12 Dec 2024 09:17:44 +0200 > > >>> Ah, this matches my idea of defining sexp in other languages as “repeatable > >>> construct/list-like construct”. We went with “every syntactic construct” at > >>> the time, which I didn’t object to, but I’m definitely happier with the > >>> repeatable construct approach. Including Stefan and Theo since they were > >>> part of the original sexp navigation discussion. > >> > >> FWIW, we have both `forward-list` and `forward-list` and the new > >> behavior you suggest sounds closer to the historical behavior of > >> `forward-list` than `forward-sexp`. > > > > Actually, what’s the difference between forward-list and forward-sexp? > > I always thought they are the same at least for Lisp. > > forward-sexp moves over a balanced parenthetical group like > forward-list does. Plus forward-sexp also moves over an atom > such as a symbol, a number. > > The problem is that treesit adds too much structural information > to such simple things as a symbol and a number. For example, in js > a simple keyword "export" gets the "(export_statement export" subtree, > Another keyword "const" gets "(lexical_declaration kind: const", etc. > > Therefore for such symbols forward-sexp needs to bypass the structure > and use simpler syntactic information to move over them like on a flat list. If you mean we should ignore the information provided by tree-sitter and instead use our own syntactic information, then that sounds wrong to me, FWIW. Why cannot we understand enough of the tree-sitter structural information to move like we want? Presumably, the structural information provided by tree-sitter is a portion of a parse tree, which to me means we should be able to move between the parse tree's nodes as long as we understand the tree and can interpret it in our terms. Aren't there some grammar-agnostic traits of tree-sitter nodes that would allow us to interpret the nodes in language-independent terms? If that is not available, then each major mode will have to provide treesit.el with a way to interpret the tree-sitter nodes of the corresponding grammar in a way that will allow sexp movement, thus providing an abstraction layer that treesit.el could use for the movement commands. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 7:40 ` Eli Zaretskii @ 2024-12-12 7:58 ` Juri Linkov 2024-12-12 8:14 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-12 7:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: theo, casouri, mickey, monnier, 73404 >> forward-sexp moves over a balanced parenthetical group like >> forward-list does. Plus forward-sexp also moves over an atom >> such as a symbol, a number. >> >> The problem is that treesit adds too much structural information >> to such simple things as a symbol and a number. For example, in js >> a simple keyword "export" gets the "(export_statement export" subtree, >> Another keyword "const" gets "(lexical_declaration kind: const", etc. >> >> Therefore for such symbols forward-sexp needs to bypass the structure >> and use simpler syntactic information to move over them like on a flat list. > > If you mean we should ignore the information provided by tree-sitter > and instead use our own syntactic information, then that sounds wrong > to me, FWIW. Why cannot we understand enough of the tree-sitter > structural information to move like we want? Presumably, the > structural information provided by tree-sitter is a portion of a parse > tree, which to me means we should be able to move between the parse > tree's nodes as long as we understand the tree and can interpret it in > our terms. > > Aren't there some grammar-agnostic traits of tree-sitter nodes that > would allow us to interpret the nodes in language-independent terms? > If that is not available, then each major mode will have to provide > treesit.el with a way to interpret the tree-sitter nodes of the > corresponding grammar in a way that will allow sexp movement, thus > providing an abstraction layer that treesit.el could use for the > movement commands. Maybe it would be possible to use something like 'flatten-tree' on the treesit's syntax tree? But this will require the addition of a lot of rules to specify what nodes should be flattened. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 7:58 ` Juri Linkov @ 2024-12-12 8:14 ` Juri Linkov 2024-12-12 16:31 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-12 8:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mickey, casouri, theo, monnier, 73404 > Maybe it would be possible to use something like 'flatten-tree' > on the treesit's syntax tree? But this will require the addition > of a lot of rules to specify what nodes should be flattened. A better idea: instead of 'sexp' in treesit-thing-settings define separately 'list' and 'atom', e.g. replace (setq-local treesit-thing-settings `((javascript (sexp ,(js--regexp-opt-symbol js--treesit-sexp-nodes))))) with (setq-local treesit-thing-settings `((javascript (list ,(js--regexp-opt-symbol js--treesit-list-nodes)) (atom ,(js--regexp-opt-symbol js--treesit-atom-nodes))))) ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 8:14 ` Juri Linkov @ 2024-12-12 16:31 ` Juri Linkov 2024-12-12 17:49 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-12 16:31 UTC (permalink / raw) To: Eli Zaretskii; +Cc: theo, casouri, mickey, monnier, 73404 > A better idea: instead of 'sexp' in treesit-thing-settings > define separately 'list' and 'atom', e.g. replace > > (setq-local treesit-thing-settings > `((javascript > (sexp ,(js--regexp-opt-symbol js--treesit-sexp-nodes))))) > > with > > (setq-local treesit-thing-settings > `((javascript > (list ,(js--regexp-opt-symbol js--treesit-list-nodes)) > (atom ,(js--regexp-opt-symbol js--treesit-atom-nodes))))) Still the problem is that using the atoms from the tree doesn't provide backward-compatibility with non-ts modes and how C-M-f moves on non-list atoms there. One way would be to extract anonymous text leaf nodes such as "export" and "const" from (export_statement "export" declaration: (lexical_declaration kind: "const" But still need to check symbol/word syntax to omit such nodes as "+" from (binary_expression left: (identifier) operator: "+" Therefore to provide backward-compatibility with non-ts modes in regard to C-M-f navigation, navigation on atoms should follow the Sword/Ssymbol rules of 'scan_lists' with non-nil 'sexpflag'. So an atom should be a syntactical entity, not structural. This means that treesit-forward-sexp should use the 'list' thing with syntactical atoms. For example, for 'C-M-f' on var p = { case: 'zzzz', -!-default: 'donkey', tee: 'ornery' }; in js-ts-mode it would be unexpected to move to var p = { case: 'zzzz', default: 'donkey'-!-, tee: 'ornery' }; because js-mode moves to var p = { case: 'zzzz', default-!-: 'donkey', tee: 'ornery' }; But anyway 'list' should be customizable as 'sexp' already is. OTOH, transpose-sexps should use treesit 'sexp' with more fine-grained lists that are not suitable for forward-sexp. (And we have no transpose-lists.) For example, it's expected for transpose-sexps to transpose a pair of key:value defined by 'sexp': var p = { default: 'donkey', -!-case: 'zzzz', tee: 'ornery' }; that will be a big improvement when comparing to js-mode. This also will close bug#60655. And if someone want to transpose adjacent atoms (symbols or words), there is transpose-words. In summary: The current implementation in treesit-forward-sexp is more like forward-list. So let's rename it to treesit-forward-list, then create a new implementation of treesit-forward-sexp that uses the 'list' thing together with syntactical atoms. Another variant is to leave treesit-forward-sexp as is, and create a new function treesit-forward-sexp-with-list that uses the 'list' thing. And anyway let's keep the current implementation of treesit-transpose-sexps that uses the 'sexp' thing. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 16:31 ` Juri Linkov @ 2024-12-12 17:49 ` Juri Linkov 2024-12-12 19:13 ` Eli Zaretskii 2024-12-18 7:37 ` Juri Linkov 0 siblings, 2 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-12 17:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mickey, casouri, theo, monnier, 73404 [-- Attachment #1: Type: text/plain, Size: 372 bytes --] > Another variant is to leave treesit-forward-sexp as is, > and create a new function treesit-forward-sexp-with-list > that uses the 'list' thing. This patch keep the current function treesit-forward-sexp, and creates a new function treesit-forward-sexp-list that uses the 'sexp-list' thing to navigate lists while using forward-sexp-default-function to navigate atoms: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: treesit-forward-sexp-list.patch --] [-- Type: text/x-diff, Size: 5601 bytes --] diff --git a/lisp/treesit.el b/lisp/treesit.el index db8f7a7595d..f064be55b9c 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -2400,6 +2400,68 @@ treesit-forward-sexp (treesit-node-start boundary) (treesit-node-end boundary))))))) +(defun treesit-forward-sexp-list (&optional arg) + "Tree-sitter implementation for `forward-sexp-function'. + +ARG is described in the docstring of `forward-sexp-function'. + +If point is inside a text environment where tree-sitter is not +supported, go forward a sexp using `forward-sexp-default-function'. +If point is inside code, use tree-sitter functions with the +following behavior. If there are no further sexps to move across, +signal `scan-error' like `forward-sexp' does. If point is already +at top-level, return nil without moving point. + +What constitutes as text and source code sexp is determined +by `text' and `sexp' in `treesit-thing-settings'." + (interactive "^p") + (let* ((arg (or arg 1)) + (pred (or treesit-sexp-type-regexp 'sexp-list)) + (current-thing (treesit-thing-at (point) pred t)) + (default-pos + (condition-case _ + (save-excursion + (forward-sexp-default-function arg) + (point)) + (scan-error nil))) + (default-pos (unless (eq (point) default-pos) default-pos)) + (sibling-pos + (save-excursion + (and (if (> arg 0) + (treesit-end-of-thing pred (abs arg) 'restricted) + (treesit-beginning-of-thing pred (abs arg) 'restricted)) + (point)))) + (sibling (when sibling-pos + (if (> arg 0) + (treesit-thing-prev sibling-pos pred) + (treesit-thing-next sibling-pos pred))))) + + ;; 'forward-sexp-default-function' should not go out of the current thing, + ;; neither go inside the next thing, neither go over the next thing + (or (when (and default-pos + (or (null current-thing) + (if (> arg 0) + (< default-pos (treesit-node-end current-thing)) + (> default-pos (treesit-node-start current-thing)))) + (or (null sibling) + (if (> arg 0) + (< default-pos (treesit-node-start sibling)) + (> default-pos (treesit-node-end sibling))))) + (goto-char default-pos)) + (when sibling-pos + (goto-char sibling-pos)) + ;; If we couldn't move, we should signal an error and report + ;; the obstacle, like `forward-sexp' does. If we couldn't + ;; find a parent, we simply return nil without moving point, + ;; then functions like `up-list' will signal "at top level". + (when-let* ((parent (treesit-thing-at (point) pred t)) + (boundary (if (> arg 0) + (treesit-node-child parent -1) + (treesit-node-child parent 0)))) + (signal 'scan-error (list "No more sexp to move across" + (treesit-node-start boundary) + (treesit-node-end boundary))))))) + (defun treesit-transpose-sexps (&optional arg) "Tree-sitter `transpose-sexps' function. ARG is the same as in `transpose-sexps'. @@ -2849,7 +2911,7 @@ treesit-navigate-thing (if (eq tactic 'restricted) (setq pos (funcall advance - (cond ((and (null next) (null prev)) parent) + (cond ((and (null next) (null prev) (not (eq thing 'sexp-list))) parent) ((> arg 0) next) (t prev)))) ;; For `nested', it's a bit more work: @@ -3246,6 +3308,9 @@ treesit-major-mode-setup (setq-local forward-sexp-function #'treesit-forward-sexp) (setq-local transpose-sexps-function #'treesit-transpose-sexps)) + (when (treesit-thing-defined-p 'sexp-list nil) + (setq-local forward-sexp-function #'treesit-forward-sexp-list)) + (when (treesit-thing-defined-p 'sentence nil) (setq-local forward-sentence-function #'treesit-forward-sentence)) diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el index dbf721e8d0f..c4d33564e80 100644 --- a/lisp/progmodes/js.el +++ b/lisp/progmodes/js.el @@ -3878,6 +3878,19 @@ js--treesit-sexp-nodes "Nodes that designate sexps in JavaScript. See `treesit-thing-settings' for more information.") +(defvar js--treesit-sexp-list-nodes + '("formal_parameters" + "arguments" + "statement_block" + "parenthesized_expression" + "switch_body" + "array" + "object" + "string" + "regex") + "Nodes that designate lists in JavaScript. +See `treesit-thing-settings' for more information.") + (defvar js--treesit-jsdoc-beginning-regexp (rx bos "/**") "Regular expression matching the beginning of a jsdoc block comment.") @@ -3921,6 +3934,7 @@ js-ts-mode (setq-local treesit-thing-settings `((javascript (sexp ,(js--regexp-opt-symbol js--treesit-sexp-nodes)) + (sexp-list ,(js--regexp-opt-symbol js--treesit-sexp-list-nodes)) (sentence ,(js--regexp-opt-symbol js--treesit-sentence-nodes)) (text ,(js--regexp-opt-symbol '("comment" "string_fragment")))))) ^ permalink raw reply related [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 17:49 ` Juri Linkov @ 2024-12-12 19:13 ` Eli Zaretskii 2024-12-13 7:06 ` Juri Linkov 2024-12-18 7:37 ` Juri Linkov 1 sibling, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-12-12 19:13 UTC (permalink / raw) To: Juri Linkov; +Cc: mickey, casouri, theo, monnier, 73404 > From: Juri Linkov <juri@linkov.net> > Cc: theo@thornhill.no, casouri@gmail.com, mickey@masteringemacs.org, > monnier@iro.umontreal.ca, 73404@debbugs.gnu.org > Date: Thu, 12 Dec 2024 19:49:30 +0200 > > > Another variant is to leave treesit-forward-sexp as is, > > and create a new function treesit-forward-sexp-with-list > > that uses the 'list' thing. > > This patch keep the current function treesit-forward-sexp, > and creates a new function treesit-forward-sexp-list > that uses the 'sexp-list' thing to navigate lists while > using forward-sexp-default-function to navigate atoms: Introducing a new command raises the question how should users navigate using the two. Will C-M-<RIGHT> invoke forward-sexp or the new command (or maybe one or the other in some dwin-ish way)? What are the problems with teaching treesit-forward-sexp do what the new command does? IOW, why do we need a new command? ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 19:13 ` Eli Zaretskii @ 2024-12-13 7:06 ` Juri Linkov 2024-12-14 11:02 ` Eli Zaretskii 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-13 7:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mickey, casouri, theo, monnier, 73404 >> This patch keep the current function treesit-forward-sexp, >> and creates a new function treesit-forward-sexp-list >> that uses the 'sexp-list' thing to navigate lists while >> using forward-sexp-default-function to navigate atoms: > > Introducing a new command raises the question how should users > navigate using the two. Will C-M-<RIGHT> invoke forward-sexp or the > new command (or maybe one or the other in some dwin-ish way)? > > What are the problems with teaching treesit-forward-sexp do what the > new command does? IOW, why do we need a new command? Actually, it's not a new command, but a new function for C-M-<RIGHT> via forward-sexp-function. This is from treesit-major-mode-setup: (when (treesit-thing-defined-p 'sexp nil) (setq-local forward-sexp-function #'treesit-forward-sexp) (setq-local transpose-sexps-function #'treesit-transpose-sexps)) (when (treesit-thing-defined-p 'sexp-list nil) (setq-local forward-sexp-function #'treesit-forward-sexp-list)) When a ts-mode defines the 'sexp-list' thing, only then it's used. Otherwise the current implementation with 'sexp' is used for C-M-<RIGHT>. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-13 7:06 ` Juri Linkov @ 2024-12-14 11:02 ` Eli Zaretskii 2024-12-14 18:14 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-12-14 11:02 UTC (permalink / raw) To: Juri Linkov; +Cc: mickey, casouri, theo, monnier, 73404 > From: Juri Linkov <juri@linkov.net> > Cc: theo@thornhill.no, casouri@gmail.com, mickey@masteringemacs.org, > monnier@iro.umontreal.ca, 73404@debbugs.gnu.org > Date: Fri, 13 Dec 2024 09:06:04 +0200 > > >> This patch keep the current function treesit-forward-sexp, > >> and creates a new function treesit-forward-sexp-list > >> that uses the 'sexp-list' thing to navigate lists while > >> using forward-sexp-default-function to navigate atoms: > > > > Introducing a new command raises the question how should users > > navigate using the two. Will C-M-<RIGHT> invoke forward-sexp or the > > new command (or maybe one or the other in some dwin-ish way)? > > > > What are the problems with teaching treesit-forward-sexp do what the > > new command does? IOW, why do we need a new command? > > Actually, it's not a new command, but a new function for C-M-<RIGHT> > via forward-sexp-function. It's an interactive function, so it's a command. > This is from treesit-major-mode-setup: > > (when (treesit-thing-defined-p 'sexp nil) > (setq-local forward-sexp-function #'treesit-forward-sexp) > (setq-local transpose-sexps-function #'treesit-transpose-sexps)) > > (when (treesit-thing-defined-p 'sexp-list nil) > (setq-local forward-sexp-function #'treesit-forward-sexp-list)) > > When a ts-mode defines the 'sexp-list' thing, only then it's used. > Otherwise the current implementation with 'sexp' is used for C-M-<RIGHT>. Then why is the new function interactive? ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-14 11:02 ` Eli Zaretskii @ 2024-12-14 18:14 ` Juri Linkov 0 siblings, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-14 18:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mickey, casouri, theo, monnier, 73404 [-- Attachment #1: Type: text/plain, Size: 937 bytes --] >> Actually, it's not a new command, but a new function for C-M-<RIGHT> >> via forward-sexp-function. > > It's an interactive function, so it's a command. > >> When a ts-mode defines the 'sexp-list' thing, only then it's used. >> Otherwise the current implementation with 'sexp' is used for C-M-<RIGHT>. > > Then why is the new function interactive? Ah, I didn't notice it's interactive! treesit-forward-sexp-list was based on treesit-forward-sexp that has the interactive spec added by Theo in the commit 207901457c01. I guess that Theo decided to make this function interactive for such use case that users could use it separately or bind it to a key. What really needs to be interactive is the new function treesit-forward-list added in the following patch because there is no special variable forward-list-function for forward-list like there is forward-sexp-function for forward-sexp. So users might want to use it separately: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: treesit-forward-list.patch --] [-- Type: text/x-diff, Size: 5315 bytes --] diff --git a/lisp/treesit.el b/lisp/treesit.el index 18200acf53f..a1c012b6d2f 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -2366,6 +2366,11 @@ treesit-sexp-type-regexp however, smaller in scope than sentences. This is used by `treesit-forward-sexp' and friends.") +(defun treesit-forward-list (&optional arg) + (interactive "^p") + (let ((treesit-sexp-type-regexp 'sexp-list)) + (treesit-forward-sexp arg))) + (defun treesit-forward-sexp (&optional arg) "Tree-sitter implementation for `forward-sexp-function'. @@ -2382,18 +2387,8 @@ treesit-forward-sexp by `text' and `sexp' in `treesit-thing-settings'." (interactive "^p") (let ((arg (or arg 1)) - (pred (or treesit-sexp-type-regexp 'sexp)) - (node-at-point - (treesit-node-at (point) (treesit-language-at (point))))) - (or (when (and node-at-point - ;; Make sure point is strictly inside node. - (< (treesit-node-start node-at-point) - (point) - (treesit-node-end node-at-point)) - (treesit-node-match-p node-at-point 'text t)) - (forward-sexp-default-function arg) - t) - (if (> arg 0) + (pred (or treesit-sexp-type-regexp 'sexp))) + (or (if (> arg 0) (treesit-end-of-thing pred (abs arg) 'restricted) (treesit-beginning-of-thing pred (abs arg) 'restricted)) ;; If we couldn't move, we should signal an error and report @@ -2408,6 +2403,63 @@ treesit-forward-sexp (treesit-node-start boundary) (treesit-node-end boundary))))))) +(defun treesit-forward-sexp-list (&optional arg) + "Tree-sitter implementation for `forward-sexp-function'. + +ARG is described in the docstring of `forward-sexp-function'. + +If point is inside a text environment where tree-sitter is not +supported, go forward a sexp using `forward-sexp-default-function'. +If point is inside code, use tree-sitter functions with the +following behavior. If there are no further sexps to move across, +signal `scan-error' like `forward-sexp' does. If point is already +at top-level, return nil without moving point. + +What constitutes as text and source code sexp is determined +by `text' and `sexp' in `treesit-thing-settings'." + (interactive "^p") + (let* ((arg (or arg 1)) + (pred 'sexp-list) + (default-pos + (condition-case _ + (save-excursion + (forward-sexp-default-function arg) + (point)) + (scan-error nil))) + (default-pos (unless (eq (point) default-pos) default-pos)) + (sibling-pos + (when default-pos + (save-excursion + (and (if (> arg 0) + (treesit-end-of-thing pred (abs arg) 'restricted) + (treesit-beginning-of-thing pred (abs arg) 'restricted)) + (point))))) + (sibling (when sibling-pos + (if (> arg 0) + (treesit-thing-prev sibling-pos pred) + (treesit-thing-next sibling-pos pred)))) + (sibling (when (and sibling + (if (> arg 0) + (<= (point) (treesit-node-start sibling)) + (>= (point) (treesit-node-start sibling)))) + sibling)) + (current-thing (when default-pos + (treesit-thing-at (point) pred t)))) + + ;; 'forward-sexp-default-function' should not go out of the current thing, + ;; neither go inside the next thing, neither go over the next thing + (or (when (and default-pos + (or (null current-thing) + (if (> arg 0) + (< default-pos (treesit-node-end current-thing)) + (> default-pos (treesit-node-start current-thing)))) + (or (null sibling) + (if (> arg 0) + (<= default-pos (treesit-node-start sibling)) + (>= default-pos (treesit-node-end sibling))))) + (goto-char default-pos)) + (treesit-forward-list arg)))) + (defun treesit-transpose-sexps (&optional arg) "Tree-sitter `transpose-sexps' function. ARG is the same as in `transpose-sexps'. @@ -2857,7 +2909,7 @@ treesit-navigate-thing (if (eq tactic 'restricted) (setq pos (funcall advance - (cond ((and (null next) (null prev)) parent) + (cond ((and (null next) (null prev) (not (eq thing 'sexp-list))) parent) ((> arg 0) next) (t prev)))) ;; For `nested', it's a bit more work: @@ -3254,6 +3306,9 @@ treesit-major-mode-setup (setq-local forward-sexp-function #'treesit-forward-sexp) (setq-local transpose-sexps-function #'treesit-transpose-sexps)) + (when (treesit-thing-defined-p 'sexp-list nil) + (setq-local forward-sexp-function #'treesit-forward-sexp-list)) + (when (treesit-thing-defined-p 'sentence nil) (setq-local forward-sentence-function #'treesit-forward-sentence)) ^ permalink raw reply related [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-12 17:49 ` Juri Linkov 2024-12-12 19:13 ` Eli Zaretskii @ 2024-12-18 7:37 ` Juri Linkov 2024-12-19 4:04 ` Yuan Fu 1 sibling, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-18 7:37 UTC (permalink / raw) To: Yuan Fu; +Cc: theo, mickey, monnier, 73404 While testing treesit-forward-sexp-list, I discovered that thing-navigation functions are not restricted to named nodes. I wonder if there a reason to find anonymous nodes as things? The problem was found with the node "unless" in Ruby: unless cond a += 1 else b -= 1 end Here the named node 'unless' has exactly the same name as the anonymous node with the text "unless": (unless "unless" condition: (identifier) Finding anonymous nodes breaks forward-sexp when point is on "unless": un-!-less cond a += 1 else b -= 1 end because (treesit-thing-at (point) 'sexp t) finds #<treesit-node "unless" in 156-162> instead of #<treesit-node unless in 156-203> Also this breaks backward-sexp and backward-up-list because treesit--thing-sibling finds the anonymous node "unless" as a previous sibling instead of the named node 'unless' as a parent. Would the right solution be to check if the found thing is a named node? With something like: diff --git a/lisp/treesit.el b/lisp/treesit.el index 18200acf53f..9ad879ee40c 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -2711,6 +2774,7 @@ treesit--thing-sibling (lambda (n) (>= (treesit-node-start n) pos)))) (iter-pred (lambda (node) (and (treesit-node-match-p node thing t) + (treesit-node-check node 'named) (funcall pos-pred node)))) (sibling nil)) (when cursor @@ -2760,6 +2824,7 @@ treesit-thing-at (let* ((cursor (treesit-node-at pos)) (iter-pred (lambda (node) (and (treesit-node-match-p node thing t) + (treesit-node-check node 'named) (if strict (< (treesit-node-start node) pos) (<= (treesit-node-start node) pos)) ^ permalink raw reply related [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-18 7:37 ` Juri Linkov @ 2024-12-19 4:04 ` Yuan Fu 2024-12-19 7:14 ` Juri Linkov 2024-12-19 7:18 ` bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode Juri Linkov 0 siblings, 2 replies; 66+ messages in thread From: Yuan Fu @ 2024-12-19 4:04 UTC (permalink / raw) To: Juri Linkov; +Cc: theo, Mickey Petersen, monnier, 73404 > On Dec 17, 2024, at 11:37 PM, Juri Linkov <juri@linkov.net> wrote: > > While testing treesit-forward-sexp-list, I discovered that > thing-navigation functions are not restricted to named nodes. > > I wonder if there a reason to find anonymous nodes as things? We should rather ask is there any reason to not find anonymous nodes as things? Even ruby-ts-mode defines a bunch of anonymous nodes as sexp, no? In any case, excluding anonymous nodes from things doesn’t sound right. > > The problem was found with the node "unless" in Ruby: > > unless cond > a += 1 > else > b -= 1 > end > > Here the named node 'unless' has exactly the same name > as the anonymous node with the text "unless": > > (unless "unless" condition: (identifier) I feel like Ruby’s grammar should call the named node something else, like unless_statement. > > Finding anonymous nodes breaks forward-sexp when point is on "unless": > > un-!-less cond > a += 1 > else > b -= 1 > end > > because (treesit-thing-at (point) 'sexp t) finds > > #<treesit-node "unless" in 156-162> > > instead of > > #<treesit-node unless in 156-203> > > Also this breaks backward-sexp and backward-up-list > because treesit--thing-sibling finds > the anonymous node "unless" as a previous sibling > instead of the named node 'unless' as a parent. > > Would the right solution be to check if the found thing > is a named node? With something like: > > diff --git a/lisp/treesit.el b/lisp/treesit.el > index 18200acf53f..9ad879ee40c 100644 > --- a/lisp/treesit.el > +++ b/lisp/treesit.el > @@ -2711,6 +2774,7 @@ treesit--thing-sibling > (lambda (n) (>= (treesit-node-start n) pos)))) > (iter-pred (lambda (node) > (and (treesit-node-match-p node thing t) > + (treesit-node-check node 'named) > (funcall pos-pred node)))) > (sibling nil)) > (when cursor > @@ -2760,6 +2824,7 @@ treesit-thing-at > (let* ((cursor (treesit-node-at pos)) > (iter-pred (lambda (node) > (and (treesit-node-match-p node thing t) > + (treesit-node-check node 'named) > (if strict > (< (treesit-node-start node) pos) > (<= (treesit-node-start node) pos)) A better solution IMO is to add some way to distinguish between named and anonymous nodes. I can think of two ways, either add “and” and “named/anonymous” predicate, so (and named “unless”) only matches the named “unless” node; or we add a special syntax such that “(unless)” only matches named nodes, and “\”unless\”” only matches anonymous nodes. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-19 4:04 ` Yuan Fu @ 2024-12-19 7:14 ` Juri Linkov 2024-12-19 7:18 ` bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode Juri Linkov 1 sibling, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-19 7:14 UTC (permalink / raw) To: Yuan Fu; +Cc: theo, Mickey Petersen, monnier, 73404 > A better solution IMO is to add some way to distinguish between named and > anonymous nodes. I can think of two ways, either add “and” and > “named/anonymous” predicate, so (and named “unless”) only matches the named > “unless” node; or we add a special syntax such that “(unless)” only matches > named nodes, and “\”unless\”” only matches anonymous nodes. I agree, so will create a new feature request. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-19 4:04 ` Yuan Fu 2024-12-19 7:14 ` Juri Linkov @ 2024-12-19 7:18 ` Juri Linkov 2024-12-24 3:02 ` Yuan Fu 1 sibling, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-19 7:18 UTC (permalink / raw) To: 74963; +Cc: Yuan Fu, Dmitry Gutov [This is a separate bug report from bug#73404] >> While testing treesit-forward-sexp-list, I discovered that >> thing-navigation functions are not restricted to named nodes. >> >> I wonder if there a reason to find anonymous nodes as things? > > We should rather ask is there any reason to not find anonymous nodes > as things? Even ruby-ts-mode defines a bunch of anonymous nodes as > sexp, no? In any case, excluding anonymous nodes from things doesn’t > sound right. Indeed, there are many anonymous nodes used in ruby-ts-mode. >> The problem was found with the node "unless" in Ruby: >> >> unless cond >> a += 1 >> else >> b -= 1 >> end >> >> Here the named node 'unless' has exactly the same name >> as the anonymous node with the text "unless": >> >> (unless "unless" condition: (identifier) > > I feel like Ruby’s grammar should call the named node something else, > like unless_statement. Agreed, the problem is that nodes defined in Ruby’s grammar are too ambiguous. There are more such nodes with the same name for named and anonymous: "if", "while", "until", etc. >> Finding anonymous nodes breaks forward-sexp when point is on "unless": >> >> un-!-less cond >> a += 1 >> else >> b -= 1 >> end >> >> because (treesit-thing-at (point) 'sexp t) finds >> >> #<treesit-node "unless" in 156-162> >> >> instead of >> >> #<treesit-node unless in 156-203> >> >> Also this breaks backward-sexp and backward-up-list >> because treesit--thing-sibling finds >> the anonymous node "unless" as a previous sibling >> instead of the named node 'unless' as a parent. >> >> Would the right solution be to check if the found thing >> is a named node? With something like: >> >> diff --git a/lisp/treesit.el b/lisp/treesit.el >> index 18200acf53f..9ad879ee40c 100644 >> --- a/lisp/treesit.el >> +++ b/lisp/treesit.el >> @@ -2711,6 +2774,7 @@ treesit--thing-sibling >> (lambda (n) (>= (treesit-node-start n) pos)))) >> (iter-pred (lambda (node) >> (and (treesit-node-match-p node thing t) >> + (treesit-node-check node 'named) >> (funcall pos-pred node)))) >> (sibling nil)) >> (when cursor >> @@ -2760,6 +2824,7 @@ treesit-thing-at >> (let* ((cursor (treesit-node-at pos)) >> (iter-pred (lambda (node) >> (and (treesit-node-match-p node thing t) >> + (treesit-node-check node 'named) >> (if strict >> (< (treesit-node-start node) pos) >> (<= (treesit-node-start node) pos)) > > A better solution IMO is to add some way to distinguish between named and > anonymous nodes. I can think of two ways, either add “and” and > “named/anonymous” predicate, so (and named “unless”) only matches the named > “unless” node; or we add a special syntax such that “(unless)” only matches > named nodes, and “\”unless\”” only matches anonymous nodes. Either predicate or a special syntax is welcome. This would be more handy than writing a lambda with implicit calls of treesit-node-check. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-19 7:18 ` bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode Juri Linkov @ 2024-12-24 3:02 ` Yuan Fu 2024-12-24 7:17 ` Juri Linkov 2024-12-24 17:52 ` Juri Linkov 0 siblings, 2 replies; 66+ messages in thread From: Yuan Fu @ 2024-12-24 3:02 UTC (permalink / raw) To: Juri Linkov; +Cc: Dmitry Gutov, 74963 > On Dec 18, 2024, at 11:18 PM, Juri Linkov <juri@linkov.net> wrote: > > [This is a separate bug report from bug#73404] > >>> While testing treesit-forward-sexp-list, I discovered that >>> thing-navigation functions are not restricted to named nodes. >>> >>> I wonder if there a reason to find anonymous nodes as things? >> >> We should rather ask is there any reason to not find anonymous nodes >> as things? Even ruby-ts-mode defines a bunch of anonymous nodes as >> sexp, no? In any case, excluding anonymous nodes from things doesn’t >> sound right. > > Indeed, there are many anonymous nodes used in ruby-ts-mode. > >>> The problem was found with the node "unless" in Ruby: >>> >>> unless cond >>> a += 1 >>> else >>> b -= 1 >>> end >>> >>> Here the named node 'unless' has exactly the same name >>> as the anonymous node with the text "unless": >>> >>> (unless "unless" condition: (identifier) >> >> I feel like Ruby’s grammar should call the named node something else, >> like unless_statement. > > Agreed, the problem is that nodes defined in Ruby’s grammar > are too ambiguous. There are more such nodes with the same name > for named and anonymous: "if", "while", "until", etc. > >>> Finding anonymous nodes breaks forward-sexp when point is on "unless": >>> >>> un-!-less cond >>> a += 1 >>> else >>> b -= 1 >>> end >>> >>> because (treesit-thing-at (point) 'sexp t) finds >>> >>> #<treesit-node "unless" in 156-162> >>> >>> instead of >>> >>> #<treesit-node unless in 156-203> >>> >>> Also this breaks backward-sexp and backward-up-list >>> because treesit--thing-sibling finds >>> the anonymous node "unless" as a previous sibling >>> instead of the named node 'unless' as a parent. >>> >>> Would the right solution be to check if the found thing >>> is a named node? With something like: >>> >>> diff --git a/lisp/treesit.el b/lisp/treesit.el >>> index 18200acf53f..9ad879ee40c 100644 >>> --- a/lisp/treesit.el >>> +++ b/lisp/treesit.el >>> @@ -2711,6 +2774,7 @@ treesit--thing-sibling >>> (lambda (n) (>= (treesit-node-start n) pos)))) >>> (iter-pred (lambda (node) >>> (and (treesit-node-match-p node thing t) >>> + (treesit-node-check node 'named) >>> (funcall pos-pred node)))) >>> (sibling nil)) >>> (when cursor >>> @@ -2760,6 +2824,7 @@ treesit-thing-at >>> (let* ((cursor (treesit-node-at pos)) >>> (iter-pred (lambda (node) >>> (and (treesit-node-match-p node thing t) >>> + (treesit-node-check node 'named) >>> (if strict >>> (< (treesit-node-start node) pos) >>> (<= (treesit-node-start node) pos)) >> >> A better solution IMO is to add some way to distinguish between named and >> anonymous nodes. I can think of two ways, either add “and” and >> “named/anonymous” predicate, so (and named “unless”) only matches the named >> “unless” node; or we add a special syntax such that “(unless)” only matches >> named nodes, and “\”unless\”” only matches anonymous nodes. > > Either predicate or a special syntax is welcome. > > This would be more handy than writing a lambda with implicit calls > of treesit-node-check. I’ll go with the (and named “unless”) route because after thinking about it more, “(unless)” will be hard to work with because the string predicate is actually a regexp. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-24 3:02 ` Yuan Fu @ 2024-12-24 7:17 ` Juri Linkov 2024-12-24 7:41 ` Juri Linkov 2024-12-25 3:25 ` Dmitry Gutov 2024-12-24 17:52 ` Juri Linkov 1 sibling, 2 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-24 7:17 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, 74963 >>> A better solution IMO is to add some way to distinguish between named and >>> anonymous nodes. I can think of two ways, either add “and” and >>> “named/anonymous” predicate, so (and named “unless”) only matches the named >>> “unless” node; or we add a special syntax such that “(unless)” only matches >>> named nodes, and “\”unless\”” only matches anonymous nodes. >> >> Either predicate or a special syntax is welcome. >> >> This would be more handy than writing a lambda with implicit calls >> of treesit-node-check. > > I’ll go with the (and named “unless”) route because after thinking > about it more, “(unless)” will be hard to work with because the string > predicate is actually a regexp. Thanks. While addition of '(and named "unless")' would be appreciated, I see that currently it's possible to do this by proving a predicate like there is 'ruby-ts--sexp-p' in (setq-local treesit-thing-settings `((ruby (sexp ,(cons (rx bol (or "class" ... ) eol) #'ruby-ts--sexp-p)) Then 'ruby-ts--sexp-p' could check for the named node "unless" as well. But it seems such solution is less efficient than adding '(and named "unless")'. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-24 7:17 ` Juri Linkov @ 2024-12-24 7:41 ` Juri Linkov 2024-12-25 3:25 ` Dmitry Gutov 1 sibling, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-24 7:41 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, 74963 > (setq-local treesit-thing-settings > `((ruby > (sexp ,(cons (rx > bol > (or > "class" > ... > ) > eol) > #'ruby-ts--sexp-p)) BTW, I just fixed a bug in typescript-ts-mode where "string_fragment" was mismatched by "string", because its regexp-opt matched node names too widely, so needed to enclose in regexp anchors. I see that all ts-modes solve this common problem each in its own way (here 'list' indicates a list of strings that should match node names): c-ts-mode: (regexp-opt list 'symbols) js-ts-mode: (concat "\\_<" (regexp-opt list t) "\\_>") java-ts-mode: (rx (or list)) ruby-ts-mode: (rx bol (or list) eol) Currently there is no uniform way to handle this frequent need. 'concat' like above looks too ugly, but 'regexp-opt' with the 'symbols' arg produces a strange regexp for matching symbols. Maybe better would be create a new argument for 'regexp-opt', e.g.: (regexp-opt list 'complete) that will expand to: (concat "^" list "$") ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-24 7:17 ` Juri Linkov 2024-12-24 7:41 ` Juri Linkov @ 2024-12-25 3:25 ` Dmitry Gutov 2024-12-25 7:52 ` Juri Linkov 1 sibling, 1 reply; 66+ messages in thread From: Dmitry Gutov @ 2024-12-25 3:25 UTC (permalink / raw) To: Juri Linkov, Yuan Fu; +Cc: 74963 Hi Juri, On 24/12/2024 09:17, Juri Linkov wrote: > While addition of '(and named "unless")' would be appreciated, > I see that currently it's possible to do this by proving a predicate > like there is 'ruby-ts--sexp-p' in > > (setq-local treesit-thing-settings > `((ruby > (sexp ,(cons (rx > bol > (or > "class" > ... > ) > eol) > #'ruby-ts--sexp-p)) > > Then 'ruby-ts--sexp-p' could check for the named node "unless" as well. > > But it seems such solution is less efficient than adding '(and named "unless")'. Given that we're already calling a predicate every time (in ruby-ts-mode), we might as well add one more check. See the patch at the end. Speaking of tricky examples though, here's a definition: module Bar class Foo def baz end end end If you move point inside the keyword "module" or "class", C-M-f wouldn't move forward either as of the latest master. No such problem with "def". Adding the check for "named" fixes the first two cases, but then C-M-f inside "def" jumps to after "baaz". Could be worked around with a special case, but I wonder what this difference comes from (haven't properly debugged yet). diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el index 4ef0cb18eae..4b15c6cbf27 100644 --- a/lisp/progmodes/ruby-ts-mode.el +++ b/lisp/progmodes/ruby-ts-mode.el @@ -1120,6 +1120,10 @@ ruby-ts--sexp-p (equal (treesit-node-type (treesit-node-child node 0)) "("))) +(defun ruby-ts--sexp-list-p (node) + (when (treesit-node-check node 'named) + (ruby-ts--sexp-p node))) + (defvar-keymap ruby-ts-mode-map :doc "Keymap used in Ruby mode" :parent prog-mode-map @@ -1235,7 +1239,7 @@ ruby-ts-mode "array" "hash") eol) - #'ruby-ts--sexp-p)) + #'ruby-ts--sexp-list-p)) (text ,(lambda (node) (or (member (treesit-node-type node) '("comment" "string_content" "heredoc_content")) ^ permalink raw reply related [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-25 3:25 ` Dmitry Gutov @ 2024-12-25 7:52 ` Juri Linkov 2024-12-26 1:00 ` Dmitry Gutov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-25 7:52 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Yuan Fu, 74963 >> While addition of '(and named "unless")' would be appreciated, >> I see that currently it's possible to do this by proving a predicate >> like there is 'ruby-ts--sexp-p' in >> >> (setq-local treesit-thing-settings >> `((ruby >> (sexp ,(cons (rx >> bol >> (or >> "class" >> ... >> ) >> eol) >> #'ruby-ts--sexp-p)) >> >> Then 'ruby-ts--sexp-p' could check for the named node "unless" as well. >> >> But it seems such solution is less efficient than adding '(and named "unless")'. > > Given that we're already calling a predicate every time (in > ruby-ts-mode), we might as well add one more check. See the patch at the > end. Thanks, I tried the patch. It was broken, so needed to edit manually. Also the new key 'w' doesn't work in diff buffers, need to fix it as well. > Speaking of tricky examples though, here's a definition: > > module Bar > class Foo > def baz > end > end > end > > If you move point inside the keyword "module" or "class", C-M-f wouldn't > move forward either as of the latest master. No such problem with "def". > > Adding the check for "named" fixes the first two cases, but then C-M-f > inside "def" jumps to after "baaz". Could be worked around with a > special case, but I wonder what this difference comes from (haven't > properly debugged yet). I see no problems with your patch. Everything works nicely. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-25 7:52 ` Juri Linkov @ 2024-12-26 1:00 ` Dmitry Gutov 2024-12-27 7:42 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Dmitry Gutov @ 2024-12-26 1:00 UTC (permalink / raw) To: Juri Linkov; +Cc: Yuan Fu, 74963 On 25/12/2024 09:52, Juri Linkov wrote: >> Given that we're already calling a predicate every time (in >> ruby-ts-mode), we might as well add one more check. See the patch at the >> end. > > Thanks, I tried the patch. It was broken, so needed to edit manually. Maybe something regarding whitespace at the end? > Also the new key 'w' doesn't work in diff buffers, need to fix it as well. The binding for 'diff-kill-ring-save'? Seems to work here, as long as the diff buffer is in read-only mode. >> Adding the check for "named" fixes the first two cases, but then C-M-f >> inside "def" jumps to after "baaz". Could be worked around with a >> special case, but I wonder what this difference comes from (haven't >> properly debugged yet). > > I see no problems with your patch. Everything works nicely. Hmm, I can't reproduce it either anymore. Thanks for testing, pushed to master now (unfortunately the commit message refers to bug#73404). ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-26 1:00 ` Dmitry Gutov @ 2024-12-27 7:42 ` Juri Linkov 0 siblings, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-27 7:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Yuan Fu, 74963 >>> Given that we're already calling a predicate every time (in >>> ruby-ts-mode), we might as well add one more check. See the patch at the >>> end. >> Thanks, I tried the patch. It was broken, so needed to edit manually. > > Maybe something regarding whitespace at the end? Something with whitespace, but not a big problem. >> Also the new key 'w' doesn't work in diff buffers, need to fix it as well. > > The binding for 'diff-kill-ring-save'? Seems to work here, as long as the > diff buffer is in read-only mode. Yes, 'W' with 'diff-kill-ring-save'. Single keys are still a problem in visited diff files. >>> Adding the check for "named" fixes the first two cases, but then C-M-f >>> inside "def" jumps to after "baaz". Could be worked around with a >>> special case, but I wonder what this difference comes from (haven't >>> properly debugged yet). >> I see no problems with your patch. Everything works nicely. > > Hmm, I can't reproduce it either anymore. > > Thanks for testing, pushed to master now (unfortunately the commit message > refers to bug#73404). Thanks. Maybe a helper for other ts-modes will be handy: (defun treesit-node-named (node) (treesit-node-check node 'named)) to be used like this (sexp ,(cons (treesit-match-nodes strings) 'treesit-node-named)) ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-24 3:02 ` Yuan Fu 2024-12-24 7:17 ` Juri Linkov @ 2024-12-24 17:52 ` Juri Linkov 2024-12-24 21:03 ` Yuan Fu 1 sibling, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-24 17:52 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, 74963 > I’ll go with the (and named “unless”) route because after thinking > about it more, “(unless)” will be hard to work with because the string > predicate is actually a regexp. Is it possible to mark all node names specified in treesit-thing-settings as named? I just discovered a new problem: 1. With typescript-ts-mode on the following snippet: type NodeInfo = | (BaseNode & { subtypes: BaseNode[]; }) | (BaseNode & { fields: { [name: string]: ChildNode }; children: ChildNode[]; }); You can move point inside "string" and type C-M-f or C-M-b. But point doesn't move. This is because treesit-thing-settings defines a named node "string". But anonymous node has the same name "string": (index_signature [ name: (identifier) : index_type: (predefined_type string) and (treesit-node-at (point)) returns #<treesit-node "string" in 111-117> This mismatched "string" in TypeScript is even more unexpected than "unless" in Ruby. So probably we need a way to mark all used nodes as named to avoid such unexpected matches. Maybe matching anonymous nodes should be opt-in, and by default match only named nodes. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-24 17:52 ` Juri Linkov @ 2024-12-24 21:03 ` Yuan Fu 2024-12-25 7:49 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-12-24 21:03 UTC (permalink / raw) To: Juri Linkov; +Cc: Dmitry Gutov, 74963 > On Dec 24, 2024, at 9:52 AM, Juri Linkov <juri@linkov.net> wrote: > >> I’ll go with the (and named “unless”) route because after thinking >> about it more, “(unless)” will be hard to work with because the string >> predicate is actually a regexp. > > Is it possible to mark all node names specified in treesit-thing-settings > as named? > > I just discovered a new problem: > > 1. With typescript-ts-mode on the following snippet: > > type NodeInfo = > | (BaseNode & { > subtypes: BaseNode[]; > }) > | (BaseNode & { > fields: { [name: string]: ChildNode }; > children: ChildNode[]; > }); > > You can move point inside "string" and type C-M-f or C-M-b. > But point doesn't move. > > This is because treesit-thing-settings defines a named node "string". > But anonymous node has the same name "string": > > (index_signature [ name: (identifier) : > index_type: (predefined_type string) > > and (treesit-node-at (point)) returns > #<treesit-node "string" in 111-117> > > This mismatched "string" in TypeScript is even more > unexpected than "unless" in Ruby. > > So probably we need a way to mark all used nodes as named > to avoid such unexpected matches. Maybe matching anonymous nodes > should be opt-in, and by default match only named nodes. IMHO this is just an unfortunate bug that needs to be fixed. I agree that this type of bug are hard to avoid, which is a bad thing, but that doesn’t mean we should try to alleviate it at any cost. Making predicates named by default just adds complexity and inflexibility for not much benefit. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-24 21:03 ` Yuan Fu @ 2024-12-25 7:49 ` Juri Linkov 2024-12-25 9:11 ` Yuan Fu 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-25 7:49 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, 74963 >> This mismatched "string" in TypeScript is even more >> unexpected than "unless" in Ruby. >> >> So probably we need a way to mark all used nodes as named >> to avoid such unexpected matches. Maybe matching anonymous nodes >> should be opt-in, and by default match only named nodes. > > IMHO this is just an unfortunate bug that needs to be fixed. I agree that > this type of bug are hard to avoid, which is a bad thing, but that doesn’t > mean we should try to alleviate it at any cost. Making predicates named by > default just adds complexity and inflexibility for not much benefit. Not sure if a possible flexibility is better than unintended matches. When the authors of a ts-mode carefully selected a list of named nodes to match, why treesit should try to match some random and unintended anonymous nodes? ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-25 7:49 ` Juri Linkov @ 2024-12-25 9:11 ` Yuan Fu 2024-12-25 17:39 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-12-25 9:11 UTC (permalink / raw) To: Juri Linkov; +Cc: Dmitry Gutov, 74963 > On Dec 24, 2024, at 11:49 PM, Juri Linkov <juri@linkov.net> wrote: > >>> This mismatched "string" in TypeScript is even more >>> unexpected than "unless" in Ruby. >>> >>> So probably we need a way to mark all used nodes as named >>> to avoid such unexpected matches. Maybe matching anonymous nodes >>> should be opt-in, and by default match only named nodes. >> >> IMHO this is just an unfortunate bug that needs to be fixed. I agree that >> this type of bug are hard to avoid, which is a bad thing, but that doesn’t >> mean we should try to alleviate it at any cost. Making predicates named by >> default just adds complexity and inflexibility for not much benefit. > > Not sure if a possible flexibility is better than unintended matches. > > When the authors of a ts-mode carefully selected a list of named nodes to match, > why treesit should try to match some random and unintended anonymous nodes? I don’t know and can’t prove how much the flexibility is worth, but the cost on complexity is real. If everywhere else uses thing predicates as-is, but sexp navigation auto-converts thing predicates into named predicate, that’s a cognitive burden and a special case that’s guaranteed to trip people over. OTOH, what’s the downside of wrapping the sexp predicate with (and named …), if you only want named nodes to match? I just think the cost outweighs the benefit, if there is any to begin with. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode 2024-12-25 9:11 ` Yuan Fu @ 2024-12-25 17:39 ` Juri Linkov 0 siblings, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-25 17:39 UTC (permalink / raw) To: Yuan Fu; +Cc: Dmitry Gutov, 74963 >> Not sure if a possible flexibility is better than unintended matches. >> >> When the authors of a ts-mode carefully selected a list of named nodes to match, >> why treesit should try to match some random and unintended anonymous nodes? > > I don’t know and can’t prove how much the flexibility is worth, but the > cost on complexity is real. If everywhere else uses thing predicates as-is, > but sexp navigation auto-converts thing predicates into named predicate, > that’s a cognitive burden and a special case that’s guaranteed to trip > people over. > > OTOH, what’s the downside of wrapping the sexp predicate with (and named …), > if you only want named nodes to match? > > I just think the cost outweighs the benefit, if there is any to begin with. Actually, what I had in mind is not to enable named-only mode by default, but only to allow the authors of ts-modes to specify this condition. For example, if it will be possible to write (setq-local treesit-thing-settings `((typescript (sexp (and named ,(regexp-opt typescript-ts-mode--sexp-nodes 'symbols)))))) this should be fine. This is similar to how the authors of ts-modes decide whether to restrict matches to exact names by using "^...$" with regexp-opt. BTW, I'm thinking about adding such simple helper: (defun treesit-regexp-opt (strings) (concat "^" (regexp-opt strings) "$")) to use like this: (setq-local treesit-thing-settings `((typescript (sexp (and named ,(treesit-regexp-opt typescript-ts-mode--sexp-nodes)))))) ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-11 6:31 ` Yuan Fu 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-19 7:34 ` Juri Linkov 2024-12-24 19:05 ` Juri Linkov 2024-12-30 7:15 ` Juri Linkov 1 sibling, 2 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-19 7:34 UTC (permalink / raw) To: Yuan Fu Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, 73404, Stefan Monnier >> What should remain in 'treesit-thing-settings' are only grouping >> constructs such as "parenthesized_expression" and "statement_block". > > Ah, this matches my idea of defining sexp in other languages as “repeatable > construct/list-like construct”. We went with “every syntactic construct” at > the time, which I didn’t object to, but I’m definitely happier with the > repeatable construct approach. Including Stefan and Theo since they were > part of the original sexp navigation discussion. > > My only concern is that would the result be a bit unpredictable/confusing > when we mix the result of two logic together in such an involved way? We > can push to master and try it out for a while. I use tree-sitter sexp > navigation for work every day, albeit strictly for navigating list-like > constructs—I use forward/backward-word for smaller navigation. > >> tries to go out of the current thing to its parent, >> thus breaking the main principle that 'forward-sexp' >> should move forward across siblings only. But removing >> this line fixed the problem: > > Thanks, LGTM. Ok, so now pushed to master in such backwards-compatible way that when a ts-mode doesn't define the 'sexp-list' thing, then the existing 'sexp' is used. Also added 'sexp-list' to c-ts-mode, js-ts-mode, ruby-ts-mode and html-ts-mode. Addition of 'sexp-list' to other ts-modes is underway. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-19 7:34 ` bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Juri Linkov @ 2024-12-24 19:05 ` Juri Linkov 2024-12-24 21:14 ` Yuan Fu 2024-12-25 17:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-30 7:15 ` Juri Linkov 1 sibling, 2 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-24 19:05 UTC (permalink / raw) To: Yuan Fu Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier > Ok, so now pushed to master in such backwards-compatible way > that when a ts-mode doesn't define the 'sexp-list' thing, > then the existing 'sexp' is used. Also I'm looking into allowing more list-navigation commands to be usable in ts-modes. E.g. instead of limiting list-navigation only to the current 'forward-sexp', another useful command is 'down-list'. Currently to get inside an HTML element in html-ts-mode is possible by using 'M-e' (forward-sentence) to skip HTML tag. This is not quite obvious. More natural would be to support 'C-M-d' (down-list). But this requires overriding more low-level 'scan-lists' and 'scan-sexps' in treesit, instead of overriding the current top-level 'forward-sexp-function'. Also treesit-thing-settings for 'down-list' would require pairs of node names: one node to define a list, and another node to skip a node to get inside the list. For html-ts-mode a list node is "element", and the node to skip is "tag". So for example: -!-text <html></html> 'C-M-f' will use node "element" to move to the beginning of "element": text -!-<html></html> then to get inside also need to skip the <html> tag: text <html>-!-</html> ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-24 19:05 ` Juri Linkov @ 2024-12-24 21:14 ` Yuan Fu 2024-12-25 7:44 ` Juri Linkov 2024-12-25 17:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-12-24 21:14 UTC (permalink / raw) To: Juri Linkov Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier > On Dec 24, 2024, at 11:05 AM, Juri Linkov <juri@linkov.net> wrote: > >> Ok, so now pushed to master in such backwards-compatible way >> that when a ts-mode doesn't define the 'sexp-list' thing, >> then the existing 'sexp' is used. > > Also I'm looking into allowing more list-navigation commands > to be usable in ts-modes. E.g. instead of limiting list-navigation > only to the current 'forward-sexp', another useful command is > 'down-list'. > > Currently to get inside an HTML element in html-ts-mode > is possible by using 'M-e' (forward-sentence) to skip HTML tag. > This is not quite obvious. > > More natural would be to support 'C-M-d' (down-list). > > But this requires overriding more low-level 'scan-lists' and > 'scan-sexps' in treesit, instead of overriding the current top-level > 'forward-sexp-function'. > > Also treesit-thing-settings for 'down-list' would require > pairs of node names: one node to define a list, and another node > to skip a node to get inside the list. > > For html-ts-mode a list node is "element", and the node to skip > is "tag". So for example: > > -!-text <html></html> > > 'C-M-f' will use node "element" to move to the beginning of "element": > > text -!-<html></html> > > then to get inside also need to skip the <html> tag: > > text <html>-!-</html> Right, the thing navigation only supports going up/outside, but not going down/inside. We can add a new thing for beginning and end of balanced pairs. Then down-list will be going from the start of a balanced-pair-open to the end of it. Up-list will be going from the start of a balanced-pair-close to the end of it. We might also want a way to jump from pair-open to pair-end. Going to pair-open’s parent’s end will be almost always correct. (Except for the grammars that do weird stuff, like tree-sitter-c’s for statement, the condition is not a node by itself, but split into opening parentheses, initializer, loop condition, increment, closing paren, all of which direct child of the for_statement, not grouped into a condition node.) Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-24 21:14 ` Yuan Fu @ 2024-12-25 7:44 ` Juri Linkov 2024-12-25 8:34 ` Yuan Fu 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-25 7:44 UTC (permalink / raw) To: Yuan Fu Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier >> For html-ts-mode a list node is "element", and the node to skip >> is "tag". So for example: >> >> -!-text <html></html> >> >> 'C-M-f' will use node "element" to move to the beginning of "element": >> >> text -!-<html></html> >> >> then to get inside also need to skip the <html> tag: >> >> text <html>-!-</html> > > Right, the thing navigation only supports going up/outside, but not going > down/inside. We can add a new thing for beginning and end of balanced > pairs. Then down-list will be going from the start of a balanced-pair-open > to the end of it. Up-list will be going from the start of > a balanced-pair-close to the end of it. Probably we could use just such heuristics that 'down-list' should skip the first node of the balanced pair. This should work for most ts-modes. For example, for 'jsx_element' the first child to skip is 'jsx_opening_element'. For 'argument_list' the first child to skip is an anonymous node "(". For 'statement_block' the first child to skip is an anonymous node "{". ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-25 7:44 ` Juri Linkov @ 2024-12-25 8:34 ` Yuan Fu 2024-12-25 17:36 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Yuan Fu @ 2024-12-25 8:34 UTC (permalink / raw) To: Juri Linkov Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier > On Dec 24, 2024, at 11:44 PM, Juri Linkov <juri@linkov.net> wrote: > >>> For html-ts-mode a list node is "element", and the node to skip >>> is "tag". So for example: >>> >>> -!-text <html></html> >>> >>> 'C-M-f' will use node "element" to move to the beginning of "element": >>> >>> text -!-<html></html> >>> >>> then to get inside also need to skip the <html> tag: >>> >>> text <html>-!-</html> >> >> Right, the thing navigation only supports going up/outside, but not going >> down/inside. We can add a new thing for beginning and end of balanced >> pairs. Then down-list will be going from the start of a balanced-pair-open >> to the end of it. Up-list will be going from the start of >> a balanced-pair-close to the end of it. > > Probably we could use just such heuristics that 'down-list' should skip > the first node of the balanced pair. This should work for most ts-modes. > > For example, for 'jsx_element' the first child to skip is 'jsx_opening_element'. > For 'argument_list' the first child to skip is an anonymous node "(". > For 'statement_block' the first child to skip is an anonymous node "{“. Yes, that should work for the vast majority of grammars. I can’t think of an counter example other than for_statment in tree-sitter-c. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-25 8:34 ` Yuan Fu @ 2024-12-25 17:36 ` Juri Linkov 2024-12-27 7:59 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-25 17:36 UTC (permalink / raw) To: Yuan Fu Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier >> Probably we could use just such heuristics that 'down-list' should skip >> the first node of the balanced pair. This should work for most ts-modes. >> >> For example, for 'jsx_element' the first child to skip is 'jsx_opening_element'. >> For 'argument_list' the first child to skip is an anonymous node "(". >> For 'statement_block' the first child to skip is an anonymous node "{“. > > Yes, that should work for the vast majority of grammars. I can’t think > of an counter example other than for_statment in tree-sitter-c. Indeed, the 'for_statement' is a hard problem, and I see no good solution. I already encountered in different languages the same problem that you described: > We might also want a way to jump from pair-open to pair-end. Going to > pair-open’s parent’s end will be almost always correct. (Except for the > grammars that do weird stuff, like tree-sitter-c’s for statement, the > condition is not a node by itself, but split into opening parentheses, > initializer, loop condition, increment, closing paren, all of which direct > child of the for_statement, not grouped into a condition node.) Maybe indeed worth to try defining pair-open and pair-end as some children nodes of the 'for_statement', i.e. only the opening paren and the closing paren. Something like (completely untested): (and (member (treesit-node-text node) '("(" ")")) (equal (treesit-node-type (treesit-node-parent node)) "for_statement")) ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-25 17:36 ` Juri Linkov @ 2024-12-27 7:59 ` Juri Linkov 0 siblings, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-27 7:59 UTC (permalink / raw) To: Yuan Fu Cc: Theodor Thornhill, Eli Zaretskii, Mickey Petersen, 73404, Stefan Monnier >> I can’t think of an counter example other than for_statment in >> tree-sitter-c. > > Indeed, the 'for_statement' is a hard problem, and I see > no good solution. A possible solution would be to find a way to create "virtual" nodes. This means transforming the existing syntax tree by inserting addition nodes to it. For example, transforming (for_statement "for" "(" condition: (assignment_expression left: (identifier) operator: "=" right: (number_literal)) ; body: (binary_expression left: (identifier) operator: "<" right: (number_literal)) ; (update_expression operator: "++" argument: (identifier)) ")" into (for_statement "for" (for_parameters "(" condition: (assignment_expression left: (identifier) operator: "=" right: (number_literal)) ; body: (binary_expression left: (identifier) operator: "<" right: (number_literal)) ; (update_expression operator: "++" argument: (identifier)) ")" by inserting the virtual node "for_parameters". Not sure if such transformation is supported by tree-sitter. Generally, we have two problems with syntax trees created by the authors of tree-sitter grammars: 1. Insufficient structure 2. Excessive structure For insufficient structure a possible solution would be to insert virtual nodes like above. And for excessive structure we need to flatten some nodes. For example, in c-ts-mode: static -!-struct atimer *free_atimers; 'C-M-f' unexpectedly jumped not to the next symbol, but to static struct atimer-!- *free_atimers; because of too much structure built in the syntax tree: (declaration type: (storage_class_specifier "static") declarator: (struct_specifier "struct" name: (type_identifier)) (pointer_declarator "*" declarator: (identifier)) ";") So the solution could be to flatten 'struct_specifier' to move 'type_identifier' to be a sibling in the same list inside 'declaration': (declaration type: (storage_class_specifier "static") declarator: (struct_specifier "struct") (type_identifier) (pointer_declarator "*" declarator: (identifier)) ";") Also not sure if such thing is possible in tree-sitter. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-24 19:05 ` Juri Linkov 2024-12-24 21:14 ` Yuan Fu @ 2024-12-25 17:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-25 18:01 ` Juri Linkov 1 sibling, 1 reply; 66+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-25 17:19 UTC (permalink / raw) To: Juri Linkov Cc: Mickey Petersen, Yuan Fu, Theodor Thornhill, Eli Zaretskii, 73404 > Also I'm looking into allowing more list-navigation commands > to be usable in ts-modes. E.g. instead of limiting list-navigation > only to the current 'forward-sexp', another useful command is > 'down-list'. Gentle reminder that `forward-sexp` is not a "list-navigation" function. That would be `forward-list`. We very often use sexp commands and functions to manipulate non-lists such as identifiers. Stefan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-25 17:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-25 18:01 ` Juri Linkov 2024-12-25 19:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-25 18:01 UTC (permalink / raw) To: Stefan Monnier Cc: Mickey Petersen, Yuan Fu, Theodor Thornhill, Eli Zaretskii, 73404 >> Also I'm looking into allowing more list-navigation commands >> to be usable in ts-modes. E.g. instead of limiting list-navigation >> only to the current 'forward-sexp', another useful command is >> 'down-list'. > > Gentle reminder that `forward-sexp` is not a "list-navigation" function. > That would be `forward-list`. We very often use sexp commands and > functions to manipulate non-lists such as identifiers. Do you think it would be better to override low-level functions 'scan-lists' and 'scan-sexps' with new variables like 'scan-lists-function' and 'scan-sexps-function', instead of adding more variables for overriding top-level commands such as a new variable 'forward-list-function' and 'down-list-function', like the existing 'forward-sexp-function'? ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-25 18:01 ` Juri Linkov @ 2024-12-25 19:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-27 7:54 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-25 19:29 UTC (permalink / raw) To: Juri Linkov Cc: Mickey Petersen, Yuan Fu, Theodor Thornhill, Eli Zaretskii, 73404 >> Gentle reminder that `forward-sexp` is not a "list-navigation" function. >> That would be `forward-list`. We very often use sexp commands and >> functions to manipulate non-lists such as identifiers. > Do you think it would be better to override low-level functions 'scan-lists' > and 'scan-sexps' with new variables like 'scan-lists-function' > and 'scan-sexps-function', instead of adding more variables for > overriding top-level commands such as a new variable 'forward-list-function' > and 'down-list-function', like the existing 'forward-sexp-function'? Don't know. What I do know is that in general we'd also want an `up-sexp` operation. Currently we have an ugly kludge in `up-list` to try and use `forward-sexp-function` (which is ugly both because `forward-sexp-function` doesn't really provide the functionality we need, and because it mixes up sexp and list navigation), and it would be good to clean it up. AFAIK `up-list` is the only place where I've needed sexp-based navigation and where `forward-sexp-function` didn't do the job. In theory I guess `down-list` is another, but I've never found a use for it. Stefan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-25 19:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-27 7:54 ` Juri Linkov 2024-12-29 17:58 ` Juri Linkov 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-27 7:54 UTC (permalink / raw) To: Stefan Monnier Cc: Mickey Petersen, Yuan Fu, Theodor Thornhill, Eli Zaretskii, 73404 [-- Attachment #1: Type: text/plain, Size: 2710 bytes --] >>> Gentle reminder that `forward-sexp` is not a "list-navigation" function. >>> That would be `forward-list`. We very often use sexp commands and >>> functions to manipulate non-lists such as identifiers. >> Do you think it would be better to override low-level functions 'scan-lists' >> and 'scan-sexps' with new variables like 'scan-lists-function' >> and 'scan-sexps-function', instead of adding more variables for >> overriding top-level commands such as a new variable 'forward-list-function' >> and 'down-list-function', like the existing 'forward-sexp-function'? > > Don't know. I tried to override `scan-lists` and `scan-sexps` with advices (a proof of concept attached below), and it works nicely. For example, `C-M-d` moves down to the HTML element in html-ts-mode. But then I realized there are not too many places where such overriding might be useful. In fact, there is only 1 place in `show-paren--default` that uses `scan-sexps`. But even this occurrence can't be used, so I created `treesit-show-paren-data` for `show-paren-data-function` in bug#75122. And overriding `scan-lists` is useful only in `forward-list`, `down-list` and `up-list`. That's all. So clearly instead of overriding `scan-lists` and `scan-sexps`, better would be to add 3 new variables: `forward-list-function`, `down-list-function` and `up-list-function`. This gives the users more flexibility to choose for example navigation for C-M-f with a limited number of treesit lists + syntax symbols, and for C-M-n everything that is a list in treesit. This means that with start_atimer (-!-enum atimer_type type, struct timespec timestamp) C-M-f could skip only the next symbol and move to start_atimer (enum-!- atimer_type type, struct timespec timestamp) while C-M-n could skip the whole next `parameter_declaration` start_atimer (enum atimer_type type-!-, struct timespec timestamp) or vice versa depending on the values of all new options `...-function` in ts-modes. > What I do know is that in general we'd also want an `up-sexp` operation. > Currently we have an ugly kludge in `up-list` to try and use > `forward-sexp-function` (which is ugly both because > `forward-sexp-function` doesn't really provide the functionality we > need, and because it mixes up sexp and list navigation), and it would be > good to clean it up. Agreed, this distinction is required for treesit. Hopefully, this can be achieved by separating `forward-sexp-function` and `up-list-function` that in ts-modes could be set to new functions either `treesit-up-list` or `treesit-up-sexp`. PS: will try to refactor everything in this attachment to new `forward-list-function`, `down-list-function` and `up-list-function`: [-- Attachment #2: scan-lists-advice.el --] [-- Type: application/emacs-lisp, Size: 1594 bytes --] ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-27 7:54 ` Juri Linkov @ 2024-12-29 17:58 ` Juri Linkov 0 siblings, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-29 17:58 UTC (permalink / raw) To: Stefan Monnier Cc: Theodor Thornhill, Yuan Fu, Mickey Petersen, Eli Zaretskii, 73404 > So clearly instead of overriding `scan-lists` and `scan-sexps`, > better would be to add 3 new variables: `forward-list-function`, > `down-list-function` and `up-list-function`. Ok, now these 3 functions are added in lisp.el and overridden in treesit.el. >> What I do know is that in general we'd also want an `up-sexp` operation. >> Currently we have an ugly kludge in `up-list` to try and use >> `forward-sexp-function` (which is ugly both because >> `forward-sexp-function` doesn't really provide the functionality we >> need, and because it mixes up sexp and list navigation), and it would be >> good to clean it up. > > Agreed, this distinction is required for treesit. Hopefully, > this can be achieved by separating `forward-sexp-function` > and `up-list-function` that in ts-modes could be set to new > functions either `treesit-up-list` or `treesit-up-sexp`. Like `treesit-forward-sexp-list` uses `forward-sexp-default-function` to move between symbols inside lists, I'm going to add syntax-based fallback in `treesit-down-list` and `treesit-up-list` as well. This is useful in strings and comments. This is what `up-list` already does. Unfortunately, currently there is such limitation in `down-list`: (when (ppss-comment-or-string-start (syntax-ppss)) (user-error "This command doesn't work in strings or comments")) But this is not a problem for `treesit-down-list`. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-19 7:34 ` bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Juri Linkov 2024-12-24 19:05 ` Juri Linkov @ 2024-12-30 7:15 ` Juri Linkov 2024-12-30 8:00 ` Yuan Fu ` (2 more replies) 1 sibling, 3 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-30 7:15 UTC (permalink / raw) To: Yuan Fu Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier > Also added 'sexp-list' to c-ts-mode, js-ts-mode, ruby-ts-mode and > html-ts-mode. Addition of 'sexp-list' to other ts-modes is underway. BTW, my initial intention was to add the thing 'list'. But then I discovered that (treesit-thing-next (point) 'list) uses the function 'list' instead of the thing 'list'. However, the current 'sexp-list' as a two-word composite is too ugly. Now I found a better replacement: 'group'. This word is already used in lisp.el such as in the docstring of 'forward-list': "Move forward across one balanced group of parentheses." And the error messages: "No next group". So exactly the same message will be used by (format-message "No next %S" pred) where 'pred' will be 'group'. IMHO, this looks better: (setq-local treesit-thing-settings `((html (sexp ,(regexp-opt '("element" "text" "attribute" "value"))) (group ,(regexp-opt '("element""))) (sentence "tag") (text ,(regexp-opt '("comment" "text")))))) ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-30 7:15 ` Juri Linkov @ 2024-12-30 8:00 ` Yuan Fu 2024-12-30 13:40 ` Eli Zaretskii 2024-12-30 16:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 0 replies; 66+ messages in thread From: Yuan Fu @ 2024-12-30 8:00 UTC (permalink / raw) To: Juri Linkov Cc: Mickey Petersen, Eli Zaretskii, Theodor Thornhill, 73404, Stefan Monnier > On Dec 29, 2024, at 11:15 PM, Juri Linkov <juri@linkov.net> wrote: > >> Also added 'sexp-list' to c-ts-mode, js-ts-mode, ruby-ts-mode and >> html-ts-mode. Addition of 'sexp-list' to other ts-modes is underway. > > BTW, my initial intention was to add the thing 'list'. > But then I discovered that (treesit-thing-next (point) 'list) > uses the function 'list' instead of the thing 'list'. > > However, the current 'sexp-list' as a two-word composite is too ugly. > Now I found a better replacement: 'group'. This word is already used > in lisp.el such as in the docstring of 'forward-list': > > "Move forward across one balanced group of parentheses." > > And the error messages: "No next group". > > So exactly the same message will be used by > > (format-message "No next %S" pred) > > where 'pred' will be 'group'. IMHO, this looks better: > > (setq-local treesit-thing-settings > `((html > (sexp ,(regexp-opt '("element" "text" "attribute" "value"))) > (group ,(regexp-opt '("element""))) > (sentence "tag") > (text ,(regexp-opt '("comment" "text")))))) I don’t have an opinion on this. Groups sounds a bit abstract, but so does sexp-list. So as long as we do a good job explaining the concept, it should be fine. Yuan ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-30 7:15 ` Juri Linkov 2024-12-30 8:00 ` Yuan Fu @ 2024-12-30 13:40 ` Eli Zaretskii 2024-12-30 18:54 ` Juri Linkov 2024-12-30 16:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 1 reply; 66+ messages in thread From: Eli Zaretskii @ 2024-12-30 13:40 UTC (permalink / raw) To: Juri Linkov; +Cc: mickey, casouri, theo, 73404, monnier > From: Juri Linkov <juri@linkov.net> > Cc: Theodor Thornhill <theo@thornhill.no>, Eli Zaretskii <eliz@gnu.org>, > Mickey Petersen <mickey@masteringemacs.org>, 73404@debbugs.gnu.org, > Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 30 Dec 2024 09:15:42 +0200 > > BTW, my initial intention was to add the thing 'list'. > But then I discovered that (treesit-thing-next (point) 'list) > uses the function 'list' instead of the thing 'list'. > > However, the current 'sexp-list' as a two-word composite is too ugly. > Now I found a better replacement: 'group'. This word is already used > in lisp.el such as in the docstring of 'forward-list': > > "Move forward across one balanced group of parentheses." > > And the error messages: "No next group". "Group" is too abstract and too removed from the actual thing. I'm quite sure we can do better. What are the possible kinds of "groups", which are not balanced parenthetical expressions? Can you show a more-or-less exhaustive list of them? With that, we could try looking for a proper terminology. Thanks. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-30 13:40 ` Eli Zaretskii @ 2024-12-30 18:54 ` Juri Linkov 2024-12-30 19:36 ` Eli Zaretskii 0 siblings, 1 reply; 66+ messages in thread From: Juri Linkov @ 2024-12-30 18:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mickey, casouri, theo, 73404, monnier >> BTW, my initial intention was to add the thing 'list'. >> But then I discovered that (treesit-thing-next (point) 'list) >> uses the function 'list' instead of the thing 'list'. >> >> However, the current 'sexp-list' as a two-word composite is too ugly. >> Now I found a better replacement: 'group'. This word is already used >> in lisp.el such as in the docstring of 'forward-list': >> >> "Move forward across one balanced group of parentheses." >> >> And the error messages: "No next group". > > "Group" is too abstract and too removed from the actual thing. I'm > quite sure we can do better. The proper name is "list". Since it can't be used directly, the second best variant is "group". > What are the possible kinds of "groups", which are not balanced > parenthetical expressions? All groups should be balanced. Even in languages that don't use parens and brackets, e.g. "def" and "end" are balanced while these are not parenthetical expressions. > Can you show a more-or-less exhaustive list of them? A exhaustive list is not yet finished. But basically most groups are parenthetical expressions while some don't use parens. ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-30 18:54 ` Juri Linkov @ 2024-12-30 19:36 ` Eli Zaretskii 0 siblings, 0 replies; 66+ messages in thread From: Eli Zaretskii @ 2024-12-30 19:36 UTC (permalink / raw) To: Juri Linkov; +Cc: mickey, casouri, theo, 73404, monnier > From: Juri Linkov <juri@linkov.net> > Cc: casouri@gmail.com, theo@thornhill.no, mickey@masteringemacs.org, > 73404@debbugs.gnu.org, monnier@iro.umontreal.ca > Date: Mon, 30 Dec 2024 20:54:39 +0200 > > > "Group" is too abstract and too removed from the actual thing. I'm > > quite sure we can do better. > > The proper name is "list". Since it can't be used directly, > the second best variant is "group". Let's delay this part of the discussion until we see enough examples of those "lists" or "groups". > > What are the possible kinds of "groups", which are not balanced > > parenthetical expressions? > > All groups should be balanced. Even in languages that don't use > parens and brackets, e.g. "def" and "end" are balanced > while these are not parenthetical expressions. > > > Can you show a more-or-less exhaustive list of them? > > A exhaustive list is not yet finished. But basically most > groups are parenthetical expressions while some don't use parens. OK, maybe "exhaustive" is too much. Can you post a list that you have now? ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-30 7:15 ` Juri Linkov 2024-12-30 8:00 ` Yuan Fu 2024-12-30 13:40 ` Eli Zaretskii @ 2024-12-30 16:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-30 18:59 ` Juri Linkov 2 siblings, 1 reply; 66+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-30 16:30 UTC (permalink / raw) To: Juri Linkov Cc: Mickey Petersen, Yuan Fu, Theodor Thornhill, Eli Zaretskii, 73404 > BTW, my initial intention was to add the thing 'list'. > But then I discovered that (treesit-thing-next (point) 'list) > uses the function 'list' instead of the thing 'list'. I don't have a good suggestion for the actual naming. Regarding the fact that this arg can take a either symbol or a function (which suffers from a risk of ambiguity, like you discovered), I think it's very important to try and avoid the intersection of the two, and a "standard" way to do that is to use keywords, like `:list`. Stefan PS: Tho, strictly speaking you can `(defun :list ...)`. I just hope noone ever does that (although I plead guilty to getting dangerously close to it when I suggested this very idea to Philip for `setup.el`). ^ permalink raw reply [flat|nested] 66+ messages in thread
* bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes 2024-12-30 16:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-30 18:59 ` Juri Linkov 0 siblings, 0 replies; 66+ messages in thread From: Juri Linkov @ 2024-12-30 18:59 UTC (permalink / raw) To: Stefan Monnier Cc: Mickey Petersen, Yuan Fu, Theodor Thornhill, Eli Zaretskii, 73404 >> BTW, my initial intention was to add the thing 'list'. >> But then I discovered that (treesit-thing-next (point) 'list) >> uses the function 'list' instead of the thing 'list'. > > I don't have a good suggestion for the actual naming. > > Regarding the fact that this arg can take a either symbol or a function > (which suffers from a risk of ambiguity, like you discovered), I think > it's very important to try and avoid the intersection of the two, and > a "standard" way to do that is to use keywords, like `:list`. Then for backward-compatibility for existing things could support both variants, e.g. 'sentence' and ':sentence', 'text' and ':text' (luckily no one yet defined such functions). And for the new thing to use ':list', e.g. (setq-local treesit-thing-settings `((html (:sexp ,(regexp-opt '("element" "text" "attribute" "value"))) (:list ,(regexp-opt '("element""))) (:sentence "tag") (:text ,(regexp-opt '("comment" "text")))))) Another variant to avoid ambiguity would be to use a special syntax in calls, e.g. instead of (treesit-thing-next pos 'list) to use (treesit-thing-next pos '(thing list)). ^ permalink raw reply [flat|nested] 66+ messages in thread
end of thread, other threads:[~2024-12-30 19:36 UTC | newest] Thread overview: 66+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-21 5:06 bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Mickey Petersen 2024-09-26 7:42 ` Yuan Fu 2024-09-26 9:56 ` Mickey Petersen 2024-09-26 10:53 ` Eli Zaretskii 2024-09-26 12:13 ` Mickey Petersen 2024-09-26 13:46 ` Eli Zaretskii 2024-09-26 15:21 ` Mickey Petersen 2024-09-26 15:45 ` Eli Zaretskii 2024-09-27 5:43 ` Yuan Fu 2024-09-29 16:56 ` Juri Linkov 2024-10-01 3:57 ` Yuan Fu 2024-10-01 17:49 ` Juri Linkov 2024-10-02 6:14 ` Yuan Fu 2024-12-05 18:52 ` Juri Linkov 2024-12-05 19:53 ` Juri Linkov 2024-12-10 17:20 ` Juri Linkov 2024-12-11 6:31 ` Yuan Fu 2024-12-11 15:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 15:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-11 16:50 ` Mickey Petersen 2024-12-11 18:27 ` Yuan Fu 2024-12-12 7:17 ` Juri Linkov 2024-12-12 7:40 ` Eli Zaretskii 2024-12-12 7:58 ` Juri Linkov 2024-12-12 8:14 ` Juri Linkov 2024-12-12 16:31 ` Juri Linkov 2024-12-12 17:49 ` Juri Linkov 2024-12-12 19:13 ` Eli Zaretskii 2024-12-13 7:06 ` Juri Linkov 2024-12-14 11:02 ` Eli Zaretskii 2024-12-14 18:14 ` Juri Linkov 2024-12-18 7:37 ` Juri Linkov 2024-12-19 4:04 ` Yuan Fu 2024-12-19 7:14 ` Juri Linkov 2024-12-19 7:18 ` bug#74963: Ambiguous treesit named and anonymous nodes in ruby-ts-mode Juri Linkov 2024-12-24 3:02 ` Yuan Fu 2024-12-24 7:17 ` Juri Linkov 2024-12-24 7:41 ` Juri Linkov 2024-12-25 3:25 ` Dmitry Gutov 2024-12-25 7:52 ` Juri Linkov 2024-12-26 1:00 ` Dmitry Gutov 2024-12-27 7:42 ` Juri Linkov 2024-12-24 17:52 ` Juri Linkov 2024-12-24 21:03 ` Yuan Fu 2024-12-25 7:49 ` Juri Linkov 2024-12-25 9:11 ` Yuan Fu 2024-12-25 17:39 ` Juri Linkov 2024-12-19 7:34 ` bug#73404: 30.0.50; [forward/kill/etc]-sexp commands do not behave as expected in tree-sitter modes Juri Linkov 2024-12-24 19:05 ` Juri Linkov 2024-12-24 21:14 ` Yuan Fu 2024-12-25 7:44 ` Juri Linkov 2024-12-25 8:34 ` Yuan Fu 2024-12-25 17:36 ` Juri Linkov 2024-12-27 7:59 ` Juri Linkov 2024-12-25 17:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-25 18:01 ` Juri Linkov 2024-12-25 19:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-27 7:54 ` Juri Linkov 2024-12-29 17:58 ` Juri Linkov 2024-12-30 7:15 ` Juri Linkov 2024-12-30 8:00 ` Yuan Fu 2024-12-30 13:40 ` Eli Zaretskii 2024-12-30 18:54 ` Juri Linkov 2024-12-30 19:36 ` Eli Zaretskii 2024-12-30 16:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-12-30 18:59 ` Juri Linkov
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).