all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
@ 2023-01-17 20:44 Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-17 20:53 ` Dmitry Gutov
  2023-01-17 21:13 ` Mickey Petersen
  0 siblings, 2 replies; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-17 20:44 UTC (permalink / raw)
  To: 60894; +Cc: Juri Linkov, Mickey Petersen, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 1683 bytes --]


Hi Emacs (also Juri and Mickey as you've expressed some interest for this)

This is an example patch for sexp movement with tree sitter.  I want to
put it out here to hopefully produce some discussion, sooner rather than
later.

Three initial questions:

1. What should a sexp be?

  Is it basically "everything", or is there a distincition between
  "word", "sexp" and "sentence"?  For lisp forward-sexp looks like a
  "jump over words, or a balanced pair of parens".  In other languages
  that can look a little weird - consider:

  ```
  foo().|bar().baz(); -> foo().bar|().baz(); -> foo().bar()|.baz();
  ```
  
  In a sense it could be considered "better", or at least distinct from
  forward-word to:

  ```
  foo().|bar().baz(); -> foo().bar()|.baz(); -> foo().bar().baz()|;
  ```

2. Should this new function be leveraged in transpose-sexps?

  IMO if the forward-sexp gets too close to forward-word, or
  forward-sentence we lose some nice properties with the current
  'treesit-transpose-sexps', namely (among others):

  ```
  f(String foo,| Integer bar) ->  void foo(Integer bar, String foo|)
  ```

  I know you Mickey have expressed some dissatisfaction with the current
  implementation - now is a good time to make some worthwhile
  improvements.


3. What are the "rules"?

  In c-mode, elisp-mode without paredit forward-sexp won't jump out of
  the current scope, however with paredit enabled it will.


If we simply want some code similar to this to live and slowly evolve I
guess we can install something like this patch after some tweaks and
iterate when we have more experience.

Anyway, I hope these questions and thoughs will spark some discussion,

Theo



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-treesit-forward-sexp.patch --]
[-- Type: text/x-patch, Size: 2961 bytes --]

From b02d09216cad2833f96decf46587d51a5ce98ea9 Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Tue, 17 Jan 2023 21:18:29 +0100
Subject: [PATCH] Add treesit-forward-sexp

* lisp/progmodes/java-ts-mode.el (java-ts-mode): Use
treesit-sexp-type-regexp.
* lisp/treesit.el (treesit-sexp-type-regexp): New defvar.
(treesit-forward-sexp): New command.
(treesit-major-mode-setup): Conditionally set forward-sexp-function.
---
 lisp/progmodes/java-ts-mode.el | 15 +++++++++++++++
 lisp/treesit.el                | 17 +++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
index 83c437d307..03093e0980 100644
--- a/lisp/progmodes/java-ts-mode.el
+++ b/lisp/progmodes/java-ts-mode.el
@@ -328,6 +328,21 @@ java-ts-mode
                             "package_declaration"
                             "import_declaration")))
 
+  (setq-local treesit-sexp-type-regexp
+              (regexp-opt '("annotation"
+                            "parenthesized_expression"
+                            "argument_list"
+                            "identifier"
+                            "modifiers"
+                            "block"
+                            "body"
+                            "literal"
+                            "access"
+                            "reference"
+                            "_type"
+                            "true"
+                            "false")))
+
   ;; Font-lock.
   (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
   (setq-local treesit-font-lock-feature-list
diff --git a/lisp/treesit.el b/lisp/treesit.el
index 69bfff21df..735df7a8f4 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1622,6 +1622,21 @@ treesit-search-forward-goto
       (goto-char current-pos)))
     node))
 
+(defvar-local treesit-sexp-type-regexp nil
+  "A regexp that matches the node type of sexp nodes.
+
+A sexp node is a node that is bigger than punctuation, and
+delimits medium sized statements in the source code.  It is,
+however, smaller in scope than sentences.  This is used by
+`treesit-forward-sexp' and friends.")
+
+(defun treesit-forward-sexp (&optional arg)
+  (interactive "^p")
+  (or arg (setq arg 1))
+  (funcall
+   (if (> arg 0) #'treesit-end-of-thing #'treesit-beginning-of-thing)
+   treesit-sexp-type-regexp (abs arg)))
+
 (defun treesit-transpose-sexps (&optional arg)
   "Tree-sitter `transpose-sexps' function.
 Arg is the same as in `transpose-sexps'.
@@ -2287,6 +2302,8 @@ treesit-major-mode-setup
     (setq-local add-log-current-defun-function
                 #'treesit-add-log-current-defun))
 
+  (when treesit-sexp-type-regexp
+    (setq-local forward-sexp-function #'treesit-forward-sexp))
   (setq-local transpose-sexps-function #'treesit-transpose-sexps)
   (when treesit-sentence-type-regexp
     (setq-local forward-sentence-function #'treesit-forward-sentence))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-17 20:44 bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-17 20:53 ` Dmitry Gutov
  2023-01-17 21:07   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-17 21:13 ` Mickey Petersen
  1 sibling, 1 reply; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-17 20:53 UTC (permalink / raw)
  To: Theodor Thornhill, 60894; +Cc: Mickey Petersen, Stefan Monnier, Juri Linkov

On 17/01/2023 22:44, Theodor Thornhill via Bug reports for GNU Emacs, 
the Swiss army knife of text editors wrote:
> 1. What should a sexp be?
> 
>    Is it basically "everything", or is there a distincition between
>    "word", "sexp" and "sentence"?  For lisp forward-sexp looks like a
>    "jump over words, or a balanced pair of parens".  In other languages
>    that can look a little weird - consider:
> 
>    ```
>    foo().|bar().baz(); -> foo().bar|().baz(); -> foo().bar()|.baz();
>    ```
>    
>    In a sense it could be considered "better", or at least distinct from
>    forward-word to:
> 
>    ```
>    foo().|bar().baz(); -> foo().bar()|.baz(); -> foo().bar().baz()|;

One of the key things for Ruby, I think, is to jump over expressions.

E.g. when the point is before 'def' in

   def foo
     ...
   end

forward-sexp jumps to after 'end'. And backward-sexp jumps back.

Same for

   if 2 == 3
     ...
   end

, parenthesized expressions and (less important) method calls and 
statements as well.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-17 20:53 ` Dmitry Gutov
@ 2023-01-17 21:07   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18  1:50     ` Dmitry Gutov
  0 siblings, 1 reply; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-17 21:07 UTC (permalink / raw)
  To: Dmitry Gutov, 60894; +Cc: Mickey Petersen, Stefan Monnier, Juri Linkov

[-- Attachment #1: Type: text/plain, Size: 1269 bytes --]

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 17/01/2023 22:44, Theodor Thornhill via Bug reports for GNU Emacs, 
> the Swiss army knife of text editors wrote:
>> 1. What should a sexp be?
>> 
>>    Is it basically "everything", or is there a distincition between
>>    "word", "sexp" and "sentence"?  For lisp forward-sexp looks like a
>>    "jump over words, or a balanced pair of parens".  In other languages
>>    that can look a little weird - consider:
>> 
>>    ```
>>    foo().|bar().baz(); -> foo().bar|().baz(); -> foo().bar()|.baz();
>>    ```
>>    
>>    In a sense it could be considered "better", or at least distinct from
>>    forward-word to:
>> 
>>    ```
>>    foo().|bar().baz(); -> foo().bar()|.baz(); -> foo().bar().baz()|;
>
> One of the key things for Ruby, I think, is to jump over expressions.
>
> E.g. when the point is before 'def' in
>
>    def foo
>      ...
>    end
>
> forward-sexp jumps to after 'end'. And backward-sexp jumps back.
>
> Same for
>
>    if 2 == 3
>      ...
>    end
>
> , parenthesized expressions and (less important) method calls and 
> statements as well.


Test this very untested addition to the patch (I know no ruby).  It
seems to do what you want.  I'd consider this sentence movement, though.
For M-e

Theo


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-treesit-forward-sexp.patch --]
[-- Type: text/x-patch, Size: 3583 bytes --]

From 9934c4225633ff0524269902b8c574666cdfe0cb Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Tue, 17 Jan 2023 21:18:29 +0100
Subject: [PATCH] Add treesit-forward-sexp

* lisp/progmodes/java-ts-mode.el (java-ts-mode): Use
treesit-sexp-type-regexp.
* lisp/treesit.el (treesit-sexp-type-regexp): New defvar.
(treesit-forward-sexp): New command.
(treesit-major-mode-setup): Conditionally set forward-sexp-function.
* lisp/progmodes/ruby-ts-mode.el: Add some types to ruby-ts-mode.
---
 lisp/progmodes/java-ts-mode.el | 15 +++++++++++++++
 lisp/progmodes/ruby-ts-mode.el |  4 ++++
 lisp/treesit.el                | 17 +++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
index 83c437d307..03093e0980 100644
--- a/lisp/progmodes/java-ts-mode.el
+++ b/lisp/progmodes/java-ts-mode.el
@@ -328,6 +328,21 @@ java-ts-mode
                             "package_declaration"
                             "import_declaration")))
 
+  (setq-local treesit-sexp-type-regexp
+              (regexp-opt '("annotation"
+                            "parenthesized_expression"
+                            "argument_list"
+                            "identifier"
+                            "modifiers"
+                            "block"
+                            "body"
+                            "literal"
+                            "access"
+                            "reference"
+                            "_type"
+                            "true"
+                            "false")))
+
   ;; Font-lock.
   (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
   (setq-local treesit-font-lock-feature-list
diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
index 939c054b04..78186c78d5 100644
--- a/lisp/progmodes/ruby-ts-mode.el
+++ b/lisp/progmodes/ruby-ts-mode.el
@@ -984,6 +984,10 @@ ruby-ts-mode
   ;; Navigation.
   (setq-local treesit-defun-type-regexp ruby-ts--method-regex)
 
+  (setq-local treesit-sexp-type-regexp
+              (regexp-opt '("method"
+                            "if")))
+
   ;; AFAIK, Ruby can not nest methods
   (setq-local treesit-defun-prefer-top-level nil)
 
diff --git a/lisp/treesit.el b/lisp/treesit.el
index 69bfff21df..735df7a8f4 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1622,6 +1622,21 @@ treesit-search-forward-goto
       (goto-char current-pos)))
     node))
 
+(defvar-local treesit-sexp-type-regexp nil
+  "A regexp that matches the node type of sexp nodes.
+
+A sexp node is a node that is bigger than punctuation, and
+delimits medium sized statements in the source code.  It is,
+however, smaller in scope than sentences.  This is used by
+`treesit-forward-sexp' and friends.")
+
+(defun treesit-forward-sexp (&optional arg)
+  (interactive "^p")
+  (or arg (setq arg 1))
+  (funcall
+   (if (> arg 0) #'treesit-end-of-thing #'treesit-beginning-of-thing)
+   treesit-sexp-type-regexp (abs arg)))
+
 (defun treesit-transpose-sexps (&optional arg)
   "Tree-sitter `transpose-sexps' function.
 Arg is the same as in `transpose-sexps'.
@@ -2287,6 +2302,8 @@ treesit-major-mode-setup
     (setq-local add-log-current-defun-function
                 #'treesit-add-log-current-defun))
 
+  (when treesit-sexp-type-regexp
+    (setq-local forward-sexp-function #'treesit-forward-sexp))
   (setq-local transpose-sexps-function #'treesit-transpose-sexps)
   (when treesit-sentence-type-regexp
     (setq-local forward-sentence-function #'treesit-forward-sentence))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-17 20:44 bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-17 20:53 ` Dmitry Gutov
@ 2023-01-17 21:13 ` Mickey Petersen
  2023-01-18 13:31   ` Dmitry Gutov
  2023-01-18 17:09   ` Juri Linkov
  1 sibling, 2 replies; 25+ messages in thread
From: Mickey Petersen @ 2023-01-17 21:13 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: 60894, monnier, juri


Theodor Thornhill <theo@thornhill.no> writes:

> Hi Emacs (also Juri and Mickey as you've expressed some interest for this)
>
> This is an example patch for sexp movement with tree sitter.  I want to
> put it out here to hopefully produce some discussion, sooner rather than
> later.
>
> Three initial questions:
>
> 1. What should a sexp be?
>

`forward-sexp' and friends behave exactly as they should behave. If
there are sexp-like things on the left and/or right-hand side of
point, notwithstanding certain punction in the direction of travel,
then it uses that to determine where to move; where to kill to; and
the boundaries for transposition.

Absent an sexp, it defaults to word-based movement. Exactly as it should.

With that out of the way. That does not mean that we cannot extend its
behaviour in ways that are both predictable but also useful.

It should know about things the default `syntax-ppss' machinery cannot
figure out for itself.

Consider HTML (or anything in SGML club of languages). I would expect
sexp movement to move over a matched pair of tags. It currently does
not; the reason why is understandable when you know how `syntax-ppss'
does (or does not, as it were) work. (You can equally make an argument
that it should simply go to the end of an opening/closing node and not
the pair, but that is personal preference.)

`nxml-mode' handles it properly; Combobulate handles it properly, too.
But Combobulate also falls back to the classic sexp behaviour if it
cannot find a suitable node in the direction of travel.

And that is the key thing here. It augments; it does not replace.


>   Is it basically "everything", or is there a distincition between
>   "word", "sexp" and "sentence"?  For lisp forward-sexp looks like a
>   "jump over words, or a balanced pair of parens".  In other languages
>   that can look a little weird - consider:
>

Yes, there are distinctions, and no, sexp-based movement is not the
only game in town. Certainly not pre-TS; and definitely not after.

Word movement is a subset of sexp-based movement. Sentence and
paragraph-based movement is rarely altered in coding major modes. But M-{
and M-} are still useful for moving through contiguous blocks of code
separated by blank lines. Not its intended use, but absent something
better, it's OK. Room for improvement for sure.


>   ```
>   foo().|bar().baz(); -> foo().bar|().baz(); -> foo().bar()|.baz();
>   ```
>

Subject to the vagaries of the syntax table, yes, that's right. It's
also useful. I can punch `C-M-k' twice and gobble up the caller and
whatever is in the brackets. That is not possible with word-based
killing alone.

>   In a sense it could be considered "better", or at least distinct from
>   forward-word to:
>
>   ```
>   foo().|bar().baz(); -> foo().bar()|.baz(); -> foo().bar().baz()|;
>   ```
>

That is not how word movement works on my machine:

    foo().bar().baz();
          |  |     |

That is the path of travel for `M-f' in C-mode. This is how `C-M-f'
moves:

    foo().bar().baz();
          |  | |   |

> 2. Should this new function be leveraged in transpose-sexps?
>
>   IMO if the forward-sexp gets too close to forward-word, or
>   forward-sentence we lose some nice properties with the current
>   'treesit-transpose-sexps', namely (among others):
>
>   ```
>   f(String foo,| Integer bar) ->  void foo(Integer bar, String foo|)
>   ```
>
>   I know you Mickey have expressed some dissatisfaction with the current
>   implementation - now is a good time to make some worthwhile
>   improvements.

My dissatisfaction was merely the approach. You cannot transpose
sibling nodes and expect that to behave in a manner that is logical to
anyone who is not fluent in the specifics of every TS grammar they
interact with.

Things that appear to be 'siblings' to the human eye may in fact not
be that at all.

And some grammars are rather more generous with their level of
granularity than others: both in how expansive they are, but also in
how deep the trees are. So when you go foraging for nodes to
transpose, you may end up finding sub-nodes to the ones you'd actually
want to transpose because they are smaller and closer units.

>
> 3. What are the "rules"?
>
>   In c-mode, elisp-mode without paredit forward-sexp won't jump out of
>   the current scope, however with paredit enabled it will.
>
>
> If we simply want some code similar to this to live and slowly evolve I
> guess we can install something like this patch after some tweaks and
> iterate when we have more experience.
>

The problem with that patch is -- and caveat, I do not use Java, so I
am not too versed in the details of the note types used -- but most of
the things I see are things that a normal sexp would work with anyway:

- true, false: they are just words, right?
- identifier: presumably a symbol or something that a decent syntax
  table would probably catch.
- parenthesized_expression: delimited by ( ), etc. ? Sexp handles that
- just fine.
- argument_list: presumably uses ( ) also


> Anyway, I hope these questions and thoughs will spark some discussion,
>
> Theo
>
>
> [2. text/x-patch; 0001-Add-treesit-forward-sexp.patch]...






^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-17 21:07   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-18  1:50     ` Dmitry Gutov
  2023-01-18  5:35       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-18  1:50 UTC (permalink / raw)
  To: Theodor Thornhill, 60894; +Cc: Mickey Petersen, Stefan Monnier, Juri Linkov

On 17/01/2023 23:07, Theodor Thornhill via Bug reports for GNU Emacs, 
the Swiss army knife of text editors wrote:
> Test this very untested addition to the patch (I know no ruby).  It
> seems to do what you want.  I'd consider this sentence movement, though.
> For M-e

That seems to be working rather well, thanks. I just needed to extend 
the list of nodes:

   (setq-local treesit-sexp-type-regexp
               (regexp-opt '("class"
                             "module"
                             "method"
                             "argument_list"
                             "array"
                             "hash"
                             "parenthesized_statements"
                             "if"
                             "case"
                             "block"
                             "do_block"
                             "begin")))

With array, hash, etc, you see it's not exactly like a sentence.

Regarding your previous question -- whether forward-sexp should jump 
over the arglist together with the called method name -- ruby-mode's 
answer to that is:

- If point is before ".", jump over ".bar(...)".
- If point is after ".", jump over "bar" only.

But the difference is more subtle here, and different people might have 
different preferences. This also seems more difficult to express via 
node types since "." is in the middle of the (call) node.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18  1:50     ` Dmitry Gutov
@ 2023-01-18  5:35       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 12:21         ` Eli Zaretskii
                           ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-18  5:35 UTC (permalink / raw)
  To: Dmitry Gutov, 60894; +Cc: Mickey Petersen, Stefan Monnier, Juri Linkov

[-- Attachment #1: Type: text/plain, Size: 1754 bytes --]

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 17/01/2023 23:07, Theodor Thornhill via Bug reports for GNU Emacs, 
> the Swiss army knife of text editors wrote:
>> Test this very untested addition to the patch (I know no ruby).  It
>> seems to do what you want.  I'd consider this sentence movement, though.
>> For M-e
>
> That seems to be working rather well, thanks. I just needed to extend 
> the list of nodes:
>
>    (setq-local treesit-sexp-type-regexp
>                (regexp-opt '("class"
>                              "module"
>                              "method"
>                              "argument_list"
>                              "array"
>                              "hash"
>                              "parenthesized_statements"
>                              "if"
>                              "case"
>                              "block"
>                              "do_block"
>                              "begin")))
>
> With array, hash, etc, you see it's not exactly like a sentence.
>
> Regarding your previous question -- whether forward-sexp should jump 
> over the arglist together with the called method name -- ruby-mode's 
> answer to that is:
>
> - If point is before ".", jump over ".bar(...)".
> - If point is after ".", jump over "bar" only.
>
> But the difference is more subtle here, and different people might have 
> different preferences. This also seems more difficult to express via 
> node types since "." is in the middle of the (call) node.

Yeah, it's not the easiest problem, but we can wait until we get more
experience in other modes too to solidify this design.  It is not for
pemacs-29 anyway, so we have time.

Added some words to the manual and your node types to ruby-ts-mode

Theo


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-treesit-forward-sexp.patch --]
[-- Type: text/x-patch, Size: 6094 bytes --]

From 3a12fdd4280d7ebbe84e08490a4bc4a249f5dea5 Mon Sep 17 00:00:00 2001
From: Theodor Thornhill <theo@thornhill.no>
Date: Tue, 17 Jan 2023 21:18:29 +0100
Subject: [PATCH] Add treesit-forward-sexp

* lisp/progmodes/java-ts-mode.el (java-ts-mode): Use
treesit-sexp-type-regexp.
* lisp/treesit.el (treesit-sexp-type-regexp): New defvar.
(treesit-forward-sexp): New command.
(treesit-major-mode-setup): Conditionally set forward-sexp-function.
* lisp/progmodes/ruby-ts-mode.el: Add some types to ruby-ts-mode.
* doc/lispref/positions.texi (List Motion): Mention the change in the
manual.
* etc/NEWS: Mention the change.
---
 doc/lispref/positions.texi     | 17 +++++++++++++++++
 etc/NEWS                       |  9 +++++++++
 lisp/progmodes/java-ts-mode.el | 15 +++++++++++++++
 lisp/progmodes/ruby-ts-mode.el | 14 ++++++++++++++
 lisp/treesit.el                | 17 +++++++++++++++++
 5 files changed, 72 insertions(+)

diff --git a/doc/lispref/positions.texi b/doc/lispref/positions.texi
index 8d95ecee7a..8b7431096f 100644
--- a/doc/lispref/positions.texi
+++ b/doc/lispref/positions.texi
@@ -875,6 +875,23 @@ List Motion
 @code{backward-sentence}(@pxref{Moving by Sentences,,, emacs, The
 extensible self-documenting text editor}).
 
+@defvar treesit-sexp-type-regexp
+The value of this variable is a regexp matching the node type of sexp
+nodes.  (For ``node'' and ``node type'', @pxref{Parsing Program
+Source}.)
+@end defvar
+
+@findex treesit-forward-sexp
+@findex forward-sexp
+@findex backward-sexp
+If Emacs is compiled with tree-sitter, it can use the tree-sitter
+parser information to move across syntax constructs.  Since what
+exactly is considered a sexp varies between languages, a major mode
+should set @code{treesit-sentence-type-regexp} to determine that.
+Then the mode can get navigation-by-sexp functionality for free, by
+using @code{forward-sexp} and @code{backward-sexp}(@pxref{Moving by
+Sentences,,, emacs, The extensible self-documenting text editor}).
+
 @node Skipping Characters
 @subsection Skipping Characters
 @cindex skipping characters
diff --git a/etc/NEWS b/etc/NEWS
index cde6783349..56e689b239 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -84,6 +84,15 @@ define "sentences" in Tree-sitter enabled modes.
 All tree-sitter modes that define 'treesit-sentence-type-regexp' now
 set 'forward-sentence-function' to call 'treesit-forward-sentence'.
 
+*** New defvar-local 'treesit-sexp-type-regexp'.
+Similarly to 'treesit-defun-type-regexp', this variable is used to
+define "sexps" in Tree-sitter enabled modes.
+
+*** New function 'treesit-forward-sexp'.
+treesit.el conditionally sets 'forward-sexp-function` for major modes
+that have defined 'treesit-sexp-type-regexp' to enable sexp related
+commands.
+
 \f
 * Changes in Specialized Modes and Packages in Emacs 30.1
 ---
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
index 83c437d307..03093e0980 100644
--- a/lisp/progmodes/java-ts-mode.el
+++ b/lisp/progmodes/java-ts-mode.el
@@ -328,6 +328,21 @@ java-ts-mode
                             "package_declaration"
                             "import_declaration")))
 
+  (setq-local treesit-sexp-type-regexp
+              (regexp-opt '("annotation"
+                            "parenthesized_expression"
+                            "argument_list"
+                            "identifier"
+                            "modifiers"
+                            "block"
+                            "body"
+                            "literal"
+                            "access"
+                            "reference"
+                            "_type"
+                            "true"
+                            "false")))
+
   ;; Font-lock.
   (setq-local treesit-font-lock-settings java-ts-mode--font-lock-settings)
   (setq-local treesit-font-lock-feature-list
diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
index 939c054b04..d99be96475 100644
--- a/lisp/progmodes/ruby-ts-mode.el
+++ b/lisp/progmodes/ruby-ts-mode.el
@@ -984,6 +984,20 @@ ruby-ts-mode
   ;; Navigation.
   (setq-local treesit-defun-type-regexp ruby-ts--method-regex)
 
+  (setq-local treesit-sexp-type-regexp
+              (regexp-opt '("class"
+                            "module"
+                            "method"
+                            "argument_list"
+                            "array"
+                            "hash"
+                            "parenthesized_statements"
+                            "if"
+                            "case"
+                            "block"
+                            "do_block"
+                            "begin")))
+
   ;; AFAIK, Ruby can not nest methods
   (setq-local treesit-defun-prefer-top-level nil)
 
diff --git a/lisp/treesit.el b/lisp/treesit.el
index 69bfff21df..735df7a8f4 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1622,6 +1622,21 @@ treesit-search-forward-goto
       (goto-char current-pos)))
     node))
 
+(defvar-local treesit-sexp-type-regexp nil
+  "A regexp that matches the node type of sexp nodes.
+
+A sexp node is a node that is bigger than punctuation, and
+delimits medium sized statements in the source code.  It is,
+however, smaller in scope than sentences.  This is used by
+`treesit-forward-sexp' and friends.")
+
+(defun treesit-forward-sexp (&optional arg)
+  (interactive "^p")
+  (or arg (setq arg 1))
+  (funcall
+   (if (> arg 0) #'treesit-end-of-thing #'treesit-beginning-of-thing)
+   treesit-sexp-type-regexp (abs arg)))
+
 (defun treesit-transpose-sexps (&optional arg)
   "Tree-sitter `transpose-sexps' function.
 Arg is the same as in `transpose-sexps'.
@@ -2287,6 +2302,8 @@ treesit-major-mode-setup
     (setq-local add-log-current-defun-function
                 #'treesit-add-log-current-defun))
 
+  (when treesit-sexp-type-regexp
+    (setq-local forward-sexp-function #'treesit-forward-sexp))
   (setq-local transpose-sexps-function #'treesit-transpose-sexps)
   (when treesit-sentence-type-regexp
     (setq-local forward-sentence-function #'treesit-forward-sentence))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18  5:35       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-18 12:21         ` Eli Zaretskii
  2023-01-18 13:39         ` Dmitry Gutov
  2023-01-18 17:16         ` Juri Linkov
  2 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2023-01-18 12:21 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: 60894, juri, mickey, monnier, dgutov

> Cc: Mickey Petersen <mickey@masteringemacs.org>,
>  Stefan Monnier <monnier@iro.umontreal.ca>, Juri Linkov <juri@linkov.net>
> Date: Wed, 18 Jan 2023 06:35:36 +0100
> From:  Theodor Thornhill via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> diff --git a/doc/lispref/positions.texi b/doc/lispref/positions.texi
> index 8d95ecee7a..8b7431096f 100644
> --- a/doc/lispref/positions.texi
> +++ b/doc/lispref/positions.texi
> @@ -875,6 +875,23 @@ List Motion
>  @code{backward-sentence}(@pxref{Moving by Sentences,,, emacs, The
>  extensible self-documenting text editor}).
>  
> +@defvar treesit-sexp-type-regexp
> +The value of this variable is a regexp matching the node type of sexp
> +nodes.  (For ``node'' and ``node type'', @pxref{Parsing Program
> +Source}.)
> +@end defvar
> +
> +@findex treesit-forward-sexp
> +@findex forward-sexp
> +@findex backward-sexp
> +If Emacs is compiled with tree-sitter, it can use the tree-sitter
> +parser information to move across syntax constructs.  Since what
> +exactly is considered a sexp varies between languages, a major mode
> +should set @code{treesit-sentence-type-regexp} to determine that.
> +Then the mode can get navigation-by-sexp functionality for free, by
> +using @code{forward-sexp} and @code{backward-sexp}(@pxref{Moving by
> +Sentences,,, emacs, The extensible self-documenting text editor}).

The last 2 index entries should either not be there or be qualified:

  @findex forward-sexp@r{, and tree-sitter}

That's because these two commands are described in another place in
the manual, and the unqualified index entries should go there.
Otherwise, if the readers type in Info "i forward-s TAB", they will
see something like this:

  forward-sexp
  forward-sexp<1>

and will not know which one of these two they want.  And similarly in
the printed version of the manual: you will get "forward-sexp" with 2
page numbers -- which one is the one you are looking for?

> +*** New function 'treesit-forward-sexp'.
> +treesit.el conditionally sets 'forward-sexp-function` for major modes
> +that have defined 'treesit-sexp-type-regexp' to enable sexp related
> +commands.

"sexp-related motion commands" should be better.  (If "motion" is not
inclusive enough, we could use additional descriptions.)

Thanks.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-17 21:13 ` Mickey Petersen
@ 2023-01-18 13:31   ` Dmitry Gutov
  2023-01-18 17:09   ` Juri Linkov
  1 sibling, 0 replies; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-18 13:31 UTC (permalink / raw)
  To: Mickey Petersen, Theodor Thornhill; +Cc: 60894, monnier, juri

On 17/01/2023 23:13, Mickey Petersen wrote:
>> 3. What are the "rules"?
>>
>>    In c-mode, elisp-mode without paredit forward-sexp won't jump out of
>>    the current scope, however with paredit enabled it will.
>>
>>
>> If we simply want some code similar to this to live and slowly evolve I
>> guess we can install something like this patch after some tweaks and
>> iterate when we have more experience.
>>
> The problem with that patch is -- and caveat, I do not use Java, so I
> am not too versed in the details of the note types used -- but most of
> the things I see are things that a normal sexp would work with anyway:
> 
> - true, false: they are just words, right?
> - identifier: presumably a symbol or something that a decent syntax
>    table would probably catch.
> - parenthesized_expression: delimited by ( ), etc. ? Sexp handles that
> - just fine.
> - argument_list: presumably uses ( ) also

Yeah, it seems like in C-based languages (where any block of code is 
surrounded by curlier) the default forward-sexp should behave decently 
already.

Whether forward-sexp should jump over the whole if () { ... } 
expression, for example, is a matter of taste. Not so with e.g. Python, 
where an 'if' statement can have zero parens and no visible terminator.






^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18  5:35       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 12:21         ` Eli Zaretskii
@ 2023-01-18 13:39         ` Dmitry Gutov
  2023-01-18 18:28           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 19:01           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 17:16         ` Juri Linkov
  2 siblings, 2 replies; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-18 13:39 UTC (permalink / raw)
  To: Theodor Thornhill, 60894; +Cc: Mickey Petersen, Stefan Monnier, Juri Linkov

On 18/01/2023 07:35, Theodor Thornhill wrote:
> Yeah, it's not the easiest problem, but we can wait until we get more
> experience in other modes too to solidify this design.  It is not for
> pemacs-29 anyway, so we have time.
> 
> Added some words to the manual and your node types to ruby-ts-mode

Thanks! As far as I'm concerned, this is good for master, thank you.

Consider looking into an additional couple of commands which (probably) 
could be implemented using the same starting data: backward-up-list and 
down-list. These have been pretty useful to me.

backward-up-list might even work automatically with a sufficiently 
conforming forward-sexp-function (it does with SMIE); not sure about 
down-list -- SMIE has a dedicated implementation of it.

And show-paren-mode support would be nice to have too -- though it'll 
probably require a separate defcustom with node types.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-17 21:13 ` Mickey Petersen
  2023-01-18 13:31   ` Dmitry Gutov
@ 2023-01-18 17:09   ` Juri Linkov
  2023-01-18 18:27     ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 25+ messages in thread
From: Juri Linkov @ 2023-01-18 17:09 UTC (permalink / raw)
  To: Mickey Petersen; +Cc: 60894, Theodor Thornhill, Stefan Monnier

> Consider HTML (or anything in SGML club of languages). I would expect
> sexp movement to move over a matched pair of tags. It currently does
> not; the reason why is understandable when you know how `syntax-ppss'
> does (or does not, as it were) work. (You can equally make an argument
> that it should simply go to the end of an opening/closing node and not
> the pair, but that is personal preference.)
>
> `nxml-mode' handles it properly; Combobulate handles it properly, too.
> But Combobulate also falls back to the classic sexp behaviour if it
> cannot find a suitable node in the direction of travel.

While ‘forward-sexp’ moves over the next tag, there is also ‘C-c C-f’
(‘sgml-skip-tag-forward’) that moves over the whole element to the end tag.
I'm not sure if sexp movement in nxml-mode is an improvement since
there is no way to move over the tag only.

To support both cases maybe ‘forward-sexp’ should move over the tag,
and ‘forward-sentence’ over the whole element?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18  5:35       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 12:21         ` Eli Zaretskii
  2023-01-18 13:39         ` Dmitry Gutov
@ 2023-01-18 17:16         ` Juri Linkov
  2023-01-18 18:27           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 25+ messages in thread
From: Juri Linkov @ 2023-01-18 17:16 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: 60894, Mickey Petersen, Stefan Monnier, Dmitry Gutov

>> That seems to be working rather well, thanks. I just needed to extend
>> the list of nodes:
>>
>>    (setq-local treesit-sexp-type-regexp
>>                (regexp-opt '("class"
>>                              "module"
>>                              "method"
>>                              "argument_list"
>>                              "array"
>>                              "hash"
>>                              "parenthesized_statements"
>>                              "if"
>>                              "case"
>>                              "block"
>>                              "do_block"
>>                              "begin")))
>
> Added some words to the manual and your node types to ruby-ts-mode

Thanks, I tried it out, and it works even better than in ruby-mode.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 17:09   ` Juri Linkov
@ 2023-01-18 18:27     ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 18:55       ` Juri Linkov
  2023-01-18 22:06       ` Dmitry Gutov
  0 siblings, 2 replies; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-18 18:27 UTC (permalink / raw)
  To: Juri Linkov, Mickey Petersen; +Cc: 60894, Stefan Monnier



On 18 January 2023 18:09:29 CET, Juri Linkov <juri@linkov.net> wrote:
>> Consider HTML (or anything in SGML club of languages). I would expect
>> sexp movement to move over a matched pair of tags. It currently does
>> not; the reason why is understandable when you know how `syntax-ppss'
>> does (or does not, as it were) work. (You can equally make an argument
>> that it should simply go to the end of an opening/closing node and not
>> the pair, but that is personal preference.)
>>
>> `nxml-mode' handles it properly; Combobulate handles it properly, too.
>> But Combobulate also falls back to the classic sexp behaviour if it
>> cannot find a suitable node in the direction of travel.
>
>While ‘forward-sexp’ moves over the next tag, there is also ‘C-c C-f’
>(‘sgml-skip-tag-forward’) that moves over the whole element to the end tag.
>I'm not sure if sexp movement in nxml-mode is an improvement since
>there is no way to move over the tag only.
>
>To support both cases maybe ‘forward-sexp’ should move over the tag,
>and ‘forward-sentence’ over the whole element?

Yes! This is what I find intuitive, and have tried to explain in my docstrings of the two.

But I'm new in town and want all to at least discuss:)

I can extract a forward-sexp-default-function to amend some of the damage?

Theo





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 17:16         ` Juri Linkov
@ 2023-01-18 18:27           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-18 18:27 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 60894, Mickey Petersen, Stefan Monnier, Dmitry Gutov



On 18 January 2023 18:16:06 CET, Juri Linkov <juri@linkov.net> wrote:
>>> That seems to be working rather well, thanks. I just needed to extend
>>> the list of nodes:
>>>
>>>    (setq-local treesit-sexp-type-regexp
>>>                (regexp-opt '("class"
>>>                              "module"
>>>                              "method"
>>>                              "argument_list"
>>>                              "array"
>>>                              "hash"
>>>                              "parenthesized_statements"
>>>                              "if"
>>>                              "case"
>>>                              "block"
>>>                              "do_block"
>>>                              "begin")))
>>
>> Added some words to the manual and your node types to ruby-ts-mode
>
>Thanks, I tried it out, and it works even better than in ruby-mode.

I'm glad!

Theo





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 13:39         ` Dmitry Gutov
@ 2023-01-18 18:28           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 19:01           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-18 18:28 UTC (permalink / raw)
  To: Dmitry Gutov, 60894; +Cc: Mickey Petersen, Stefan Monnier, Juri Linkov



On 18 January 2023 14:39:07 CET, Dmitry Gutov <dgutov@yandex.ru> wrote:
>On 18/01/2023 07:35, Theodor Thornhill wrote:
>> Yeah, it's not the easiest problem, but we can wait until we get more
>> experience in other modes too to solidify this design.  It is not for
>> pemacs-29 anyway, so we have time.
>> 
>> Added some words to the manual and your node types to ruby-ts-mode
>
>Thanks! As far as I'm concerned, this is good for master, thank you.
>
>Consider looking into an additional couple of commands which (probably) could be implemented using the same starting data: backward-up-list and down-list. These have been pretty useful to me.
>
>backward-up-list might even work automatically with a sufficiently conforming forward-sexp-function (it does with SMIE); not sure about down-list -- SMIE has a dedicated implementation of it.
>
>And show-paren-mode support would be nice to have too -- though it'll probably require a separate defcustom with node types.


Yep next on my list:)

Thanks for the feedback,
Theo





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 18:27     ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-18 18:55       ` Juri Linkov
  2023-01-18 22:06       ` Dmitry Gutov
  1 sibling, 0 replies; 25+ messages in thread
From: Juri Linkov @ 2023-01-18 18:55 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: 60894, Mickey Petersen, Stefan Monnier

>>> Consider HTML (or anything in SGML club of languages). I would expect
>>> sexp movement to move over a matched pair of tags. It currently does
>>> not; the reason why is understandable when you know how `syntax-ppss'
>>> does (or does not, as it were) work. (You can equally make an argument
>>> that it should simply go to the end of an opening/closing node and not
>>> the pair, but that is personal preference.)
>>>
>>> `nxml-mode' handles it properly; Combobulate handles it properly, too.
>>> But Combobulate also falls back to the classic sexp behaviour if it
>>> cannot find a suitable node in the direction of travel.
>>
>>While ‘forward-sexp’ moves over the next tag, there is also ‘C-c C-f’
>>(‘sgml-skip-tag-forward’) that moves over the whole element to the end tag.
>>I'm not sure if sexp movement in nxml-mode is an improvement since
>>there is no way to move over the tag only.
>>
>>To support both cases maybe ‘forward-sexp’ should move over the tag,
>>and ‘forward-sentence’ over the whole element?
>
> Yes! This is what I find intuitive, and have tried to explain in my docstrings of the two.
>
> But I'm new in town and want all to at least discuss:)
>
> I can extract a forward-sexp-default-function to amend some of the damage?

Maybe in a new html-ts-mode?  I see there is libtree-sitter-html,
but can't find the corresponding ts mode.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 13:39         ` Dmitry Gutov
  2023-01-18 18:28           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-18 19:01           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 21:59             ` Dmitry Gutov
  1 sibling, 1 reply; 25+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-18 19:01 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Mickey Petersen, 60894, Theodor Thornhill, Juri Linkov

> backward-up-list might even work automatically with a sufficiently
>  conforming forward-sexp-function (it does with SMIE);

I'm happy to hear that opinion, but it's a bit too optimistic:
`backward-up-list` doesn't really work sufficiently well (IMO) just
using `forward-sexp-function`.

My SMIE experience with it is that we really need a specific hook
for that if we want to avoid wonky behaviors in "corner" cases.


        Stefan






^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 19:01           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-18 21:59             ` Dmitry Gutov
  2023-01-19  2:43               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-18 21:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Mickey Petersen, 60894, Theodor Thornhill, Juri Linkov

On 18/01/2023 21:01, Stefan Monnier wrote:
>> backward-up-list might even work automatically with a sufficiently
>>   conforming forward-sexp-function (it does with SMIE);
> 
> I'm happy to hear that opinion, but it's a bit too optimistic:
> `backward-up-list` doesn't really work sufficiently well (IMO) just
> using `forward-sexp-function`.

It does work fairly well in my experience.

> My SMIE experience with it is that we really need a specific hook
> for that if we want to avoid wonky behaviors in "corner" cases.

A bug report with repro would probably help, I guess. Or you could start 
on that hook yourself.

Do you mean a hook like backward-up-list-function, or something smaller?





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 18:27     ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-18 18:55       ` Juri Linkov
@ 2023-01-18 22:06       ` Dmitry Gutov
  2023-01-19  6:24         ` Eli Zaretskii
  2023-01-19  7:58         ` Juri Linkov
  1 sibling, 2 replies; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-18 22:06 UTC (permalink / raw)
  To: Theodor Thornhill, Juri Linkov, Mickey Petersen; +Cc: 60894, Stefan Monnier

On 18/01/2023 20:27, Theodor Thornhill via Bug reports for GNU Emacs, 
the Swiss army knife of text editors wrote:
> 
> On 18 January 2023 18:09:29 CET, Juri Linkov<juri@linkov.net>  wrote:
>>> Consider HTML (or anything in SGML club of languages). I would expect
>>> sexp movement to move over a matched pair of tags. It currently does
>>> not; the reason why is understandable when you know how `syntax-ppss'
>>> does (or does not, as it were) work. (You can equally make an argument
>>> that it should simply go to the end of an opening/closing node and not
>>> the pair, but that is personal preference.)
>>>
>>> `nxml-mode' handles it properly; Combobulate handles it properly, too.
>>> But Combobulate also falls back to the classic sexp behaviour if it
>>> cannot find a suitable node in the direction of travel.
>> While ‘forward-sexp’ moves over the next tag, there is also ‘C-c C-f’
>> (‘sgml-skip-tag-forward’) that moves over the whole element to the end tag.
>> I'm not sure if sexp movement in nxml-mode is an improvement since
>> there is no way to move over the tag only.
>>
>> To support both cases maybe ‘forward-sexp’ should move over the tag,
>> and ‘forward-sentence’ over the whole element?
> Yes! This is what I find intuitive, and have tried to explain in my docstrings of the two.
> 
> But I'm new in town and want all to at least discuss:)

In my intuition, sexps are expressions which can be nested (a lot).

Sentences are "flat" expressions. I would probably say that in 
tree-sitter modes sentences should be equivalent to "statements".

So, sexps could be small, and they could be large. For them various list 
navigation operations make sense (like the previously mentioned ones).

Sentences stand somewhere in the middle, and they're more like 
sequential. A operation like backward-up-statement wouldn't make a lot 
of sense, however.

Note quite sure what would correspond to statement in html-mode, but I 
would put separate tag elements in the category of word, or symbols, or, 
okay, statements, rather than sexps. A sexp is the tag opener, plus tag 
contents, plus its closer.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 21:59             ` Dmitry Gutov
@ 2023-01-19  2:43               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-19  3:30                 ` Dmitry Gutov
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-19  2:43 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Mickey Petersen, 60894, Theodor Thornhill, Juri Linkov

>> My SMIE experience with it is that we really need a specific hook
>> for that if we want to avoid wonky behaviors in "corner" cases.
> A bug report with repro would probably help, I guess. Or you could start on
> that hook yourself.

I think `C-M-u` from within a LaTeX environment was one of the cases
where it misbehaved (tho that one is not using SMIE).

> Do you mean a hook like backward-up-list-function, or something smaller?

Something like that.  Maybe it could also be used for `expand-region`
and `thing-at-point` kind of purposes maybe and could work even for
treesit nodes that aren't "matching begin..end thingies".


        Stefan






^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-19  2:43               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-19  3:30                 ` Dmitry Gutov
  2023-01-19  3:58                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-19  3:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Theodor Thornhill, 60894, Mickey Petersen, Juri Linkov

On 19/01/2023 04:43, Stefan Monnier via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
>>> My SMIE experience with it is that we really need a specific hook
>>> for that if we want to avoid wonky behaviors in "corner" cases.
>> A bug report with repro would probably help, I guess. Or you could start on
>> that hook yourself.
> 
> I think `C-M-u` from within a LaTeX environment was one of the cases
> where it misbehaved (tho that one is not using SMIE).

Any chance SMIE is doing something different, or something particularly 
correct?

Because in my experience as well backward-up-list seems to work fine 
with SMIE, but not with the tree-sitter forward-sexp implementation 
under discussion.

>> Do you mean a hook like backward-up-list-function, or something smaller?
> 
> Something like that.  Maybe it could also be used for `expand-region`
> and `thing-at-point` kind of purposes maybe and could work even for
> treesit nodes that aren't "matching begin..end thingies".

A treesit node doesn't need an explicit "end" token, though.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-19  3:30                 ` Dmitry Gutov
@ 2023-01-19  3:58                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-01-19 18:44                     ` Dmitry Gutov
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-19  3:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Theodor Thornhill, 60894, Mickey Petersen, Juri Linkov

>> I think `C-M-u` from within a LaTeX environment was one of the cases
>> where it misbehaved (tho that one is not using SMIE).
> Any chance SMIE is doing something different, or something
> particularly correct?

Could be.  Maybe its simplistic approach rules out the bad cases?

>>> Do you mean a hook like backward-up-list-function, or something smaller?
>> Something like that.  Maybe it could also be used for `expand-region`
>> and `thing-at-point` kind of purposes maybe and could work even for
>> treesit nodes that aren't "matching begin..end thingies".
> A treesit node doesn't need an explicit "end" token, though.

And that's what I want: I want to use successive `C-M-u` (or
`expand-region`) to consider ever greater subexpressions that include
the position from which I started and to do that at a fine grain.
E.g. if I start with point on `b` in:

       a + b * c

I'd like to first consider "b" then "b * c" then the whole thing.


        Stefan






^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 22:06       ` Dmitry Gutov
@ 2023-01-19  6:24         ` Eli Zaretskii
  2023-01-19  7:58         ` Juri Linkov
  1 sibling, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2023-01-19  6:24 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: mickey, 60894, theo, monnier, juri

> Cc: 60894@debbugs.gnu.org, Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Thu, 19 Jan 2023 00:06:22 +0200
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> In my intuition, sexps are expressions which can be nested (a lot).
> 
> Sentences are "flat" expressions. I would probably say that in 
> tree-sitter modes sentences should be equivalent to "statements".
> 
> So, sexps could be small, and they could be large. For them various list 
> navigation operations make sense (like the previously mentioned ones).
> 
> Sentences stand somewhere in the middle, and they're more like 
> sequential. A operation like backward-up-statement wouldn't make a lot 
> of sense, however.

Intuitively, the above SGTM wrt programming languages.

> Note quite sure what would correspond to statement in html-mode, but I 
> would put separate tag elements in the category of word, or symbols, or, 
> okay, statements, rather than sexps. A sexp is the tag opener, plus tag 
> contents, plus its closer.

If we could find HTML equivalents of the above notions, that would be
very good for consistent UX.  But if that turns out to be hard or
far-fetched or impossible, we could alternatively come up with
HTML-specific interpretation of these abstract notions.  As long as
the result makes sense to users who edit or view HTML files, it would
still be okay, I think, even if the relation to programming language
construct is more remote.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-18 22:06       ` Dmitry Gutov
  2023-01-19  6:24         ` Eli Zaretskii
@ 2023-01-19  7:58         ` Juri Linkov
  1 sibling, 0 replies; 25+ messages in thread
From: Juri Linkov @ 2023-01-19  7:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Mickey Petersen, 60894, Theodor Thornhill, Stefan Monnier

> Note quite sure what would correspond to statement in html-mode, but
> I would put separate tag elements in the category of word, or symbols, or,
> okay, statements, rather than sexps. A sexp is the tag opener, plus tag
> contents, plus its closer.

An HTML tag with attributes as a statement and an HTML element
with start/end tags as a sexp makes sense.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-19  3:58                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-01-19 18:44                     ` Dmitry Gutov
  2023-01-19 19:03                       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Gutov @ 2023-01-19 18:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Mickey Petersen, 60894, Theodor Thornhill, Juri Linkov

On 19/01/2023 05:58, Stefan Monnier via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
>>> I think `C-M-u` from within a LaTeX environment was one of the cases
>>> where it misbehaved (tho that one is not using SMIE).
>> Any chance SMIE is doing something different, or something
>> particularly correct?
> 
> Could be.  Maybe its simplistic approach rules out the bad cases?

Simplistic meaning one that uses a list of openers and closers?

>>>> Do you mean a hook like backward-up-list-function, or something smaller?
>>> Something like that.  Maybe it could also be used for `expand-region`
>>> and `thing-at-point` kind of purposes maybe and could work even for
>>> treesit nodes that aren't "matching begin..end thingies".
>> A treesit node doesn't need an explicit "end" token, though.
> 
> And that's what I want: I want to use successive `C-M-u` (or
> `expand-region`) to consider ever greater subexpressions that include
> the position from which I started and to do that at a fine grain.
> E.g. if I start with point on `b` in:
> 
>         a + b * c
> 
> I'd like to first consider "b" then "b * c" then the whole thing.

That should be easy enough to do using the provided tree-sitter 
framework, just by adding binary nodes to the list of types.

Whether this behavior is preferable is a matter of opinion, though. My 
guess is Ruby users will find it too fiddly, and the end result is that 
one will have to press 'C-M-u' more times to get to the same result 
(which would usually be to get to the beginning of a block, or a 
method). But people can customize it.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp
  2023-01-19 18:44                     ` Dmitry Gutov
@ 2023-01-19 19:03                       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 25+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-01-19 19:03 UTC (permalink / raw)
  To: Dmitry Gutov, Stefan Monnier; +Cc: 60894, Mickey Petersen, Juri Linkov

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 19/01/2023 05:58, Stefan Monnier via Bug reports for GNU Emacs, the 
> Swiss army knife of text editors wrote:
>>>> I think `C-M-u` from within a LaTeX environment was one of the cases
>>>> where it misbehaved (tho that one is not using SMIE).
>>> Any chance SMIE is doing something different, or something
>>> particularly correct?
>> 
>> Could be.  Maybe its simplistic approach rules out the bad cases?
>
> Simplistic meaning one that uses a list of openers and closers?
>
>>>>> Do you mean a hook like backward-up-list-function, or something smaller?
>>>> Something like that.  Maybe it could also be used for `expand-region`
>>>> and `thing-at-point` kind of purposes maybe and could work even for
>>>> treesit nodes that aren't "matching begin..end thingies".
>>> A treesit node doesn't need an explicit "end" token, though.
>> 
>> And that's what I want: I want to use successive `C-M-u` (or
>> `expand-region`) to consider ever greater subexpressions that include
>> the position from which I started and to do that at a fine grain.
>> E.g. if I start with point on `b` in:
>> 
>>         a + b * c
>> 
>> I'd like to first consider "b" then "b * c" then the whole thing.
>
> That should be easy enough to do using the provided tree-sitter 
> framework, just by adding binary nodes to the list of types.
>
> Whether this behavior is preferable is a matter of opinion, though. My 
> guess is Ruby users will find it too fiddly, and the end result is that 
> one will have to press 'C-M-u' more times to get to the same result 
> (which would usually be to get to the beginning of a block, or a 
> method). But people can customize it.

So I pushed the changes I've made so far, after addressing Elis
comments.  Let's try it for a while and see how we feel about it going
forward.


Theo





^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-01-19 19:03 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-17 20:44 bug#60894: 30.0.50; [PATCH] Add treesit-forward-sexp Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-17 20:53 ` Dmitry Gutov
2023-01-17 21:07   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-18  1:50     ` Dmitry Gutov
2023-01-18  5:35       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-18 12:21         ` Eli Zaretskii
2023-01-18 13:39         ` Dmitry Gutov
2023-01-18 18:28           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-18 19:01           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-18 21:59             ` Dmitry Gutov
2023-01-19  2:43               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-19  3:30                 ` Dmitry Gutov
2023-01-19  3:58                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-19 18:44                     ` Dmitry Gutov
2023-01-19 19:03                       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-18 17:16         ` Juri Linkov
2023-01-18 18:27           ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-17 21:13 ` Mickey Petersen
2023-01-18 13:31   ` Dmitry Gutov
2023-01-18 17:09   ` Juri Linkov
2023-01-18 18:27     ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-01-18 18:55       ` Juri Linkov
2023-01-18 22:06       ` Dmitry Gutov
2023-01-19  6:24         ` Eli Zaretskii
2023-01-19  7:58         ` Juri Linkov

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.