all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#68993: treesitter support for forward-sexp-default-function
@ 2024-02-08 17:38 Juri Linkov
  2024-02-10 17:46 ` Juri Linkov
  2024-02-12  1:42 ` Yuan Fu
  0 siblings, 2 replies; 5+ messages in thread
From: Juri Linkov @ 2024-02-08 17:38 UTC (permalink / raw)
  To: 68993; +Cc: Yuan Fu

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

'treesit-forward-sentence' supports the node type 'text',
and for matching nodes it uses the fallback to
'forward-sentence-default-function'.

This patch does exactly the same for 'treesit-forward-sexp':
for nodes that match a new node type 'comment',
it uses the fallback to the new function
'forward-sexp-default-function'.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: forward-sexp-default-function.patch --]
[-- Type: text/x-diff, Size: 1880 bytes --]

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 82b2f97b4a5..284c4915f3a 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -2137,7 +2137,10 @@ treesit-forward-sexp
   (interactive "^p")
   (let ((arg (or arg 1))
         (pred (or treesit-sexp-type-regexp 'sexp)))
-    (or (if (> arg 0)
+    (or (when (treesit-node-match-p (treesit-node-at (point)) 'comment t)
+          (funcall #'forward-sexp-default-function arg)
+          t)
+        (if (> arg 0)
             (treesit-end-of-thing pred (abs arg) 'restricted)
           (treesit-beginning-of-thing pred (abs arg) 'restricted))
         ;; If we couldn't move, we should signal an error and report
diff --git a/lisp/emacs-lisp/lisp.el b/lisp/emacs-lisp/lisp.el
index 4b722b4e9a7..d3c3bf55de3 100644
--- a/lisp/emacs-lisp/lisp.el
+++ b/lisp/emacs-lisp/lisp.el
@@ -45,7 +45,12 @@ parens-require-spaces
   :type 'boolean
   :group 'lisp)
 
-(defvar forward-sexp-function nil
+(defun forward-sexp-default-function (&optional arg)
+  "Default function for `forward-sexp-function'."
+  (goto-char (or (scan-sexps (point) arg) (buffer-end arg)))
+  (if (< arg 0) (backward-prefix-chars)))
+
+(defvar forward-sexp-function #'forward-sexp-default-function
   ;; FIXME:
   ;; - for some uses, we may want a "sexp-only" version, which only
   ;;   jumps over a well-formed sexp, rather than some dwimish thing
@@ -74,10 +79,7 @@ forward-sexp
                                     "No next sexp"
                                   "No previous sexp"))))
     (or arg (setq arg 1))
-    (if forward-sexp-function
-        (funcall forward-sexp-function arg)
-      (goto-char (or (scan-sexps (point) arg) (buffer-end arg)))
-      (if (< arg 0) (backward-prefix-chars)))))
+    (funcall forward-sexp-function arg)))
 
 (defun backward-sexp (&optional arg interactive)
   "Move backward across one balanced expression (sexp).

[-- Attachment #3: Type: text/plain, Size: 483 bytes --]


Maybe the node type 'comment' is not the best name,
but it was intended to allow using the default function
to be able to move with 'M-C-f' in the comments and strings
there tree-sitter has no information.

It makes sense to support the default movement with 'M-C-f'
in the comments and strings of all ts modes.  The second patch
shows how this could be achieved by adding the default
'comment' match to 'treesit-thing-settings' of all modes.
Or maybe this should be customizable?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: treesit-major-mode-setup.patch --]
[-- Type: text/x-diff, Size: 1071 bytes --]

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 82b2f97b4a5..284c4915f3a 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -3054,6 +3057,18 @@ treesit-major-mode-setup
     (setq-local outline-search-function #'treesit-outline-search
                 outline-level #'treesit-outline-level))
 
+  (dolist (parser (treesit-parser-list))
+    (let ((language (treesit-parser-language parser))
+          (comment (regexp-opt '("comment" "string" "string_content"))))
+      (unless (treesit-thing-defined-p 'comment language)
+        (if-let ((l (alist-get language treesit-thing-settings)))
+            (progn
+              (setf (alist-get 'comment l) (list comment))
+              (setf (alist-get language treesit-thing-settings) l))
+          (setq-local treesit-thing-settings
+                      (append `((,language (comment ,comment)))
+                              treesit-thing-settings))))))
+
   ;; Remove existing local parsers.
   (dolist (ov (overlays-in (point-min) (point-max)))
     (when-let ((parser (overlay-get ov 'treesit-parser)))

[-- Attachment #5: Type: text/plain, Size: 389 bytes --]


The third patch demonstrates how it's possible to close bug#67036
that was impossible to fix without more general changes in treesit.el.

The problem is that e.g. Ruby parser to such text:

  hash[:key]

produces such syntax tree:

  (element_reference object: (identifier) [ (simple_symbol) ])

so when point is on [ then 'M-C-f' can't move to ].

This is fixed now by the third patch:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: ruby-ts-mode.patch --]
[-- Type: text/x-diff, Size: 1409 bytes --]

diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
index 598eaa461ff..4d0ae2e9303 100644
--- a/lisp/progmodes/ruby-ts-mode.el
+++ b/lisp/progmodes/ruby-ts-mode.el
@@ -1170,7 +1170,20 @@ ruby-ts-mode
                                 "global_variable"
                                 )
                                eol)
-                              #'ruby-ts--sexp-p)))))
+                              #'ruby-ts--sexp-p))
+                 (comment ,(lambda (node)
+                             (or (member (treesit-node-type node)
+                                         '("comment" "string_content"))
+                                 (and (member (treesit-node-text node)
+                                              '("[" "]"))
+                                      (equal (treesit-node-type
+                                              (treesit-node-parent node))
+                                             "element_reference"))
+                                 (and (member (treesit-node-text node)
+                                              '("#{" "}"))
+                                      (equal (treesit-node-type
+                                              (treesit-node-parent node))
+                                             "interpolation"))))))))
 
   ;; AFAIK, Ruby can not nest methods
   (setq-local treesit-defun-prefer-top-level nil)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* bug#68993: treesitter support for forward-sexp-default-function
  2024-02-08 17:38 bug#68993: treesitter support for forward-sexp-default-function Juri Linkov
@ 2024-02-10 17:46 ` Juri Linkov
  2024-02-12  1:28   ` Yuan Fu
  2024-02-12  1:42 ` Yuan Fu
  1 sibling, 1 reply; 5+ messages in thread
From: Juri Linkov @ 2024-02-10 17:46 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 68993

Hi Yuan,

Do you think 'comment' is a suitable name for 'treesit-forward-sexp'?
I'm unsure even if 'text' is a good name for 'treesit-forward-sentence'.
But at least these two should be consistent with each another.

Maybe better to add the prefix 'default-' to both?
This will hint that the default function is used.

Then 'treesit-forward-sentence' will support types
'sentence' and 'default-sentence'.
And 'treesit-forward-sexp' will support types
'sexp' and 'default-sexp'.

> @@ -2137,7 +2137,10 @@ treesit-forward-sexp
>    (interactive "^p")
>    (let ((arg (or arg 1))
>          (pred (or treesit-sexp-type-regexp 'sexp)))
> -    (or (if (> arg 0)
> +    (or (when (treesit-node-match-p (treesit-node-at (point)) 'comment t)
> +          (funcall #'forward-sexp-default-function arg)
> +          t)
> +        (if (> arg 0)
>              (treesit-end-of-thing pred (abs arg) 'restricted)
>            (treesit-beginning-of-thing pred (abs arg) 'restricted))
>          ;; If we couldn't move, we should signal an error and report





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#68993: treesitter support for forward-sexp-default-function
  2024-02-10 17:46 ` Juri Linkov
@ 2024-02-12  1:28   ` Yuan Fu
  0 siblings, 0 replies; 5+ messages in thread
From: Yuan Fu @ 2024-02-12  1:28 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 68993



> On Feb 10, 2024, at 9:46 AM, Juri Linkov <juri@linkov.net> wrote:
> 
> Hi Yuan,
> 
> Do you think 'comment' is a suitable name for 'treesit-forward-sexp'?
> I'm unsure even if 'text' is a good name for 'treesit-forward-sentence'.
> But at least these two should be consistent with each another.
> 
> Maybe better to add the prefix 'default-' to both?
> This will hint that the default function is used.
> 
> Then 'treesit-forward-sentence' will support types
> 'sentence' and 'default-sentence'.
> And 'treesit-forward-sexp' will support types
> 'sexp' and 'default-sexp'.

First there’s the “text” definition, then treesit-forward-sentence uses it as a heuristic to get better results in comments and strings, rather than the other way around. So for me it doesn’t make much sense to say if “text” is good or bad name for treesit-forward-sentence—it’s not for treesit-forward-sentence to start with.

My suggestion would be for both treesit-forward-sentence and -sexp to use “text” for their heuristic. If someone wants more customized behavior, they can always write a custom forward-sentence/sexp function.

Yuan




^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#68993: treesitter support for forward-sexp-default-function
  2024-02-08 17:38 bug#68993: treesitter support for forward-sexp-default-function Juri Linkov
  2024-02-10 17:46 ` Juri Linkov
@ 2024-02-12  1:42 ` Yuan Fu
  2024-02-12 18:41   ` Juri Linkov
  1 sibling, 1 reply; 5+ messages in thread
From: Yuan Fu @ 2024-02-12  1:42 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 68993



> On Feb 8, 2024, at 9:38 AM, Juri Linkov <juri@linkov.net> wrote:
> 
> 'treesit-forward-sentence' supports the node type 'text',
> and for matching nodes it uses the fallback to
> 'forward-sentence-default-function'.
> 
> This patch does exactly the same for 'treesit-forward-sexp':
> for nodes that match a new node type 'comment',
> it uses the fallback to the new function
> 'forward-sexp-default-function'.
> 
> diff --git a/lisp/treesit.el b/lisp/treesit.el
> index 82b2f97b4a5..284c4915f3a 100644
> --- a/lisp/treesit.el
> +++ b/lisp/treesit.el
> @@ -2137,7 +2137,10 @@ treesit-forward-sexp
>   (interactive "^p")
>   (let ((arg (or arg 1))
>         (pred (or treesit-sexp-type-regexp 'sexp)))
> -    (or (if (> arg 0)
> +    (or (when (treesit-node-match-p (treesit-node-at (point)) 'comment t)
> +          (funcall #'forward-sexp-default-function arg)
> +          t)
> +        (if (> arg 0)
>             (treesit-end-of-thing pred (abs arg) 'restricted)
>           (treesit-beginning-of-thing pred (abs arg) 'restricted))
>         ;; If we couldn't move, we should signal an error and report
> diff --git a/lisp/emacs-lisp/lisp.el b/lisp/emacs-lisp/lisp.el
> index 4b722b4e9a7..d3c3bf55de3 100644
> --- a/lisp/emacs-lisp/lisp.el
> +++ b/lisp/emacs-lisp/lisp.el
> @@ -45,7 +45,12 @@ parens-require-spaces
>   :type 'boolean
>   :group 'lisp)
> 
> -(defvar forward-sexp-function nil
> +(defun forward-sexp-default-function (&optional arg)
> +  "Default function for `forward-sexp-function'."
> +  (goto-char (or (scan-sexps (point) arg) (buffer-end arg)))
> +  (if (< arg 0) (backward-prefix-chars)))
> +
> +(defvar forward-sexp-function #'forward-sexp-default-function
>   ;; FIXME:
>   ;; - for some uses, we may want a "sexp-only" version, which only
>   ;;   jumps over a well-formed sexp, rather than some dwimish thing
> @@ -74,10 +79,7 @@ forward-sexp
>                                     "No next sexp"
>                                   "No previous sexp"))))
>     (or arg (setq arg 1))
> -    (if forward-sexp-function
> -        (funcall forward-sexp-function arg)
> -      (goto-char (or (scan-sexps (point) arg) (buffer-end arg)))
> -      (if (< arg 0) (backward-prefix-chars)))))
> +    (funcall forward-sexp-function arg)))
> 
> (defun backward-sexp (&optional arg interactive)
>   "Move backward across one balanced expression (sexp).
> 
> Maybe the node type 'comment' is not the best name,
> but it was intended to allow using the default function
> to be able to move with 'M-C-f' in the comments and strings
> there tree-sitter has no information.
> 
> It makes sense to support the default movement with 'M-C-f'
> in the comments and strings of all ts modes.  The second patch
> shows how this could be achieved by adding the default
> 'comment' match to 'treesit-thing-settings' of all modes.
> Or maybe this should be customizable?

I think treesit-thing-settings is something we want to left for major mode’s to set. They’ll need to define other “things” in treesit-thing-settings anyway. Sure, it’s nice if we can set a few definitions automatically, but I don’t think the gain is worth that much; OTOH, it’s nice to have clear boundaries, and minimizes the possibility of confusion.

> 
> diff --git a/lisp/treesit.el b/lisp/treesit.el
> index 82b2f97b4a5..284c4915f3a 100644
> --- a/lisp/treesit.el
> +++ b/lisp/treesit.el
> @@ -3054,6 +3057,18 @@ treesit-major-mode-setup
>     (setq-local outline-search-function #'treesit-outline-search
>                 outline-level #'treesit-outline-level))
> 
> +  (dolist (parser (treesit-parser-list))
> +    (let ((language (treesit-parser-language parser))
> +          (comment (regexp-opt '("comment" "string" "string_content"))))
> +      (unless (treesit-thing-defined-p 'comment language)
> +        (if-let ((l (alist-get language treesit-thing-settings)))
> +            (progn
> +              (setf (alist-get 'comment l) (list comment))
> +              (setf (alist-get language treesit-thing-settings) l))
> +          (setq-local treesit-thing-settings
> +                      (append `((,language (comment ,comment)))
> +                              treesit-thing-settings))))))
> +
>   ;; Remove existing local parsers.
>   (dolist (ov (overlays-in (point-min) (point-max)))
>     (when-let ((parser (overlay-get ov 'treesit-parser)))
> 
> The third patch demonstrates how it's possible to close bug#67036
> that was impossible to fix without more general changes in treesit.el.
> 
> The problem is that e.g. Ruby parser to such text:
> 
>  hash[:key]
> 
> produces such syntax tree:
> 
>  (element_reference object: (identifier) [ (simple_symbol) ])
> 
> so when point is on [ then 'M-C-f' can't move to ].
> 
> This is fixed now by the third patch:
> 
> diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
> index 598eaa461ff..4d0ae2e9303 100644
> --- a/lisp/progmodes/ruby-ts-mode.el
> +++ b/lisp/progmodes/ruby-ts-mode.el
> @@ -1170,7 +1170,20 @@ ruby-ts-mode
>                                 "global_variable"
>                                 )
>                                eol)
> -                              #'ruby-ts--sexp-p)))))
> +                              #'ruby-ts--sexp-p))
> +                 (comment ,(lambda (node)
> +                             (or (member (treesit-node-type node)
> +                                         '("comment" "string_content"))
> +                                 (and (member (treesit-node-text node)
> +                                              '("[" "]"))
> +                                      (equal (treesit-node-type
> +                                              (treesit-node-parent node))
> +                                             "element_reference"))
> +                                 (and (member (treesit-node-text node)
> +                                              '("#{" "}"))
> +                                      (equal (treesit-node-type
> +                                              (treesit-node-parent node))
> +                                             "interpolation"))))))))
> 
>   ;; AFAIK, Ruby can not nest methods
>   (setq-local treesit-defun-prefer-top-level nil)

IIUC, this doesn’t look like a good idea: you don’t want to mark something that’s not comment as comment. In the future, other packages will start using these thing definitions, and I’m sure you don’t want them consider regular code as comments.

For the specific problem you described, maybe the change made in #68899 can help?

Yuan




^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#68993: treesitter support for forward-sexp-default-function
  2024-02-12  1:42 ` Yuan Fu
@ 2024-02-12 18:41   ` Juri Linkov
  0 siblings, 0 replies; 5+ messages in thread
From: Juri Linkov @ 2024-02-12 18:41 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 68993

>> The problem is that e.g. Ruby parser to such text:
>>
>>  hash[:key]
>>
>> produces such syntax tree:
>>
>>  (element_reference object: (identifier) [ (simple_symbol) ])
>>
>> so when point is on [ then 'C-M-f' can't move to ].
>>
>> This is fixed now by the third patch:
>> @@ -1170,7 +1170,20 @@ ruby-ts-mode
>> +                 (comment ,(lambda (node)
>> +                             (or (member (treesit-node-type node)
>> +                                         '("comment" "string_content"))
>> +                                 (and (member (treesit-node-text node)
>> +                                              '("[" "]"))
>> +                                      (equal (treesit-node-type
>> +                                              (treesit-node-parent node))
>> +                                             "element_reference"))
>> +                                 (and (member (treesit-node-text node)
>> +                                              '("#{" "}"))
>> +                                      (equal (treesit-node-type
>> +                                              (treesit-node-parent node))
>> +                                             "interpolation"))))))))
>
> IIUC, this doesn’t look like a good idea: you don’t want to mark
> something that’s not comment as comment. In the future, other packages
> will start using these thing definitions, and I’m sure you don’t want
> them consider regular code as comments.
>
> For the specific problem you described, maybe the change made in #68899 can help?

bug#68899 can't help because it only fixes 'C-M-f' to move point to the
end of the current symbol.

Whereas for the problem above we need a predicate that will instruct
'treesit-forward-sexp' to fall back to 'forward-sexp-default-function'.

Clearly neither the name 'text' nor 'comment' would be suitable
for such usage.  Maybe it's possible to find a different name?





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-12 18:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-08 17:38 bug#68993: treesitter support for forward-sexp-default-function Juri Linkov
2024-02-10 17:46 ` Juri Linkov
2024-02-12  1:28   ` Yuan Fu
2024-02-12  1:42 ` Yuan Fu
2024-02-12 18:41   ` Juri Linkov

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.