* bug#74386: Tree-sitter javascript indentation
@ 2024-11-16 23:26 Marius Kjeldahl
2024-11-17 19:18 ` Dmitry Gutov
0 siblings, 1 reply; 19+ messages in thread
From: Marius Kjeldahl @ 2024-11-16 23:26 UTC (permalink / raw)
To: 74386
When using js-ts-mode, after multiline function parameter lists, or
multiline if clauses, Emacs typically adds two spaces indentation
after where the previous line ends. But because that line is already
indented, the effective indentation becomes four spaces (compare to
where the function definition or if clause starts). I would like this
to be only two. This is most likely a preference, but still.
I haven't been able to figure out if it is possible to customize Emacs
to do what I want. Or even learn where I can dig in and figure out
what I need to change to accomplish it. I've tried a reddit group and
also posted an issue on the tree-sitter github repo. But was told the
correct place to report it is as an Emacs bug.
So any pointers or suggestion?
Thanks,
Marius K.
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-16 23:26 bug#74386: Tree-sitter javascript indentation Marius Kjeldahl
@ 2024-11-17 19:18 ` Dmitry Gutov
2024-11-17 19:21 ` Marius Kjeldahl
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2024-11-17 19:18 UTC (permalink / raw)
To: Marius Kjeldahl, 74386
Hi!
On 17/11/2024 01:26, Marius Kjeldahl wrote:
> When using js-ts-mode, after multiline function parameter lists, or
> multiline if clauses, Emacs typically adds two spaces indentation
> after where the previous line ends. But because that line is already
> indented, the effective indentation becomes four spaces (compare to
> where the function definition or if clause starts). I would like this
> to be only two. This is most likely a preference, but still.
>
> I haven't been able to figure out if it is possible to customize Emacs
> to do what I want. Or even learn where I can dig in and figure out
> what I need to change to accomplish it. I've tried a reddit group and
> also posted an issue on the tree-sitter github repo. But was told the
> correct place to report it is as an Emacs bug.
>
> So any pointers or suggestion?
It would help if you also give specific examples of code where incorrect
indentation occurs. People can guess, but they might not guess all the
cases you want.
Just paste the code inside email, assuming monospaced text.
(And to clarify how this mailing list works: please use "reply all", so
that the bug# email address is retained in To:).
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-17 19:18 ` Dmitry Gutov
@ 2024-11-17 19:21 ` Marius Kjeldahl
2024-11-17 22:12 ` Dmitry Gutov
0 siblings, 1 reply; 19+ messages in thread
From: Marius Kjeldahl @ 2024-11-17 19:21 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386
Great, thanks.
Here are two examples (fingers crossed):
function a(b,
c) {
....d();
and
if (clause1
&& clause2) {
....callSomeFunc();
Notice in both cases the four dots "...." representing spaces used for
indentation. I would like only two dots (two spaces) for those
specific examples.
Thanks,
Marius K.
On Sun, 17 Nov 2024 at 20:18, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> Hi!
>
> On 17/11/2024 01:26, Marius Kjeldahl wrote:
> > When using js-ts-mode, after multiline function parameter lists, or
> > multiline if clauses, Emacs typically adds two spaces indentation
> > after where the previous line ends. But because that line is already
> > indented, the effective indentation becomes four spaces (compare to
> > where the function definition or if clause starts). I would like this
> > to be only two. This is most likely a preference, but still.
> >
> > I haven't been able to figure out if it is possible to customize Emacs
> > to do what I want. Or even learn where I can dig in and figure out
> > what I need to change to accomplish it. I've tried a reddit group and
> > also posted an issue on the tree-sitter github repo. But was told the
> > correct place to report it is as an Emacs bug.
> >
> > So any pointers or suggestion?
>
> It would help if you also give specific examples of code where incorrect
> indentation occurs. People can guess, but they might not guess all the
> cases you want.
>
> Just paste the code inside email, assuming monospaced text.
>
> (And to clarify how this mailing list works: please use "reply all", so
> that the bug# email address is retained in To:).
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-17 19:21 ` Marius Kjeldahl
@ 2024-11-17 22:12 ` Dmitry Gutov
2024-11-17 22:21 ` Marius Kjeldahl
2024-11-18 8:35 ` Marius Kjeldahl
0 siblings, 2 replies; 19+ messages in thread
From: Dmitry Gutov @ 2024-11-17 22:12 UTC (permalink / raw)
To: Marius Kjeldahl, Theodor Thornhill; +Cc: 74386
On 17/11/2024 21:21, Marius Kjeldahl wrote:
> Great, thanks.
>
> Here are two examples (fingers crossed):
>
> function a(b,
> c) {
> ....d();
>
> and
>
> if (clause1
> && clause2) {
> ....callSomeFunc();
>
> Notice in both cases the four dots "...." representing spaces used for
> indentation. I would like only two dots (two spaces) for those
> specific examples.
Thank you, these two examples should be handled by the patch below.
It's a change which could affect a fair amount of other examples too -
for example
Foobar
.find()
.catch((err) => {
return 2;
})
.then((num) => {
console.log(num);
});
turns into
Foobar
.find()
.catch((err) => {
return 2;
})
.then((num) => {
console.log(num);
});
I'm not 100% sure if that's good or bad (maybe worse compatibility with
js-mode, that we could say), so it would be useful to go through a bunch
of other examples and decide whether what we had is a bug, or these
should be supported as different styles, or etc.
Cc'ing the original author of js-ts-mode, Theodor, for consultation.
BTW, to try out such a patch you need to apply it, evaluate the defvar
form (C-M-x, for example), and then re-enable the major mode in a given
buffer (M-x js-ts-mode).
diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index f74b8ab1c46..17a9509dd45 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -3453,7 +3453,7 @@ js--treesit-indent-rules
(let ((switch-case (rx "switch_" (or "case" "default"))))
`((javascript
((parent-is "program") parent-bol 0)
- ((node-is "}") parent-bol 0)
+ ((node-is "}") standalone-parent 0)
((node-is ")") parent-bol 0)
((node-is "]") parent-bol 0)
((node-is ">") parent-bol 0)
@@ -3466,7 +3466,7 @@ js--treesit-indent-rules
;; "{" on the newline.
((node-is "statement_block") parent-bol js-indent-level)
((parent-is "named_imports") parent-bol js-indent-level)
- ((parent-is "statement_block") parent-bol js-indent-level)
+ ((parent-is "statement_block") standalone-parent js-indent-level)
((parent-is "variable_declarator") parent-bol js-indent-level)
((parent-is "arguments") parent-bol js-indent-level)
((parent-is "array") parent-bol js-indent-level)
@@ -3481,7 +3481,6 @@ js--treesit-indent-rules
((parent-is "binary_expression") parent-bol js-indent-level)
((parent-is "class_body") parent-bol js-indent-level)
((parent-is ,switch-case) parent-bol js-indent-level)
- ((parent-is "statement_block") parent-bol js-indent-level)
((match "while" "do_statement") parent-bol 0)
((match "else" "if_statement") parent-bol 0)
((parent-is ,(rx (or (seq (or "if" "for" "for_in" "while" "do")
"_statement")
^ permalink raw reply related [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-17 22:12 ` Dmitry Gutov
@ 2024-11-17 22:21 ` Marius Kjeldahl
2024-11-17 22:41 ` Dmitry Gutov
2024-11-18 8:35 ` Marius Kjeldahl
1 sibling, 1 reply; 19+ messages in thread
From: Marius Kjeldahl @ 2024-11-17 22:21 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Theodor Thornhill
Sounds good, will try. I'm not sure I like you other examples much, so
customizability for these things would be preferable.
Thanks,
Marius K.
On Sun, 17 Nov 2024 at 23:12, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 17/11/2024 21:21, Marius Kjeldahl wrote:
> > Great, thanks.
> >
> > Here are two examples (fingers crossed):
> >
> > function a(b,
> > c) {
> > ....d();
> >
> > and
> >
> > if (clause1
> > && clause2) {
> > ....callSomeFunc();
> >
> > Notice in both cases the four dots "...." representing spaces used for
> > indentation. I would like only two dots (two spaces) for those
> > specific examples.
>
> Thank you, these two examples should be handled by the patch below.
>
> It's a change which could affect a fair amount of other examples too -
> for example
>
> Foobar
> .find()
> .catch((err) => {
> return 2;
> })
> .then((num) => {
> console.log(num);
> });
>
> turns into
>
> Foobar
> .find()
> .catch((err) => {
> return 2;
> })
> .then((num) => {
> console.log(num);
> });
>
>
> I'm not 100% sure if that's good or bad (maybe worse compatibility with
> js-mode, that we could say), so it would be useful to go through a bunch
> of other examples and decide whether what we had is a bug, or these
> should be supported as different styles, or etc.
>
> Cc'ing the original author of js-ts-mode, Theodor, for consultation.
>
> BTW, to try out such a patch you need to apply it, evaluate the defvar
> form (C-M-x, for example), and then re-enable the major mode in a given
> buffer (M-x js-ts-mode).
>
> diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
> index f74b8ab1c46..17a9509dd45 100644
> --- a/lisp/progmodes/js.el
> +++ b/lisp/progmodes/js.el
> @@ -3453,7 +3453,7 @@ js--treesit-indent-rules
> (let ((switch-case (rx "switch_" (or "case" "default"))))
> `((javascript
> ((parent-is "program") parent-bol 0)
> - ((node-is "}") parent-bol 0)
> + ((node-is "}") standalone-parent 0)
> ((node-is ")") parent-bol 0)
> ((node-is "]") parent-bol 0)
> ((node-is ">") parent-bol 0)
> @@ -3466,7 +3466,7 @@ js--treesit-indent-rules
> ;; "{" on the newline.
> ((node-is "statement_block") parent-bol js-indent-level)
> ((parent-is "named_imports") parent-bol js-indent-level)
> - ((parent-is "statement_block") parent-bol js-indent-level)
> + ((parent-is "statement_block") standalone-parent js-indent-level)
> ((parent-is "variable_declarator") parent-bol js-indent-level)
> ((parent-is "arguments") parent-bol js-indent-level)
> ((parent-is "array") parent-bol js-indent-level)
> @@ -3481,7 +3481,6 @@ js--treesit-indent-rules
> ((parent-is "binary_expression") parent-bol js-indent-level)
> ((parent-is "class_body") parent-bol js-indent-level)
> ((parent-is ,switch-case) parent-bol js-indent-level)
> - ((parent-is "statement_block") parent-bol js-indent-level)
> ((match "while" "do_statement") parent-bol 0)
> ((match "else" "if_statement") parent-bol 0)
> ((parent-is ,(rx (or (seq (or "if" "for" "for_in" "while" "do")
> "_statement")
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-17 22:21 ` Marius Kjeldahl
@ 2024-11-17 22:41 ` Dmitry Gutov
0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Gutov @ 2024-11-17 22:41 UTC (permalink / raw)
To: Marius Kjeldahl; +Cc: 74386, Theodor Thornhill
On 18/11/2024 00:21, Marius Kjeldahl wrote:
> Sounds good, will try. I'm not sure I like you other examples much, so
> customizability for these things would be preferable.
These particular cases could also be distinguished using custom logic,
or even using a separate query.
But we might still want to allow customizing them separately (or not?) -
and check other examples as well.
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-17 22:12 ` Dmitry Gutov
2024-11-17 22:21 ` Marius Kjeldahl
@ 2024-11-18 8:35 ` Marius Kjeldahl
2024-11-18 15:29 ` Dmitry Gutov
1 sibling, 1 reply; 19+ messages in thread
From: Marius Kjeldahl @ 2024-11-18 8:35 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Theodor Thornhill
Can confirm your patch solves both the variants I showed. This will
also help me to dig into and understand more of what is going on, and
allows me to experiment more with my preferences (and tree-sitter in
general). So thank you very much.
When I get a clue, or when someone with a clue comes up with it, a
patch which make these things configurable with easy settings would
probably be. a good idea (instead of diving into "tree-sitter
internals").
Thanks,
Marius K.
On Sun, 17 Nov 2024 at 23:12, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 17/11/2024 21:21, Marius Kjeldahl wrote:
> > Great, thanks.
> >
> > Here are two examples (fingers crossed):
> >
> > function a(b,
> > c) {
> > ....d();
> >
> > and
> >
> > if (clause1
> > && clause2) {
> > ....callSomeFunc();
> >
> > Notice in both cases the four dots "...." representing spaces used for
> > indentation. I would like only two dots (two spaces) for those
> > specific examples.
>
> Thank you, these two examples should be handled by the patch below.
>
> It's a change which could affect a fair amount of other examples too -
> for example
>
> Foobar
> .find()
> .catch((err) => {
> return 2;
> })
> .then((num) => {
> console.log(num);
> });
>
> turns into
>
> Foobar
> .find()
> .catch((err) => {
> return 2;
> })
> .then((num) => {
> console.log(num);
> });
>
>
> I'm not 100% sure if that's good or bad (maybe worse compatibility with
> js-mode, that we could say), so it would be useful to go through a bunch
> of other examples and decide whether what we had is a bug, or these
> should be supported as different styles, or etc.
>
> Cc'ing the original author of js-ts-mode, Theodor, for consultation.
>
> BTW, to try out such a patch you need to apply it, evaluate the defvar
> form (C-M-x, for example), and then re-enable the major mode in a given
> buffer (M-x js-ts-mode).
>
> diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
> index f74b8ab1c46..17a9509dd45 100644
> --- a/lisp/progmodes/js.el
> +++ b/lisp/progmodes/js.el
> @@ -3453,7 +3453,7 @@ js--treesit-indent-rules
> (let ((switch-case (rx "switch_" (or "case" "default"))))
> `((javascript
> ((parent-is "program") parent-bol 0)
> - ((node-is "}") parent-bol 0)
> + ((node-is "}") standalone-parent 0)
> ((node-is ")") parent-bol 0)
> ((node-is "]") parent-bol 0)
> ((node-is ">") parent-bol 0)
> @@ -3466,7 +3466,7 @@ js--treesit-indent-rules
> ;; "{" on the newline.
> ((node-is "statement_block") parent-bol js-indent-level)
> ((parent-is "named_imports") parent-bol js-indent-level)
> - ((parent-is "statement_block") parent-bol js-indent-level)
> + ((parent-is "statement_block") standalone-parent js-indent-level)
> ((parent-is "variable_declarator") parent-bol js-indent-level)
> ((parent-is "arguments") parent-bol js-indent-level)
> ((parent-is "array") parent-bol js-indent-level)
> @@ -3481,7 +3481,6 @@ js--treesit-indent-rules
> ((parent-is "binary_expression") parent-bol js-indent-level)
> ((parent-is "class_body") parent-bol js-indent-level)
> ((parent-is ,switch-case) parent-bol js-indent-level)
> - ((parent-is "statement_block") parent-bol js-indent-level)
> ((match "while" "do_statement") parent-bol 0)
> ((match "else" "if_statement") parent-bol 0)
> ((parent-is ,(rx (or (seq (or "if" "for" "for_in" "while" "do")
> "_statement")
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-18 8:35 ` Marius Kjeldahl
@ 2024-11-18 15:29 ` Dmitry Gutov
2024-11-30 10:01 ` Eli Zaretskii
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2024-11-18 15:29 UTC (permalink / raw)
To: Marius Kjeldahl; +Cc: 74386, Theodor Thornhill
On 18/11/2024 10:35, Marius Kjeldahl wrote:
> Can confirm your patch solves both the variants I showed. This will
> also help me to dig into and understand more of what is going on, and
> allows me to experiment more with my preferences (and tree-sitter in
> general). So thank you very much.
Great!
For more experimenting, also check out 'M-x treesit-explore-mode'.
And here's how to immediately find out which indentation rule was used:
(setq treesit--indent-verbose t)
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-18 15:29 ` Dmitry Gutov
@ 2024-11-30 10:01 ` Eli Zaretskii
2024-12-01 5:23 ` Yuan Fu
0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2024-11-30 10:01 UTC (permalink / raw)
To: Dmitry Gutov, theo; +Cc: 74386, marius.kjeldahl
> Cc: 74386@debbugs.gnu.org, Theodor Thornhill <theo@thornhill.no>
> Date: Mon, 18 Nov 2024 17:29:18 +0200
> From: Dmitry Gutov <dmitry@gutov.dev>
>
> On 18/11/2024 10:35, Marius Kjeldahl wrote:
> > Can confirm your patch solves both the variants I showed. This will
> > also help me to dig into and understand more of what is going on, and
> > allows me to experiment more with my preferences (and tree-sitter in
> > general). So thank you very much.
>
> Great!
>
> For more experimenting, also check out 'M-x treesit-explore-mode'.
>
> And here's how to immediately find out which indentation rule was used:
>
> (setq treesit--indent-verbose t)
Theo, please chime in. Should Dmitry install his changes?
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-11-30 10:01 ` Eli Zaretskii
@ 2024-12-01 5:23 ` Yuan Fu
2024-12-01 13:11 ` Dmitry Gutov
0 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2024-12-01 5:23 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Dmitry Gutov, theo, marius.kjeldahl, 74386
> On Nov 30, 2024, at 2:01 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> Cc: 74386@debbugs.gnu.org, Theodor Thornhill <theo@thornhill.no>
>> Date: Mon, 18 Nov 2024 17:29:18 +0200
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> On 18/11/2024 10:35, Marius Kjeldahl wrote:
>>> Can confirm your patch solves both the variants I showed. This will
>>> also help me to dig into and understand more of what is going on, and
>>> allows me to experiment more with my preferences (and tree-sitter in
>>> general). So thank you very much.
>>
>> Great!
>>
>> For more experimenting, also check out 'M-x treesit-explore-mode'.
>>
>> And here's how to immediately find out which indentation rule was used:
>>
>> (setq treesit--indent-verbose t)
>
> Theo, please chime in. Should Dmitry install his changes?
The standalone-parent anchor is designed specifically to solve the bug described here, and changing parent-bol to standalone-parent is very safe. So IMO we can merge this (either to emacs-30 or master).
Yuan
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-01 5:23 ` Yuan Fu
@ 2024-12-01 13:11 ` Dmitry Gutov
2024-12-01 19:10 ` Yuan Fu
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2024-12-01 13:11 UTC (permalink / raw)
To: Yuan Fu, Eli Zaretskii; +Cc: 74386, theo, marius.kjeldahl
On 01/12/2024 07:23, Yuan Fu wrote:
>>> For more experimenting, also check out 'M-x treesit-explore-mode'.
>>>
>>> And here's how to immediately find out which indentation rule was used:
>>>
>>> (setq treesit--indent-verbose t)
>> Theo, please chime in. Should Dmitry install his changes?
> The standalone-parent anchor is designed specifically to solve the bug described here, and changing parent-bol to standalone-parent is very safe. So IMO we can merge this (either to emacs-30 or master).
Thanks for your answer.
What I was wondering, is whether we might need some user option(s) to
switch back to parent-bol instead of standalone-parent for some
constructs, and whether we should replace more uses of parent-bol with
standaline-parent, and if so, which ones.
Also see the example of the indentation change with that patch
(switching from one fairly popular indentation style to one slightly
less popular - AFAIK).
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-01 13:11 ` Dmitry Gutov
@ 2024-12-01 19:10 ` Yuan Fu
2024-12-01 22:33 ` Dmitry Gutov
0 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2024-12-01 19:10 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
> On Dec 1, 2024, at 5:11 AM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 01/12/2024 07:23, Yuan Fu wrote:
>>>> For more experimenting, also check out 'M-x treesit-explore-mode'.
>>>>
>>>> And here's how to immediately find out which indentation rule was used:
>>>>
>>>> (setq treesit--indent-verbose t)
>>> Theo, please chime in. Should Dmitry install his changes?
>> The standalone-parent anchor is designed specifically to solve the bug described here, and changing parent-bol to standalone-parent is very safe. So IMO we can merge this (either to emacs-30 or master).
>
> Thanks for your answer.
>
> What I was wondering, is whether we might need some user option(s) to switch back to parent-bol instead of standalone-parent for some constructs, and whether we should replace more uses of parent-bol with standaline-parent, and if so, which ones.
>
> Also see the example of the indentation change with that patch (switching from one fairly popular indentation style to one slightly less popular - AFAIK).
Ah, I see. That’s a good point, and I definitely prefer the indentation result of parent-bol here. The one produced by standalone-parent is just wrong. What we can do is make standalone-parent ignore “.” when checking for “standaloneness”. And perhaps make it configurable so it’s enabled only for modes that this waiver makes sense (C-like languages excluding C and C++).
Yuan
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-01 19:10 ` Yuan Fu
@ 2024-12-01 22:33 ` Dmitry Gutov
2024-12-02 2:31 ` Yuan Fu
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2024-12-01 22:33 UTC (permalink / raw)
To: Yuan Fu; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
On 01/12/2024 21:10, Yuan Fu wrote:
> Ah, I see. That’s a good point, and I definitely prefer the indentation result of parent-bol here. The one produced by standalone-parent is just wrong. What we can do is make standalone-parent ignore “.” when checking for “standaloneness”. And perhaps make it configurable so it’s enabled only for modes that this waiver makes sense (C-like languages excluding C and C++).
Maybe not by hardcoding this in inside the 'standalone-parent' matcher,
but writing this in the indentation rules? Different languages might
have differing ASTs for such construct.
Or if you meant to do a text search, a period might start a method call,
but it could also continue a "range" literal in some other language, or
some struct initializer (I think?) in C/C++. Also, some languages allow
(and style guides suggest) to have the previous at the end of the line,
then followed by newline and then the method name.
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-01 22:33 ` Dmitry Gutov
@ 2024-12-02 2:31 ` Yuan Fu
2024-12-11 6:18 ` Yuan Fu
0 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2024-12-02 2:31 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]
> On Dec 1, 2024, at 2:33 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 01/12/2024 21:10, Yuan Fu wrote:
>> Ah, I see. That’s a good point, and I definitely prefer the indentation result of parent-bol here. The one produced by standalone-parent is just wrong. What we can do is make standalone-parent ignore “.” when checking for “standaloneness”. And perhaps make it configurable so it’s enabled only for modes that this waiver makes sense (C-like languages excluding C and C++).
>
> Maybe not by hardcoding this in inside the 'standalone-parent' matcher, but writing this in the indentation rules? Different languages might have differing ASTs for such construct.
>
> Or if you meant to do a text search, a period might start a method call, but it could also continue a "range" literal in some other language, or some struct initializer (I think?) in C/C++. Also, some languages allow (and style guides suggest) to have the previous at the end of the line, then followed by newline and then the method name.
Not hard-coded, but customizable, like this:
Yuan
[-- Attachment #2: standalone-predicate-poc.patch --]
[-- Type: application/octet-stream, Size: 2553 bytes --]
From 301e8a55a14f61258c505a48a973379ac3156079 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Sun, 1 Dec 2024 18:26:40 -0800
Subject: [PATCH] Standalone predicate POC
---
lisp/treesit.el | 37 +++++++++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/lisp/treesit.el b/lisp/treesit.el
index 2acb46ab105..513f58eeffd 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1597,6 +1597,37 @@ treesit--indent-prev-line-node
(back-to-indentation)
(treesit--indent-largest-node-at (point)))))
+(defvar treesit-simple-indent-standalone-predicate nil
+ "Function used to determine if a node is \"standalone\".
+
+\"Standalone\" means the node starts on a new line. For example, if we
+look at the opening bracket, then it's standalone in this case:
+
+ { <-- Standalone.
+ return 1;
+ }
+
+but not in this case:
+
+ if (true) { <-- Not standalone.
+ return 1;
+ }
+
+The value of this variable affects the `standalone-parent' indent preset
+for treesit-simple-indent. If the value is nil, the standlone condition
+is as described. Some major mode might want to relax the condition a
+little bit, so that it ignores some punctuation like \".\". For
+example, a Javascript mode might want to consider the method call below
+to be standalone too:
+
+ obj
+ .method(() => { <-- Consider \".method\" to be standalone,
+ return 1; <-- so this line anchors on \".method\".
+ });
+
+The value should be a function that takes a node, and return t if it's
+standalone.")
+
(defvar treesit-simple-indent-presets
(list (cons 'match
(lambda
@@ -1736,8 +1767,10 @@ treesit-simple-indent-presets
(catch 'term
(while parent
(goto-char (treesit-node-start parent))
- (when (looking-back (rx bol (* whitespace))
- (line-beginning-position))
+ (when (if (null treesit-simple-indent-standalone-predicate)
+ (looking-back (rx bol (* whitespace))
+ (line-beginning-position))
+ (funcall parent))
(throw 'term (point)))
(setq parent (treesit-node-parent parent)))))))
(cons 'prev-sibling (lambda (node parent bol &rest _)
--
2.39.5 (Apple Git-151)
^ permalink raw reply related [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-02 2:31 ` Yuan Fu
@ 2024-12-11 6:18 ` Yuan Fu
2024-12-12 3:20 ` Dmitry Gutov
0 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2024-12-11 6:18 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
> On Dec 1, 2024, at 6:31 PM, Yuan Fu <casouri@gmail.com> wrote:
>
>
>
>> On Dec 1, 2024, at 2:33 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>
>> On 01/12/2024 21:10, Yuan Fu wrote:
>>> Ah, I see. That’s a good point, and I definitely prefer the indentation result of parent-bol here. The one produced by standalone-parent is just wrong. What we can do is make standalone-parent ignore “.” when checking for “standaloneness”. And perhaps make it configurable so it’s enabled only for modes that this waiver makes sense (C-like languages excluding C and C++).
>>
>> Maybe not by hardcoding this in inside the 'standalone-parent' matcher, but writing this in the indentation rules? Different languages might have differing ASTs for such construct.
>>
>> Or if you meant to do a text search, a period might start a method call, but it could also continue a "range" literal in some other language, or some struct initializer (I think?) in C/C++. Also, some languages allow (and style guides suggest) to have the previous at the end of the line, then followed by newline and then the method name.
>
> Not hard-coded, but customizable, like this:
>
> Yuan
>
> <standalone-predicate-poc.patch>
Circling back on this. WDYT? I think this would benefit all “modern” languages with chaining method calls.
Yuan
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-11 6:18 ` Yuan Fu
@ 2024-12-12 3:20 ` Dmitry Gutov
2024-12-12 5:28 ` Yuan Fu
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2024-12-12 3:20 UTC (permalink / raw)
To: Yuan Fu; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
On 11/12/2024 08:18, Yuan Fu wrote:
>>> Maybe not by hardcoding this in inside the 'standalone-parent' matcher, but writing this in the indentation rules? Different languages might have differing ASTs for such construct.
>>>
>>> Or if you meant to do a text search, a period might start a method call, but it could also continue a "range" literal in some other language, or some struct initializer (I think?) in C/C++. Also, some languages allow (and style guides suggest) to have the previous at the end of the line, then followed by newline and then the method name.
>> Not hard-coded, but customizable, like this:
>>
>> Yuan
>>
>> <standalone-predicate-poc.patch>
> Circling back on this. WDYT? I think this would benefit all “modern” languages with chaining method calls.
It's an interesting suggestion - and the docstring is very readable.
I'd be great to see how it works with some existing modes' indentation
code - e.g. to try to rewrite any of the rules in ruby-ts-mode (do we
have any other ts mode with as many options affecting indentation?).
OT2H the use of 'standalone-parent' is optional, and it's combined with
a matcher anyway, so it's good even if it covers like 80% of the cases.
What would be our next step in this? Replacing all 'parent-bol' anchors
with 'standalone-parent' across most ts modes?
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-12 3:20 ` Dmitry Gutov
@ 2024-12-12 5:28 ` Yuan Fu
2024-12-13 3:34 ` Dmitry Gutov
0 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2024-12-12 5:28 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
> On Dec 11, 2024, at 7:20 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 11/12/2024 08:18, Yuan Fu wrote:
>>>> Maybe not by hardcoding this in inside the 'standalone-parent' matcher, but writing this in the indentation rules? Different languages might have differing ASTs for such construct.
>>>>
>>>> Or if you meant to do a text search, a period might start a method call, but it could also continue a "range" literal in some other language, or some struct initializer (I think?) in C/C++. Also, some languages allow (and style guides suggest) to have the previous at the end of the line, then followed by newline and then the method name.
>>> Not hard-coded, but customizable, like this:
>>>
>>> Yuan
>>>
>>> <standalone-predicate-poc.patch>
>> Circling back on this. WDYT? I think this would benefit all “modern” languages with chaining method calls.
>
> It's an interesting suggestion - and the docstring is very readable.
>
> I'd be great to see how it works with some existing modes' indentation code - e.g. to try to rewrite any of the rules in ruby-ts-mode (do we have any other ts mode with as many options affecting indentation?).
>
> OT2H the use of 'standalone-parent' is optional, and it's combined with a matcher anyway, so it's good even if it covers like 80% of the cases.
>
> What would be our next step in this? Replacing all 'parent-bol' anchors with 'standalone-parent' across most ts modes?
Speaking of next step, I recently added another handy tool for languages with C-like syntax: c-ts-common-baseline-indent-rule. I figured out an indent logic that can work on all C-like languages and covers a wide range of cases. This one rule can give you all theses indentation:
1. Statements align to their previous sibling:
int main() {
int a = 1;
int b = 2; <-- Align to prev line’s sibling.
}
2. Indents one level for blocks: function, if, for, struct, etc.
int main() {
return 0; <-- Indent one level.
{ <-- Align to prev line’s sibling.
return 1; <-- Indent one level.
}
}
3. Elements in parenthesis and brackets:
return [1, 2, 3,
4, 5, 6]; <-- Align to first sibling.
return [
1, 2, 3, <-- Indent one level (option 1).
4, 5, 6, <-- Align to prev line’s sibling.
];
return [
1, 2, 3, <-- Align to opening bracket (option 2).
4, 5, 6, <-- Align to prev line’s sibling.
]; <-- Align to opening bracket.
for (int i = 0;
i < 10; <-- Align to first sibling.
i++) { <-- Align to prev line’s sibling.
continue;
}
4. Statement expressions indent one level when it’s broken into two
lines:
int main() {
int var
= 1287; <-- Indent one level.
int var =
1287; <-- Indent one level.
}
Then a C-like language’s major mode only need to add special cases over the baseline indent rule. And if we add the configurable heuristic for standalone-parent, the baseline indent rules would make use of it.
I brought it up because if we’re going to do some renovations to indent rules, might as well make use of c-ts-common-baseline-indent-rule, and we probably don’t even need to replace parent-box with standalone-parent, because the baseline indent rule would cover most cases.
I’ve already used it to rewrite c-ts-mode indent rules and it’s been a success; this baseline + override approach has been very helpful. c-ts-mode still has a lot of indent rules because of things like preproc directive, etc, but it’s much more manageable than before.
I don’t know how much it would help modes that has simpler indent rules. Go-ts-mode and rust-ts-mode only has a handful of indent rules, maybe they don’t really need this baseline rule. OTOH Lua and Ruby has more involved indent rules, maybe they can benefit and reduce the number of rules they need to define.
Yuan
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-12 5:28 ` Yuan Fu
@ 2024-12-13 3:34 ` Dmitry Gutov
2024-12-13 5:34 ` Yuan Fu
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2024-12-13 3:34 UTC (permalink / raw)
To: Yuan Fu; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
On 12/12/2024 07:28, Yuan Fu wrote:
>> What would be our next step in this? Replacing all 'parent-bol' anchors with 'standalone-parent' across most ts modes?
>
> Speaking of next step, I recently added another handy tool for languages with C-like syntax: c-ts-common-baseline-indent-rule. I figured out an indent logic that can work on all C-like languages and covers a wide range of cases. This one rule can give you all theses indentation:
Looks pretty great. I guess it depends on the grammars being to an
extent compatible, right?
> 1. Statements align to their previous sibling:
>
> int main() {
> int a = 1;
> int b = 2; <-- Align to prev line’s sibling.
> }
>
> 2. Indents one level for blocks: function, if, for, struct, etc.
>
> int main() {
> return 0; <-- Indent one level.
> { <-- Align to prev line’s sibling.
> return 1; <-- Indent one level.
> }
> }
>
> 3. Elements in parenthesis and brackets:
>
> return [1, 2, 3,
> 4, 5, 6]; <-- Align to first sibling.
>
> return [
> 1, 2, 3, <-- Indent one level (option 1).
> 4, 5, 6, <-- Align to prev line’s sibling.
> ];
>
> return [
> 1, 2, 3, <-- Align to opening bracket (option 2).
> 4, 5, 6, <-- Align to prev line’s sibling.
> ]; <-- Align to opening bracket.
>
> for (int i = 0;
> i < 10; <-- Align to first sibling.
> i++) { <-- Align to prev line’s sibling.
> continue;
> }
>
> 4. Statement expressions indent one level when it’s broken into two
> lines:
>
> int main() {
> int var
> = 1287; <-- Indent one level.
> int var =
> 1287; <-- Indent one level.
> }
Should there be an example with a method call starting on a new line,
line in the arrow literal example (for JS) that we discussed?
> Then a C-like language’s major mode only need to add special cases over the baseline indent rule. And if we add the configurable heuristic for standalone-parent, the baseline indent rules would make use of it.
Sounds good.
> I brought it up because if we’re going to do some renovations to indent rules, might as well make use of c-ts-common-baseline-indent-rule, and we probably don’t even need to replace parent-box with standalone-parent, because the baseline indent rule would cover most cases.
I'm now sure how safe that is - my point was that for each of the
languages it'd be great to have somebody motivated go over the main
syntactic cases and see that the behavior is still reasonable. But we
can also make the switch and wait for reports.
> I’ve already used it to rewrite c-ts-mode indent rules and it’s been a success; this baseline + override approach has been very helpful. c-ts-mode still has a lot of indent rules because of things like preproc directive, etc, but it’s much more manageable than before.
>
> I don’t know how much it would help modes that has simpler indent rules. Go-ts-mode and rust-ts-mode only has a handful of indent rules, maybe they don’t really need this baseline rule. OTOH Lua and Ruby has more involved indent rules, maybe they can benefit and reduce the number of rules they need to define.
Ruby has different delimiters (do...end or def...end or etc), and the
curlies don't do exactly the same job that they do in C. So I'm not sure
how feasible it is. A half of the function would be a fit, though.
^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#74386: Tree-sitter javascript indentation
2024-12-13 3:34 ` Dmitry Gutov
@ 2024-12-13 5:34 ` Yuan Fu
0 siblings, 0 replies; 19+ messages in thread
From: Yuan Fu @ 2024-12-13 5:34 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 74386, Eli Zaretskii, Theodor Thornhill, marius.kjeldahl
> On Dec 12, 2024, at 7:34 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 12/12/2024 07:28, Yuan Fu wrote:
>
>>> What would be our next step in this? Replacing all 'parent-bol' anchors with 'standalone-parent' across most ts modes?
>> Speaking of next step, I recently added another handy tool for languages with C-like syntax: c-ts-common-baseline-indent-rule. I figured out an indent logic that can work on all C-like languages and covers a wide range of cases. This one rule can give you all theses indentation:
>
> Looks pretty great. I guess it depends on the grammars being to an extent compatible, right?
Yes, but most C-like language should be compatible. The rule relies on the grammar to put brackets like “(“ “[“ “{“ as the first child node and last child node of the contract that contains them, which is what grammars naturally do. (The only exception I found is the for statement in C.) Beyond that, the rule takes advantage of how parse tress are usually structured: when the previous line is a sibling node of the current lines, usually you want to align the two lines; and when you indent, the indent anchor is usually the "standalone-parent”.
>
>> 1. Statements align to their previous sibling:
>> int main() {
>> int a = 1;
>> int b = 2; <-- Align to prev line’s sibling.
>> }
>> 2. Indents one level for blocks: function, if, for, struct, etc.
>> int main() {
>> return 0; <-- Indent one level.
>> { <-- Align to prev line’s sibling.
>> return 1; <-- Indent one level.
>> }
>> }
>> 3. Elements in parenthesis and brackets:
>> return [1, 2, 3,
>> 4, 5, 6]; <-- Align to first sibling.
>> return [
>> 1, 2, 3, <-- Indent one level (option 1).
>> 4, 5, 6, <-- Align to prev line’s sibling.
>> ];
>> return [
>> 1, 2, 3, <-- Align to opening bracket (option 2).
>> 4, 5, 6, <-- Align to prev line’s sibling.
>> ]; <-- Align to opening bracket.
>> for (int i = 0;
>> i < 10; <-- Align to first sibling.
>> i++) { <-- Align to prev line’s sibling.
>> continue;
>> }
>> 4. Statement expressions indent one level when it’s broken into two
>> lines:
>> int main() {
>> int var
>> = 1287; <-- Indent one level.
>> int var =
>> 1287; <-- Indent one level.
>> }
>
> Should there be an example with a method call starting on a new line, line in the arrow literal example (for JS) that we discussed?
Yes, once we add that.
>
>> Then a C-like language’s major mode only need to add special cases over the baseline indent rule. And if we add the configurable heuristic for standalone-parent, the baseline indent rules would make use of it.
>
> Sounds good.
>
>> I brought it up because if we’re going to do some renovations to indent rules, might as well make use of c-ts-common-baseline-indent-rule, and we probably don’t even need to replace parent-box with standalone-parent, because the baseline indent rule would cover most cases.
>
> I'm now sure how safe that is - my point was that for each of the languages it'd be great to have somebody motivated go over the main syntactic cases and see that the behavior is still reasonable. But we can also make the switch and wait for reports.
I agree, sweeping change in unfamiliar packages maintained by other people is obviously a no-go. I’m thinking of the maintainers making the change should they see the baseline-indent-rule beneficial. (Same goes to standalone-parent, I’d much rather the maintainers take that call even it’s a smaller change.)
For immediate next step we can just apply the standalone-parent patch, and use it in js. And we make baseline-indent rule support the standalone-parent customization, and let major mode maintainers know of both. What they want to do is up to them.
>
>> I’ve already used it to rewrite c-ts-mode indent rules and it’s been a success; this baseline + override approach has been very helpful. c-ts-mode still has a lot of indent rules because of things like preproc directive, etc, but it’s much more manageable than before.
>> I don’t know how much it would help modes that has simpler indent rules. Go-ts-mode and rust-ts-mode only has a handful of indent rules, maybe they don’t really need this baseline rule. OTOH Lua and Ruby has more involved indent rules, maybe they can benefit and reduce the number of rules they need to define.
>
> Ruby has different delimiters (do...end or def...end or etc), and the curlies don't do exactly the same job that they do in C. So I'm not sure how feasible it is. A half of the function would be a fit, though.
Right, so I think it’ll still be more helpful than not.
Yuan
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-12-13 5:34 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-16 23:26 bug#74386: Tree-sitter javascript indentation Marius Kjeldahl
2024-11-17 19:18 ` Dmitry Gutov
2024-11-17 19:21 ` Marius Kjeldahl
2024-11-17 22:12 ` Dmitry Gutov
2024-11-17 22:21 ` Marius Kjeldahl
2024-11-17 22:41 ` Dmitry Gutov
2024-11-18 8:35 ` Marius Kjeldahl
2024-11-18 15:29 ` Dmitry Gutov
2024-11-30 10:01 ` Eli Zaretskii
2024-12-01 5:23 ` Yuan Fu
2024-12-01 13:11 ` Dmitry Gutov
2024-12-01 19:10 ` Yuan Fu
2024-12-01 22:33 ` Dmitry Gutov
2024-12-02 2:31 ` Yuan Fu
2024-12-11 6:18 ` Yuan Fu
2024-12-12 3:20 ` Dmitry Gutov
2024-12-12 5:28 ` Yuan Fu
2024-12-13 3:34 ` Dmitry Gutov
2024-12-13 5:34 ` Yuan Fu
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.