unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Dmitry Gutov <dmitry@gutov.dev>
Cc: 74386@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
	Theodor Thornhill <theo@thornhill.no>,
	marius.kjeldahl@gmail.com
Subject: bug#74386: Tree-sitter javascript indentation
Date: Thu, 12 Dec 2024 21:34:29 -0800	[thread overview]
Message-ID: <057D3C98-8503-4C6F-80F1-B54BAFE624BF@gmail.com> (raw)
In-Reply-To: <cb45b90e-cc66-4b3f-895b-d96f641bb9dd@gutov.dev>



> On Dec 12, 2024, at 7:34 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 12/12/2024 07:28, Yuan Fu wrote:
> 
>>> What would be our next step in this? Replacing all 'parent-bol' anchors with 'standalone-parent' across most ts modes?
>> Speaking of next step, I recently added another handy tool for languages with C-like syntax: c-ts-common-baseline-indent-rule. I figured out an indent logic that can work on all C-like languages and covers a wide range of cases. This one rule can give you all theses indentation:
> 
> Looks pretty great. I guess it depends on the grammars being to an extent compatible, right?

Yes, but most C-like language should be compatible. The rule relies on the grammar to put brackets like “(“ “[“ “{“ as the first child node and last child node of the contract that contains them, which is what grammars naturally do. (The only exception I found is the for statement in C.) Beyond that, the rule takes advantage of how parse tress are usually structured: when the previous line is a sibling node of the current lines, usually you want to align the two lines; and when you indent, the indent anchor is usually the "standalone-parent”. 

> 
>> 1. Statements align to their previous sibling:
>>     int main() {
>>       int a = 1;
>>       int b = 2; <-- Align to prev line’s sibling.
>>     }
>> 2. Indents one level for blocks: function, if, for, struct, etc.
>>     int main() {
>>       return 0;   <-- Indent one level.
>>       {           <-- Align to prev line’s sibling.
>>         return 1; <-- Indent one level.
>>       }
>>     }
>> 3. Elements in parenthesis and brackets:
>>     return [1, 2, 3,
>>             4, 5, 6]; <-- Align to first sibling.
>>     return [
>>       1, 2, 3,  <-- Indent one level (option 1).
>>       4, 5, 6,  <-- Align to prev line’s sibling.
>>     ];
>>     return [
>>             1, 2, 3,  <-- Align to opening bracket (option 2).
>>             4, 5, 6,  <-- Align to prev line’s sibling.
>>            ];         <-- Align to opening bracket.
>>     for (int i = 0;
>>          i < 10; <-- Align to first sibling.
>>          i++) {  <-- Align to prev line’s sibling.
>>       continue;
>>     }
>> 4. Statement expressions indent one level when it’s broken into two
>>    lines:
>>     int main() {
>>       int var
>>         = 1287;  <-- Indent one level.
>>       int var =
>>         1287;    <-- Indent one level.
>>     }
> 
> Should there be an example with a method call starting on a new line, line in the arrow literal example (for JS) that we discussed?

Yes, once we add that.

> 
>> Then a C-like language’s major mode only need to add special cases over the baseline indent rule. And if we add the configurable heuristic for standalone-parent, the baseline indent rules would make use of it.
> 
> Sounds good.
> 
>> I brought it up because if we’re going to do some renovations to indent rules, might as well make use of c-ts-common-baseline-indent-rule, and we probably don’t even need to replace parent-box with standalone-parent, because the baseline indent rule would cover most cases.
> 
> I'm now sure how safe that is - my point was that for each of the languages it'd be great to have somebody motivated go over the main syntactic cases and see that the behavior is still reasonable. But we can also make the switch and wait for reports.

I agree, sweeping change in unfamiliar packages maintained by other people is obviously a no-go. I’m thinking of the maintainers making the change should they see the baseline-indent-rule beneficial. (Same goes to standalone-parent, I’d much rather the maintainers take that call even it’s a smaller change.)

For immediate next step we can just apply the standalone-parent patch, and use it in js. And we make baseline-indent rule support the standalone-parent customization, and let major mode maintainers know of both. What they want to do is up to them.

> 
>> I’ve already used it to rewrite c-ts-mode indent rules and it’s been a success; this baseline + override approach has been very helpful. c-ts-mode still has a lot of indent rules because of things like preproc directive, etc, but it’s much more manageable than before.
>> I don’t know how much it would help modes that has simpler indent rules. Go-ts-mode and rust-ts-mode only has a handful of indent rules, maybe they don’t really need this baseline rule. OTOH Lua and Ruby has more involved indent rules, maybe they can benefit and reduce the number of rules they need to define.
> 
> Ruby has different delimiters (do...end or def...end or etc), and the curlies don't do exactly the same job that they do in C. So I'm not sure how feasible it is. A half of the function would be a fit, though.

Right, so I think it’ll still be more helpful than not.

Yuan




      reply	other threads:[~2024-12-13  5:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-16 23:26 bug#74386: Tree-sitter javascript indentation Marius Kjeldahl
2024-11-17 19:18 ` Dmitry Gutov
2024-11-17 19:21   ` Marius Kjeldahl
2024-11-17 22:12     ` Dmitry Gutov
2024-11-17 22:21       ` Marius Kjeldahl
2024-11-17 22:41         ` Dmitry Gutov
2024-11-18  8:35       ` Marius Kjeldahl
2024-11-18 15:29         ` Dmitry Gutov
2024-11-30 10:01           ` Eli Zaretskii
2024-12-01  5:23             ` Yuan Fu
2024-12-01 13:11               ` Dmitry Gutov
2024-12-01 19:10                 ` Yuan Fu
2024-12-01 22:33                   ` Dmitry Gutov
2024-12-02  2:31                     ` Yuan Fu
2024-12-11  6:18                       ` Yuan Fu
2024-12-12  3:20                         ` Dmitry Gutov
2024-12-12  5:28                           ` Yuan Fu
2024-12-13  3:34                             ` Dmitry Gutov
2024-12-13  5:34                               ` Yuan Fu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=057D3C98-8503-4C6F-80F1-B54BAFE624BF@gmail.com \
    --to=casouri@gmail.com \
    --cc=74386@debbugs.gnu.org \
    --cc=dmitry@gutov.dev \
    --cc=eliz@gnu.org \
    --cc=marius.kjeldahl@gmail.com \
    --cc=theo@thornhill.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).