Re: Tree-sitter indentation for js-mode & cc-mode

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

From: Yuan Fu <casouri@gmail.com>
To: Theodor Thornhill <theo@thornhill.no>
Cc: emacs-devel <emacs-devel@gnu.org>,
	Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: Tree-sitter indentation for js-mode & cc-mode
Date: Thu, 27 Oct 2022 08:21:43 -0700	[thread overview]
Message-ID: <A9F73048-7234-41D8-B092-DE1E277CD689@gmail.com> (raw)
In-Reply-To: <87k04lljh6.fsf@thornhill.no>



> On Oct 27, 2022, at 2:11 AM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> 
> Hi Yuan!
> 
>> I did some work to allow tree-sitter indentation engine to plug in to
>> c-offset-alist. Currently in a tree-sitter indent rule, we have
>> 
>> (MATCHER ANCHOR OFFSET)
>> 
>> OFFSET is normally an integer, but now it can also be a syntax symbol
>> recognized by cc-mode’s indentation engine. In that case, tree-sitter
>> indent calculates the indent using c-calc-offset, passing the syntax
>> symbol and anchor position to it, and c-calc-offset will give us the
>> integer offset based on c-offset-alist.
>> 
> 
> This is cool, but do we really want/need this?  I mean, now we're really
> binding these implementations together and allowing all the legacy of CC
> mode to blend in.  We also need knowledge of how CC mode names their
> syntactic definitions.  IMO one of the big selling points of tree sitter
> is that you can look at other editors implementation and get inspired
> immediately.  Now we need deep knowledge of cc mode, don't we?  Also,
> why would we want cc mode to calculate this for us?  I see what you're
> trying to do, but _I_ think this is a step in the wrong direction.

You have a point. I tried to blend in cc-mode because that’ll allow us support “styles” and existing user customization. (Also I started out thinking it will be easier to write indentation rules this way which turns out to be not true.) Perhaps it’s better to come up with a new system for customizing indentation style. I’ll revert this change.

> 
> 
>> I’ve written indent rules for js-mode, they are in
>> js-treesit-cc-indent-rules. Overall it works pretty well. Theo, could
>> you give it a try? From my testing it is already an improvement from
>> the original rules. I didn’t finish the JSX part and just copied your
>> original rules for JSX. In the future I can probably port that to
>> cc-style too.
>> 
> 
> I don't think this is better, for example:
> 
> The old variant renders this snippet correctly:
> ```
> const fooClient = new Foo({
>  bucket: process.env.foo,
>  region: process.env.foo,
> });
> ```
> 
> But the new one renders it like this:
> ```
> const fooClient = new Foo({
>                            bucket: process.env.foo,
>                          region: process.env.foo,
>                          });
> ```
> 
> I know this is a matter of tweaking, but it immediately makes me
> question the reasoning to blend them.

It’s largely my slip-up rather than inherit defect of the system, but I agree with your opinion above.

> 
> In addition I profiled indenting a 50k lines js file with messed up
> indentation, and received some surprising results.  The pure cc mode
> variant is slow, but MUCH faster than tree-sitter.  It seems we are
> looking up way to much the root of the tree, but you know the internals
> here better than me.  Is this something we can optimize away? See the
> attached report at the bottom.

This is very strange, I need to look into it.

> 
>> I also added imenu support for js-mode and ts-mode, and navigation for
>> python-mode.
>> 
> 
> Very cool - it seems to work pretty nicely!

Nice.

> 
> Anyways - can we please revisit the idea that we init and/or use cc mode
> in tandem with tree-sitter?  I know we want "feature parity", but I
> think we lose too much of the simplicity gained by adding in the old
> complexity.  My prediction for the future is that this will result in
> numerous bug reports where it's hard to know whether this is a fix for
> cc mode or treesitter.  And in the end people _will_ just skip these
> modes altogether and put simpler ones in (m)elpa that only uses treesit,
> to avoid this.  The cc mode won't go away at all, for the people that
> considers that superior.  We can still use the treesit-settings as a
> centralized variable to get most, if not all of the "auto-enabling"
> benefits by just lifting it up:
> 
> ```
>  ;;....
> 
>  (cond
>   ;; Tree-sitter.
>   ((treesit-ready-p 'js-mode 'javascript)
>    ;; init all treesitter relevant stuff - can add in _some_ other
>    ;; non-cc-mode settinigs, such as comment-start, etc above this.
>    ;; We don't need the cache, detection of js-jsx or any of the
>    ;; before-change-functions
> 
>    (treesit-major-mode-setup))
>   ;; Elisp.
>   (t
>     ;; enable in normal cc mode stuff
>    )))
> 
> ```
> 
> This way other hypothetical tree-sitter-v2 in the future is just a
> simple cond, and no need to worry.
> 
> If I'm missing something important here, please let me know, but I
> _really_ don't understand the reason for merging these implementations.

I don’t have an educated opinion on this. If no one has objections I’ll follow your professional advice ;-)

> Anyway, thanks for your continued hard work!

Many thanks to you, too!

> 
> Theo
> 
> <indent-report.txt>

next prev parent reply	other threads:[~2022-10-27 15:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27  1:43 Tree-sitter indentation for js-mode & cc-mode Yuan Fu
2022-10-27  9:11 ` Theodor Thornhill
2022-10-27  9:28   ` Theodor Thornhill
2022-10-27  9:58     ` Theodor Thornhill
2022-10-27 15:21   ` Yuan Fu [this message]
2022-10-27 18:36     ` Theodor Thornhill
2022-10-28  8:15       ` Yuan Fu
2022-10-28  8:59         ` Theodor Thornhill
2022-10-28  9:10         ` Theodor Thornhill
2022-10-28 19:43           ` Yuan Fu
2022-10-28 19:49             ` Theodor Thornhill
2022-10-29  1:05               ` Yuan Fu
2022-10-29  5:53             ` Eli Zaretskii
2022-10-29  6:54               ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A9F73048-7234-41D8-B092-DE1E277CD689@gmail.com \
    --to=casouri@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=theo@thornhill.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).