From: Yuan Fu <casouri@gmail.com>
To: Theodor Thornhill <theo@thornhill.no>
Cc: emacs-devel <emacs-devel@gnu.org>,
Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: Tree-sitter indentation for js-mode & cc-mode
Date: Thu, 27 Oct 2022 08:21:43 -0700 [thread overview]
Message-ID: <A9F73048-7234-41D8-B092-DE1E277CD689@gmail.com> (raw)
In-Reply-To: <87k04lljh6.fsf@thornhill.no>
> On Oct 27, 2022, at 2:11 AM, Theodor Thornhill <theo@thornhill.no> wrote:
>
>
> Hi Yuan!
>
>> I did some work to allow tree-sitter indentation engine to plug in to
>> c-offset-alist. Currently in a tree-sitter indent rule, we have
>>
>> (MATCHER ANCHOR OFFSET)
>>
>> OFFSET is normally an integer, but now it can also be a syntax symbol
>> recognized by cc-mode’s indentation engine. In that case, tree-sitter
>> indent calculates the indent using c-calc-offset, passing the syntax
>> symbol and anchor position to it, and c-calc-offset will give us the
>> integer offset based on c-offset-alist.
>>
>
> This is cool, but do we really want/need this? I mean, now we're really
> binding these implementations together and allowing all the legacy of CC
> mode to blend in. We also need knowledge of how CC mode names their
> syntactic definitions. IMO one of the big selling points of tree sitter
> is that you can look at other editors implementation and get inspired
> immediately. Now we need deep knowledge of cc mode, don't we? Also,
> why would we want cc mode to calculate this for us? I see what you're
> trying to do, but _I_ think this is a step in the wrong direction.
You have a point. I tried to blend in cc-mode because that’ll allow us support “styles” and existing user customization. (Also I started out thinking it will be easier to write indentation rules this way which turns out to be not true.) Perhaps it’s better to come up with a new system for customizing indentation style. I’ll revert this change.
>
>
>> I’ve written indent rules for js-mode, they are in
>> js-treesit-cc-indent-rules. Overall it works pretty well. Theo, could
>> you give it a try? From my testing it is already an improvement from
>> the original rules. I didn’t finish the JSX part and just copied your
>> original rules for JSX. In the future I can probably port that to
>> cc-style too.
>>
>
> I don't think this is better, for example:
>
> The old variant renders this snippet correctly:
> ```
> const fooClient = new Foo({
> bucket: process.env.foo,
> region: process.env.foo,
> });
> ```
>
> But the new one renders it like this:
> ```
> const fooClient = new Foo({
> bucket: process.env.foo,
> region: process.env.foo,
> });
> ```
>
> I know this is a matter of tweaking, but it immediately makes me
> question the reasoning to blend them.
It’s largely my slip-up rather than inherit defect of the system, but I agree with your opinion above.
>
> In addition I profiled indenting a 50k lines js file with messed up
> indentation, and received some surprising results. The pure cc mode
> variant is slow, but MUCH faster than tree-sitter. It seems we are
> looking up way to much the root of the tree, but you know the internals
> here better than me. Is this something we can optimize away? See the
> attached report at the bottom.
This is very strange, I need to look into it.
>
>> I also added imenu support for js-mode and ts-mode, and navigation for
>> python-mode.
>>
>
> Very cool - it seems to work pretty nicely!
Nice.
>
> Anyways - can we please revisit the idea that we init and/or use cc mode
> in tandem with tree-sitter? I know we want "feature parity", but I
> think we lose too much of the simplicity gained by adding in the old
> complexity. My prediction for the future is that this will result in
> numerous bug reports where it's hard to know whether this is a fix for
> cc mode or treesitter. And in the end people _will_ just skip these
> modes altogether and put simpler ones in (m)elpa that only uses treesit,
> to avoid this. The cc mode won't go away at all, for the people that
> considers that superior. We can still use the treesit-settings as a
> centralized variable to get most, if not all of the "auto-enabling"
> benefits by just lifting it up:
>
> ```
> ;;....
>
> (cond
> ;; Tree-sitter.
> ((treesit-ready-p 'js-mode 'javascript)
> ;; init all treesitter relevant stuff - can add in _some_ other
> ;; non-cc-mode settinigs, such as comment-start, etc above this.
> ;; We don't need the cache, detection of js-jsx or any of the
> ;; before-change-functions
>
> (treesit-major-mode-setup))
> ;; Elisp.
> (t
> ;; enable in normal cc mode stuff
> )))
>
> ```
>
> This way other hypothetical tree-sitter-v2 in the future is just a
> simple cond, and no need to worry.
>
> If I'm missing something important here, please let me know, but I
> _really_ don't understand the reason for merging these implementations.
I don’t have an educated opinion on this. If no one has objections I’ll follow your professional advice ;-)
> Anyway, thanks for your continued hard work!
Many thanks to you, too!
>
> Theo
>
> <indent-report.txt>
next prev parent reply other threads:[~2022-10-27 15:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 1:43 Tree-sitter indentation for js-mode & cc-mode Yuan Fu
2022-10-27 9:11 ` Theodor Thornhill
2022-10-27 9:28 ` Theodor Thornhill
2022-10-27 9:58 ` Theodor Thornhill
2022-10-27 15:21 ` Yuan Fu [this message]
2022-10-27 18:36 ` Theodor Thornhill
2022-10-28 8:15 ` Yuan Fu
2022-10-28 8:59 ` Theodor Thornhill
2022-10-28 9:10 ` Theodor Thornhill
2022-10-28 19:43 ` Yuan Fu
2022-10-28 19:49 ` Theodor Thornhill
2022-10-29 1:05 ` Yuan Fu
2022-10-29 5:53 ` Eli Zaretskii
2022-10-29 6:54 ` Yuan Fu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=A9F73048-7234-41D8-B092-DE1E277CD689@gmail.com \
--to=casouri@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
--cc=theo@thornhill.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).