From: Yuan Fu <casouri@gmail.com>
To: Theodor Thornhill <theo@thornhill.no>
Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, dev@rjt.dev
Subject: Re: feature/tree-sitter: Where to Put C/C++ Stuff
Date: Tue, 1 Nov 2022 02:22:37 -0700 [thread overview]
Message-ID: <FF0EE33A-0B07-4ABC-8A45-97DAD88CAE5B@gmail.com> (raw)
In-Reply-To: <878rkv3y7z.fsf@thornhill.no>
Before we jump into discussions, I want to note that many of your (Theo’s) arguments seem to be against cc-mode rather than “using the same major mode”. For major modes that doesn’t use cc-mode (like python-mode), tree-sitter and non-tree-sitter features so far coexist just fine.
>>
>> That'd mean people will need either to invent all the other goodies in
>> CC mode (everything except fontifications and indentation) from
>> scratch, or give up all those other goodies. Does that make sense?
>>
>
> Yes, well, partially. I think that we are too likely to create unwanted
> issues by merging the two too closely. I have seen several of these
> issues the last couple of years while implementing c-sharp mode in cc
> mode, emacs-tree-sitter and treesit. There are several things that are
> happening. I'll try to expand on some of them just to create some
> perspective, but also for some specific points where we can improve to
> maybe don't have a problem with this at all.
>
> 1: Use CC mode for one thing and tree-sitter for the rest
> While first implementing tree-sitter in c-sharp mode we tried just
> applying font-locking, and use cc mode for indentation and the rest.
> What happened was that we immediately inherited the performance issues
> from cc mode straight into our code. Specifically, when typing in a
> file with too many (from cc mode's perspective) strings, typing lag rose
> to several seconds per press. I filed several bug reports on this both
> here and to Alan. After some time and much heroics we got some
> improvement on this from Alan, but c-sharp already had moved on.
>
> 2: Using separate names for modes.
> The great advantage here is easy to understand. You have no inheritance
> issues, and are free to develop features without regards to legacy. A
> disadvantage is that some users depend on that major mode name for other
> stuff. We had some issues filed with us to flip over to tree-sitter
> completely, because that name (csharp-mode) was so important compared to
> (csharp-tree-sitter-mode). We almost made the change, but then Yuan
> started his work so we waited. This would have sunsetted the cc mode
> almost immediately
>
> 3: Confusion with where to file bugs
> We have many bugs in c-sharp mode where some things are emacs bugs, some
> things are cc mode bugs, some are treesitter bugs and some are our own
> bugs. There is a real issue with understanding cc mode and figuring out
> where a bug fix should end up. It has taken me many weeks worth of
> digging to understand only the simplest mechanisms of cc mode.
> Tree-sitter takes contributors only a couple of hours to be immediately
> productive. To disregard this point with only compatibility with cc
> mode is a huge mistake, IMO.
>
> 4: How do we know what to disable?
> If there's a problem somewhere in the tree-sitter variant of the cc mode
> derived new mode, and we see some issue - who makes the fix? For
> example, previously there was limited support for multiline strings in
> cc mode, which took almost a year to finalize. The tree-sitter variant
> with more performance and accuracy took me maybe 20 minutes in a
> work-meeting. Should a feature that is simple to implement in the
> tree-sitter variant wait for a similar cc mode implementation? The
> namespacing seems to suggest that yes, it should.
I don’t think it should (which I think we both agree). And I don’t think it’s any problem if a major mode has some tree-sitter-powered feature that the non-tree-sitter version doesn’t have.
>
> 5: While tree-sitter is only an engine, it provides a lot more goodies
> We have a huge opportunity to create real new frameworks for emacs now,
> but limiting us to merge the features/modes suggests that we cannot
> reliably do overarching advancements such as we see now in the
> feature/tree-sitter branch. For example, many small hacks I've made in
> the modes I've submitted thus far has made it into general mechanisms in
> treesit.el. All modes that enable tree-sitter should be able to use
> these and all the new that come _without_ worrying whether or not some
> issue will crop up from inheriting from cc mode or some other thing.
> Examples are indentation styles, paredit-like funciontalities,
> refactorings and more.
>
> 6: What are the goodies that we really need from CC mode?
> CC mode provides indentation and font locking. What else does it
> provide that isn't replaceable pretty quickly? I mean this not as a
> contrarian, but out of real curiosity.
One thing I found, which might be the only thing, is filling, specifically filling the /* */ style comments while respecting all style of drawing stars in these comments. I mean all the style like
/*
*
*/
/*=====================================
=======================================*/
Etc, etc. I tried to look at c-mask-paragraph, and it is very complicated. Maybe we can use c-fill-paragraph without setting up the rest of cc-mode?
> My guess is that we can get to
> feature parity and well beyond that in a very short amount of time, if
> we're not hindered by merging everything.
>
>
> Sorry for the long mail, but I think we are missing the point by viewing
> tree-sitter simply as an engine to plop in aside cc mode for
> convenience, and not the real infrastructure change it is. There is no
> need to sunset cc mode, but equally there is no need to limit tree-sitter.
>
If mixing cc-mode and tree-sitter brings more problem than merit, maybe we can adopt a mutual exclusive policy, where a major mode either sets up cc-mode or uses tree-sitter, but never together.
>
>> Tree-sitter doesn't (and cannot) replace everything a major mode does
>> for a programming language. So a completely new mode means we through
>> the baby with the bathwater.
>
> I don't agree, but I'm very curious to what else would take a
> significant effort _apart_ from indentation feature parity with cc mode is.
Tree-sitter is just a tool, obviously there are things a major mode provides that doesn’t involve a parser, eg, python’s REPL. But I see no prblem putting this feature alongside tree-sitter features in the same major mode.
>
> One thing I know of is integration with package managers such as what
> elm-mode and go-mode does, but that is an easy fix. The upstream
> go-mode, if not possible to move to core can just derive from a simple
> go-treesit, skip all indentation and font-locking in its own mode, but
> supply the goodies.
>
> --
> Theo
next prev parent reply other threads:[~2022-11-01 9:22 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-01 2:30 feature/tree-sitter: Where to Put C/C++ Stuff Randy Taylor
2022-11-01 5:44 ` Theodor Thornhill
2022-11-01 7:24 ` Eli Zaretskii
2022-11-01 7:55 ` Theodor Thornhill
2022-11-01 9:22 ` Yuan Fu [this message]
2022-11-01 9:41 ` Theodor Thornhill
2022-11-01 9:57 ` Eli Zaretskii
2022-11-01 11:53 ` Theodor Thornhill
2022-11-01 12:28 ` Eli Zaretskii
2022-11-01 13:05 ` Theodor Thornhill
2022-11-01 13:10 ` Eli Zaretskii
2022-11-01 13:27 ` Theodor Thornhill
2022-11-01 13:49 ` Eli Zaretskii
2022-11-01 13:54 ` Theodor Thornhill
2022-11-01 14:03 ` Eli Zaretskii
2022-11-01 14:12 ` Theodor Thornhill
2022-11-01 16:09 ` tomas
2022-11-01 13:12 ` Manuel Uberti
2022-11-04 14:49 ` Benjamin Riefenstahl
2022-11-04 16:17 ` Pascal Quesseveur
2022-11-01 13:32 ` Stefan Monnier
2022-11-01 14:02 ` Eli Zaretskii
2022-11-01 15:09 ` Stefan Monnier
2022-11-01 15:36 ` Theodor Thornhill
2022-11-01 16:43 ` Eli Zaretskii
2022-11-02 20:43 ` João Távora
2022-11-01 7:20 ` Eli Zaretskii
2022-11-01 12:10 ` Alan Mackenzie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=FF0EE33A-0B07-4ABC-8A45-97DAD88CAE5B@gmail.com \
--to=casouri@gmail.com \
--cc=dev@rjt.dev \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=theo@thornhill.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).