* feature/tree-sitter: Where to Put C/C++ Stuff @ 2022-11-01 2:30 Randy Taylor 2022-11-01 5:44 ` Theodor Thornhill 2022-11-01 7:20 ` Eli Zaretskii 0 siblings, 2 replies; 28+ messages in thread From: Randy Taylor @ 2022-11-01 2:30 UTC (permalink / raw) To: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 327 bytes --] Hi. Where specifically should the C and C++ tree-sitter stuff go? I've been using it for a couple months and would like to upstream syntax highlighting for both. I'll focus on getting C done first. I see there are a lot of cc- files; would it be appropriate to add the tree-sitter stuff into a new cc-treesit.el file? Thanks. [-- Attachment #2: Type: text/html, Size: 877 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 2:30 feature/tree-sitter: Where to Put C/C++ Stuff Randy Taylor @ 2022-11-01 5:44 ` Theodor Thornhill 2022-11-01 7:24 ` Eli Zaretskii 2022-11-02 20:43 ` João Távora 2022-11-01 7:20 ` Eli Zaretskii 1 sibling, 2 replies; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 5:44 UTC (permalink / raw) To: emacs-devel, Randy Taylor, emacs-devel@gnu.org On 1 November 2022 03:30:54 CET, Randy Taylor <dev@rjt.dev> wrote: >Hi. > >Where specifically should the C and C++ tree-sitter stuff go? I've been using it for a couple months and would like to upstream syntax highlighting for both. I'll focus on getting C done first. > >I see there are a lot of cc- files; would it be appropriate to add the tree-sitter stuff into a new cc-treesit.el file? >Thanks. I'm no authority on the matter, but I'd love for us not to complicate things too much. I vote for separate, non-cc-prefixed _new_ modes, that derives from prog-mode. I understand that this is a controversial opinion, but that's what I want. I believe people will do that anyway if we don't. Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 5:44 ` Theodor Thornhill @ 2022-11-01 7:24 ` Eli Zaretskii 2022-11-01 7:55 ` Theodor Thornhill 2022-11-01 13:32 ` Stefan Monnier 2022-11-02 20:43 ` João Távora 1 sibling, 2 replies; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 7:24 UTC (permalink / raw) To: Theodor Thornhill; +Cc: emacs-devel, dev, emacs-devel > Date: Tue, 01 Nov 2022 06:44:38 +0100 > From: Theodor Thornhill <theo@thornhill.no> > > >Where specifically should the C and C++ tree-sitter stuff go? I've been using it for a couple months and would like to upstream syntax highlighting for both. I'll focus on getting C done first. > > > >I see there are a lot of cc- files; would it be appropriate to add the tree-sitter stuff into a new cc-treesit.el file? > >Thanks. > > I'm no authority on the matter, but I'd love for us not to complicate things too much. I vote for separate, non-cc-prefixed _new_ modes, that derives from prog-mode. That'd mean people will need either to invent all the other goodies in CC mode (everything except fontifications and indentation) from scratch, or give up all those other goodies. Does that make sense? Tree-sitter doesn't (and cannot) replace everything a major mode does for a programming language. So a completely new mode means we through the baby with the bathwater. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 7:24 ` Eli Zaretskii @ 2022-11-01 7:55 ` Theodor Thornhill 2022-11-01 9:22 ` Yuan Fu 2022-11-01 9:57 ` Eli Zaretskii 2022-11-01 13:32 ` Stefan Monnier 1 sibling, 2 replies; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 7:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, dev, emacs-devel Hi Eli! >> Date: Tue, 01 Nov 2022 06:44:38 +0100 >> From: Theodor Thornhill <theo@thornhill.no> >> >> >Where specifically should the C and C++ tree-sitter stuff go? I've >> >been using it for a couple months and would like to upstream syntax >> >highlighting for both. I'll focus on getting C done first. >> > >> >I see there are a lot of cc- files; would it be appropriate to add >> >the tree-sitter stuff into a new cc-treesit.el file? Thanks. >> >> I'm no authority on the matter, but I'd love for us not to complicate >> things too much. I vote for separate, non-cc-prefixed _new_ modes, >> that derives from prog-mode. > > That'd mean people will need either to invent all the other goodies in > CC mode (everything except fontifications and indentation) from > scratch, or give up all those other goodies. Does that make sense? > Yes, well, partially. I think that we are too likely to create unwanted issues by merging the two too closely. I have seen several of these issues the last couple of years while implementing c-sharp mode in cc mode, emacs-tree-sitter and treesit. There are several things that are happening. I'll try to expand on some of them just to create some perspective, but also for some specific points where we can improve to maybe don't have a problem with this at all. 1: Use CC mode for one thing and tree-sitter for the rest While first implementing tree-sitter in c-sharp mode we tried just applying font-locking, and use cc mode for indentation and the rest. What happened was that we immediately inherited the performance issues from cc mode straight into our code. Specifically, when typing in a file with too many (from cc mode's perspective) strings, typing lag rose to several seconds per press. I filed several bug reports on this both here and to Alan. After some time and much heroics we got some improvement on this from Alan, but c-sharp already had moved on. 2: Using separate names for modes. The great advantage here is easy to understand. You have no inheritance issues, and are free to develop features without regards to legacy. A disadvantage is that some users depend on that major mode name for other stuff. We had some issues filed with us to flip over to tree-sitter completely, because that name (csharp-mode) was so important compared to (csharp-tree-sitter-mode). We almost made the change, but then Yuan started his work so we waited. This would have sunsetted the cc mode almost immediately 3: Confusion with where to file bugs We have many bugs in c-sharp mode where some things are emacs bugs, some things are cc mode bugs, some are treesitter bugs and some are our own bugs. There is a real issue with understanding cc mode and figuring out where a bug fix should end up. It has taken me many weeks worth of digging to understand only the simplest mechanisms of cc mode. Tree-sitter takes contributors only a couple of hours to be immediately productive. To disregard this point with only compatibility with cc mode is a huge mistake, IMO. 4: How do we know what to disable? If there's a problem somewhere in the tree-sitter variant of the cc mode derived new mode, and we see some issue - who makes the fix? For example, previously there was limited support for multiline strings in cc mode, which took almost a year to finalize. The tree-sitter variant with more performance and accuracy took me maybe 20 minutes in a work-meeting. Should a feature that is simple to implement in the tree-sitter variant wait for a similar cc mode implementation? The namespacing seems to suggest that yes, it should. 5: While tree-sitter is only an engine, it provides a lot more goodies We have a huge opportunity to create real new frameworks for emacs now, but limiting us to merge the features/modes suggests that we cannot reliably do overarching advancements such as we see now in the feature/tree-sitter branch. For example, many small hacks I've made in the modes I've submitted thus far has made it into general mechanisms in treesit.el. All modes that enable tree-sitter should be able to use these and all the new that come _without_ worrying whether or not some issue will crop up from inheriting from cc mode or some other thing. Examples are indentation styles, paredit-like funciontalities, refactorings and more. 6: What are the goodies that we really need from CC mode? CC mode provides indentation and font locking. What else does it provide that isn't replaceable pretty quickly? I mean this not as a contrarian, but out of real curiosity. My guess is that we can get to feature parity and well beyond that in a very short amount of time, if we're not hindered by merging everything. Sorry for the long mail, but I think we are missing the point by viewing tree-sitter simply as an engine to plop in aside cc mode for convenience, and not the real infrastructure change it is. There is no need to sunset cc mode, but equally there is no need to limit tree-sitter. > Tree-sitter doesn't (and cannot) replace everything a major mode does > for a programming language. So a completely new mode means we through > the baby with the bathwater. I don't agree, but I'm very curious to what else would take a significant effort _apart_ from indentation feature parity with cc mode is. One thing I know of is integration with package managers such as what elm-mode and go-mode does, but that is an easy fix. The upstream go-mode, if not possible to move to core can just derive from a simple go-treesit, skip all indentation and font-locking in its own mode, but supply the goodies. -- Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 7:55 ` Theodor Thornhill @ 2022-11-01 9:22 ` Yuan Fu 2022-11-01 9:41 ` Theodor Thornhill 2022-11-01 9:57 ` Eli Zaretskii 1 sibling, 1 reply; 28+ messages in thread From: Yuan Fu @ 2022-11-01 9:22 UTC (permalink / raw) To: Theodor Thornhill; +Cc: Eli Zaretskii, emacs-devel, dev Before we jump into discussions, I want to note that many of your (Theo’s) arguments seem to be against cc-mode rather than “using the same major mode”. For major modes that doesn’t use cc-mode (like python-mode), tree-sitter and non-tree-sitter features so far coexist just fine. >> >> That'd mean people will need either to invent all the other goodies in >> CC mode (everything except fontifications and indentation) from >> scratch, or give up all those other goodies. Does that make sense? >> > > Yes, well, partially. I think that we are too likely to create unwanted > issues by merging the two too closely. I have seen several of these > issues the last couple of years while implementing c-sharp mode in cc > mode, emacs-tree-sitter and treesit. There are several things that are > happening. I'll try to expand on some of them just to create some > perspective, but also for some specific points where we can improve to > maybe don't have a problem with this at all. > > 1: Use CC mode for one thing and tree-sitter for the rest > While first implementing tree-sitter in c-sharp mode we tried just > applying font-locking, and use cc mode for indentation and the rest. > What happened was that we immediately inherited the performance issues > from cc mode straight into our code. Specifically, when typing in a > file with too many (from cc mode's perspective) strings, typing lag rose > to several seconds per press. I filed several bug reports on this both > here and to Alan. After some time and much heroics we got some > improvement on this from Alan, but c-sharp already had moved on. > > 2: Using separate names for modes. > The great advantage here is easy to understand. You have no inheritance > issues, and are free to develop features without regards to legacy. A > disadvantage is that some users depend on that major mode name for other > stuff. We had some issues filed with us to flip over to tree-sitter > completely, because that name (csharp-mode) was so important compared to > (csharp-tree-sitter-mode). We almost made the change, but then Yuan > started his work so we waited. This would have sunsetted the cc mode > almost immediately > > 3: Confusion with where to file bugs > We have many bugs in c-sharp mode where some things are emacs bugs, some > things are cc mode bugs, some are treesitter bugs and some are our own > bugs. There is a real issue with understanding cc mode and figuring out > where a bug fix should end up. It has taken me many weeks worth of > digging to understand only the simplest mechanisms of cc mode. > Tree-sitter takes contributors only a couple of hours to be immediately > productive. To disregard this point with only compatibility with cc > mode is a huge mistake, IMO. > > 4: How do we know what to disable? > If there's a problem somewhere in the tree-sitter variant of the cc mode > derived new mode, and we see some issue - who makes the fix? For > example, previously there was limited support for multiline strings in > cc mode, which took almost a year to finalize. The tree-sitter variant > with more performance and accuracy took me maybe 20 minutes in a > work-meeting. Should a feature that is simple to implement in the > tree-sitter variant wait for a similar cc mode implementation? The > namespacing seems to suggest that yes, it should. I don’t think it should (which I think we both agree). And I don’t think it’s any problem if a major mode has some tree-sitter-powered feature that the non-tree-sitter version doesn’t have. > > 5: While tree-sitter is only an engine, it provides a lot more goodies > We have a huge opportunity to create real new frameworks for emacs now, > but limiting us to merge the features/modes suggests that we cannot > reliably do overarching advancements such as we see now in the > feature/tree-sitter branch. For example, many small hacks I've made in > the modes I've submitted thus far has made it into general mechanisms in > treesit.el. All modes that enable tree-sitter should be able to use > these and all the new that come _without_ worrying whether or not some > issue will crop up from inheriting from cc mode or some other thing. > Examples are indentation styles, paredit-like funciontalities, > refactorings and more. > > 6: What are the goodies that we really need from CC mode? > CC mode provides indentation and font locking. What else does it > provide that isn't replaceable pretty quickly? I mean this not as a > contrarian, but out of real curiosity. One thing I found, which might be the only thing, is filling, specifically filling the /* */ style comments while respecting all style of drawing stars in these comments. I mean all the style like /* * */ /*===================================== =======================================*/ Etc, etc. I tried to look at c-mask-paragraph, and it is very complicated. Maybe we can use c-fill-paragraph without setting up the rest of cc-mode? > My guess is that we can get to > feature parity and well beyond that in a very short amount of time, if > we're not hindered by merging everything. > > > Sorry for the long mail, but I think we are missing the point by viewing > tree-sitter simply as an engine to plop in aside cc mode for > convenience, and not the real infrastructure change it is. There is no > need to sunset cc mode, but equally there is no need to limit tree-sitter. > If mixing cc-mode and tree-sitter brings more problem than merit, maybe we can adopt a mutual exclusive policy, where a major mode either sets up cc-mode or uses tree-sitter, but never together. > >> Tree-sitter doesn't (and cannot) replace everything a major mode does >> for a programming language. So a completely new mode means we through >> the baby with the bathwater. > > I don't agree, but I'm very curious to what else would take a > significant effort _apart_ from indentation feature parity with cc mode is. Tree-sitter is just a tool, obviously there are things a major mode provides that doesn’t involve a parser, eg, python’s REPL. But I see no prblem putting this feature alongside tree-sitter features in the same major mode. > > One thing I know of is integration with package managers such as what > elm-mode and go-mode does, but that is an easy fix. The upstream > go-mode, if not possible to move to core can just derive from a simple > go-treesit, skip all indentation and font-locking in its own mode, but > supply the goodies. > > -- > Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 9:22 ` Yuan Fu @ 2022-11-01 9:41 ` Theodor Thornhill 0 siblings, 0 replies; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 9:41 UTC (permalink / raw) To: Yuan Fu; +Cc: Eli Zaretskii, emacs-devel, dev Hi Yuan! > Before we jump into discussions, I want to note that many of your > (Theo’s) arguments seem to be against cc-mode rather than “using the > same major mode”. For major modes that doesn’t use cc-mode (like > python-mode), tree-sitter and non-tree-sitter features so far coexist > just fine. > Yes, absolutely, but that's mostly because of the nature of complexity in cc-mode. For the record - I'm not against cc-mode, I'm actually pretty impressed. But I'm wary of the consequences of mixing complexities here. Also I'm not sure if cc-mode "owning" java-mode, js-mode, c-mode, c++-mode etc makes sense. They are their own languages, and should maybe live as such. >> 4: How do we know what to disable? >> If there's a problem somewhere in the tree-sitter variant of the cc mode >> derived new mode, and we see some issue - who makes the fix? For >> example, previously there was limited support for multiline strings in >> cc mode, which took almost a year to finalize. The tree-sitter variant >> with more performance and accuracy took me maybe 20 minutes in a >> work-meeting. Should a feature that is simple to implement in the >> tree-sitter variant wait for a similar cc mode implementation? The >> namespacing seems to suggest that yes, it should. > > I don’t think it should (which I think we both agree). And I don’t > think it’s any problem if a major mode has some tree-sitter-powered > feature that the non-tree-sitter version doesn’t have. > I agree. For example I'm all for some variant of what we're doing in js-mode. I think we're still not there, but mixing _can_ be done. >> 6: What are the goodies that we really need from CC mode? >> CC mode provides indentation and font locking. What else does it >> provide that isn't replaceable pretty quickly? I mean this not as a >> contrarian, but out of real curiosity. > > One thing I found, which might be the only thing, is filling, > specifically filling the /* */ style comments while respecting all > style of drawing stars in these comments. I mean all the style like > > /* > * > */ > > /*===================================== > > =======================================*/ > > Etc, etc. I tried to look at c-mask-paragraph, and it is very > complicated. Maybe we can use c-fill-paragraph without setting up the > rest of cc-mode? Yes, this is true. Either we can see if it's possible to reuse, or we can roll our own down the line. For the record. Many things that tries to use fill even in cc mode isn't 100%. I doubt that using only parts of cc mode is really feasible without it bleeding in other places, but I don't have expertise to judge that alone. > >> My guess is that we can get to >> feature parity and well beyond that in a very short amount of time, if >> we're not hindered by merging everything. >> >> >> Sorry for the long mail, but I think we are missing the point by viewing >> tree-sitter simply as an engine to plop in aside cc mode for >> convenience, and not the real infrastructure change it is. There is no >> need to sunset cc mode, but equally there is no need to limit tree-sitter. >> > > If mixing cc-mode and tree-sitter brings more problem than merit, > maybe we can adopt a mutual exclusive policy, where a major mode > either sets up cc-mode or uses tree-sitter, but never together. > This is my hope :-) >> >>> Tree-sitter doesn't (and cannot) replace everything a major mode does >>> for a programming language. So a completely new mode means we through >>> the baby with the bathwater. >> >> I don't agree, but I'm very curious to what else would take a >> significant effort _apart_ from indentation feature parity with cc mode is. > > Tree-sitter is just a tool, obviously there are things a major mode > provides that doesn’t involve a parser, eg, python’s REPL. But I see > no prblem putting this feature alongside tree-sitter features in the > same major mode. > I agree with you here as well. -- Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 7:55 ` Theodor Thornhill 2022-11-01 9:22 ` Yuan Fu @ 2022-11-01 9:57 ` Eli Zaretskii 2022-11-01 11:53 ` Theodor Thornhill 1 sibling, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 9:57 UTC (permalink / raw) To: Theodor Thornhill; +Cc: emacs-devel, dev, emacs-devel > From: Theodor Thornhill <theo@thornhill.no> > Cc: emacs-devel@gnu.org, dev@rjt.dev, emacs-devel@gnu.org > Date: Tue, 01 Nov 2022 08:55:44 +0100 > > Yes, well, partially. I think that we are too likely to create unwanted > issues by merging the two too closely. Then we should merge them "not too closely", I guess. The challenge is to merge them so that we gain the most and lose the least. > 1: Use CC mode for one thing and tree-sitter for the rest > While first implementing tree-sitter in c-sharp mode we tried just > applying font-locking, and use cc mode for indentation and the rest. > What happened was that we immediately inherited the performance issues > from cc mode straight into our code. If those same performance issues exist today, then we don't lose anything, do we? We just gain less than we could. But the amount of work required for rewriting the other parts of CC Mode is huge, and we don't want to leave users of CC Mode in a dilemma whether to switch to a new mode and lose everything else for a significant amount of time, or give up tree-sitter and stay with CC Mode. Not something I'd agree to. I also have hard time believing that you can reimplement those slow parts of CC Mode to be much faster, but if you have code to show which does that, I'm sure I'd be interested to look at it and consider improving CC Mode using that code. > Specifically, when typing in a > file with too many (from cc mode's perspective) strings, typing lag rose > to several seconds per press. I filed several bug reports on this both > here and to Alan. After some time and much heroics we got some > improvement on this from Alan, but c-sharp already had moved on. I don't know what c-sharp mode does besides fontification and indentation, but CC Mode does a lot more, see below. If you disregarded a significant part of that, or if it is not relevant for editing C# code, then your particular experience is not very educational for the purposes of this discussion, and could lead us to wrong conclusions. It is trivially correct that a new mode can move much faster and make breaking changes, but this is unacceptable for a mode that comes with Emacs. We respect our users much more than 3rd-party packages out there do, and we do that for good reasons. > 2: Using separate names for modes. > The great advantage here is easy to understand. You have no inheritance > issues, and are free to develop features without regards to legacy. A > disadvantage is that some users depend on that major mode name for other > stuff. That's a _huge_ disadvantage, in my book. > 3: Confusion with where to file bugs Not relevant in our case: the bugs should be filed with Emacs. > 4: How do we know what to disable? > If there's a problem somewhere in the tree-sitter variant of the cc mode > derived new mode, and we see some issue - who makes the fix? Also not relevant: the answer is "we the Emacs project make the fix". > 5: While tree-sitter is only an engine, it provides a lot more goodies > We have a huge opportunity to create real new frameworks for emacs now, > but limiting us to merge the features/modes suggests that we cannot > reliably do overarching advancements such as we see now in the > feature/tree-sitter branch. Yes. And trying to make breaking changes in important Emacs features such as CC Mode is really a non-starter. It isn't going to happen. > 6: What are the goodies that we really need from CC mode? > CC mode provides indentation and font locking. What else does it > provide that isn't replaceable pretty quickly? I mean this not as a > contrarian, but out of real curiosity. CC Mode has a full-blown manual, where this question is answered. Here's a partial list of features outside of the fontification and indentation area, which I collected just by looking at the top-level menus of that manual: . filling and breaking text in comments and strings . automatic insertion of newlines after braces, colons, commas, semi-colons . whitespace cleanups . minor modes: electric, hungry-delete, comment-style . c-offsets-alist and interactive indentation customization (related to indentation, but still extremely important, and not directly in tree-sitter) > My guess is that we can get to feature parity and well beyond that > in a very short amount of time, if we're not hindered by merging > everything. As they say, "show me the code". If you can write up a C/C++ mode from scratch which supports most everything in the CC Mode manual, do it better/cleaner than CC Mode does, and do it before the emacs-29 branch is cut, in a month or so, I might change my mind. > Sorry for the long mail, but I think we are missing the point by viewing > tree-sitter simply as an engine to plop in aside cc mode for > convenience, and not the real infrastructure change it is. Who said we view tree-sitter that way? What actually happens is that we gradually introduce tree-sitter as an engine for replacing the implementation of Emacs features where it is faster and/or better. That is the plan. There's no limit to these replacements, except what tree-sitter can do and how we can use that. But one thing we will NOT do is throw away existing important features before we have equivalent replacements and before users tell us the replacements are indeed better. > There is no need to sunset cc mode, but equally there is no need to > limit tree-sitter. There's no limits. The fact that we use tree-sitter for what we use it now is just because _we_ decided to do that initially, in order to have it in Emacs 29 as a useful infrastructure that users can take advantage of. I don't believe in releasing Emacs with infrastructure that has no user-level features built on it. > > Tree-sitter doesn't (and cannot) replace everything a major mode does > > for a programming language. So a completely new mode means we through > > the baby with the bathwater. > > I don't agree, but I'm very curious to what else would take a > significant effort _apart_ from indentation feature parity with cc mode is. See above: just read the CC Mode manual, and see for yourself. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 9:57 ` Eli Zaretskii @ 2022-11-01 11:53 ` Theodor Thornhill 2022-11-01 12:28 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 11:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, dev, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Theodor Thornhill <theo@thornhill.no> >> Cc: emacs-devel@gnu.org, dev@rjt.dev, emacs-devel@gnu.org >> Date: Tue, 01 Nov 2022 08:55:44 +0100 >> >> Yes, well, partially. I think that we are too likely to create unwanted >> issues by merging the two too closely. > > Then we should merge them "not too closely", I guess. The challenge > is to merge them so that we gain the most and lose the least. > That is reasonable. It's just the sentiment that we should do a full on merge between tree-sitter and cc mode I don't like. If we can find a way to blend and still keep them distinct, we are on the correct path. I don't have a clear solution, I'm afraid. Personally I like how I did it in ts-mode, where we fall back to cc mode if we cannot enable tree-sitter. That's not as easy an option for i.e java because java already exists. So some code has to end up in cc-mode, unless we make separate modes. >> 1: Use CC mode for one thing and tree-sitter for the rest >> While first implementing tree-sitter in c-sharp mode we tried just >> applying font-locking, and use cc mode for indentation and the rest. >> What happened was that we immediately inherited the performance issues >> from cc mode straight into our code. > > If those same performance issues exist today, then we don't lose > anything, do we? We just gain less than we could. But the amount of > work required for rewriting the other parts of CC Mode is huge, and we > don't want to leave users of CC Mode in a dilemma whether to switch to > a new mode and lose everything else for a significant amount of time, > or give up tree-sitter and stay with CC Mode. Not something I'd agree > to. > That is also reasonable. > I also have hard time believing that you can reimplement those slow > parts of CC Mode to be much faster, but if you have code to show which > does that, I'm sure I'd be interested to look at it and consider > improving CC Mode using that code. > You'd be surprised. - https://github.com/emacs-csharp/csharp-mode/pull/251 - https://github.com/emacs-csharp/csharp-mode/issues/207 - https://github.com/emacs-csharp/csharp-mode/issues/164 - https://debbugs.gnu.org/db/43/43631.html - https://github.com/emacs-csharp/csharp-mode/issues/151 - https://github.com/emacs-csharp/csharp-mode/issues/200 All of these are solved with [0], no implementation needed for anything (apart from generic tree-sitter machinery of course). >> Specifically, when typing in a >> file with too many (from cc mode's perspective) strings, typing lag rose >> to several seconds per press. I filed several bug reports on this both >> here and to Alan. After some time and much heroics we got some >> improvement on this from Alan, but c-sharp already had moved on. > > I don't know what c-sharp mode does besides fontification and > indentation, but CC Mode does a lot more, see below. If you > disregarded a significant part of that, or if it is not relevant for > editing C# code, then your particular experience is not very > educational for the purposes of this discussion, and could lead us to > wrong conclusions. > > It is trivially correct that a new mode can move much faster and make > breaking changes, but this is unacceptable for a mode that comes with > Emacs. We respect our users much more than 3rd-party packages out > there do, and we do that for good reasons. > I don't believe I disregard much here. Yes it is trivially correct, but I've spent a lot of time to improve on the c#-cc-mode support, out of the same reasons you mention. >> 2: Using separate names for modes. >> The great advantage here is easy to understand. You have no inheritance >> issues, and are free to develop features without regards to legacy. A >> disadvantage is that some users depend on that major mode name for other >> stuff. > > That's a _huge_ disadvantage, in my book. > Yes I agree >> 3: Confusion with where to file bugs > > Not relevant in our case: the bugs should be filed with Emacs. > Well, are you sure? Diagnosing a bug and its origin is as important as actually writing the code. Trying to make that diagnosing step easier isn't worthless. Even though all bugs end up in Emacs, the likelihood that some casual reader of this list submits some queries and a function to tree-sitter is _much_ bigger than almost anyone on this list trying to grok cc. >> 4: How do we know what to disable? >> If there's a problem somewhere in the tree-sitter variant of the cc mode >> derived new mode, and we see some issue - who makes the fix? > > Also not relevant: the answer is "we the Emacs project make the fix". > Sure, but we want as many as possible to be able to fix them, no? >> 5: While tree-sitter is only an engine, it provides a lot more goodies >> We have a huge opportunity to create real new frameworks for emacs now, >> but limiting us to merge the features/modes suggests that we cannot >> reliably do overarching advancements such as we see now in the >> feature/tree-sitter branch. > > Yes. And trying to make breaking changes in important Emacs features > such as CC Mode is really a non-starter. It isn't going to happen. > Ok. Let me be clear. I'm not suggesting breaking changes. I'm only saying that CC mode should go. I agree with you here. I'm trying to be mindful with how, and offering some real, hard won experiences in this exact tree-sitter/cc-mode gap. It is trivially easy to say that we should just add it to cc mode, not so much to know what some of the hidden issues are. >> 6: What are the goodies that we really need from CC mode? >> CC mode provides indentation and font locking. What else does it >> provide that isn't replaceable pretty quickly? I mean this not as a >> contrarian, but out of real curiosity. > > CC Mode has a full-blown manual, where this question is answered. > Here's a partial list of features outside of the fontification and > indentation area, which I collected just by looking at the top-level > menus of that manual: > > . filling and breaking text in comments and strings > . automatic insertion of newlines after braces, colons, commas, semi-colons > . whitespace cleanups > . minor modes: electric, hungry-delete, comment-style > . c-offsets-alist and interactive indentation customization (related > to indentation, but still extremely important, and not directly in > tree-sitter) > Yes, I've read the manual many times. Filling is one nice thing, agreed. electric, hungry-delete is just sitting there waiting for us to create a framework using tree-sitter that would benefit _all_ languages supported by tree-sitter, not just cc. >> My guess is that we can get to feature parity and well beyond that >> in a very short amount of time, if we're not hindered by merging >> everything. > > As they say, "show me the code". If you can write up a C/C++ mode > from scratch which supports most everything in the CC Mode manual, do > it better/cleaner than CC Mode does, and do it before the emacs-29 > branch is cut, in a month or so, I might change my mind. > Challenge accepted. Can I create it for java, which is a language I'm writing a lot in these days? It would be simpler for me to test with stuff I use daily, but still very much related to CC mode functionality. I can branch out from feature/tree-sitter and create progmodes/java-ts-mode.el in scratch/tree-sitter/java, then we can decide if some variant of it should be merged in to tree-sitter before the branch is cut. What do you think? If so, it would be nice to be able to commit myself to simplify rebasing/merging with feature/tree-sitter, and also not littering Yuan with reviews. >> Sorry for the long mail, but I think we are missing the point by viewing >> tree-sitter simply as an engine to plop in aside cc mode for >> convenience, and not the real infrastructure change it is. > > Who said we view tree-sitter that way? > > What actually happens is that we gradually introduce tree-sitter as an > engine for replacing the implementation of Emacs features where it is > faster and/or better. That is the plan. There's no limit to these > replacements, except what tree-sitter can do and how we can use that. > But one thing we will NOT do is throw away existing important features > before we have equivalent replacements and before users tell us the > replacements are indeed better. > Yes, I don't disagree and never said we should. If did then I misspoke. >> There is no need to sunset cc mode, but equally there is no need to >> limit tree-sitter. > > There's no limits. The fact that we use tree-sitter for what we use > it now is just because _we_ decided to do that initially, in order to > have it in Emacs 29 as a useful infrastructure that users can take > advantage of. I don't believe in releasing Emacs with infrastructure > that has no user-level features built on it. > And which is why I try to create some actual, useful modes for us for the merge. >> > Tree-sitter doesn't (and cannot) replace everything a major mode does >> > for a programming language. So a completely new mode means we through >> > the baby with the bathwater. >> >> I don't agree, but I'm very curious to what else would take a >> significant effort _apart_ from indentation feature parity with cc mode is. > > See above: just read the CC Mode manual, and see for yourself. I have, many times :-) -- Theo [0]: https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L69-L78 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 11:53 ` Theodor Thornhill @ 2022-11-01 12:28 ` Eli Zaretskii 2022-11-01 13:05 ` Theodor Thornhill ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 12:28 UTC (permalink / raw) To: Theodor Thornhill; +Cc: emacs-devel, dev, emacs-devel > From: Theodor Thornhill <theo@thornhill.no> > Cc: emacs-devel@gnu.org, dev@rjt.dev, emacs-devel@gnu.org > Date: Tue, 01 Nov 2022 12:53:11 +0100 > > > I also have hard time believing that you can reimplement those slow > > parts of CC Mode to be much faster, but if you have code to show which > > does that, I'm sure I'd be interested to look at it and consider > > improving CC Mode using that code. > > > > You'd be surprised. > > - https://github.com/emacs-csharp/csharp-mode/pull/251 > - https://github.com/emacs-csharp/csharp-mode/issues/207 > - https://github.com/emacs-csharp/csharp-mode/issues/164 > - https://debbugs.gnu.org/db/43/43631.html > - https://github.com/emacs-csharp/csharp-mode/issues/151 > - https://github.com/emacs-csharp/csharp-mode/issues/200 > > All of these are solved with [0], no implementation needed for anything > (apart from generic tree-sitter machinery of course). That's for C#, not for C/C++. But if you can do the same for C/C++, sure, let's see the code and judge its relative merits and demerits. > >> 3: Confusion with where to file bugs > > > > Not relevant in our case: the bugs should be filed with Emacs. > > > > Well, are you sure? You asked where to file the bugs. The answer is: on debbugs. If it eventually turns out the bug is in tree-sitter, we will file a bug there. Just like we do with any other library we use. Nothing new here, IMO. > > . filling and breaking text in comments and strings > > . automatic insertion of newlines after braces, colons, commas, semi-colons > > . whitespace cleanups > > . minor modes: electric, hungry-delete, comment-style > > . c-offsets-alist and interactive indentation customization (related > > to indentation, but still extremely important, and not directly in > > tree-sitter) > > > > Yes, I've read the manual many times. Filling is one nice thing, > agreed. electric, hungry-delete is just sitting there waiting for us to > create a framework using tree-sitter that would benefit _all_ languages > supported by tree-sitter, not just cc. If tree-sitter can make these easier or faster or better, I see no reason not to use tree-sitter for (some of) those features as well. There's no decision to limit tree-sitter's use to fontification and indentation, and I don't think we will ever make such decisions, except if we have some bitter experience. > > As they say, "show me the code". If you can write up a C/C++ mode > > from scratch which supports most everything in the CC Mode manual, do > > it better/cleaner than CC Mode does, and do it before the emacs-29 > > branch is cut, in a month or so, I might change my mind. > > Challenge accepted. Can I create it for java, which is a language I'm > writing a lot in these days? Sorry, no. It has to support all the languages supported by CC Mode now. That's the challenge. It is fine by me to have a separate java-mode, but then I personally will not be very interested in this, since editing the Emacs C code, which I do a lot, will still need to use CC Mode. Without decent support for C/C++, CC Mode cannot be retired. (Do people really use Emacs to develop Java? I'd be surprised.) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 12:28 ` Eli Zaretskii @ 2022-11-01 13:05 ` Theodor Thornhill 2022-11-01 13:10 ` Eli Zaretskii 2022-11-01 13:12 ` Manuel Uberti 2022-11-04 14:49 ` Benjamin Riefenstahl 2 siblings, 1 reply; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 13:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, dev On 1 November 2022 13:28:05 CET, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Theodor Thornhill <theo@thornhill.no> >> Cc: emacs-devel@gnu.org, dev@rjt.dev, emacs-devel@gnu.org >> Date: Tue, 01 Nov 2022 12:53:11 +0100 >> >> > I also have hard time believing that you can reimplement those slow >> > parts of CC Mode to be much faster, but if you have code to show which >> > does that, I'm sure I'd be interested to look at it and consider >> > improving CC Mode using that code. >> > >> >> You'd be surprised. >> >> - https://github.com/emacs-csharp/csharp-mode/pull/251 >> - https://github.com/emacs-csharp/csharp-mode/issues/207 >> - https://github.com/emacs-csharp/csharp-mode/issues/164 >> - https://debbugs.gnu.org/db/43/43631.html >> - https://github.com/emacs-csharp/csharp-mode/issues/151 >> - https://github.com/emacs-csharp/csharp-mode/issues/200 >> >> All of these are solved with [0], no implementation needed for anything >> (apart from generic tree-sitter machinery of course). > >That's for C#, not for C/C++. > >But if you can do the same for C/C++, sure, let's see the code and >judge its relative merits and demerits. > >> >> 3: Confusion with where to file bugs >> > >> > Not relevant in our case: the bugs should be filed with Emacs. >> > >> >> Well, are you sure? > >You asked where to file the bugs. The answer is: on debbugs. If it >eventually turns out the bug is in tree-sitter, we will file a bug >there. Just like we do with any other library we use. Nothing new >here, IMO. > >> > . filling and breaking text in comments and strings >> > . automatic insertion of newlines after braces, colons, commas, semi-colons >> > . whitespace cleanups >> > . minor modes: electric, hungry-delete, comment-style >> > . c-offsets-alist and interactive indentation customization (related >> > to indentation, but still extremely important, and not directly in >> > tree-sitter) >> > >> >> Yes, I've read the manual many times. Filling is one nice thing, >> agreed. electric, hungry-delete is just sitting there waiting for us to >> create a framework using tree-sitter that would benefit _all_ languages >> supported by tree-sitter, not just cc. > >If tree-sitter can make these easier or faster or better, I see no >reason not to use tree-sitter for (some of) those features as well. >There's no decision to limit tree-sitter's use to fontification and >indentation, and I don't think we will ever make such decisions, >except if we have some bitter experience. > >> > As they say, "show me the code". If you can write up a C/C++ mode >> > from scratch which supports most everything in the CC Mode manual, do >> > it better/cleaner than CC Mode does, and do it before the emacs-29 >> > branch is cut, in a month or so, I might change my mind. >> >> Challenge accepted. Can I create it for java, which is a language I'm >> writing a lot in these days? > >Sorry, no. It has to support all the languages supported by CC Mode >now. That's the challenge. > Ok let's do it. But let's restrict it to languages considered stable from https://tree-sitter.github.io/tree-sitter/#available-parsers - c - c++ - c# - java - javascript - typescript - json Ok? >It is fine by me to have a separate java-mode, but then I personally >will not be very interested in this, since editing the Emacs C code, >which I do a lot, will still need to use CC Mode. Without decent >support for C/C++, CC Mode cannot be retired. > >(Do people really use Emacs to develop Java? I'd be surprised.) Yes. I do, no problem Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:05 ` Theodor Thornhill @ 2022-11-01 13:10 ` Eli Zaretskii 2022-11-01 13:27 ` Theodor Thornhill 2022-11-01 16:09 ` tomas 0 siblings, 2 replies; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 13:10 UTC (permalink / raw) To: Theodor Thornhill; +Cc: emacs-devel, dev > Date: Tue, 01 Nov 2022 14:05:39 +0100 > From: Theodor Thornhill <theo@thornhill.no> > CC: emacs-devel@gnu.org, dev@rjt.dev > > >> Challenge accepted. Can I create it for java, which is a language I'm > >> writing a lot in these days? > > > >Sorry, no. It has to support all the languages supported by CC Mode > >now. That's the challenge. > > > > Ok let's do it. But let's restrict it to languages considered stable from https://tree-sitter.github.io/tree-sitter/#available-parsers > > - c > - c++ > - c# > - java > - javascript > - typescript > - json > > Ok? You mean, C, C++, and Java? Yes, SGTM. That'd leave Objective C, IDL, Awk, and Pike out. > >(Do people really use Emacs to develop Java? I'd be surprised.) > > Yes. I do, no problem I meant the stuff that's missing in Emacs which is present in any decent Java IDE. Maybe you use Emacs for Java with many add-on packages? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:10 ` Eli Zaretskii @ 2022-11-01 13:27 ` Theodor Thornhill 2022-11-01 13:49 ` Eli Zaretskii 2022-11-01 16:09 ` tomas 1 sibling, 1 reply; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 13:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, dev On 1 November 2022 14:10:43 CET, Eli Zaretskii <eliz@gnu.org> wrote: >> Date: Tue, 01 Nov 2022 14:05:39 +0100 >> From: Theodor Thornhill <theo@thornhill.no> >> CC: emacs-devel@gnu.org, dev@rjt.dev >> >> >> Challenge accepted. Can I create it for java, which is a language I'm >> >> writing a lot in these days? >> > >> >Sorry, no. It has to support all the languages supported by CC Mode >> >now. That's the challenge. >> > >> >> Ok let's do it. But let's restrict it to languages considered stable from https://tree-sitter.github.io/tree-sitter/#available-parsers >> >> - c >> - c++ >> - c# >> - java >> - javascript >> - typescript >> - json >> >> Ok? > >You mean, C, C++, and Java? Yes, SGTM. That'd leave Objective C, >IDL, Awk, and Pike out. > Yes, they have no parser apart from objc, which is in development. >> >(Do people really use Emacs to develop Java? I'd be surprised.) >> >> Yes. I do, no problem > >I meant the stuff that's missing in Emacs which is present in any >decent Java IDE. Maybe you use Emacs for Java with many add-on >packages? Nope. Eglot. That's it. Some details and integration are not as nice, but it's 80% there ootb. A little config gets me 95% there. Should I begin? I understand there's no obligation from anyone to accept this, but I think it's worth a shot. Can it live in a scratch branch? Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:27 ` Theodor Thornhill @ 2022-11-01 13:49 ` Eli Zaretskii 2022-11-01 13:54 ` Theodor Thornhill 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 13:49 UTC (permalink / raw) To: Theodor Thornhill; +Cc: emacs-devel, dev > Date: Tue, 01 Nov 2022 14:27:19 +0100 > From: Theodor Thornhill <theo@thornhill.no> > CC: emacs-devel@gnu.org, dev@rjt.dev > > Should I begin? Yes, please. > I understand there's no obligation from anyone to accept this, but I think it's worth a shot. Can it live in a scratch branch? I'd prefer the tree-sitter branch. Why make one more? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:49 ` Eli Zaretskii @ 2022-11-01 13:54 ` Theodor Thornhill 2022-11-01 14:03 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 13:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, dev Eli Zaretskii <eliz@gnu.org> writes: >> Date: Tue, 01 Nov 2022 14:27:19 +0100 >> From: Theodor Thornhill <theo@thornhill.no> >> CC: emacs-devel@gnu.org, dev@rjt.dev >> >> Should I begin? > > Yes, please. > >> I understand there's no obligation from anyone to accept this, but I think it's worth a shot. Can it live in a scratch branch? > > I'd prefer the tree-sitter branch. Why make one more? No objections here. I was just trying to enable us to more easily reject without messing with git history. Can I get push access? -- Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:54 ` Theodor Thornhill @ 2022-11-01 14:03 ` Eli Zaretskii 2022-11-01 14:12 ` Theodor Thornhill 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 14:03 UTC (permalink / raw) To: Theodor Thornhill; +Cc: emacs-devel, dev > From: Theodor Thornhill <theo@thornhill.no> > Cc: emacs-devel@gnu.org, dev@rjt.dev > Date: Tue, 01 Nov 2022 14:54:14 +0100 > > Can I get push access? Is it a necessary condition? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 14:03 ` Eli Zaretskii @ 2022-11-01 14:12 ` Theodor Thornhill 0 siblings, 0 replies; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 14:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, dev On 1 November 2022 15:03:43 CET, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Theodor Thornhill <theo@thornhill.no> >> Cc: emacs-devel@gnu.org, dev@rjt.dev >> Date: Tue, 01 Nov 2022 14:54:14 +0100 >> >> Can I get push access? > >Is it a necessary condition? No of course not, but it's simpler. I can work around it, but it'll be slower :-) Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:10 ` Eli Zaretskii 2022-11-01 13:27 ` Theodor Thornhill @ 2022-11-01 16:09 ` tomas 1 sibling, 0 replies; 28+ messages in thread From: tomas @ 2022-11-01 16:09 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 548 bytes --] On Tue, Nov 01, 2022 at 03:10:43PM +0200, Eli Zaretskii wrote: [...] > > >(Do people really use Emacs to develop Java? I'd be surprised.) > > > > Yes. I do, no problem > > I meant the stuff that's missing in Emacs which is present in any > decent Java IDE. Maybe you use Emacs for Java with many add-on > packages? It's a while ago, but I did participate in a Java project, and Emacs was absolutely fine back then. No add-ons. Now I know this isn't everyone's "way of working", but it suits me perfectly. Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 12:28 ` Eli Zaretskii 2022-11-01 13:05 ` Theodor Thornhill @ 2022-11-01 13:12 ` Manuel Uberti 2022-11-04 14:49 ` Benjamin Riefenstahl 2 siblings, 0 replies; 28+ messages in thread From: Manuel Uberti @ 2022-11-01 13:12 UTC (permalink / raw) To: Eli Zaretskii, Theodor Thornhill; +Cc: emacs-devel, dev On 01/11/22 13:28, Eli Zaretskii wrote: > (Do people really use Emacs to develop Java? I'd be surprised.) I've not coded in Java professionally for a while, but I still need some Java for work. A combination of LSP (via Eglot), project.el, and shell-mode covers my needs these days without having to run to Eclipse. But again, it's pet projects and proof of concepts nowadays, nothing big nor business related. Still, Emacs is way more useful with Java than it used to be when I first picked it up ~10yrs ago. -- Manuel Uberti https://manueluberti.eu ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 12:28 ` Eli Zaretskii 2022-11-01 13:05 ` Theodor Thornhill 2022-11-01 13:12 ` Manuel Uberti @ 2022-11-04 14:49 ` Benjamin Riefenstahl 2022-11-04 16:17 ` Pascal Quesseveur 2 siblings, 1 reply; 28+ messages in thread From: Benjamin Riefenstahl @ 2022-11-04 14:49 UTC (permalink / raw) To: emacs-devel Eli Zaretskii writes: > (Do people really use Emacs to develop Java? I'd be surprised.) Just FTR, I do. And I know some other people that use Emacs for Java, too. I only use core Emacs for this and a few functions that I wrote myself. benny ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-04 14:49 ` Benjamin Riefenstahl @ 2022-11-04 16:17 ` Pascal Quesseveur 0 siblings, 0 replies; 28+ messages in thread From: Pascal Quesseveur @ 2022-11-04 16:17 UTC (permalink / raw) To: emacs-devel >"BR" == Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes: BR> Just FTR, I do. Me too. In the past I used jde (or jdee) but not anymore. I developed some functions to use ant and jswat and I'm pretty happy with the result. -- Pascal Quesseveur pquessev@gmail.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 7:24 ` Eli Zaretskii 2022-11-01 7:55 ` Theodor Thornhill @ 2022-11-01 13:32 ` Stefan Monnier 2022-11-01 14:02 ` Eli Zaretskii 1 sibling, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2022-11-01 13:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Theodor Thornhill, emacs-devel, dev >> I'm no authority on the matter, but I'd love for us not to complicate >> things too much. I vote for separate, non-cc-prefixed _new_ modes, that >> derives from prog-mode. > > That'd mean people will need either to invent all the other goodies in > CC mode (everything except fontifications and indentation) from > scratch, or give up all those other goodies. Does that make sense? I'm a strong proponent of keeping "one mode" but from what I've seen so far, trying to mix tree-sitter with CC-mode's `c-mode`, I agree with Theodor that it might be better to start from scratch :-( I have not looked at other languages in CC-mode, so I don't know if the same should apply to all CC-mode's modes (my guess is that it does, tho). My best hope so far is to: - Rename `c-mode` to `cc-c-mode`. - Make a new `c-mode` which delegates to `cc-c-mode` by default unless the user asked for the "new, tree-sitter based, c-mode" in which case it uses the brand new code base. `cc-c-mode` would still set `major-mode` to `c-mode`, so from the users's point of view there's still only one `c-mode` but the two variants (tree-sitter and CC-mode) are almost completely separate. We should make some effort to avoid users thinking "oh, there's the legacy CC-mode-based c-mode and the shiny new tree-sitter-based C-mode", but rather think "should I stay with the trusty CC-mode-based c-mode, or try the toddler c-mode". > Tree-sitter doesn't (and cannot) replace everything a major mode does > for a programming language. No, indeed. But it's hard to use one part of CC-mode without another. One of the great things about CC-mode is how it is all nicely integrated. But that cuts both ways :-( > So a completely new mode means we through the baby with the bathwater. The way I see it is that it will not break backward compatibility, and in the short term it may fail to provide a strict superset of CC-mode's `c-mode` features, but it's still going to be better than mixing the two and then trying to fix the corresponding breakage. > CC Mode has a full-blown manual, where this question is answered. > Here's a partial list of features outside of the fontification and > indentation area, which I collected just by looking at the top-level > menus of that manual: > > . filling and breaking text in comments and strings This should be broken out of CC-mode so that all modes can benefit from it. AFAIK this is the most valuable feature of CC-mode that's sorely missing in our generic infrastructure (lots and lots of other major modes suffer from it, so making it available to all major modes will be a great improvement). > . automatic insertion of newlines after braces, colons, commas, semi-colons This is already provided by `electric-layout-mode`. [ More specifically it's one of the parts of CC-mode which I "broke out of CC-mode so that all modes can benefit from it". Of course, CC-mode doesn't use it, because when you try to implement something to be more generic, you rarely end up with 100% identical behavior; and CC-mode wants to be backward compatible with old Emacsen that don't have `electric-layout-mode`. ] > . whitespace cleanups Not very familiar with this, but I'd be surprised if it wouldn't benefit from "break out of CC-mode so that all modes can benefit from it". > . minor modes: electric, hungry-delete, comment-style "Break out of CC-mode so that all modes can benefit from it". > . c-offsets-alist and interactive indentation customization (related > to indentation, but still extremely important, and not directly in > tree-sitter) This is indeed important, but we can't use CC-mode's code for that in any case: it needs to be reimplemented for tree-sitter's indentation. And it'd be better if we could do that without having to worry about backward compatibility with existing CC-mode users's settings (i.e. we're free to cover the same functionality in a different way). Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 13:32 ` Stefan Monnier @ 2022-11-01 14:02 ` Eli Zaretskii 2022-11-01 15:09 ` Stefan Monnier 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 14:02 UTC (permalink / raw) To: Stefan Monnier; +Cc: theo, emacs-devel, dev > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Theodor Thornhill <theo@thornhill.no>, emacs-devel@gnu.org, dev@rjt.dev > Date: Tue, 01 Nov 2022 09:32:18 -0400 > > My best hope so far is to: > > - Rename `c-mode` to `cc-c-mode`. > - Make a new `c-mode` which delegates to `cc-c-mode` by default unless > the user asked for the "new, tree-sitter based, c-mode" in which case > it uses the brand new code base. > > `cc-c-mode` would still set `major-mode` to `c-mode`, so from the users's > point of view there's still only one `c-mode` but the two variants > (tree-sitter and CC-mode) are almost completely separate. > > We should make some effort to avoid users thinking "oh, there's the > legacy CC-mode-based c-mode and the shiny new tree-sitter-based C-mode", > but rather think "should I stay with the trusty CC-mode-based c-mode, or > try the toddler c-mode". > > > Tree-sitter doesn't (and cannot) replace everything a major mode does > > for a programming language. > > No, indeed. But it's hard to use one part of CC-mode without another. > One of the great things about CC-mode is how it is all > nicely integrated. But that cuts both ways :-( > > > So a completely new mode means we through the baby with the bathwater. > > The way I see it is that it will not break backward compatibility, and > in the short term it may fail to provide a strict superset of CC-mode's > `c-mode` features, but it's still going to be better than mixing the two > and then trying to fix the corresponding breakage. > > > CC Mode has a full-blown manual, where this question is answered. > > Here's a partial list of features outside of the fontification and > > indentation area, which I collected just by looking at the top-level > > menus of that manual: > > > > . filling and breaking text in comments and strings > > This should be broken out of CC-mode so that all modes can benefit from it. > AFAIK this is the most valuable feature of CC-mode that's sorely missing > in our generic infrastructure (lots and lots of other major modes suffer > from it, so making it available to all major modes will be a great > improvement). > > > . automatic insertion of newlines after braces, colons, commas, semi-colons > > This is already provided by `electric-layout-mode`. > [ More specifically it's one of the parts of CC-mode which I > "broke out of CC-mode so that all modes can benefit from it". > Of course, CC-mode doesn't use it, because when you try to > implement something to be more generic, you rarely end up with > 100% identical behavior; and CC-mode wants to be backward compatible > with old Emacsen that don't have `electric-layout-mode`. ] > > > . whitespace cleanups > > Not very familiar with this, but I'd be surprised if it wouldn't benefit > from "break out of CC-mode so that all modes can benefit from it". > > > . minor modes: electric, hungry-delete, comment-style > > "Break out of CC-mode so that all modes can benefit from it". > > > . c-offsets-alist and interactive indentation customization (related > > to indentation, but still extremely important, and not directly in > > tree-sitter) > > This is indeed important, but we can't use CC-mode's code for that in > any case: it needs to be reimplemented for tree-sitter's indentation. > And it'd be better if we could do that without having to worry about > backward compatibility with existing CC-mode users's settings > (i.e. we're free to cover the same functionality in a different way). Sorry for being blunt, but you've presented a plan for Emacs 32 if not 42. If that's what we need, we should first make sure that Theodor (or whoever picks up the gauntlet) will be willing to work on such a branch for that long a time ;-) What _I_ want is to have some decent tree-sitter supported modes in Emacs 29, and I still hope C/C++ editing could benefit from that, in Emacs 29. That calls for a completely different plan, if my experience with Emacs development is of any significance. Bottom line: I don't see how we could make a "revolution" the size you are envisioning in such a short time. Not unless you somehow can summon a team of talented and motivated individuals to work on it starting today. The only practical way I see is by _evolution_, gradually replacing CC Mode's features with tree-sitter supported ones where that makes sense, and at first as opt-in. And yes, this means no "breaking out of CC-mode", at least not as part of this particular effort: it simply is too much, too high a bar to jump. It could well enough kill the effort, for all practical purposes. Of course, I'd be happy to be proven wrong, and be dazzled by a full-fledged, backward-compatible C/C++ mode based on tree-sitter, with all of the stuff you mentioned on top of that, within the month. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 14:02 ` Eli Zaretskii @ 2022-11-01 15:09 ` Stefan Monnier 2022-11-01 15:36 ` Theodor Thornhill 2022-11-01 16:43 ` Eli Zaretskii 0 siblings, 2 replies; 28+ messages in thread From: Stefan Monnier @ 2022-11-01 15:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: theo, emacs-devel, dev > Sorry for being blunt, but you've presented a plan for Emacs 32 if > not 42. Huh? What makes you think that? On the contrary it's a plan that lets us get quickly a working tree-sitter-based C-mode. Not one that's a strict superset of CC-mode's `c-mode`, but a quite decent `c-mode` nevertheless. > Bottom line: I don't see how we could make a "revolution" the size you > are envisioning in such a short time. It's not at all a revolution. It's a very smooth path that breaks nothing and lets us move progressively. It's a mini "revolution" maybe for users who will have to choose between two different flavors of `c-mode`, each one with its current strengths and downsides, but that's the cost to pay for a much smoother job on the implementation. > Not unless you somehow can summon a team of talented and motivated > individuals to work on it starting today. The only practical way > I see is by _evolution_, gradually replacing CC Mode's features with > tree-sitter supported ones where that makes sense, and at first as > opt-in. And yes, this means no "breaking out of CC-mode", at least > not as part of this particular effort: it simply is too much, too high > a bar to jump. It could well enough kill the effort, for all > practical purposes. Slowly evolving CC-mode itself to use tree-sitter is something I can't even begin to imagine how to do. That's what I would expect to take years :-) > Of course, I'd be happy to be proven wrong, and be dazzled by a > full-fledged, backward-compatible C/C++ mode based on tree-sitter, > with all of the stuff you mentioned on top of that, within the month. I don't foresee "all of the stuff" to be done immediately, no. [ Tho I do think the filling code at least can be extracted from CC-mode within a month (or at least, an important subset of it). ] Which is why users will have to choose (and we'll stick to CC-mode by default, of course). Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 15:09 ` Stefan Monnier @ 2022-11-01 15:36 ` Theodor Thornhill 2022-11-01 16:43 ` Eli Zaretskii 1 sibling, 0 replies; 28+ messages in thread From: Theodor Thornhill @ 2022-11-01 15:36 UTC (permalink / raw) To: Stefan Monnier, Eli Zaretskii; +Cc: emacs-devel, dev On 1 November 2022 16:09:39 CET, Stefan Monnier <monnier@iro.umontreal.ca> wrote: >> Sorry for being blunt, but you've presented a plan for Emacs 32 if >> not 42. > >Huh? What makes you think that? > >On the contrary it's a plan that lets us get quickly a working >tree-sitter-based C-mode. Not one that's a strict superset of CC-mode's >`c-mode`, but a quite decent `c-mode` nevertheless. > No matter what we'll decide on, I'll make these modes and submit it for review in some weeks time. I'm no c++ expert, so I'm bound to make mistakes there, but the others I think I have an idea of how to do. > >> Not unless you somehow can summon a team of talented and motivated >> individuals to work on it starting today. The only practical way >> I see is by _evolution_, gradually replacing CC Mode's features with >> tree-sitter supported ones where that makes sense, and at first as >> opt-in. And yes, this means no "breaking out of CC-mode", at least >> not as part of this particular effort: it simply is too much, too high >> a bar to jump. It could well enough kill the effort, for all >> practical purposes. > I'll try to prove you wrong. It seems someone is trying to add it to the proposed cc-treesit.el, so maybe we can have the cake and eat it too ;-) > >I don't foresee "all of the stuff" to be done immediately, no. >[ Tho I do think the filling code at least can be extracted from CC-mode >within a month (or at least, an important subset of it). ] > I think I'll try to make a tree-sitter powered auto-fill. >Which is why users will have to choose (and we'll stick to CC-mode by >default, of course). > Of course. Theo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 15:09 ` Stefan Monnier 2022-11-01 15:36 ` Theodor Thornhill @ 2022-11-01 16:43 ` Eli Zaretskii 1 sibling, 0 replies; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 16:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: theo, emacs-devel, dev > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: theo@thornhill.no, emacs-devel@gnu.org, dev@rjt.dev > Date: Tue, 01 Nov 2022 11:09:39 -0400 > > > Sorry for being blunt, but you've presented a plan for Emacs 32 if > > not 42. > > Huh? What makes you think that? A bit of gray hair, nothing more. > On the contrary it's a plan that lets us get quickly a working > tree-sitter-based C-mode. Not one that's a strict superset of CC-mode's > `c-mode`, but a quite decent `c-mode` nevertheless. I just disagree with the "quickly" part, that's all. > I don't foresee "all of the stuff" to be done immediately, no. So we basically agree. > [ Tho I do think the filling code at least can be extracted from CC-mode > within a month (or at least, an important subset of it). ] Let's see. I'll be happy if that happens. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 5:44 ` Theodor Thornhill 2022-11-01 7:24 ` Eli Zaretskii @ 2022-11-02 20:43 ` João Távora 1 sibling, 0 replies; 28+ messages in thread From: João Távora @ 2022-11-02 20:43 UTC (permalink / raw) To: Theodor Thornhill; +Cc: Randy Taylor, emacs-devel [-- Attachment #1: Type: text/plain, Size: 847 bytes --] On Tue, Nov 1, 2022, 05:46 Theodor Thornhill <theo@thornhill.no> wrote: > > > On 1 November 2022 03:30:54 CET, Randy Taylor <dev@rjt.dev> wrote: > >Hi. > > > >Where specifically should the C and C++ tree-sitter stuff go? I've been > using it for a couple months and would like to upstream syntax highlighting > for both. I'll focus on getting C done first. > > > >I see there are a lot of cc- files; would it be appropriate to add the > tree-sitter stuff into a new cc-treesit.el file? > >Thanks. > > I'm no authority on the matter, but I'd love for us not to complicate > things too much. I vote for separate, non-cc-prefixed _new_ modes, that > derives from prog-mode. > > I understand that this is a controversial opinion, but that's what I want. > I believe people will do that anyway if we don't. > +1 João > [-- Attachment #2: Type: text/html, Size: 1491 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 2:30 feature/tree-sitter: Where to Put C/C++ Stuff Randy Taylor 2022-11-01 5:44 ` Theodor Thornhill @ 2022-11-01 7:20 ` Eli Zaretskii 2022-11-01 12:10 ` Alan Mackenzie 1 sibling, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2022-11-01 7:20 UTC (permalink / raw) To: Randy Taylor, Alan Mackenzie; +Cc: emacs-devel > Date: Tue, 01 Nov 2022 02:30:54 +0000 > From: Randy Taylor <dev@rjt.dev> > > Where specifically should the C and C++ tree-sitter stuff go? I've been using it for a couple months and would > like to upstream syntax highlighting for both. I'll focus on getting C done first. > > I see there are a lot of cc- files; would it be appropriate to add the tree-sitter stuff into a new cc-treesit.el file? I suggest a separate cc-*.el file (e.g., cc-treesit.el), and some user option to trigger its use instead of (or maybe in addition to, as the case may be) the equivalent CC mode stuff. Alan, are you okay with this approach? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: feature/tree-sitter: Where to Put C/C++ Stuff 2022-11-01 7:20 ` Eli Zaretskii @ 2022-11-01 12:10 ` Alan Mackenzie 0 siblings, 0 replies; 28+ messages in thread From: Alan Mackenzie @ 2022-11-01 12:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Randy Taylor, emacs-devel Hello, Eli. On Tue, Nov 01, 2022 at 09:20:33 +0200, Eli Zaretskii wrote: > > Date: Tue, 01 Nov 2022 02:30:54 +0000 > > From: Randy Taylor <dev@rjt.dev> > > Where specifically should the C and C++ tree-sitter stuff go? I've been using it for a couple months and would > > like to upstream syntax highlighting for both. I'll focus on getting C done first. > > I see there are a lot of cc- files; would it be appropriate to add the tree-sitter stuff into a new cc-treesit.el file? > I suggest a separate cc-*.el file (e.g., cc-treesit.el), and some user > option to trigger its use instead of (or maybe in addition to, as the > case may be) the equivalent CC mode stuff. > Alan, are you okay with this approach? Yes, certainly. It is the approach I would have chosen myself. The key sequence C-c C-t is currently unused in CC Mode, and it would seem ideal to toggle tree-sitter with. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2022-11-04 16:17 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-11-01 2:30 feature/tree-sitter: Where to Put C/C++ Stuff Randy Taylor 2022-11-01 5:44 ` Theodor Thornhill 2022-11-01 7:24 ` Eli Zaretskii 2022-11-01 7:55 ` Theodor Thornhill 2022-11-01 9:22 ` Yuan Fu 2022-11-01 9:41 ` Theodor Thornhill 2022-11-01 9:57 ` Eli Zaretskii 2022-11-01 11:53 ` Theodor Thornhill 2022-11-01 12:28 ` Eli Zaretskii 2022-11-01 13:05 ` Theodor Thornhill 2022-11-01 13:10 ` Eli Zaretskii 2022-11-01 13:27 ` Theodor Thornhill 2022-11-01 13:49 ` Eli Zaretskii 2022-11-01 13:54 ` Theodor Thornhill 2022-11-01 14:03 ` Eli Zaretskii 2022-11-01 14:12 ` Theodor Thornhill 2022-11-01 16:09 ` tomas 2022-11-01 13:12 ` Manuel Uberti 2022-11-04 14:49 ` Benjamin Riefenstahl 2022-11-04 16:17 ` Pascal Quesseveur 2022-11-01 13:32 ` Stefan Monnier 2022-11-01 14:02 ` Eli Zaretskii 2022-11-01 15:09 ` Stefan Monnier 2022-11-01 15:36 ` Theodor Thornhill 2022-11-01 16:43 ` Eli Zaretskii 2022-11-02 20:43 ` João Távora 2022-11-01 7:20 ` Eli Zaretskii 2022-11-01 12:10 ` Alan Mackenzie
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.