* tree-sitter: conceptional problem solvable at Emacs' level? @ 2023-02-09 8:09 Holger Schurig 2023-02-09 8:17 ` Po Lu 2023-02-09 16:25 ` Ergus 0 siblings, 2 replies; 26+ messages in thread From: Holger Schurig @ 2023-02-09 8:09 UTC (permalink / raw) To: Emacs-devel Hi, I run branch emacs-29 since some time with great success. And now I wanted to test out tree-sitter and c++-test-mode. Unfortunately, I stumbled into some conceptional problems and wonder if this is actually solvable by Emacs, or if some would need a completely new grammar. The issue is: tree-sitter doesn't work well with C macros. I program a lot in C++/Qt. So let's look at this (valid) C++ program: ----------------------------------------------------------------------------- #include <QObject> class Test : public QObject { Q_OBJECT public: Test() : QObject() {}; public slots: void someSlot() {}; }; ----------------------------------------------------------------------------- If have the libraries installed (e.g. qtbase5-dev on Debian), you can compile this perfectly. However, tree-sitter produces a garbage syntax tree: - contain some bitfield node (which isn't really there) - contains an error node (despite the code being compilable) And as a result, BOTH the indentation and the font-locking is wrong. Would I need to create a tree-sitter grammar in JavaScript that understands this macro-enhanced C++? That would be quite difficult. Or will there be a method to add some kind of tiny-preprocessor to c++-ts-mode, so that it can substitute "Q_OBJECT", "signals" and "slots" with nothing before handing things over to tree-sitter? In comparison, I could teach the old cc-mode about this macro-enriched C++ just with (c-add-style "qt-gnu" '("gnu" (c-access-key . "\\<\\(signals\\|public\\|protected\\|private\\|public slots\\|protected slots\\|private slots\\):"))) I guess that a lot of C and C++ programs use macros. And if there is no simple way to aid tree-sitter in understanding this, then I fear tree-sitter enhanced modes will often be unusable on them. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 8:09 tree-sitter: conceptional problem solvable at Emacs' level? Holger Schurig @ 2023-02-09 8:17 ` Po Lu 2023-02-09 8:50 ` Eli Zaretskii 2023-02-10 7:33 ` Yuan Fu 2023-02-09 16:25 ` Ergus 1 sibling, 2 replies; 26+ messages in thread From: Po Lu @ 2023-02-09 8:17 UTC (permalink / raw) To: Holger Schurig; +Cc: Emacs-devel Holger Schurig <holgerschurig@gmail.com> writes: > Hi, I run branch emacs-29 since some time with great success. And now I > wanted to test out tree-sitter and c++-test-mode. Unfortunately, I > stumbled into some conceptional problems and wonder if this is actually > solvable by Emacs, or if some would need a completely new grammar. > > The issue is: tree-sitter doesn't work well with C macros. > > I program a lot in C++/Qt. So let's look at this (valid) C++ program: > > ----------------------------------------------------------------------------- > #include <QObject> > > class Test : public QObject > { > Q_OBJECT > public: > Test() : QObject() {}; > public slots: > void someSlot() {}; > }; > ----------------------------------------------------------------------------- > > If have the libraries installed (e.g. qtbase5-dev on Debian), you can > compile this perfectly. > > However, tree-sitter produces a garbage syntax tree: > > - contain some bitfield node (which isn't really there) > - contains an error node (despite the code being compilable) > > And as a result, BOTH the indentation and the font-locking is wrong. > > > Would I need to create a tree-sitter grammar in JavaScript that > understands this macro-enhanced C++? That would be quite difficult. > Or will there be a method to add some kind of tiny-preprocessor to > c++-ts-mode, so that it can substitute "Q_OBJECT", "signals" and "slots" > with nothing before handing things over to tree-sitter? > > > In comparison, I could teach the old cc-mode about this macro-enriched > C++ just with > > (c-add-style "qt-gnu" > '("gnu" (c-access-key . > "\\<\\(signals\\|public\\|protected\\|private\\|public > slots\\|protected slots\\|private slots\\):"))) > > > I guess that a lot of C and C++ programs use macros. And if there is no > simple way to aid tree-sitter in understanding this, then I fear > tree-sitter enhanced modes will often be unusable on them. My suggestion is simply to stay with CC Mode. Parsers (without a full C preprocessor inside) can only work for languages like Python, which cannot be enhanced with syntax-modifying macros. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 8:17 ` Po Lu @ 2023-02-09 8:50 ` Eli Zaretskii 2023-02-09 10:13 ` Po Lu 2023-02-10 7:33 ` Yuan Fu 1 sibling, 1 reply; 26+ messages in thread From: Eli Zaretskii @ 2023-02-09 8:50 UTC (permalink / raw) To: Po Lu; +Cc: holgerschurig, Emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: Emacs-devel@gnu.org > Date: Thu, 09 Feb 2023 16:17:27 +0800 > > Holger Schurig <holgerschurig@gmail.com> writes: > > > Hi, I run branch emacs-29 since some time with great success. And now I > > wanted to test out tree-sitter and c++-test-mode. Unfortunately, I > > stumbled into some conceptional problems and wonder if this is actually > > solvable by Emacs, or if some would need a completely new grammar. > > > > The issue is: tree-sitter doesn't work well with C macros. > > > > I program a lot in C++/Qt. So let's look at this (valid) C++ program: > > > > ----------------------------------------------------------------------------- > > #include <QObject> > > > > class Test : public QObject > > { > > Q_OBJECT > > public: > > Test() : QObject() {}; > > public slots: > > void someSlot() {}; > > }; > > ----------------------------------------------------------------------------- > > > > If have the libraries installed (e.g. qtbase5-dev on Debian), you can > > compile this perfectly. > > > > However, tree-sitter produces a garbage syntax tree: > > > > - contain some bitfield node (which isn't really there) > > - contains an error node (despite the code being compilable) > > > > And as a result, BOTH the indentation and the font-locking is wrong. > > > > > > Would I need to create a tree-sitter grammar in JavaScript that > > understands this macro-enhanced C++? That would be quite difficult. > > Or will there be a method to add some kind of tiny-preprocessor to > > c++-ts-mode, so that it can substitute "Q_OBJECT", "signals" and "slots" > > with nothing before handing things over to tree-sitter? > > > > > > In comparison, I could teach the old cc-mode about this macro-enriched > > C++ just with > > > > (c-add-style "qt-gnu" > > '("gnu" (c-access-key . > > "\\<\\(signals\\|public\\|protected\\|private\\|public > > slots\\|protected slots\\|private slots\\):"))) > > > > > > I guess that a lot of C and C++ programs use macros. And if there is no > > simple way to aid tree-sitter in understanding this, then I fear > > tree-sitter enhanced modes will often be unusable on them. > > My suggestion is simply to stay with CC Mode. Suggestions for what to do for now aside, I would still want us to try to figure out the possibilities for better handling of C/C++ macros in tree-sitter supported modes. I don't want to give up yet, because the kludges similar to c-add-style used by CC mode might be possible with tree-sitter modes as well. Or maybe some other solution could work, including the idea of letting tree-sitter see preprocessed source code (although this is probably harder to implement, and must be done on the C level). We just started using these modes in Emacs, so it is small wonder that issues like this are popping up, and will probably keep popping up for some time to come. I see no reason whatsoever to give up on tree-sitter just because these minor problems in marginal cases are brought up; we should instead solve them one by one. Being minor problems, they in no way invalidate the basic decision to try using tree-sitter in Emacs, not from where I stand. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 8:50 ` Eli Zaretskii @ 2023-02-09 10:13 ` Po Lu 2023-02-09 10:55 ` Eli Zaretskii 0 siblings, 1 reply; 26+ messages in thread From: Po Lu @ 2023-02-09 10:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: holgerschurig, Emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Suggestions for what to do for now aside, I would still want us to try > to figure out the possibilities for better handling of C/C++ macros in > tree-sitter supported modes. I don't want to give up yet, because the > kludges similar to c-add-style used by CC mode might be possible with > tree-sitter modes as well. Or maybe some other solution could work, > including the idea of letting tree-sitter see preprocessed source code > (although this is probably harder to implement, and must be done on > the C level). I don't oppose us trying, I just see problems with trying to parse C macros with anything other than guesswork whilst maintaining reasonable speed. Preprocessing C can be very slow, and might not be easy to set up to run on each edit, which would be necessary to let tree-sitter see preprocessed source code. And once you start trying to do that, you have to determine how to run whatever C preprocessor is supposed to preprocess a file. Even language servers, which are typically entire C compilers, fail to understand some kinds of macros not written with them in mind. I guess one example would be the very compiler specific ``macro''-like keywords used by various compilers to designate long and short interrupt service routines, or Intel's ``near'' and ``far'' keywords. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 10:13 ` Po Lu @ 2023-02-09 10:55 ` Eli Zaretskii 0 siblings, 0 replies; 26+ messages in thread From: Eli Zaretskii @ 2023-02-09 10:55 UTC (permalink / raw) To: Po Lu; +Cc: holgerschurig, Emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: holgerschurig@gmail.com, Emacs-devel@gnu.org > Date: Thu, 09 Feb 2023 18:13:59 +0800 > > I don't oppose us trying, I just see problems with trying to parse C > macros with anything other than guesswork whilst maintaining reasonable > speed. Well, guesswork or not, CC mode does it, so it's definitely doable. I see no reason to give up on this without trying. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 8:17 ` Po Lu 2023-02-09 8:50 ` Eli Zaretskii @ 2023-02-10 7:33 ` Yuan Fu 2023-02-10 8:42 ` Eli Zaretskii 1 sibling, 1 reply; 26+ messages in thread From: Yuan Fu @ 2023-02-10 7:33 UTC (permalink / raw) To: Po Lu; +Cc: Holger Schurig, Emacs-devel > On Feb 9, 2023, at 12:17 AM, Po Lu <luangruo@yahoo.com> wrote: > > Holger Schurig <holgerschurig@gmail.com> writes: > >> Hi, I run branch emacs-29 since some time with great success. And now I >> wanted to test out tree-sitter and c++-test-mode. Unfortunately, I >> stumbled into some conceptional problems and wonder if this is actually >> solvable by Emacs, or if some would need a completely new grammar. >> >> The issue is: tree-sitter doesn't work well with C macros. >> >> I program a lot in C++/Qt. So let's look at this (valid) C++ program: >> >> ----------------------------------------------------------------------------- >> #include <QObject> >> >> class Test : public QObject >> { >> Q_OBJECT >> public: >> Test() : QObject() {}; >> public slots: >> void someSlot() {}; >> }; >> ----------------------------------------------------------------------------- >> >> If have the libraries installed (e.g. qtbase5-dev on Debian), you can >> compile this perfectly. >> >> However, tree-sitter produces a garbage syntax tree: >> >> - contain some bitfield node (which isn't really there) >> - contains an error node (despite the code being compilable) >> >> And as a result, BOTH the indentation and the font-locking is wrong. >> >> >> Would I need to create a tree-sitter grammar in JavaScript that >> understands this macro-enhanced C++? That would be quite difficult. >> Or will there be a method to add some kind of tiny-preprocessor to >> c++-ts-mode, so that it can substitute "Q_OBJECT", "signals" and "slots" >> with nothing before handing things over to tree-sitter? >> >> >> In comparison, I could teach the old cc-mode about this macro-enriched >> C++ just with >> >> (c-add-style "qt-gnu" >> '("gnu" (c-access-key . >> "\\<\\(signals\\|public\\|protected\\|private\\|public >> slots\\|protected slots\\|private slots\\):"))) >> >> >> I guess that a lot of C and C++ programs use macros. And if there is no >> simple way to aid tree-sitter in understanding this, then I fear >> tree-sitter enhanced modes will often be unusable on them. > > My suggestion is simply to stay with CC Mode. > > Parsers (without a full C preprocessor inside) can only work for > languages like Python, which cannot be enhanced with syntax-modifying > macros. > Right. Our best hope is for someone to try extend the current tree-sitter-c grammar, but I don’t know how feasible it is. Emacs can also do some limited workaround, but the potential in that department is slim. Yuan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-10 7:33 ` Yuan Fu @ 2023-02-10 8:42 ` Eli Zaretskii [not found] ` <CAOpc7mHX6s0B8vdDee+9FMvQejGTSL3jzgwVekS7Esg-AOf=jw@mail.gmail.com> 2023-02-11 9:34 ` Ihor Radchenko 0 siblings, 2 replies; 26+ messages in thread From: Eli Zaretskii @ 2023-02-10 8:42 UTC (permalink / raw) To: Yuan Fu; +Cc: luangruo, holgerschurig, Emacs-devel > From: Yuan Fu <casouri@gmail.com> > Date: Thu, 9 Feb 2023 23:33:10 -0800 > Cc: Holger Schurig <holgerschurig@gmail.com>, > Emacs-devel@gnu.org > > > Parsers (without a full C preprocessor inside) can only work for > > languages like Python, which cannot be enhanced with syntax-modifying > > macros. > > Right. Our best hope is for someone to try extend the current tree-sitter-c grammar, but I don’t know how feasible it is. Emacs can also do some limited workaround, but the potential in that department is slim. I think we still have a way to go before we reach the above conclusions (which basically mean we give up on improving the situation with C/C++ macros). We should explore other approaches. One such approach would be to perform our own analysis when the parser returns an error node due to macros. Another possibility is to complicate the function we pass to tree-sitter with which to read buffer text, in a way that replaces the text of a macro with something else (in the simplest case, just space characters), so as to avoid errors in the parser, and again analyze the macros in our own code. And I'm sure there are other alternatives. This issue is not unique to Emacs, so studying how other IDEs deal with it could also yield ideas. Volunteers interested in improving support for C/C++ based on tree-sitter are very welcome to step forward and work on these issues. Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
[parent not found: <CAOpc7mHX6s0B8vdDee+9FMvQejGTSL3jzgwVekS7Esg-AOf=jw@mail.gmail.com>]
* Re: tree-sitter: conceptional problem solvable at Emacs' level? [not found] ` <CAOpc7mHX6s0B8vdDee+9FMvQejGTSL3jzgwVekS7Esg-AOf=jw@mail.gmail.com> @ 2023-02-10 11:48 ` Eli Zaretskii 2023-02-11 2:17 ` Po Lu 0 siblings, 1 reply; 26+ messages in thread From: Eli Zaretskii @ 2023-02-10 11:48 UTC (permalink / raw) To: Holger Schurig; +Cc: luangruo, holgerschurig, Emacs-devel > From: Holger Schurig <holgerschurig@gmail.com> > Date: Fri, 10 Feb 2023 11:31:39 +0000 > > > And I'm sure there are other alternatives. This issue is not unique > > to Emacs, so studying how other IDEs deal with it could also yield > > ideas. > > I'm aware of three FOSS IDEs that are somewhat linked to C++ and/or Qt: > > * KDevelop (from the KDE pro project) > * Kate (dito, although some might not call it an IDE) > * Qt Creator > > For the first two I have no info if they even looked into Tree-Sitter. > > However, Qt Creator does't want to implement Tree-Sitter at all, they > are happy with KSyntaxHighlighting. See > https://bugreports.qt.io/browse/QTCREATORBUG-26348 > > KSyntaxHighligting is in turn from the KDE project, and used in KDevelop > and Kate. So basically all three use the same technology. Which seems to > work nicely for them. Thanks. However, I meant the IDEs which are using tree-sitter and support developing C/C++ programs. I believe some do. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-10 11:48 ` Eli Zaretskii @ 2023-02-11 2:17 ` Po Lu 2023-02-11 6:25 ` Konstantin Kharlamov 0 siblings, 1 reply; 26+ messages in thread From: Po Lu @ 2023-02-11 2:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Holger Schurig, Emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > However, I meant the IDEs which are using tree-sitter and support > developing C/C++ programs. I believe some do. I think most of those have similar problems supporting macros. Who knows their names? I may be able to ask some of their users. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 2:17 ` Po Lu @ 2023-02-11 6:25 ` Konstantin Kharlamov 2023-02-11 6:36 ` Konstantin Kharlamov 0 siblings, 1 reply; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 6:25 UTC (permalink / raw) To: Po Lu, Eli Zaretskii; +Cc: Holger Schurig, Emacs-devel On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > Eli Zaretskii <eliz@gnu.org> writes: > > > However, I meant the IDEs which are using tree-sitter and support > > developing C/C++ programs. I believe some do. > > I think most of those have similar problems supporting macros. > Who knows their names? I may be able to ask some of their users. From my experience on and off work, there are just two IDEs (as in, not editors) used most widely for C++ code: QtCreator and Visual Studio. The first you discussed, the second is proprietary. Then again, people most often code in C++ and C with text editors, in that case popular choices from my experience: Sublime Text and VS Code. These two have don't use tree-sitter either. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 6:25 ` Konstantin Kharlamov @ 2023-02-11 6:36 ` Konstantin Kharlamov 2023-02-11 6:51 ` Theodor Thornhill 0 siblings, 1 reply; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 6:36 UTC (permalink / raw) To: Po Lu, Eli Zaretskii; +Cc: Holger Schurig, Emacs-devel On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > However, I meant the IDEs which are using tree-sitter and support > > > developing C/C++ programs. I believe some do. > > > > I think most of those have similar problems supporting macros. > > Who knows their names? I may be able to ask some of their users. > > From my experience on and off work, there are just two IDEs (as in, not > editors) > used most widely for C++ code: QtCreator and Visual Studio. The first you > discussed, the second is proprietary. > > Then again, people most often code in C++ and C with text editors, in that > case > popular choices from my experience: Sublime Text and VS Code. These two have > don't use tree-sitter either. I installed Sublime Text on my Archlinux and tested with the C++ code OP posted. What I see is that ST does seem confused about indentation, while trying to make a newline right after `slots:` line. However, if you try to make a newline after the `void someSlot() {};` line, it will use the indentation used on the previous line. The default cc-mode in Emacs works similarly. The cc-ts-mode on the other hand doesn't make use of the previous indentation, and I think it should. It would resolve that problem and others, because in my experience it happens very often in C and C++ code that you want some custom indentation level, so you just make one and you expect the editor to keep it while creating more new lines. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 6:36 ` Konstantin Kharlamov @ 2023-02-11 6:51 ` Theodor Thornhill 2023-02-11 7:11 ` Konstantin Kharlamov 2023-04-16 19:21 ` Konstantin Kharlamov 0 siblings, 2 replies; 26+ messages in thread From: Theodor Thornhill @ 2023-02-11 6:51 UTC (permalink / raw) To: emacs-devel, Konstantin Kharlamov, Po Lu, Eli Zaretskii Cc: Holger Schurig, Emacs-devel On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> wrote: >On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: >> On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: >> > Eli Zaretskii <eliz@gnu.org> writes: >> > >> > > However, I meant the IDEs which are using tree-sitter and support >> > > developing C/C++ programs. I believe some do. >> > >> > I think most of those have similar problems supporting macros. >> > Who knows their names? I may be able to ask some of their users. >> >> From my experience on and off work, there are just two IDEs (as in, not >> editors) >> used most widely for C++ code: QtCreator and Visual Studio. The first you >> discussed, the second is proprietary. >> >> Then again, people most often code in C++ and C with text editors, in that >> case >> popular choices from my experience: Sublime Text and VS Code. These two have >> don't use tree-sitter either. > >I installed Sublime Text on my Archlinux and tested with the C++ code OP posted. > >What I see is that ST does seem confused about indentation, while trying to make >a newline right after `slots:` line. > >However, if you try to make a newline after the `void someSlot() {};` line, it >will use the indentation used on the previous line. > >The default cc-mode in Emacs works similarly. The cc-ts-mode on the other hand >doesn't make use of the previous indentation, and I think it should. It would >resolve that problem and others, because in my experience it happens very often >in C and C++ code that you want some custom indentation level, so you just make >one and you expect the editor to keep it while creating more new lines. > That last statement sounds easily solvable. Can you send me a short example describing exactly what you want in a code snippet and I'll add it. Thanks, Theo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 6:51 ` Theodor Thornhill @ 2023-02-11 7:11 ` Konstantin Kharlamov 2023-02-11 7:53 ` Konstantin Kharlamov ` (2 more replies) 2023-04-16 19:21 ` Konstantin Kharlamov 1 sibling, 3 replies; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 7:11 UTC (permalink / raw) To: Theodor Thornhill, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: > > > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> > wrote: > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > > > > > However, I meant the IDEs which are using tree-sitter and support > > > > > developing C/C++ programs. I believe some do. > > > > > > > > I think most of those have similar problems supporting macros. > > > > Who knows their names? I may be able to ask some of their users. > > > > > > From my experience on and off work, there are just two IDEs (as in, not > > > editors) > > > used most widely for C++ code: QtCreator and Visual Studio. The first you > > > discussed, the second is proprietary. > > > > > > Then again, people most often code in C++ and C with text editors, in that > > > case > > > popular choices from my experience: Sublime Text and VS Code. These two > > > have > > > don't use tree-sitter either. > > > > I installed Sublime Text on my Archlinux and tested with the C++ code OP > > posted. > > > > What I see is that ST does seem confused about indentation, while trying to > > make > > a newline right after `slots:` line. > > > > However, if you try to make a newline after the `void someSlot() {};` line, > > it > > will use the indentation used on the previous line. > > > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the other > > hand > > doesn't make use of the previous indentation, and I think it should. It > > would > > resolve that problem and others, because in my experience it happens very > > often > > in C and C++ code that you want some custom indentation level, so you just > > make > > one and you expect the editor to keep it while creating more new lines. > > > > That last statement sounds easily solvable. Can you send me a short example > describing exactly what you want in a code snippet and I'll add it. > > Thanks, > Theo Thank you! The example is below, but please wait a bit just to make sure there's no opposition from other people, because I don't know if it works like this on purpose, or not. Given this C++ code with weird class members indentation: class Foo { int a; bool b; }; Now, suppose you put a caret after `bool b;` text and press Enter to make a new line (all tests are done with `emacs -Q`). The behaviour: * cc-mode and Sublime Text: creates a newline with the indentation exactly as on the previous one. * cc-ts-mode: re-indents the `bool b;` line, then creates a new one with a custom indentation that is different from one on the `int a;` line. The cc-mode and Sublime Text behaviour seems like less annoying to me, because if I wanted to reindent the prev. line, most likely I'd did it by pressing an indentation hotkey (e.g. `=` in Evil mode I use). ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 7:11 ` Konstantin Kharlamov @ 2023-02-11 7:53 ` Konstantin Kharlamov 2023-02-11 8:22 ` Konstantin Kharlamov 2023-02-11 8:43 ` Eli Zaretskii 2 siblings, 0 replies; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 7:53 UTC (permalink / raw) To: Theodor Thornhill, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On Sat, 2023-02-11 at 10:11 +0300, Konstantin Kharlamov wrote: > On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: > > > > > > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> > > wrote: > > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > > > > > > > However, I meant the IDEs which are using tree-sitter and support > > > > > > developing C/C++ programs. I believe some do. > > > > > > > > > > I think most of those have similar problems supporting macros. > > > > > Who knows their names? I may be able to ask some of their users. > > > > > > > > From my experience on and off work, there are just two IDEs (as in, not > > > > editors) > > > > used most widely for C++ code: QtCreator and Visual Studio. The first > > > > you > > > > discussed, the second is proprietary. > > > > > > > > Then again, people most often code in C++ and C with text editors, in > > > > that > > > > case > > > > popular choices from my experience: Sublime Text and VS Code. These two > > > > have > > > > don't use tree-sitter either. > > > > > > I installed Sublime Text on my Archlinux and tested with the C++ code OP > > > posted. > > > > > > What I see is that ST does seem confused about indentation, while trying > > > to > > > make > > > a newline right after `slots:` line. > > > > > > However, if you try to make a newline after the `void someSlot() {};` > > > line, > > > it > > > will use the indentation used on the previous line. > > > > > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the other > > > hand > > > doesn't make use of the previous indentation, and I think it should. It > > > would > > > resolve that problem and others, because in my experience it happens very > > > often > > > in C and C++ code that you want some custom indentation level, so you just > > > make > > > one and you expect the editor to keep it while creating more new lines. > > > > > > > That last statement sounds easily solvable. Can you send me a short example > > describing exactly what you want in a code snippet and I'll add it. > > > > Thanks, > > Theo > > Thank you! The example is below, but please wait a bit just to make sure > there's no opposition from other people, because I don't know if it works like > this on purpose, or not. > > Given this C++ code with weird class members indentation: > > class Foo { > int a; > bool b; > }; > > Now, suppose you put a caret after `bool b;` text and press Enter to make a > new > line (all tests are done with `emacs -Q`). The behaviour: > > * cc-mode and Sublime Text: creates a newline with the indentation exactly as > on > the previous one. > * cc-ts-mode: re-indents the `bool b;` line, then creates a new one with a > custom indentation that is different from one on the `int a;` line. > > The cc-mode and Sublime Text behaviour seems like less annoying to me, because > if I wanted to reindent the prev. line, most likely I'd did it by pressing an > indentation hotkey (e.g. `=` in Evil mode I use). FTR, re-using indentation from prev. line seems to has been the default in all Emacs modes, and one that proved to be useful. To support that here are more examples. While writing C, depending on a circumstances the pre-existing code might have indentation of function args like this: foo(arg1, arg2); Or like this: foo( foo1, foo2); You might want to add another argument to the call, but you don't want to re- indent everything. So when you press Enter, you expect the new line to have prev. indentation. Another example from elisp-mode: I have this snippet in my Evil config: (use-package evil ;; […] :bind (:map evil-insert-state-map ;; after having insert-state keymap wiped out make [escape] switch back ;; to normal state ([escape] . 'evil-normal-state) :map evil-normal-state-map ("C-u" . 'evil-scroll-up) ("k" . 'evil-previous-visual-line) ("j" . 'evil-next-visual-line) ;; […] :map evil-visual-state-map ("k" . 'evil-previous-visual-line) ("j" . 'evil-next-visual-line) :map isearch-mode-map ;; allow for "up/down" history scrolling in / search ("<down>" . 'isearch-ring-advance) ("<up>" . 'isearch-ring-retreat) ) ) If I were to re-indent everything, the indentation in `:bind` body would be different. But I want to keep it the way it is now, and due to elisp-mode re- using prev. line indentation, whenever I create a new line, it will always have the indentation I wanted. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 7:11 ` Konstantin Kharlamov 2023-02-11 7:53 ` Konstantin Kharlamov @ 2023-02-11 8:22 ` Konstantin Kharlamov 2023-02-11 8:41 ` Theodor Thornhill 2023-02-11 8:43 ` Eli Zaretskii 2 siblings, 1 reply; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 8:22 UTC (permalink / raw) To: Theodor Thornhill, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On Sat, 2023-02-11 at 10:11 +0300, Konstantin Kharlamov wrote: > On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: > > > > > > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> > > wrote: > > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > > > > > > > However, I meant the IDEs which are using tree-sitter and support > > > > > > developing C/C++ programs. I believe some do. > > > > > > > > > > I think most of those have similar problems supporting macros. > > > > > Who knows their names? I may be able to ask some of their users. > > > > > > > > From my experience on and off work, there are just two IDEs (as in, not > > > > editors) > > > > used most widely for C++ code: QtCreator and Visual Studio. The first > > > > you > > > > discussed, the second is proprietary. > > > > > > > > Then again, people most often code in C++ and C with text editors, in > > > > that > > > > case > > > > popular choices from my experience: Sublime Text and VS Code. These two > > > > have > > > > don't use tree-sitter either. > > > > > > I installed Sublime Text on my Archlinux and tested with the C++ code OP > > > posted. > > > > > > What I see is that ST does seem confused about indentation, while trying > > > to > > > make > > > a newline right after `slots:` line. > > > > > > However, if you try to make a newline after the `void someSlot() {};` > > > line, > > > it > > > will use the indentation used on the previous line. > > > > > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the other > > > hand > > > doesn't make use of the previous indentation, and I think it should. It > > > would > > > resolve that problem and others, because in my experience it happens very > > > often > > > in C and C++ code that you want some custom indentation level, so you just > > > make > > > one and you expect the editor to keep it while creating more new lines. > > > > > > > That last statement sounds easily solvable. Can you send me a short example > > describing exactly what you want in a code snippet and I'll add it. > > > > Thanks, > > Theo > > Thank you! The example is below, but please wait a bit just to make sure > there's no opposition from other people, because I don't know if it works like > this on purpose, or not. > > Given this C++ code with weird class members indentation: > > class Foo { > int a; > bool b; > }; > > Now, suppose you put a caret after `bool b;` text and press Enter to make a > new > line (all tests are done with `emacs -Q`). The behaviour: > > * cc-mode and Sublime Text: creates a newline with the indentation exactly as > on > the previous one. > * cc-ts-mode: re-indents the `bool b;` line, then creates a new one with a > custom indentation that is different from one on the `int a;` line. > > The cc-mode and Sublime Text behaviour seems like less annoying to me, because > if I wanted to reindent the prev. line, most likely I'd did it by pressing an > indentation hotkey (e.g. `=` in Evil mode I use). Oh, wait, though I mistakengly used c-mode instead of c++-mode. The c-mode works this way, it keeps prev. indentation, however c++-mode instead uses a new indentation. It's odd they behave differently, and it certainly is different from other modes (e.g. emacs-lisp-mode). In this case I think the question of whether it should re-use prev. line indentation, which I think the should. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 8:22 ` Konstantin Kharlamov @ 2023-02-11 8:41 ` Theodor Thornhill 2023-02-11 9:37 ` Konstantin Kharlamov 0 siblings, 1 reply; 26+ messages in thread From: Theodor Thornhill @ 2023-02-11 8:41 UTC (permalink / raw) To: Konstantin Kharlamov, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On 11 February 2023 09:22:06 CET, Konstantin Kharlamov <hi-angel@yandex.ru> wrote: >On Sat, 2023-02-11 at 10:11 +0300, Konstantin Kharlamov wrote: >> On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: >> > >> > >> > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> >> > wrote: >> > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: >> > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: >> > > > > Eli Zaretskii <eliz@gnu.org> writes: >> > > > > >> > > > > > However, I meant the IDEs which are using tree-sitter and support >> > > > > > developing C/C++ programs. I believe some do. >> > > > > >> > > > > I think most of those have similar problems supporting macros. >> > > > > Who knows their names? I may be able to ask some of their users. >> > > > >> > > > From my experience on and off work, there are just two IDEs (as in, not >> > > > editors) >> > > > used most widely for C++ code: QtCreator and Visual Studio. The first >> > > > you >> > > > discussed, the second is proprietary. >> > > > >> > > > Then again, people most often code in C++ and C with text editors, in >> > > > that >> > > > case >> > > > popular choices from my experience: Sublime Text and VS Code. These two >> > > > have >> > > > don't use tree-sitter either. >> > > >> > > I installed Sublime Text on my Archlinux and tested with the C++ code OP >> > > posted. >> > > >> > > What I see is that ST does seem confused about indentation, while trying >> > > to >> > > make >> > > a newline right after `slots:` line. >> > > >> > > However, if you try to make a newline after the `void someSlot() {};` >> > > line, >> > > it >> > > will use the indentation used on the previous line. >> > > >> > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the other >> > > hand >> > > doesn't make use of the previous indentation, and I think it should. It >> > > would >> > > resolve that problem and others, because in my experience it happens very >> > > often >> > > in C and C++ code that you want some custom indentation level, so you just >> > > make >> > > one and you expect the editor to keep it while creating more new lines. >> > > >> > >> > That last statement sounds easily solvable. Can you send me a short example >> > describing exactly what you want in a code snippet and I'll add it. >> > >> > Thanks, >> > Theo >> >> Thank you! The example is below, but please wait a bit just to make sure >> there's no opposition from other people, because I don't know if it works like >> this on purpose, or not. >> >> Given this C++ code with weird class members indentation: >> >> class Foo { >> int a; >> bool b; >> }; >> >> Now, suppose you put a caret after `bool b;` text and press Enter to make a >> new >> line (all tests are done with `emacs -Q`). The behaviour: >> >> * cc-mode and Sublime Text: creates a newline with the indentation exactly as >> on >> the previous one. >> * cc-ts-mode: re-indents the `bool b;` line, then creates a new one with a >> custom indentation that is different from one on the `int a;` line. >> >> The cc-mode and Sublime Text behaviour seems like less annoying to me, because >> if I wanted to reindent the prev. line, most likely I'd did it by pressing an >> indentation hotkey (e.g. `=` in Evil mode I use). > >Oh, wait, though I mistakengly used c-mode instead of c++-mode. The c-mode works this way, it keeps prev. indentation, however c++-mode instead uses a new indentation. It's odd they behave differently, and it certainly is different from other modes (e.g. emacs-lisp-mode). In this case I think the question of whether it should re-use prev. line indentation, which I think the should. C-mode or c-ts-mode? Yeah, this is what I'm thinking too. I'll look at it tonight or tomorrow :) Theo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 8:41 ` Theodor Thornhill @ 2023-02-11 9:37 ` Konstantin Kharlamov 2023-02-11 10:25 ` Konstantin Kharlamov 0 siblings, 1 reply; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 9:37 UTC (permalink / raw) To: Theodor Thornhill, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On Sat, 2023-02-11 at 09:41 +0100, Theodor Thornhill wrote: > > > On 11 February 2023 09:22:06 CET, Konstantin Kharlamov <hi-angel@yandex.ru> > wrote: > > On Sat, 2023-02-11 at 10:11 +0300, Konstantin Kharlamov wrote: > > > On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: > > > > > > > > > > > > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov > > > > <hi-angel@yandex.ru> > > > > wrote: > > > > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > > > > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > > > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > > > > > > > > > > > However, I meant the IDEs which are using tree-sitter and > > > > > > > > support > > > > > > > > developing C/C++ programs. I believe some do. > > > > > > > > > > > > > > I think most of those have similar problems supporting macros. > > > > > > > Who knows their names? I may be able to ask some of their users. > > > > > > > > > > > > From my experience on and off work, there are just two IDEs (as in, > > > > > > not > > > > > > editors) > > > > > > used most widely for C++ code: QtCreator and Visual Studio. The > > > > > > first > > > > > > you > > > > > > discussed, the second is proprietary. > > > > > > > > > > > > Then again, people most often code in C++ and C with text editors, > > > > > > in > > > > > > that > > > > > > case > > > > > > popular choices from my experience: Sublime Text and VS Code. These > > > > > > two > > > > > > have > > > > > > don't use tree-sitter either. > > > > > > > > > > I installed Sublime Text on my Archlinux and tested with the C++ code > > > > > OP > > > > > posted. > > > > > > > > > > What I see is that ST does seem confused about indentation, while > > > > > trying > > > > > to > > > > > make > > > > > a newline right after `slots:` line. > > > > > > > > > > However, if you try to make a newline after the `void someSlot() {};` > > > > > line, > > > > > it > > > > > will use the indentation used on the previous line. > > > > > > > > > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the > > > > > other > > > > > hand > > > > > doesn't make use of the previous indentation, and I think it should. > > > > > It > > > > > would > > > > > resolve that problem and others, because in my experience it happens > > > > > very > > > > > often > > > > > in C and C++ code that you want some custom indentation level, so you > > > > > just > > > > > make > > > > > one and you expect the editor to keep it while creating more new > > > > > lines. > > > > > > > > > > > > > That last statement sounds easily solvable. Can you send me a short > > > > example > > > > describing exactly what you want in a code snippet and I'll add it. > > > > > > > > Thanks, > > > > Theo > > > > > > Thank you! The example is below, but please wait a bit just to make sure > > > there's no opposition from other people, because I don't know if it works > > > like > > > this on purpose, or not. > > > > > > Given this C++ code with weird class members indentation: > > > > > > class Foo { > > > int a; > > > bool b; > > > }; > > > > > > Now, suppose you put a caret after `bool b;` text and press Enter to make > > > a > > > new > > > line (all tests are done with `emacs -Q`). The behaviour: > > > > > > * cc-mode and Sublime Text: creates a newline with the indentation exactly > > > as > > > on > > > the previous one. > > > * cc-ts-mode: re-indents the `bool b;` line, then creates a new one with a > > > custom indentation that is different from one on the `int a;` line. > > > > > > The cc-mode and Sublime Text behaviour seems like less annoying to me, > > > because > > > if I wanted to reindent the prev. line, most likely I'd did it by pressing > > > an > > > indentation hotkey (e.g. `=` in Evil mode I use). > > > > Oh, wait, though I mistakengly used c-mode instead of c++-mode. The c-mode > > works this way, it keeps prev. indentation, however c++-mode instead uses a > > new indentation. It's odd they behave differently, and it certainly is > > different from other modes (e.g. emacs-lisp-mode). In this case I think the > > question of whether it should re-use prev. line indentation, which I think > > the should. > > C-mode or c-ts-mode? > > Yeah, this is what I'm thinking too. I'll look at it tonight or tomorrow :) c-ts-mode works the same way as c++-ts-mode does. Upon further inspection I realised that the vanilla c-mode keeps previous indentation in aforementioned case just because it doesn't recognise `class` keyword. But if you replace it with `struct`, it will make use of whatever indentation it thinks is correct instead of one from previous line. However, actually, the vanilla c-mode and c++-mode behave inconsistently. Depending on the code they may or may not make use of previous indentation. So anyway, I re-created an example where indentation is being kept in ST, c-mode, c++-mode, but not in c-ts-mode or c++-ts-mode, below. I also threw in other editors for comparison. Given this code: int main() { foobar( arg1, arg2 ); } Suppose you put a caret after `arg2` text and press Enter to make a new line (all tests are done with `emacs -Q`). The behaviour: * c-mode, c++-mode, Sublime Text (both with `.c` and `.cpp` file), VS Code (both with `.c` and `.cpp` file): creates a new line indented same way as previous one. * c-ts-mode, c++-ts-mode: re-indents the `arg2` line to have indentation different from `arg1,` line, and creates a new line that also has new indentation. * QtCreator: lol, it does no indentation whatsoever in this case. Overall, it seems like "using the previous indentation" is the way to go, it also is used in VS Code and Sublime Text. As a side note, if a user explicitly wants to re-indent the code, behaviour should depend on how much text they selected for re-indentation (at least c-mode and c++-mode behave this way) which is intuitive. For example: if I only select arg2 line, then re-indentation uses previous offsets, so basically nothing happens. However if I select arg1 and arg2 lines, then indentation would be different because the previous line has a different syntax construction of "opening parenthesis", so the default indentation for that case is used, which is "indent arguments to the opening parenthesis of the function". ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 9:37 ` Konstantin Kharlamov @ 2023-02-11 10:25 ` Konstantin Kharlamov 0 siblings, 0 replies; 26+ messages in thread From: Konstantin Kharlamov @ 2023-02-11 10:25 UTC (permalink / raw) To: Theodor Thornhill, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On Sat, 2023-02-11 at 12:37 +0300, Konstantin Kharlamov wrote: > Given this code: > > int main() { > foobar( > arg1, > arg2 > ); > } > > Suppose you put a caret after `arg2` text and press Enter to make a new line > (all tests are done with `emacs -Q`). The behaviour: > > * c-mode, c++-mode, Sublime Text (both with `.c` and `.cpp` file), VS Code > (both > with `.c` and `.cpp` file): creates a new line indented same way as previous > one. > * c-ts-mode, c++-ts-mode: re-indents the `arg2` line to have indentation > different from `arg1,` line, and creates a new line that also has new > indentation. > * QtCreator: lol, it does no indentation whatsoever in this case. Ah, QtCreator does indent, it's just it isn't clear right away due to the way its GUI behaves. Anyway, it creates a new line with its own indentation (weird one btw, it is 5 spaces further than the opening parenthesis). ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 7:11 ` Konstantin Kharlamov 2023-02-11 7:53 ` Konstantin Kharlamov 2023-02-11 8:22 ` Konstantin Kharlamov @ 2023-02-11 8:43 ` Eli Zaretskii 2 siblings, 0 replies; 26+ messages in thread From: Eli Zaretskii @ 2023-02-11 8:43 UTC (permalink / raw) To: Konstantin Kharlamov; +Cc: theo, emacs-devel, luangruo, holgerschurig > From: Konstantin Kharlamov <hi-angel@yandex.ru> > Cc: Holger Schurig <holgerschurig@gmail.com> > Date: Sat, 11 Feb 2023 10:11:24 +0300 > > On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: > > > > > > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> > > wrote: > > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > > > > > > > However, I meant the IDEs which are using tree-sitter and support > > > > > > developing C/C++ programs. I believe some do. > > > > > > > > > > I think most of those have similar problems supporting macros. > > > > > Who knows their names? I may be able to ask some of their users. > > > > > > > > From my experience on and off work, there are just two IDEs (as in, not > > > > editors) > > > > used most widely for C++ code: QtCreator and Visual Studio. The first you > > > > discussed, the second is proprietary. > > > > > > > > Then again, people most often code in C++ and C with text editors, in that > > > > case > > > > popular choices from my experience: Sublime Text and VS Code. These two > > > > have > > > > don't use tree-sitter either. > > > > > > I installed Sublime Text on my Archlinux and tested with the C++ code OP > > > posted. > > > > > > What I see is that ST does seem confused about indentation, while trying to > > > make > > > a newline right after `slots:` line. > > > > > > However, if you try to make a newline after the `void someSlot() {};` line, > > > it > > > will use the indentation used on the previous line. > > > > > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the other > > > hand > > > doesn't make use of the previous indentation, and I think it should. It > > > would > > > resolve that problem and others, because in my experience it happens very > > > often > > > in C and C++ code that you want some custom indentation level, so you just > > > make > > > one and you expect the editor to keep it while creating more new lines. > > > > > > > That last statement sounds easily solvable. Can you send me a short example > > describing exactly what you want in a code snippet and I'll add it. > > > > Thanks, > > Theo > > Thank you! The example is below, but please wait a bit just to make sure there's no opposition from other people, because I don't know if it works like this on purpose, or not. Since we are close to a pretest, I think we should have a defcustom which controls this behavior, and leave to users whether to turn this on or off. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 6:51 ` Theodor Thornhill 2023-02-11 7:11 ` Konstantin Kharlamov @ 2023-04-16 19:21 ` Konstantin Kharlamov 1 sibling, 0 replies; 26+ messages in thread From: Konstantin Kharlamov @ 2023-04-16 19:21 UTC (permalink / raw) To: Theodor Thornhill, emacs-devel, Po Lu, Eli Zaretskii; +Cc: Holger Schurig On Sat, 2023-02-11 at 07:51 +0100, Theodor Thornhill wrote: > > > On 11 February 2023 07:36:26 CET, Konstantin Kharlamov <hi-angel@yandex.ru> > wrote: > > On Sat, 2023-02-11 at 09:25 +0300, Konstantin Kharlamov wrote: > > > On Sat, 2023-02-11 at 10:17 +0800, Po Lu wrote: > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > > > > > However, I meant the IDEs which are using tree-sitter and support > > > > > developing C/C++ programs. I believe some do. > > > > > > > > I think most of those have similar problems supporting macros. > > > > Who knows their names? I may be able to ask some of their users. > > > > > > From my experience on and off work, there are just two IDEs (as in, not > > > editors) > > > used most widely for C++ code: QtCreator and Visual Studio. The first you > > > discussed, the second is proprietary. > > > > > > Then again, people most often code in C++ and C with text editors, in that > > > case > > > popular choices from my experience: Sublime Text and VS Code. These two > > > have > > > don't use tree-sitter either. > > > > I installed Sublime Text on my Archlinux and tested with the C++ code OP > > posted. > > > > What I see is that ST does seem confused about indentation, while trying to > > make > > a newline right after `slots:` line. > > > > However, if you try to make a newline after the `void someSlot() {};` line, > > it > > will use the indentation used on the previous line. > > > > The default cc-mode in Emacs works similarly. The cc-ts-mode on the other > > hand > > doesn't make use of the previous indentation, and I think it should. It > > would > > resolve that problem and others, because in my experience it happens very > > often > > in C and C++ code that you want some custom indentation level, so you just > > make > > one and you expect the editor to keep it while creating more new lines. > > > > That last statement sounds easily solvable. Can you send me a short example > describing exactly what you want in a code snippet and I'll add it. > > Thanks, > Theo Incidentally, I'm reading Stefan Monnier's paper presenting SMIE¹, and it has the following example: > […] It also means that the indentation code should strive to obey previous choices > that the user made. For example if the user wants to indent its code in the > following unconventional way: > > longfunctionname(argument1, argument2, > argument3, > argument4); > > while the user should not be surprised if the auto-indenter tries to align > ‘argument3’ with ‘argument1’, it would be reasonable for them to expect that > ‘argument4’ stays put by simply aligning it with its nearest sibling rather than > with the earlier ‘argument1’. That is to say: yeah, the previous indentation level should be used by default if present. 1: https://arxiv.org/abs/2006.03103 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-10 8:42 ` Eli Zaretskii [not found] ` <CAOpc7mHX6s0B8vdDee+9FMvQejGTSL3jzgwVekS7Esg-AOf=jw@mail.gmail.com> @ 2023-02-11 9:34 ` Ihor Radchenko 2023-02-11 10:42 ` Eli Zaretskii 2023-02-11 13:58 ` Lynn Winebarger 1 sibling, 2 replies; 26+ messages in thread From: Ihor Radchenko @ 2023-02-11 9:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Yuan Fu, luangruo, holgerschurig, Emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Another possibility is to complicate the function we pass to > tree-sitter with which to read buffer text, in a way that replaces the > text of a macro with something else (in the simplest case, just space > characters), so as to avoid errors in the parser, and again analyze > the macros in our own code. Another idea is delegating parts of buffer to Elisp/alternative parser. Tree sitter provides support to documents written using a mixture of grammars: https://tree-sitter.github.io/tree-sitter/using-parsers#multi-language-documents Macros can be considered such a "mixed" grammar with macros being a grammar of their own. AFAIU, tree sitter allows excluding certain file ranges from parsing and instead parse the excluded ranges using alternative grammar. If Elisp can somehow tell tree-sitter backend not skip parsing macro-looking lines, it should solve the problem at least partially. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 9:34 ` Ihor Radchenko @ 2023-02-11 10:42 ` Eli Zaretskii 2023-02-11 13:58 ` Lynn Winebarger 1 sibling, 0 replies; 26+ messages in thread From: Eli Zaretskii @ 2023-02-11 10:42 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, luangruo, holgerschurig, Emacs-devel > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Yuan Fu <casouri@gmail.com>, luangruo@yahoo.com, > holgerschurig@gmail.com, Emacs-devel@gnu.org > Date: Sat, 11 Feb 2023 09:34:42 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Another possibility is to complicate the function we pass to > > tree-sitter with which to read buffer text, in a way that replaces the > > text of a macro with something else (in the simplest case, just space > > characters), so as to avoid errors in the parser, and again analyze > > the macros in our own code. > > Another idea is delegating parts of buffer to Elisp/alternative parser. Could be. However: > Tree sitter provides support to documents written using a mixture of > grammars: https://tree-sitter.github.io/tree-sitter/using-parsers#multi-language-documents > Macros can be considered such a "mixed" grammar with macros being a > grammar of their own. > > AFAIU, tree sitter allows excluding certain file ranges from parsing > and instead parse the excluded ranges using alternative grammar. If > Elisp can somehow tell tree-sitter backend not skip parsing > macro-looking lines, it should solve the problem at least partially. I believe the problem is with handling the parts which _use_ the macro, not those parts which _define_ macros. Still, this idea should be explored, I think. Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-11 9:34 ` Ihor Radchenko 2023-02-11 10:42 ` Eli Zaretskii @ 2023-02-11 13:58 ` Lynn Winebarger 1 sibling, 0 replies; 26+ messages in thread From: Lynn Winebarger @ 2023-02-11 13:58 UTC (permalink / raw) To: Ihor Radchenko Cc: Eli Zaretskii, Yuan Fu, luangruo, holgerschurig, Emacs-devel On Sat, Feb 11, 2023 at 4:34 AM Ihor Radchenko <yantar92@posteo.net> wrote: > > Eli Zaretskii <eliz@gnu.org> writes: > > > Another possibility is to complicate the function we pass to > > tree-sitter with which to read buffer text, in a way that replaces the > > text of a macro with something else (in the simplest case, just space > > characters), so as to avoid errors in the parser, and again analyze > > the macros in our own code. > > Another idea is delegating parts of buffer to Elisp/alternative parser. > > Tree sitter provides support to documents written using a mixture of > grammars: https://tree-sitter.github.io/tree-sitter/using-parsers#multi-language-documents > Macros can be considered such a "mixed" grammar with macros being a > grammar of their own. > > AFAIU, tree sitter allows excluding certain file ranges from parsing > and instead parse the excluded ranges using alternative grammar. If > Elisp can somehow tell tree-sitter backend not skip parsing > macro-looking lines, it should solve the problem at least partially. > What's needed is (a) a generic "macro keyword" terminal in tree-sitter grammar, recognized by the lexer because they can appear anywhere, and (b) a parser for macro definitions. Then (b) maintains the set of macros that the lexer uses to recognize instances of (a). For extra credit, the macros could be hypothetically expanded, the results parsed, and the annotations generated on instances of the macro arguments mapped back to their occurrence as arguments. Maybe some kind of "unfold" notation could be used to see the results of the expansion and the resulting annotations in context. Lynn ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 8:09 tree-sitter: conceptional problem solvable at Emacs' level? Holger Schurig 2023-02-09 8:17 ` Po Lu @ 2023-02-09 16:25 ` Ergus 2023-02-09 20:09 ` Dmitry Gutov 1 sibling, 1 reply; 26+ messages in thread From: Ergus @ 2023-02-09 16:25 UTC (permalink / raw) To: Holger Schurig; +Cc: Emacs-devel Hi: Probably I a saying the obvious, but, did you tried to share this in the treesit syntax repository issues? https://github.com/tree-sitter/tree-sitter-cpp/issues Maybe they could give a native better solution... and fix it there, so we won't need to reinvennt the wheel. Best, Ergus On Thu, Feb 09, 2023 at 12:09:10AM -0800, Holger Schurig wrote: >Hi, I run branch emacs-29 since some time with great success. And now I >wanted to test out tree-sitter and c++-test-mode. Unfortunately, I >stumbled into some conceptional problems and wonder if this is actually >solvable by Emacs, or if some would need a completely new grammar. > >The issue is: tree-sitter doesn't work well with C macros. > >I program a lot in C++/Qt. So let's look at this (valid) C++ program: > >----------------------------------------------------------------------------- >#include <QObject> > >class Test : public QObject >{ > Q_OBJECT >public: > Test() : QObject() {}; >public slots: > void someSlot() {}; >}; >----------------------------------------------------------------------------- > >If have the libraries installed (e.g. qtbase5-dev on Debian), you can >compile this perfectly. > >However, tree-sitter produces a garbage syntax tree: > >- contain some bitfield node (which isn't really there) >- contains an error node (despite the code being compilable) > >And as a result, BOTH the indentation and the font-locking is wrong. > > >Would I need to create a tree-sitter grammar in JavaScript that >understands this macro-enhanced C++? That would be quite difficult. >Or will there be a method to add some kind of tiny-preprocessor to >c++-ts-mode, so that it can substitute "Q_OBJECT", "signals" and "slots" >with nothing before handing things over to tree-sitter? > > >In comparison, I could teach the old cc-mode about this macro-enriched >C++ just with > > (c-add-style "qt-gnu" > '("gnu" (c-access-key . > "\\<\\(signals\\|public\\|protected\\|private\\|public > slots\\|protected slots\\|private slots\\):"))) > > >I guess that a lot of C and C++ programs use macros. And if there is no >simple way to aid tree-sitter in understanding this, then I fear >tree-sitter enhanced modes will often be unusable on them. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 16:25 ` Ergus @ 2023-02-09 20:09 ` Dmitry Gutov 2023-02-10 7:41 ` Holger Schurig 0 siblings, 1 reply; 26+ messages in thread From: Dmitry Gutov @ 2023-02-09 20:09 UTC (permalink / raw) To: Ergus, Holger Schurig; +Cc: Emacs-devel On 09/02/2023 18:25, Ergus wrote: > Hi: > > Probably I a saying the obvious, but, did you tried to share this in the > treesit syntax repository issues? > > https://github.com/tree-sitter/tree-sitter-cpp/issues > > Maybe they could give a native better solution... and fix it there, so > we won't need to reinvennt the wheel. These might be relevant: https://github.com/tree-sitter/tree-sitter-cpp/issues/85 https://github.com/tree-sitter/tree-sitter-cpp/issues/40 https://github.com/tree-sitter/tree-sitter-cpp/issues/146 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: tree-sitter: conceptional problem solvable at Emacs' level? 2023-02-09 20:09 ` Dmitry Gutov @ 2023-02-10 7:41 ` Holger Schurig 0 siblings, 0 replies; 26+ messages in thread From: Holger Schurig @ 2023-02-10 7:41 UTC (permalink / raw) To: Dmitry Gutov, Ergus; +Cc: Emacs-devel > https://github.com/tree-sitter/tree-sitter-cpp/issues/85 From January 2021 Inconclusive. > https://github.com/tree-sitter/tree-sitter-cpp/issues/40 From June 2019 Speaks about regenerating the parser based on environment variabled. That means you'll have to have the whole NPM toolchain installed. Final suggestion is replacing things with spaces before feeding the data to tree-sitter. That would keep offsets intact. > https://github.com/tree-sitter/tree-sitter-cpp/issues/146 From January 2022 Points to alternate parsers for C++ dialects, here for OpenFOAM code. Given that these bugs are sitting there for sometimes years, I can conclude that the tree-sitter project doesn't care at all. Or lacks the manpower. Or assigns it differently. Or that the core developers of it have different itches to scratch. Whatever the reason, I wouldn't hold my breath that anything changes on their side soon. The last idea from their bug 40 sounds like it could be implemented on Emacs side. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2023-04-16 19:21 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-09 8:09 tree-sitter: conceptional problem solvable at Emacs' level? Holger Schurig 2023-02-09 8:17 ` Po Lu 2023-02-09 8:50 ` Eli Zaretskii 2023-02-09 10:13 ` Po Lu 2023-02-09 10:55 ` Eli Zaretskii 2023-02-10 7:33 ` Yuan Fu 2023-02-10 8:42 ` Eli Zaretskii [not found] ` <CAOpc7mHX6s0B8vdDee+9FMvQejGTSL3jzgwVekS7Esg-AOf=jw@mail.gmail.com> 2023-02-10 11:48 ` Eli Zaretskii 2023-02-11 2:17 ` Po Lu 2023-02-11 6:25 ` Konstantin Kharlamov 2023-02-11 6:36 ` Konstantin Kharlamov 2023-02-11 6:51 ` Theodor Thornhill 2023-02-11 7:11 ` Konstantin Kharlamov 2023-02-11 7:53 ` Konstantin Kharlamov 2023-02-11 8:22 ` Konstantin Kharlamov 2023-02-11 8:41 ` Theodor Thornhill 2023-02-11 9:37 ` Konstantin Kharlamov 2023-02-11 10:25 ` Konstantin Kharlamov 2023-02-11 8:43 ` Eli Zaretskii 2023-04-16 19:21 ` Konstantin Kharlamov 2023-02-11 9:34 ` Ihor Radchenko 2023-02-11 10:42 ` Eli Zaretskii 2023-02-11 13:58 ` Lynn Winebarger 2023-02-09 16:25 ` Ergus 2023-02-09 20:09 ` Dmitry Gutov 2023-02-10 7:41 ` Holger Schurig
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.