unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
@ 2023-03-23 20:43 Herman, Geza
  2023-03-24 18:17 ` Yuan Fu
  0 siblings, 1 reply; 16+ messages in thread
From: Herman, Geza @ 2023-03-23 20:43 UTC (permalink / raw)
  To: 62412

Copy this half-written program to a c++-ts-mode buffer:

----- 8< -----------------------

void foo() {
   for (int i=0
}

int main(int argc, char *argv[]) {
}

----- 8< -----------------------

Move the point to the end of the line of the for loop, and press ";" (as 
if you continued to write the loop). Notice that the line will lose its 
indentation ("for" will be moved to column 1). If you continue writing 
the for loop, it will be correctly re-indented after the closing 
parenthesis (for example, continue the line with "; i<10; i++)", and 
notice that after pressing ")", the line will be re-indented).

This doesn't happen if the main function is deleted. I'm not sure 
whether this is a tree-sitter or emacs problem, but I reported here 
because I think it's more likely that this is some emacs problem.


In GNU Emacs 29.0.60 (build 1, x86_64-pc-linux-gnu, GTK+ Version
  3.24.36, cairo version 1.16.0) of 2023-03-23 built on okoska
Repository revision: d93a439846f03dfb2be28d6b5c2e963ef6be0c22
Repository branch: emacs-29
Windowing system distributor 'The X.Org Foundation', version 11.0.12101006
System Description: Debian GNU/Linux bookworm/sid

Configured using:
  'configure --with-native-compilation=aot --without-compress-install
  --with-json --with-xinput2 --with-xwidgets --with-tree-sitter
  --with-cairo'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES
NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND SQLITE3 THREADS
TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP X11 XDBE XIM XINPUT2 XPM
XWIDGETS GTK3 ZLIB

Important settings:
   value of $LC_ALL: C.UTF-8
   value of $LANG: en_US.UTF-8
   value of $XMODIFIERS: @im=none
   locale-coding-system: utf-8-unix

Major mode: C++//

Minor modes in effect:
   tooltip-mode: t
   global-eldoc-mode: t
   show-paren-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   line-number-mode: t
   indent-tabs-mode: t
   transient-mark-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search time-date mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils c-ts-mode
c-ts-common treesit comp comp-cstr warnings icons subr-x rx cl-seq
cl-macs gv cl-extra help-mode bytecomp byte-compile cc-mode cc-fonts
cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
cl-loaddefs cl-lib rmc iso-transl tooltip cconv eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads xwidget-internal dbusbind
inotify lcms2 dynamic-setting system-font-setting font-render-setting
cairo move-toolbar gtk x-toolkit xinput2 x multi-tty
make-network-process native-compile emacs)

Memory information:
((conses 16 109621 7046)
  (symbols 48 9139 0)
  (strings 32 26509 2321)
  (string-bytes 1 916195)
  (vectors 16 18595)
  (vector-slots 8 371251 16029)
  (floats 8 33 41)
  (intervals 56 351 0)
  (buffers 984 12))






^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree  sitter
  2023-03-23 20:43 bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter Herman, Geza
@ 2023-03-24 18:17 ` Yuan Fu
  2023-03-24 20:04   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 16+ messages in thread
From: Yuan Fu @ 2023-03-24 18:17 UTC (permalink / raw)
  To: geza.herman; +Cc: 62412, theodor thornhill


"Herman, Geza" <geza.herman@gmail.com> writes:

> Copy this half-written program to a c++-ts-mode buffer:
>
> ----- 8< -----------------------
>
> void foo() {
>   for (int i=0
> }
>
> int main(int argc, char *argv[]) {
> }
>
> ----- 8< -----------------------
>
> Move the point to the end of the line of the for loop, and press ";"
> (as if you continued to write the loop). Notice that the line will
> lose its indentation ("for" will be moved to column 1). If you
> continue writing the for loop, it will be correctly re-indented after
> the closing parenthesis (for example, continue the line with "; i<10;
> i++)", and notice that after pressing ")", the line will be
> re-indented).
>
> This doesn't happen if the main function is deleted. I'm not sure
> whether this is a tree-sitter or emacs problem, but I reported here
> because I think it's more likely that this is some emacs problem.

I believe this is due to this rule:

((query "(ERROR (ERROR)) @indent") column-0 0)

I’m not sure about the original purpose for this rule, CC’ing Theo.

Yuan





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree  sitter
  2023-03-24 18:17 ` Yuan Fu
@ 2023-03-24 20:04   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-03-25  8:53     ` João Távora
  2023-03-26 13:25     ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 16+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-24 20:04 UTC (permalink / raw)
  To: Yuan Fu, geza.herman; +Cc: 62412



On 24 March 2023 19:17:28 CET, Yuan Fu <casouri@gmail.com> wrote:
>
>"Herman, Geza" <geza.herman@gmail.com> writes:
>
>> Copy this half-written program to a c++-ts-mode buffer:
>>
>> ----- 8< -----------------------
>>
>> void foo() {
>>   for (int i=0
>> }
>>
>> int main(int argc, char *argv[]) {
>> }
>>
>> ----- 8< -----------------------
>>
>> Move the point to the end of the line of the for loop, and press ";"
>> (as if you continued to write the loop). Notice that the line will
>> lose its indentation ("for" will be moved to column 1). If you
>> continue writing the for loop, it will be correctly re-indented after
>> the closing parenthesis (for example, continue the line with "; i<10;
>> i++)", and notice that after pressing ")", the line will be
>> re-indented).
>>
>> This doesn't happen if the main function is deleted. I'm not sure
>> whether this is a tree-sitter or emacs problem, but I reported here
>> because I think it's more likely that this is some emacs problem.
>
>I believe this is due to this rule:
>
>((query "(ERROR (ERROR)) @indent") column-0 0)
>
>I’m not sure about the original purpose for this rule, CC’ing Theo.
>
>Yuan
I'll look more deeply into the cause of this, but the rule is covering some preproc directives iirc.

Unfortunately tree-sitter behaves better when auto pairs is used. I would advise people to use electric-pairs-mode (if that's the correct name, on mobile now) to avoid these sorts of issues. 

Theo





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-24 20:04   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-03-25  8:53     ` João Távora
  2023-03-25 10:19       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-03-25 10:26       ` Herman, Géza
  2023-03-26 13:25     ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 2 replies; 16+ messages in thread
From: João Távora @ 2023-03-25  8:53 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: Yuan Fu, 62412, geza.herman

On Fri, Mar 24, 2023 at 10:02 PM Theodor Thornhill via Bug reports for
GNU Emacs, the Swiss army knife of text editors
<bug-gnu-emacs@gnu.org> wrote:
e about the original purpose for this rule, CC’ing Theo.
> >
> >Yuan
> I'll look more deeply into the cause of this, but the rule is covering some preproc directives iirc.
>
> Unfortunately tree-sitter behaves better when auto pairs is used. I would advise people to use electric-pairs-mode (if that's the correct name, on mobile now) to avoid these sorts of issues.

electric-pair-mode, it's not on by default.

But, for some reason, electric-indent-mode _is_ on by default,
at least in c++-ts-mode.

So this has nothing to do with tree-sitter IMO, it's just
electric-pair-mode doing its thing.

Why is it on by default?  A fair number of users don't like
this electricity, or prefer to have it toned down.  At least
this has been the  argument for not turning on electric-pair-mode
by default, which is a much less jarring mode IMO, and one which
would solve these problems.

João





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25  8:53     ` João Távora
@ 2023-03-25 10:19       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-03-25 10:28         ` João Távora
  2023-03-25 10:26       ` Herman, Géza
  1 sibling, 1 reply; 16+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-25 10:19 UTC (permalink / raw)
  To: João Távora; +Cc: Yuan Fu, 62412, geza.herman



On 25 March 2023 09:53:31 CET, "João Távora" <joaotavora@gmail.com> wrote:
>On Fri, Mar 24, 2023 at 10:02 PM Theodor Thornhill via Bug reports for
>GNU Emacs, the Swiss army knife of text editors
><bug-gnu-emacs@gnu.org> wrote:
>e about the original purpose for this rule, CC’ing Theo.
>> >
>> >Yuan
>> I'll look more deeply into the cause of this, but the rule is covering some preproc directives iirc.
>>
>> Unfortunately tree-sitter behaves better when auto pairs is used. I would advise people to use electric-pairs-mode (if that's the correct name, on mobile now) to avoid these sorts of issues.
>
>electric-pair-mode, it's not on by default.
>
>But, for some reason, electric-indent-mode _is_ on by default,
>at least in c++-ts-mode.
>
>So this has nothing to do with tree-sitter IMO, it's just
>electric-pair-mode doing its thing.
>
>Why is it on by default?  A fair number of users don't like
>this electricity, or prefer to have it toned down.  At least
>this has been the  argument for not turning on electric-pair-mode
>by default, which is a much less jarring mode IMO, and one which
>would solve these problems.
>
>João

Yeah, maybe! But I was under the impression that indentation was electric by default in most modes, but I may be mistaken.

The reason I mentioned electric-pair-mode is that the parser fails less often when the closing paren or bracket is inserted, as it is much simpler to have a functional ast.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25  8:53     ` João Távora
  2023-03-25 10:19       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-03-25 10:26       ` Herman, Géza
  2023-03-25 11:43         ` João Távora
  1 sibling, 1 reply; 16+ messages in thread
From: Herman, Géza @ 2023-03-25 10:26 UTC (permalink / raw)
  To: João Távora, Theodor Thornhill; +Cc: Yuan Fu, 62412

That's the difference. c++-ts-mode appends "{}():;,#" to 
electric-indent-chars, while c++-mode doesn't do this.

Nevertheless, I think that the calculated intendation should be a 
correct number. Even if the user turns off electric-indent-mode, but 
dedices to call re-indenting manually for a half-written for loop, emacs 
should re-intent the line properly. At least, this specific example 
works OK with c++-mode.

(note that I understand that this problem is not trivial, tree 
sitter/emacs may get confused if the buffer cannot be properly parsed)

On 3/25/23 09:53, João Távora wrote:
> On Fri, Mar 24, 2023 at 10:02 PM Theodor Thornhill via Bug reports for
> GNU Emacs, the Swiss army knife of text editors
> <bug-gnu-emacs@gnu.org> wrote:
> e about the original purpose for this rule, CC’ing Theo.
>>> Yuan
>> I'll look more deeply into the cause of this, but the rule is covering some preproc directives iirc.
>>
>> Unfortunately tree-sitter behaves better when auto pairs is used. I would advise people to use electric-pairs-mode (if that's the correct name, on mobile now) to avoid these sorts of issues.
> electric-pair-mode, it's not on by default.
>
> But, for some reason, electric-indent-mode _is_ on by default,
> at least in c++-ts-mode.
>
> So this has nothing to do with tree-sitter IMO, it's just
> electric-pair-mode doing its thing.
>
> Why is it on by default?  A fair number of users don't like
> this electricity, or prefer to have it toned down.  At least
> this has been the  argument for not turning on electric-pair-mode
> by default, which is a much less jarring mode IMO, and one which
> would solve these problems.
>
> João






^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 10:19       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-03-25 10:28         ` João Távora
  0 siblings, 0 replies; 16+ messages in thread
From: João Távora @ 2023-03-25 10:28 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: Yuan Fu, 62412, geza.herman

On Sat, Mar 25, 2023 at 10:19 AM Theodor Thornhill <theo@thornhill.no> wrote:
>
>
>
> On 25 March 2023 09:53:31 CET, "João Távora" <joaotavora@gmail.com> wrote:
> >On Fri, Mar 24, 2023 at 10:02 PM Theodor Thornhill via Bug reports for
> >GNU Emacs, the Swiss army knife of text editors
> ><bug-gnu-emacs@gnu.org> wrote:
> >e about the original purpose for this rule, CC’ing Theo.
> >> >
> >> >Yuan
> >> I'll look more deeply into the cause of this, but the rule is covering some preproc directives iirc.
> >>
> >> Unfortunately tree-sitter behaves better when auto pairs is used. I would advise people to use electric-pairs-mode (if that's the correct name, on mobile now) to avoid these sorts of issues.
> >
> >electric-pair-mode, it's not on by default.
> >
> >But, for some reason, electric-indent-mode _is_ on by default,
> >at least in c++-ts-mode.
> >
> >So this has nothing to do with tree-sitter IMO, it's just
> >electric-pair-mode doing its thing.
> >
> >Why is it on by default?  A fair number of users don't like
> >this electricity, or prefer to have it toned down.  At least
> >this has been the  argument for not turning on electric-pair-mode
> >by default, which is a much less jarring mode IMO, and one which
> >would solve these problems.
> >
> >João
>
> Yeah, maybe! But I was under the impression that indentation was electric by default in most modes, but I may be mistaken.
>
> The reason I mentioned electric-pair-mode is that the parser fails less often when the closing paren or bracket is inserted, as it is much simpler to have a functional ast.

Sure.  I wrote electric-pair-mode and that's exactly the point.

electric-indent-mode is on by default, but many modes just
use the default value for electric-indent-chars, which only
contains a newline, and so they aren't affected by this problem.

In c++-ts-mode, you gave electric-indent-chars a richer value,
including characters such as '}', ':' and ';'.  This is not
unreasonable, but, as you've discovered, only really goes together
well with electric-pair-mode.

João





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 10:26       ` Herman, Géza
@ 2023-03-25 11:43         ` João Távora
  2023-03-25 13:48           ` Herman, Géza
  2023-03-26 13:54           ` Herman, Géza
  0 siblings, 2 replies; 16+ messages in thread
From: João Távora @ 2023-03-25 11:43 UTC (permalink / raw)
  To: Herman, Géza; +Cc: Yuan Fu, Theodor Thornhill, 62412

On Sat, Mar 25, 2023 at 10:26 AM Herman, Géza <geza.herman@gmail.com> wrote:
>
> That's the difference. c++-ts-mode appends "{}():;,#" to
> electric-indent-chars, while c++-mode doesn't do this.
>
> Nevertheless, I think that the calculated intendation should be a
> correct number. Even if the user turns off electric-indent-mode, but
> dedices to call re-indenting manually for a half-written for loop, emacs
> should re-intent the line properly. At least, this specific example
> works OK with c++-mode.
>
> (note that I understand that this problem is not trivial, tree
> sitter/emacs may get confused if the buffer cannot be properly parsed)

There can be no "correct" indentation in a buffer with an invalid state.

But there are heuristics.  Here, it can be argued that c++-mode's
heuristics are better.

Let's assume you turn off electric-indent-mode. In c++-mode, pressing RET
after:

   int main() {

"correctly" indents the next line.  In c++-ts-mode, it doesn't.

Both programs are ill-formed but you're right that after correcting
that, by say adding 'return 0; RET }', the c++-mode version of the
same program is closer to being correctly indented.

But this heuristic is not always great, so it's a stick with two ends.

Now let's take another invalid program:
   int foo()
   class bar { | <- cursor here
   }

In c++-mode typing TAB indents the class line to the second column,
which is arguably worse than c++-ts-mode, which doesn't do anything.
That's because you may well want to work on that class and then only
remember that you need the ';' for the declaration of 'foo'.

IMO, it's a question of getting used to it in the end.  And using
electric-pair-mode helps a lot, as some have pointed out.

João





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 11:43         ` João Távora
@ 2023-03-25 13:48           ` Herman, Géza
  2023-03-25 16:23             ` João Távora
  2023-03-26 13:54           ` Herman, Géza
  1 sibling, 1 reply; 16+ messages in thread
From: Herman, Géza @ 2023-03-25 13:48 UTC (permalink / raw)
  To: João Távora; +Cc: Yuan Fu, Theodor Thornhill, 62412



On 3/25/23 12:43, João Távora wrote:
>
> There can be no "correct" indentation in a buffer with an invalid state.
>
> But there are heuristics.  Here, it can be argued that c++-mode's
> heuristics are better.
I agree. In my opinion, c++-mode's heuristics are good. Tree-sitter 
support is new, it's expected that it won't work perfectly. Also, it 
doesn't have to handle any invalid program. But, while writing a 
program, it should handle indentation sensibly. I don't think that it's 
a good approach that everybody who uses electric indent should get used 
to the fact that whenever they writing a for loop, the line will jump 
around. It's a bad experience.

Anyways, feel free to close this issue if you think otherwise. I just 
disabled ';'-caused auto indenting, so I don't see this unpleasant 
behavior any more.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 13:48           ` Herman, Géza
@ 2023-03-25 16:23             ` João Távora
  2023-03-25 17:41               ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-03-25 17:47               ` Herman, Géza
  0 siblings, 2 replies; 16+ messages in thread
From: João Távora @ 2023-03-25 16:23 UTC (permalink / raw)
  To: Herman, Géza; +Cc: Yuan Fu, Theodor Thornhill, 62412

"Herman, Géza" <geza.herman@gmail.com> writes:

> On 3/25/23 12:43, João Távora wrote:
>>
>> There can be no "correct" indentation in a buffer with an invalid state.
>>
>> But there are heuristics.  Here, it can be argued that c++-mode's
>> heuristics are better.
> I agree. In my opinion, c++-mode's heuristics are good.

That's probably only because we're _used_ to c++-mode.  If we had been
using c++-ts-mode for years, we would be equally suprised.

> Tree-sitter support is new, it's expected that it won't work
> perfectly. Also, it doesn't have to handle any invalid program. But,
> while writing a program, it should handle indentation sensibly. I
> don't think that it's a good approach that everybody who uses electric
> indent should get used to the fact that whenever they writing a for
> loop, the line will jump around. It's a bad experience.

But writing a for loop from scratch is only one of the editing
activities you do in a C++ file.  Other activities involve editing
existing code.  In those situations, c++-ts-mode's heuristics could
"win".  Unless you're willing to posit that writing code from scratch is
more frequent than editing existing code, there's no right answer here.

> Anyways, feel free to close this issue if you think otherwise. I just
> disabled ';'-caused auto indenting, so I don't see this unpleasant
> behavior any more.

Yes, i'm inclined to think that c++-ts-mode shouldn't add any chars to
electric-indent-chars.  It's just not useful.

João





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 16:23             ` João Távora
@ 2023-03-25 17:41               ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-12  0:03                 ` Stefan Kangas
  2023-03-25 17:47               ` Herman, Géza
  1 sibling, 1 reply; 16+ messages in thread
From: Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-25 17:41 UTC (permalink / raw)
  To: João Távora, Herman, Géza; +Cc: Yuan Fu, 62412



On 25 March 2023 17:23:35 CET, "João Távora" <joaotavora@gmail.com> wrote:
>"Herman, Géza" <geza.herman@gmail.com> writes:
>
>> On 3/25/23 12:43, João Távora wrote:
>>>
>>> There can be no "correct" indentation in a buffer with an invalid state.
>>>
>>> But there are heuristics.  Here, it can be argued that c++-mode's
>>> heuristics are better.
>> I agree. In my opinion, c++-mode's heuristics are good.
>
>That's probably only because we're _used_ to c++-mode.  If we had been
>using c++-ts-mode for years, we would be equally suprised.
>
>> Tree-sitter support is new, it's expected that it won't work
>> perfectly. Also, it doesn't have to handle any invalid program. But,
>> while writing a program, it should handle indentation sensibly. I
>> don't think that it's a good approach that everybody who uses electric
>> indent should get used to the fact that whenever they writing a for
>> loop, the line will jump around. It's a bad experience.
>
>But writing a for loop from scratch is only one of the editing
>activities you do in a C++ file.  Other activities involve editing
>existing code.  In those situations, c++-ts-mode's heuristics could
>"win".  Unless you're willing to posit that writing code from scratch is
>more frequent than editing existing code, there's no right answer here.
>
>> Anyways, feel free to close this issue if you think otherwise. I just
>> disabled ';'-caused auto indenting, so I don't see this unpleasant
>> behavior any more.
>
>Yes, i'm inclined to think that c++-ts-mode shouldn't add any chars to
>electric-indent-chars.  It's just not useful.
>
>João

I won't object to this, as I hold no strong opinions either way. Would removing it cause less confusion, yet still reindent correctly in most cases?

If so, feel free to remove it, unless anyone else objects :)

Theo





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 16:23             ` João Távora
  2023-03-25 17:41               ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-03-25 17:47               ` Herman, Géza
  2023-03-25 17:53                 ` João Távora
  1 sibling, 1 reply; 16+ messages in thread
From: Herman, Géza @ 2023-03-25 17:47 UTC (permalink / raw)
  To: João Távora; +Cc: Yuan Fu, Theodor Thornhill, 62412



On 3/25/23 17:23, João Távora wrote:
>
>> I agree. In my opinion, c++-mode's heuristics are good.
> That's probably only because we're _used_ to c++-mode.  If we had been
> using c++-ts-mode for years, we would be equally suprised.
Yes, that can be true.

>
>> Tree-sitter support is new, it's expected that it won't work
>> perfectly. Also, it doesn't have to handle any invalid program. But,
>> while writing a program, it should handle indentation sensibly. I
>> don't think that it's a good approach that everybody who uses electric
>> indent should get used to the fact that whenever they writing a for
>> loop, the line will jump around. It's a bad experience.
> But writing a for loop from scratch is only one of the editing
> activities you do in a C++ file.  Other activities involve editing
> existing code.  In those situations, c++-ts-mode's heuristics could
> "win".  Unless you're willing to posit that writing code from scratch is
> more frequent than editing existing code, there's no right answer here.
What is the c++-ts heuristics here so it removes the indentation? I 
don't really understand why it does that. Is there a similarly looking 
situation where removing the indentation is the sensible behavior?


>> Anyways, feel free to close this issue if you think otherwise. I just
>> disabled ';'-caused auto indenting, so I don't see this unpleasant
>> behavior any more.
> Yes, i'm inclined to think that c++-ts-mode shouldn't add any chars to
> electric-indent-chars.  It's just not useful.
>
As far as I know, c++-mode has ';'-caused auto indenting, it just works 
with a different mechanism. So if the aim is that the two c++ modes 
should work similarly out of the box, then it'd make sense to keep 
electric-indent-chars as is.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 17:47               ` Herman, Géza
@ 2023-03-25 17:53                 ` João Távora
  0 siblings, 0 replies; 16+ messages in thread
From: João Távora @ 2023-03-25 17:53 UTC (permalink / raw)
  To: Herman, Géza; +Cc: Yuan Fu, Theodor Thornhill, 62412

On Sat, Mar 25, 2023 at 5:47 PM Herman, Géza <geza.herman@gmail.com> wrote:

> So if the aim is that the two c++ modes
> should work similarly out of the box, then it'd make sense to keep
> electric-indent-chars as is.

Only _if_ it is generally understood that c++-ts-mode's heuristics are
worse.  And only _then_ if they can somehow fixed to become better.

Until these conditions are satisfied, I think electric-indent-chars
has to be left alone.

João





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-24 20:04   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-03-25  8:53     ` João Távora
@ 2023-03-26 13:25     ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-26 13:25 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: Yuan Fu, 62412, geza.herman

Theodor Thornhill <theo@thornhill.no> writes:

> I'll look more deeply into the cause of this, but the rule is covering some preproc directives iirc.
>
> Unfortunately tree-sitter behaves better when auto pairs is used. I
> would advise people to use electric-pairs-mode (if that's the correct
> name, on mobile now) to avoid these sorts of issues.
>

Yes, I think that having an error node in the indentation rules is not a
good idea.  It can cause unexpected issues like the one described in
this thread.  I'd explore the idea of removing error nodes from the
rules before resorting to tweak the electric or pairing features of
Emacs.

Let's look into the problem that the introduction of the error node
tried to solve, the indentation of preprocessor directives.  Starting
with this code:

int
main()
{
  |
}

if I type '#', automatic indentation does not happen because, at that
stage, the AST doesn't recognize the full preprocessor directive (the
node in the AST is an error node).  If I continue writing the
preprocessor directive (say, "#ifdef DEBUG"), the preprocessor node is
created correctly, but it would require a manual TAB to go to column 0
because we haven't inserted any electric character while we completed
the directive.  The same manual TAB is required by c++-mode, so I
wouldn't see this as a regression.

However, there might be still minor divergences between c++-mode and
c++-ts-mode.  For example:

int
main()
{
#if
}

This in-progress code would indent correctly in c++-mode, but on
c-ts-mode the node is an error node, so we won't reliably know that it
should indent to column 0.  If we want to fix these minor divergences, I
see two possible approaches:

- Investigate if the C/C++ grammars can be improved to cover these cases
  better.

- Without changing the grammars, could we insert our own preprocessor
  nodes in the AST tree by checking if the first non-whitespace
  character of the line is the beginning of #assert, #define, #include,
  #if, #ifndef, #elif, etc.?





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 11:43         ` João Távora
  2023-03-25 13:48           ` Herman, Géza
@ 2023-03-26 13:54           ` Herman, Géza
  1 sibling, 0 replies; 16+ messages in thread
From: Herman, Géza @ 2023-03-26 13:54 UTC (permalink / raw)
  To: João Távora; +Cc: Yuan Fu, Theodor Thornhill, 62412



On 3/25/23 12:43, João Távora wrote:
> Let's assume you turn off electric-indent-mode. In c++-mode, pressing RET
> after:
>
>     int main() {
>
> "correctly" indents the next line.  In c++-ts-mode, it doesn't.
>
> Both programs are ill-formed but you're right that after correcting
> that, by say adding 'return 0; RET }', the c++-mode version of the
> same program is closer to being correctly indented.
I'd say that the current c++-ts-mode behavior is very bad for this example.

If you type this into an empty buffer:

int main() {

and press RET, the new line won't get indented. But the case is worse, 
because TAB doesn't work either (doesn't do anything). Supposedly 
because tree sitter has the wrong idea of the indentation: if add 
indentation by using spaces, pressing TAB deletes the spaces.

(Note: it doesn't matter whether electric-indent-mode is turned on or 
off, same thing happens).






^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter
  2023-03-25 17:41               ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-12  0:03                 ` Stefan Kangas
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Kangas @ 2023-09-12  0:03 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: 62412-done, Yuan Fu, João Távora, Herman Géza

Theodor Thornhill <theo@thornhill.no> writes:

> On 25 March 2023 17:23:35 CET, "João Távora" <joaotavora@gmail.com> wrote:
>>"Herman, Géza" <geza.herman@gmail.com> writes:
>>
>>> Anyways, feel free to close this issue if you think otherwise. I
>>> just disabled ';'-caused auto indenting, so I don't see this
>>> unpleasant behavior any more.
>>
>>Yes, i'm inclined to think that c++-ts-mode shouldn't add any chars to
>>electric-indent-chars.  It's just not useful.
>
> I won't object to this, as I hold no strong opinions either way.

It seems like this is not something we want to do, so I'm closing this
bug.





^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-09-12  0:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-23 20:43 bug#62412: 29.0.60; strange c++ indentation behavior with tree sitter Herman, Geza
2023-03-24 18:17 ` Yuan Fu
2023-03-24 20:04   ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-03-25  8:53     ` João Távora
2023-03-25 10:19       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-03-25 10:28         ` João Távora
2023-03-25 10:26       ` Herman, Géza
2023-03-25 11:43         ` João Távora
2023-03-25 13:48           ` Herman, Géza
2023-03-25 16:23             ` João Távora
2023-03-25 17:41               ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-12  0:03                 ` Stefan Kangas
2023-03-25 17:47               ` Herman, Géza
2023-03-25 17:53                 ` João Távora
2023-03-26 13:54           ` Herman, Géza
2023-03-26 13:25     ` Daniel Martín via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).