unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
@ 2023-11-23 20:55 Arteen Abrishami via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-24  7:23 ` Eli Zaretskii
  0 siblings, 1 reply; 12+ messages in thread
From: Arteen Abrishami via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-23 20:55 UTC (permalink / raw)
  To: 67417

This is specifically for the usage of `c-ts-mode` and is not a problem
in `c-mode`. Sometimes, when you type something like:

else
break

it won't indent the "break" until you type a semicolon. In this below
scenario, it does not indent the break at all, but `c-mode` does, and
switching from `c-mode` to `c-ts-mode` with correct indentation leaves
it fixed, but `c-ts-mode` cannot detect or fix it itself.

You can put it into a `.c` buffer all by itself and see:

```
unsigned
heap_pop(struct heapq * heap)
{
  if (heap->sz == 0)
    return -1;

  unsigned ret_val = heap->vals[0];
  heap->vals[0] = heap->vals[heap->len];
  heap->len -= 1;
  unsigned i = 0;
  unsigned lc;

  while ((lc = HEAPQ_L_CHILD(i)) < heap->len)
    {
      unsigned rc = HEAPQ_R_CHILD(i);
      /* no right child for our guy, special case */      
      if (rc == heap->len)
        {
          if (heap->vals[lc] < heap->vals[i])
            SWAP(heap->vals[lc], heap->vals[i]);
          break;
        }

      if (heap->vals[lc] < heap->vals[i])
        {
          SWAP(heap->vals[lc], heap->vals[i]);
          i = lc;
        }
      else if (heap->vals[rc] < heap->vals[i])
        {
          SWAP(heap->vals[rc], heap->vals[i]);
          i = rc;
        }
      else
      break;
      
    }
}
```

The very last break on the else without brackets around it will not indent.c


In GNU Emacs 29.1.50 (build 2, x86_64-apple-darwin23.0.0, NS
appkit-2487.00 Version 14.0 (Build 23A344)) of 2023-09-30 built on
Arteens-MacBook-Pro.local
Repository revision: 63ec6d998d42c970a825dca17743ece9c651d929
Repository branch: emacs-29
Windowing system distributor 'Apple', version 10.3.2487
System Description:  macOS 14.1.1

Configured using:
'configure --with-native-compilation=aot --without-pop
--with-tree-sitter --with-json 'CFLAGS=-O2 -pipe'
CPPFLAGS=-I/usr/local/opt/openjdk/bin:/usr/local/opt/openjdk/bin:/usr/local/opt/make/libexec/gnubin:/usr/local/opt/python@3.11/libexec/bin:/usr/local/sbin:/Library/Frameworks/Python.framework/Versions/3.11/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Library/TeX/texbin'

Configured features:
ACL GIF GMP GNUTLS JPEG JSON LCMS2 LIBXML2 MODULES NATIVE_COMP NOTIFY
KQUEUE NS PDUMPER PNG SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
TREE_SITTER XIM ZLIB

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: C/*

Minor modes in effect:
  override-global-mode: t
  windmove-mode: t
  winner-mode: t
  delete-selection-mode: t
  global-auto-revert-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  hs-minor-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message yank-media puny rfc822 mml
mml-sec epa derived epg rfc6068 epg-config gnus-util
text-property-search mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils c++-ts-mode c-ts-mode c-ts-common treesit
misearch multi-isearch time-date comp comp-cstr warnings icons vc-git
diff-mode vc-dispatcher cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs hideshow dired-subtree
dired-hacks-utils dired dired-loaddefs dash edmacro kmacro
use-package-bind-key bind-key easy-mmode rx doom-themes-ext-org
doom-themes-ext-visual-bell face-remap doom-vibrant-theme doom-themes
doom-themes-base exec-path-from-shell cl-extra help-mode
use-package-ensure use-package-core windmove winner ring delsel
autorevert filenotify dired-sidebar-autoloads dired-subtree-autoloads
dired-hacks-utils-autoloads info dash-autoloads doom-themes-autoloads
exec-path-from-shell-autoloads expand-region-autoloads package
browse-url url url-proxy url-privacy url-expand url-methods url-history
url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers
url-parse auth-source cl-seq eieio eieio-core cl-macs password-cache
json subr-x map byte-opt gv bytecomp byte-compile url-vars cl-loaddefs
cl-lib rmc iso-transl tooltip cconv eldoc paren electric uniquify
ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/ns-win ns-win
ucs-normalize mule-util term/common-win tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads kqueue cocoa ns lcms2
multi-tty make-network-process native-compile emacs)

Memory information:
((conses 16 240395 9962)
(symbols 48 13899 0)
(strings 32 41716 3048)
(string-bytes 1 1380541)
(vectors 16 26557)
(vector-slots 8 519687 16325)
(floats 8 260 374)
(intervals 56 5432 0)
(buffers 984 16))





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-23 20:55 bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets Arteen Abrishami via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-24  7:23 ` Eli Zaretskii
  2023-11-24 18:26   ` Dmitry Gutov
  0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2023-11-24  7:23 UTC (permalink / raw)
  To: Arteen Abrishami, Yuan Fu; +Cc: 67417

> Date: Thu, 23 Nov 2023 12:55:31 -0800
> From:  Arteen Abrishami via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> This is specifically for the usage of `c-ts-mode` and is not a problem
> in `c-mode`. Sometimes, when you type something like:
> 
> else
> break
> 
> it won't indent the "break" until you type a semicolon. In this below
> scenario, it does not indent the break at all, but `c-mode` does, and
> switching from `c-mode` to `c-ts-mode` with correct indentation leaves
> it fixed, but `c-ts-mode` cannot detect or fix it itself.
> 
> You can put it into a `.c` buffer all by itself and see:
> 
> ```
> unsigned
> heap_pop(struct heapq * heap)
> {
>   if (heap->sz == 0)
>     return -1;
> 
>   unsigned ret_val = heap->vals[0];
>   heap->vals[0] = heap->vals[heap->len];
>   heap->len -= 1;
>   unsigned i = 0;
>   unsigned lc;
> 
>   while ((lc = HEAPQ_L_CHILD(i)) < heap->len)
>     {
>       unsigned rc = HEAPQ_R_CHILD(i);
>       /* no right child for our guy, special case */      
>       if (rc == heap->len)
>         {
>           if (heap->vals[lc] < heap->vals[i])
>             SWAP(heap->vals[lc], heap->vals[i]);
>           break;
>         }
> 
>       if (heap->vals[lc] < heap->vals[i])
>         {
>           SWAP(heap->vals[lc], heap->vals[i]);
>           i = lc;
>         }
>       else if (heap->vals[rc] < heap->vals[i])
>         {
>           SWAP(heap->vals[rc], heap->vals[i]);
>           i = rc;
>         }
>       else
>       break;
>       
>     }
> }
> ```
> 
> The very last break on the else without brackets around it will not indent.c

Yuan, any comments?

My personal take on this is that as long as typing the required
semi-colons fixes the indentation, we are okay in these cases, but if
we can do better (i.e. if the problem is not that tree-sitter returns
a tree with an error node), we should fix this even without relying on
the electric semi-colon.

In the specific example above, it looks like tree-sitter does succeed
in parsing and shows a valid tree:

	   alternative: 
	    (else_clause else
	     (break_statement break ;)))))

So I wonder why we don't indent the "break;" part here.

Comparison with c-mode is not relevant here, btw, since c-mode uses a
very different strategy for computing indentation.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-24  7:23 ` Eli Zaretskii
@ 2023-11-24 18:26   ` Dmitry Gutov
  2023-11-27  1:47     ` Yuan Fu
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry Gutov @ 2023-11-24 18:26 UTC (permalink / raw)
  To: Eli Zaretskii, Arteen Abrishami, Yuan Fu; +Cc: 67417

On 24/11/2023 09:23, Eli Zaretskii wrote:
>> Date: Thu, 23 Nov 2023 12:55:31 -0800
>> From:  Arteen Abrishami via "Bug reports for GNU Emacs,
>>   the Swiss army knife of text editors"<bug-gnu-emacs@gnu.org>
>>
>> This is specifically for the usage of `c-ts-mode` and is not a problem
>> in `c-mode`. Sometimes, when you type something like:
>>
>> else
>> break
>>
>> it won't indent the "break" until you type a semicolon. In this below
>> scenario, it does not indent the break at all, but `c-mode` does, and
>> switching from `c-mode` to `c-ts-mode` with correct indentation leaves
>> it fixed, but `c-ts-mode` cannot detect or fix it itself.
>>
>> You can put it into a `.c` buffer all by itself and see:
>>
>> ```
>> unsigned
>> heap_pop(struct heapq * heap)
>> {
>>    if (heap->sz == 0)
>>      return -1;
>>
>>    unsigned ret_val = heap->vals[0];
>>    heap->vals[0] = heap->vals[heap->len];
>>    heap->len -= 1;
>>    unsigned i = 0;
>>    unsigned lc;
>>
>>    while ((lc = HEAPQ_L_CHILD(i)) < heap->len)
>>      {
>>        unsigned rc = HEAPQ_R_CHILD(i);
>>        /* no right child for our guy, special case */
>>        if (rc == heap->len)
>>          {
>>            if (heap->vals[lc] < heap->vals[i])
>>              SWAP(heap->vals[lc], heap->vals[i]);
>>            break;
>>          }
>>
>>        if (heap->vals[lc] < heap->vals[i])
>>          {
>>            SWAP(heap->vals[lc], heap->vals[i]);
>>            i = lc;
>>          }
>>        else if (heap->vals[rc] < heap->vals[i])
>>          {
>>            SWAP(heap->vals[rc], heap->vals[i]);
>>            i = rc;
>>          }
>>        else
>>        break;
>>        
>>      }
>> }
>> ```
>>
>> The very last break on the else without brackets around it will not indent.c
> Yuan, any comments?
> 
> My personal take on this is that as long as typing the required
> semi-colons fixes the indentation, we are okay in these cases, but if
> we can do better (i.e. if the problem is not that tree-sitter returns
> a tree with an error node), we should fix this even without relying on
> the electric semi-colon.
> 
> In the specific example above, it looks like tree-sitter does succeed
> in parsing and shows a valid tree:
> 
> 	   alternative:
> 	    (else_clause else
> 	     (break_statement break ;)))))
> 
> So I wonder why we don't indent the "break;" part here.

In my testing, it indents fine when after "else" there is either:

  * some char(s) followed by closing curly
  * or (optionally) some char(s) followed by semicolon

When there is _no_ code between "else" and the closing curly, it already 
indents fine in my testing (whether the semicolon is added or not).

Without either, the text after "else" isn't parsed as "alternative:" -- 
it's parsed as a sibling of the "else" node. And, most unfortunately, 
when "else" is followed by a closing curly, it's just parsed as (ERROR 
else), so simply pressing RET does not indent the empty line properly 
even when one is working with electric-pair-mode enabled.

I'd personally consider the last one a more definite bug in the grammar, 
but maybe there is some good reason for it. I haven't found anything 
relevant in the bug tracker.

BTW, it seems like the latest C grammar changed how else without braces 
is parsed, so "break" isn't reindented even with semicolon at the end.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-24 18:26   ` Dmitry Gutov
@ 2023-11-27  1:47     ` Yuan Fu
  2023-11-27  2:22       ` Dmitry Gutov
  0 siblings, 1 reply; 12+ messages in thread
From: Yuan Fu @ 2023-11-27  1:47 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, Arteen Abrishami; +Cc: 67417


On 11/24/23 10:26 AM, Dmitry Gutov wrote:
> On 24/11/2023 09:23, Eli Zaretskii wrote:
>>> Date: Thu, 23 Nov 2023 12:55:31 -0800
>>> From:  Arteen Abrishami via "Bug reports for GNU Emacs,
>>>   the Swiss army knife of text editors"<bug-gnu-emacs@gnu.org>
>>>
>>> This is specifically for the usage of `c-ts-mode` and is not a problem
>>> in `c-mode`. Sometimes, when you type something like:
>>>
>>> else
>>> break
>>>
>>> it won't indent the "break" until you type a semicolon. In this below
>>> scenario, it does not indent the break at all, but `c-mode` does, and
>>> switching from `c-mode` to `c-ts-mode` with correct indentation leaves
>>> it fixed, but `c-ts-mode` cannot detect or fix it itself.
>>>
>>> You can put it into a `.c` buffer all by itself and see:
>>>
>>> ```
>>> unsigned
>>> heap_pop(struct heapq * heap)
>>> {
>>>    if (heap->sz == 0)
>>>      return -1;
>>>
>>>    unsigned ret_val = heap->vals[0];
>>>    heap->vals[0] = heap->vals[heap->len];
>>>    heap->len -= 1;
>>>    unsigned i = 0;
>>>    unsigned lc;
>>>
>>>    while ((lc = HEAPQ_L_CHILD(i)) < heap->len)
>>>      {
>>>        unsigned rc = HEAPQ_R_CHILD(i);
>>>        /* no right child for our guy, special case */
>>>        if (rc == heap->len)
>>>          {
>>>            if (heap->vals[lc] < heap->vals[i])
>>>              SWAP(heap->vals[lc], heap->vals[i]);
>>>            break;
>>>          }
>>>
>>>        if (heap->vals[lc] < heap->vals[i])
>>>          {
>>>            SWAP(heap->vals[lc], heap->vals[i]);
>>>            i = lc;
>>>          }
>>>        else if (heap->vals[rc] < heap->vals[i])
>>>          {
>>>            SWAP(heap->vals[rc], heap->vals[i]);
>>>            i = rc;
>>>          }
>>>        else
>>>        break;
>>>             }
>>> }
>>> ```
>>>
>>> The very last break on the else without brackets around it will not 
>>> indent.c
>> Yuan, any comments?
>>
>> My personal take on this is that as long as typing the required
>> semi-colons fixes the indentation, we are okay in these cases, but if
>> we can do better (i.e. if the problem is not that tree-sitter returns
>> a tree with an error node), we should fix this even without relying on
>> the electric semi-colon.
>>
>> In the specific example above, it looks like tree-sitter does succeed
>> in parsing and shows a valid tree:
>>
>>        alternative:
>>         (else_clause else
>>          (break_statement break ;)))))
>>
>> So I wonder why we don't indent the "break;" part here.
>
> In my testing, it indents fine when after "else" there is either:
>
>  * some char(s) followed by closing curly
>  * or (optionally) some char(s) followed by semicolon
>
> When there is _no_ code between "else" and the closing curly, it 
> already indents fine in my testing (whether the semicolon is added or 
> not).
>
> Without either, the text after "else" isn't parsed as "alternative:" 
> -- it's parsed as a sibling of the "else" node. And, most 
> unfortunately, when "else" is followed by a closing curly, it's just 
> parsed as (ERROR else), so simply pressing RET does not indent the 
> empty line properly even when one is working with electric-pair-mode 
> enabled.
>
> I'd personally consider the last one a more definite bug in the 
> grammar, but maybe there is some good reason for it. I haven't found 
> anything relevant in the bug tracker.
>
> BTW, it seems like the latest C grammar changed how else without 
> braces is parsed, so "break" isn't reindented even with semicolon at 
> the end.

I pushed two commits which should fix the indentation for "break" after 
"else", and indentation for empty lines after if/else/for/while in 
general. The fix for the general case doesn't use the parse tree, since 
the parse tree is often incomplete when you type if (...) and hit 
return. Instead it uses a plain regexp match to see if the previous line 
starts with if/else/for/while. This seems like a reasonable heuristic to 
use before user types more things, at which point more accurate 
indentation rules would be used, since the parse tree should be more 
complete then.

Yuan






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-27  1:47     ` Yuan Fu
@ 2023-11-27  2:22       ` Dmitry Gutov
  2023-11-28  6:55         ` Yuan Fu
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry Gutov @ 2023-11-27  2:22 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii, Arteen Abrishami; +Cc: 67417

On 27/11/2023 03:47, Yuan Fu wrote:
> I pushed two commits which should fix the indentation for "break" after 
> "else", and indentation for empty lines after if/else/for/while in 
> general. The fix for the general case doesn't use the parse tree, since 
> the parse tree is often incomplete when you type if (...) and hit 
> return. Instead it uses a plain regexp match to see if the previous line 
> starts with if/else/for/while. This seems like a reasonable heuristic to 
> use before user types more things, at which point more accurate 
> indentation rules would be used, since the parse tree should be more 
> complete then.

Sorry, two counter-examples right away:

Type 'elsewhere();' and RET -> the next line is indented 1 level extra, 
at least until you type some more and then have the line reindented 
either with pressing TAB or adding semicolon.

Type 'for (;;) {}' and RET -> same.

The first case is easy to guard against (just check that the next char 
is either space of opening paren), but the second one less so. OTOH, the 
second case is likely to have a parse tree without errors, so if we also 
check for that... the heuristic might work.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-27  2:22       ` Dmitry Gutov
@ 2023-11-28  6:55         ` Yuan Fu
  2023-11-28 14:27           ` Eli Zaretskii
  2023-11-28 15:31           ` Dmitry Gutov
  0 siblings, 2 replies; 12+ messages in thread
From: Yuan Fu @ 2023-11-28  6:55 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, Arteen Abrishami; +Cc: 67417



On 11/26/23 6:22 PM, Dmitry Gutov wrote:
> On 27/11/2023 03:47, Yuan Fu wrote:
>> I pushed two commits which should fix the indentation for "break" 
>> after "else", and indentation for empty lines after if/else/for/while 
>> in general. The fix for the general case doesn't use the parse tree, 
>> since the parse tree is often incomplete when you type if (...) and 
>> hit return. Instead it uses a plain regexp match to see if the 
>> previous line starts with if/else/for/while. This seems like a 
>> reasonable heuristic to use before user types more things, at which 
>> point more accurate indentation rules would be used, since the parse 
>> tree should be more complete then.
>
> Sorry, two counter-examples right away:
>
> Type 'elsewhere();' and RET -> the next line is indented 1 level 
> extra, at least until you type some more and then have the line 
> reindented either with pressing TAB or adding semicolon.
>
> Type 'for (;;) {}' and RET -> same.
>
> The first case is easy to guard against (just check that the next char 
> is either space of opening paren), but the second one less so. OTOH, 
> the second case is likely to have a parse tree without errors, so if 
> we also check for that... the heuristic might work.

Well, darn it. And you're right, the second case is a bit hard to 
check... Well I guess for the moment we can remove this heuristic. (I 
tried a bit, and checking for no errors is not so easy.)

Yuan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-28  6:55         ` Yuan Fu
@ 2023-11-28 14:27           ` Eli Zaretskii
  2023-11-28 15:31           ` Dmitry Gutov
  1 sibling, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2023-11-28 14:27 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 67417, dmitry, arteen

> Date: Mon, 27 Nov 2023 22:55:31 -0800
> Cc: 67417@debbugs.gnu.org
> From: Yuan Fu <casouri@gmail.com>
> 
> 
> 
> On 11/26/23 6:22 PM, Dmitry Gutov wrote:
> > On 27/11/2023 03:47, Yuan Fu wrote:
> >> I pushed two commits which should fix the indentation for "break" 
> >> after "else", and indentation for empty lines after if/else/for/while 
> >> in general. The fix for the general case doesn't use the parse tree, 
> >> since the parse tree is often incomplete when you type if (...) and 
> >> hit return. Instead it uses a plain regexp match to see if the 
> >> previous line starts with if/else/for/while. This seems like a 
> >> reasonable heuristic to use before user types more things, at which 
> >> point more accurate indentation rules would be used, since the parse 
> >> tree should be more complete then.
> >
> > Sorry, two counter-examples right away:
> >
> > Type 'elsewhere();' and RET -> the next line is indented 1 level 
> > extra, at least until you type some more and then have the line 
> > reindented either with pressing TAB or adding semicolon.
> >
> > Type 'for (;;) {}' and RET -> same.
> >
> > The first case is easy to guard against (just check that the next char 
> > is either space of opening paren), but the second one less so. OTOH, 
> > the second case is likely to have a parse tree without errors, so if 
> > we also check for that... the heuristic might work.
> 
> Well, darn it. And you're right, the second case is a bit hard to 
> check... Well I guess for the moment we can remove this heuristic. (I 
> tried a bit, and checking for no errors is not so easy.)

Does this mean you need to revert something on the emacs-29 branch?





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-28  6:55         ` Yuan Fu
  2023-11-28 14:27           ` Eli Zaretskii
@ 2023-11-28 15:31           ` Dmitry Gutov
  2023-12-01  7:58             ` Yuan Fu
  1 sibling, 1 reply; 12+ messages in thread
From: Dmitry Gutov @ 2023-11-28 15:31 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii, Arteen Abrishami; +Cc: 67417

On 28/11/2023 08:55, Yuan Fu wrote:
> 
> 
> On 11/26/23 6:22 PM, Dmitry Gutov wrote:
>> On 27/11/2023 03:47, Yuan Fu wrote:
>>> I pushed two commits which should fix the indentation for "break" 
>>> after "else", and indentation for empty lines after if/else/for/while 
>>> in general. The fix for the general case doesn't use the parse tree, 
>>> since the parse tree is often incomplete when you type if (...) and 
>>> hit return. Instead it uses a plain regexp match to see if the 
>>> previous line starts with if/else/for/while. This seems like a 
>>> reasonable heuristic to use before user types more things, at which 
>>> point more accurate indentation rules would be used, since the parse 
>>> tree should be more complete then.
>>
>> Sorry, two counter-examples right away:
>>
>> Type 'elsewhere();' and RET -> the next line is indented 1 level 
>> extra, at least until you type some more and then have the line 
>> reindented either with pressing TAB or adding semicolon.
>>
>> Type 'for (;;) {}' and RET -> same.
>>
>> The first case is easy to guard against (just check that the next char 
>> is either space of opening paren), but the second one less so. OTOH, 
>> the second case is likely to have a parse tree without errors, so if 
>> we also check for that... the heuristic might work.
> 
> Well, darn it. And you're right, the second case is a bit hard to 
> check... Well I guess for the moment we can remove this heuristic. (I 
> tried a bit, and checking for no errors is not so easy.)

Maybe it's possible to salvage this heuristic a bit, at least for 
"else", and as long as it's followed by "}" somewhere (e.g. when 
electric-pair-mode is used).

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index 31a9d0fc886..20689dc1862 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -373,8 +373,17 @@ c-ts-mode--indent-styles
             ;; point on the empty line to be indented; this rule
             ;; does that.
             ((and no-node
+                 ;; Could be a matcher 'prev-sibling-p'.
+                 (lambda (_ parent bol &rest _)
+                   (equal
+                    "ERROR"
+                    (treesit-node-type
+                     (treesit-node-prev-sibling
+                      (treesit-node-first-child-for-pos
+                       parent bol)
+                      t))))
                   (c-ts-mode--prev-line-match
-                  ,(rx (or "if" "else" "while" "do" "for"))))
+                  ,(rx "else" symbol-end)))
              prev-line c-ts-mode-indent-offset)

             ((parent-is "translation_unit") column-0 0)

The rest of the nodes (if/while/do/for) don't result in parse errors 
here, as long as the condition in parentheses is typed out correctly.

I tried some additional clauses looking for "previous sibling", checking 
whether it's for_statement, etc, which ends with "expression statement", 
and that one is empty... but it a fair amount of code which will likely 
miss other edge cases anyway. Or breaks when the grammar changes.





^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-11-28 15:31           ` Dmitry Gutov
@ 2023-12-01  7:58             ` Yuan Fu
  2023-12-09  8:27               ` Eli Zaretskii
  0 siblings, 1 reply; 12+ messages in thread
From: Yuan Fu @ 2023-12-01  7:58 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, Arteen Abrishami; +Cc: 67417



On 11/28/23 7:31 AM, Dmitry Gutov wrote:
> On 28/11/2023 08:55, Yuan Fu wrote:
>>
>>
>> On 11/26/23 6:22 PM, Dmitry Gutov wrote:
>>> On 27/11/2023 03:47, Yuan Fu wrote:
>>>> I pushed two commits which should fix the indentation for "break" 
>>>> after "else", and indentation for empty lines after 
>>>> if/else/for/while in general. The fix for the general case doesn't 
>>>> use the parse tree, since the parse tree is often incomplete when 
>>>> you type if (...) and hit return. Instead it uses a plain regexp 
>>>> match to see if the previous line starts with if/else/for/while. 
>>>> This seems like a reasonable heuristic to use before user types 
>>>> more things, at which point more accurate indentation rules would 
>>>> be used, since the parse tree should be more complete then.
>>>
>>> Sorry, two counter-examples right away:
>>>
>>> Type 'elsewhere();' and RET -> the next line is indented 1 level 
>>> extra, at least until you type some more and then have the line 
>>> reindented either with pressing TAB or adding semicolon.
>>>
>>> Type 'for (;;) {}' and RET -> same.
>>>
>>> The first case is easy to guard against (just check that the next 
>>> char is either space of opening paren), but the second one less so. 
>>> OTOH, the second case is likely to have a parse tree without errors, 
>>> so if we also check for that... the heuristic might work.
>>
>> Well, darn it. And you're right, the second case is a bit hard to 
>> check... Well I guess for the moment we can remove this heuristic. (I 
>> tried a bit, and checking for no errors is not so easy.)
>
> Maybe it's possible to salvage this heuristic a bit, at least for 
> "else", and as long as it's followed by "}" somewhere (e.g. when 
> electric-pair-mode is used).
>
> diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
> index 31a9d0fc886..20689dc1862 100644
> --- a/lisp/progmodes/c-ts-mode.el
> +++ b/lisp/progmodes/c-ts-mode.el
> @@ -373,8 +373,17 @@ c-ts-mode--indent-styles
>             ;; point on the empty line to be indented; this rule
>             ;; does that.
>             ((and no-node
> +                 ;; Could be a matcher 'prev-sibling-p'.
> +                 (lambda (_ parent bol &rest _)
> +                   (equal
> +                    "ERROR"
> +                    (treesit-node-type
> +                     (treesit-node-prev-sibling
> +                      (treesit-node-first-child-for-pos
> +                       parent bol)
> +                      t))))
>                   (c-ts-mode--prev-line-match
> -                  ,(rx (or "if" "else" "while" "do" "for"))))
> +                  ,(rx "else" symbol-end)))
>              prev-line c-ts-mode-indent-offset)
>
>             ((parent-is "translation_unit") column-0 0)
>
> The rest of the nodes (if/while/do/for) don't result in parse errors 
> here, as long as the condition in parentheses is typed out correctly.
>
> I tried some additional clauses looking for "previous sibling", 
> checking whether it's for_statement, etc, which ends with "expression 
> statement", and that one is empty... but it a fair amount of code 
> which will likely miss other edge cases anyway. Or breaks when the 
> grammar changes.

Yeah, for now we can match for the specific case where there's an else 
before a "}". That should reduce the chance of it matching other edge 
cases. I'll give it another look on this weekend.

Yuan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-12-01  7:58             ` Yuan Fu
@ 2023-12-09  8:27               ` Eli Zaretskii
  2023-12-10  9:29                 ` Yuan Fu
  0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2023-12-09  8:27 UTC (permalink / raw)
  To: Yuan Fu; +Cc: 67417, dmitry, arteen

> Date: Thu, 30 Nov 2023 23:58:14 -0800
> Cc: 67417@debbugs.gnu.org
> From: Yuan Fu <casouri@gmail.com>
> 
> > The rest of the nodes (if/while/do/for) don't result in parse errors 
> > here, as long as the condition in parentheses is typed out correctly.
> >
> > I tried some additional clauses looking for "previous sibling", 
> > checking whether it's for_statement, etc, which ends with "expression 
> > statement", and that one is empty... but it a fair amount of code 
> > which will likely miss other edge cases anyway. Or breaks when the 
> > grammar changes.
> 
> Yeah, for now we can match for the specific case where there's an else 
> before a "}". That should reduce the chance of it matching other edge 
> cases. I'll give it another look on this weekend.

Any progress here?





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-12-09  8:27               ` Eli Zaretskii
@ 2023-12-10  9:29                 ` Yuan Fu
  2023-12-11  0:29                   ` Dmitry Gutov
  0 siblings, 1 reply; 12+ messages in thread
From: Yuan Fu @ 2023-12-10  9:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 67417, dmitry, arteen



On 12/9/23 12:27 AM, Eli Zaretskii wrote:
>> Date: Thu, 30 Nov 2023 23:58:14 -0800
>> Cc: 67417@debbugs.gnu.org
>> From: Yuan Fu <casouri@gmail.com>
>>
>>> The rest of the nodes (if/while/do/for) don't result in parse errors
>>> here, as long as the condition in parentheses is typed out correctly.
>>>
>>> I tried some additional clauses looking for "previous sibling",
>>> checking whether it's for_statement, etc, which ends with "expression
>>> statement", and that one is empty... but it a fair amount of code
>>> which will likely miss other edge cases anyway. Or breaks when the
>>> grammar changes.
>> Yeah, for now we can match for the specific case where there's an else
>> before a "}". That should reduce the chance of it matching other edge
>> cases. I'll give it another look on this weekend.
> Any progress here?
I pushed a commit that changes the too-board heuristic to only match 
"else before closing bracket". Hopefully this won't cause any false matches.

I think we can close this. Dmitry, WDYT?

Yuan





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets
  2023-12-10  9:29                 ` Yuan Fu
@ 2023-12-11  0:29                   ` Dmitry Gutov
  0 siblings, 0 replies; 12+ messages in thread
From: Dmitry Gutov @ 2023-12-11  0:29 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii; +Cc: 67417-done, arteen

Version: 29.2

On 10/12/2023 11:29, Yuan Fu wrote:
> 
> 
> On 12/9/23 12:27 AM, Eli Zaretskii wrote:
>>> Date: Thu, 30 Nov 2023 23:58:14 -0800
>>> Cc: 67417@debbugs.gnu.org
>>> From: Yuan Fu <casouri@gmail.com>
>>>
>>>> The rest of the nodes (if/while/do/for) don't result in parse errors
>>>> here, as long as the condition in parentheses is typed out correctly.
>>>>
>>>> I tried some additional clauses looking for "previous sibling",
>>>> checking whether it's for_statement, etc, which ends with "expression
>>>> statement", and that one is empty... but it a fair amount of code
>>>> which will likely miss other edge cases anyway. Or breaks when the
>>>> grammar changes.
>>> Yeah, for now we can match for the specific case where there's an else
>>> before a "}". That should reduce the chance of it matching other edge
>>> cases. I'll give it another look on this weekend.
>> Any progress here?
> I pushed a commit that changes the too-board heuristic to only match 
> "else before closing bracket". Hopefully this won't cause any false 
> matches.
> 
> I think we can close this. Dmitry, WDYT?

LGTM! (And closing.)

Arteen, let us know if you see any problems with the fixes that are now 
in the emacs-29 branch.





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-12-11  0:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-23 20:55 bug#67417: 29.1.50; c-ts-mode syntax issues with no brackets Arteen Abrishami via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-24  7:23 ` Eli Zaretskii
2023-11-24 18:26   ` Dmitry Gutov
2023-11-27  1:47     ` Yuan Fu
2023-11-27  2:22       ` Dmitry Gutov
2023-11-28  6:55         ` Yuan Fu
2023-11-28 14:27           ` Eli Zaretskii
2023-11-28 15:31           ` Dmitry Gutov
2023-12-01  7:58             ` Yuan Fu
2023-12-09  8:27               ` Eli Zaretskii
2023-12-10  9:29                 ` Yuan Fu
2023-12-11  0:29                   ` Dmitry Gutov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).