unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
@ 2024-11-27 23:27 Björn Lindqvist
  2024-11-28  7:30 ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Björn Lindqvist @ 2024-11-27 23:27 UTC (permalink / raw)
  To: Emacs developers

Hello Emacs developers!

I've been trying to get c-ts-mode to indent like I want, but I'm
running into problems related to preprocessor directives. For
example, consider a type definition nested in two #ifdefs:

    #ifdef X
    #ifdef Y
    typedef int foo;
    #endif
    #endif

Since both the parent and grand parent of the type_definition is a
preproc_ifdef no rule matches. Another issue is that I want my
preprocessor directives kept at column 0, which unfortunately screws
up all rules that refer to the parent. E.g.:

    ((parent-is "if_statement") standalone-parent 4)

Doesn't work for

    int main() {
        if (true)
    #ifdef A
            prutt();
    #else
            fis();
    #endif
    }

The rule I'd like to express is "take the indent of the closest
*indenting* parent and add one indent". That rule would match whether
that parent is a "while_statement", "if_statement", "for_statement",
etc. You can't express such rules with tree-sitter, can you?

Btw, I get that tree-sitter can't handle *all* weird preprocessor
constructs you can create, but my examples are really common and
appear in most C code bases.


-- 
mvh/best regards Björn Lindqvist



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
  2024-11-27 23:27 How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Björn Lindqvist
@ 2024-11-28  7:30 ` Eli Zaretskii
  2024-11-28 10:03   ` Yuan Fu
  2024-11-28 18:30   ` Filippo Argiolas
  0 siblings, 2 replies; 4+ messages in thread
From: Eli Zaretskii @ 2024-11-28  7:30 UTC (permalink / raw)
  To: Björn Lindqvist, Yuan Fu; +Cc: emacs-devel

> From: Björn Lindqvist <bjourne@gmail.com>
> Date: Thu, 28 Nov 2024 00:27:17 +0100
> 
> I've been trying to get c-ts-mode to indent like I want, but I'm
> running into problems related to preprocessor directives.

Preprocessor directives are difficult because the tree-sitter C/C++
grammars include only partial support for them.

> For
> example, consider a type definition nested in two #ifdefs:
> 
>     #ifdef X
>     #ifdef Y
>     typedef int foo;
>     #endif
>     #endif
> 
> Since both the parent and grand parent of the type_definition is a
> preproc_ifdef no rule matches.

But if you go back (up) the parent-child hierarchy, you will
eventually find a node which is not a preproc_SOMETHING, and can go
from there, no?

> Another issue is that I want my
> preprocessor directives kept at column 0, which unfortunately screws
> up all rules that refer to the parent. E.g.:
> 
>     ((parent-is "if_statement") standalone-parent 4)
> 
> Doesn't work for
> 
>     int main() {
>         if (true)
>     #ifdef A
>             prutt();
>     #else
>             fis();
>     #endif
>     }
> 
> The rule I'd like to express is "take the indent of the closest
> *indenting* parent and add one indent". That rule would match whether
> that parent is a "while_statement", "if_statement", "for_statement",
> etc. You can't express such rules with tree-sitter, can you?

Not sure, but Yuan will know.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
  2024-11-28  7:30 ` Eli Zaretskii
@ 2024-11-28 10:03   ` Yuan Fu
  2024-11-28 18:30   ` Filippo Argiolas
  1 sibling, 0 replies; 4+ messages in thread
From: Yuan Fu @ 2024-11-28 10:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Björn Lindqvist, Emacs Devel

[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]



> On Nov 27, 2024, at 11:30 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Björn Lindqvist <bjourne@gmail.com>
>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>> 
>> I've been trying to get c-ts-mode to indent like I want, but I'm
>> running into problems related to preprocessor directives.
> 
> Preprocessor directives are difficult because the tree-sitter C/C++
> grammars include only partial support for them.
> 
>> For
>> example, consider a type definition nested in two #ifdefs:
>> 
>>    #ifdef X
>>    #ifdef Y
>>    typedef int foo;
>>    #endif
>>    #endif
>> 
>> Since both the parent and grand parent of the type_definition is a
>> preproc_ifdef no rule matches.
> 
> But if you go back (up) the parent-child hierarchy, you will
> eventually find a node which is not a preproc_SOMETHING, and can go
> from there, no?
> 
>> Another issue is that I want my
>> preprocessor directives kept at column 0, which unfortunately screws
>> up all rules that refer to the parent. E.g.:
>> 
>>    ((parent-is "if_statement") standalone-parent 4)
>> 
>> Doesn't work for
>> 
>>    int main() {
>>        if (true)
>>    #ifdef A
>>            prutt();
>>    #else
>>            fis();
>>    #endif
>>    }
>> 
>> The rule I'd like to express is "take the indent of the closest
>> *indenting* parent and add one indent". That rule would match whether
>> that parent is a "while_statement", "if_statement", "for_statement",
>> etc. You can't express such rules with tree-sitter, can you?
> 
> Not sure, but Yuan will know.

Everything is possible, it’s just elisp. The only problem is how generic you can make the rule. Here’s a POC that only works for this example; specifically, it only works for if statements and #ifdef directives. It should be extendable to for statement, while statement, etc, and maybe other directives too.

Speaking of indent, we need to do something with c-ts-mode’s indentation rules. It’s getting too long and too complex. But I don’t have any great idea at this point. Maybe we can replace the rules with a hand-rolled function so it has more structure, or try nvim’s query approach.

Yuan



[-- Attachment #2: preproc-indent.patch --]
[-- Type: application/octet-stream, Size: 1695 bytes --]

From 25de026b3eb32e7457270cd199fe0902876a2715 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Thu, 28 Nov 2024 01:51:44 -0800
Subject: [PATCH] Preproc indent POC

---
 lisp/progmodes/c-ts-mode.el | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index c815ee35501..313dcfb5c05 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -435,6 +435,24 @@ c-ts-mode--indent-styles
            ((parent-is "labeled_statement")
             c-ts-mode--standalone-grandparent c-ts-mode-indent-offset)
 
+           ,(let (anchor)
+              (list
+               (lambda (_node parent &rest _)
+                 (let ((anchor-node
+                        (cond
+                         ((treesit-node-match-p parent "preproc_ifdef")
+                          (treesit-node-prev-sibling parent))
+                         ((treesit-node-match-p parent "preproc_else")
+                          (treesit-node-prev-sibling
+                           (treesit-node-parent parent))))))
+                   (when anchor-node
+                     (setq anchor (treesit-node-start anchor-node))
+                     ;; If parent is preproc and previous sibling is
+                     ;; if_statement, set anchor and return t.
+                     (treesit-node-match-p anchor-node "if_statement"))))
+               (lambda (&rest _) anchor)
+               c-ts-mode-indent-offset))
+
            ;; Preproc directives
            ((node-is "preproc") column-0 0)
            ((node-is "#endif") column-0 0)
-- 
2.39.5 (Apple Git-151)


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
  2024-11-28  7:30 ` Eli Zaretskii
  2024-11-28 10:03   ` Yuan Fu
@ 2024-11-28 18:30   ` Filippo Argiolas
  1 sibling, 0 replies; 4+ messages in thread
From: Filippo Argiolas @ 2024-11-28 18:30 UTC (permalink / raw)
  To: Eli Zaretskii, Björn Lindqvist, Yuan Fu; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Björn Lindqvist <bjourne@gmail.com>
>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>> 
>> I've been trying to get c-ts-mode to indent like I want, but I'm
>> running into problems related to preprocessor directives.
>
> Preprocessor directives are difficult because the tree-sitter C/C++
> grammars include only partial support for them.
>
>> For
>> example, consider a type definition nested in two #ifdefs:
>> 
>>     #ifdef X
>>     #ifdef Y
>>     typedef int foo;
>>     #endif
>>     #endif
>> 
>> Since both the parent and grand parent of the type_definition is a
>> preproc_ifdef no rule matches.
>
> But if you go back (up) the parent-child hierarchy, you will
> eventually find a node which is not a preproc_SOMETHING, and can go
> from there, no?
>

I believe we might have a bug here, as far as I can tell it does not
match

  ((n-p-gp nil "preproc" "translation_unit") column-0 0)

Because both parent and grand parent are preproc. So it matches one of
the `c-ts-mode--standalone-parent-skip-preproc' rules right after.

After skipping preproc nodes parent is translation_unit and indents an offset
from there. Guess this step could be made smarter to check for
translation_unit and the rule above could be removed?

>> Another issue is that I want my
>> preprocessor directives kept at column 0, which unfortunately screws
>> up all rules that refer to the parent. E.g.:
>> 
>>     ((parent-is "if_statement") standalone-parent 4)
>> 
>> Doesn't work for
>> 
>>     int main() {
>>         if (true)
>>     #ifdef A
>>             prutt();
>>     #else
>>             fis();
>>     #endif
>>     }
>> 
>> The rule I'd like to express is "take the indent of the closest
>> *indenting* parent and add one indent". That rule would match whether
>> that parent is a "while_statement", "if_statement", "for_statement",
>> etc. You can't express such rules with tree-sitter, can you?
>
> Not sure, but Yuan will know.

This can be worked around as Yuan showed, but isn't it a grammar bug?
problem is with the #ifdef function and if statement become siblings, without
preproc they have a child-parent relation.

In my experience c-ts-mode is a bit fragile with preprocessor
statements, probably because the grammar itself is fragile (see
e.g. [1]) and the problem is an hard one.

Yuan, do you think c-ts-mode could some way benefit from LSP knowledge
about inactive preprocessor branches? Idea is that we would at least
have a good syntax tree in the active branches while allowing some
errors in the inactive ones.


Filippo


1. https://github.com/tree-sitter/tree-sitter-c/issues/108



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-11-28 18:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-27 23:27 How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Björn Lindqvist
2024-11-28  7:30 ` Eli Zaretskii
2024-11-28 10:03   ` Yuan Fu
2024-11-28 18:30   ` Filippo Argiolas

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).