* How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
@ 2024-11-27 23:27 Björn Lindqvist
2024-11-28 7:30 ` Eli Zaretskii
0 siblings, 1 reply; 4+ messages in thread
From: Björn Lindqvist @ 2024-11-27 23:27 UTC (permalink / raw)
To: Emacs developers
Hello Emacs developers!
I've been trying to get c-ts-mode to indent like I want, but I'm
running into problems related to preprocessor directives. For
example, consider a type definition nested in two #ifdefs:
#ifdef X
#ifdef Y
typedef int foo;
#endif
#endif
Since both the parent and grand parent of the type_definition is a
preproc_ifdef no rule matches. Another issue is that I want my
preprocessor directives kept at column 0, which unfortunately screws
up all rules that refer to the parent. E.g.:
((parent-is "if_statement") standalone-parent 4)
Doesn't work for
int main() {
if (true)
#ifdef A
prutt();
#else
fis();
#endif
}
The rule I'd like to express is "take the indent of the closest
*indenting* parent and add one indent". That rule would match whether
that parent is a "while_statement", "if_statement", "for_statement",
etc. You can't express such rules with tree-sitter, can you?
Btw, I get that tree-sitter can't handle *all* weird preprocessor
constructs you can create, but my examples are really common and
appear in most C code bases.
--
mvh/best regards Björn Lindqvist
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
2024-11-27 23:27 How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Björn Lindqvist
@ 2024-11-28 7:30 ` Eli Zaretskii
2024-11-28 10:03 ` Yuan Fu
2024-11-28 18:30 ` Filippo Argiolas
0 siblings, 2 replies; 4+ messages in thread
From: Eli Zaretskii @ 2024-11-28 7:30 UTC (permalink / raw)
To: Björn Lindqvist, Yuan Fu; +Cc: emacs-devel
> From: Björn Lindqvist <bjourne@gmail.com>
> Date: Thu, 28 Nov 2024 00:27:17 +0100
>
> I've been trying to get c-ts-mode to indent like I want, but I'm
> running into problems related to preprocessor directives.
Preprocessor directives are difficult because the tree-sitter C/C++
grammars include only partial support for them.
> For
> example, consider a type definition nested in two #ifdefs:
>
> #ifdef X
> #ifdef Y
> typedef int foo;
> #endif
> #endif
>
> Since both the parent and grand parent of the type_definition is a
> preproc_ifdef no rule matches.
But if you go back (up) the parent-child hierarchy, you will
eventually find a node which is not a preproc_SOMETHING, and can go
from there, no?
> Another issue is that I want my
> preprocessor directives kept at column 0, which unfortunately screws
> up all rules that refer to the parent. E.g.:
>
> ((parent-is "if_statement") standalone-parent 4)
>
> Doesn't work for
>
> int main() {
> if (true)
> #ifdef A
> prutt();
> #else
> fis();
> #endif
> }
>
> The rule I'd like to express is "take the indent of the closest
> *indenting* parent and add one indent". That rule would match whether
> that parent is a "while_statement", "if_statement", "for_statement",
> etc. You can't express such rules with tree-sitter, can you?
Not sure, but Yuan will know.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
2024-11-28 7:30 ` Eli Zaretskii
@ 2024-11-28 10:03 ` Yuan Fu
2024-11-28 18:30 ` Filippo Argiolas
1 sibling, 0 replies; 4+ messages in thread
From: Yuan Fu @ 2024-11-28 10:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Björn Lindqvist, Emacs Devel
[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]
> On Nov 27, 2024, at 11:30 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> From: Björn Lindqvist <bjourne@gmail.com>
>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>>
>> I've been trying to get c-ts-mode to indent like I want, but I'm
>> running into problems related to preprocessor directives.
>
> Preprocessor directives are difficult because the tree-sitter C/C++
> grammars include only partial support for them.
>
>> For
>> example, consider a type definition nested in two #ifdefs:
>>
>> #ifdef X
>> #ifdef Y
>> typedef int foo;
>> #endif
>> #endif
>>
>> Since both the parent and grand parent of the type_definition is a
>> preproc_ifdef no rule matches.
>
> But if you go back (up) the parent-child hierarchy, you will
> eventually find a node which is not a preproc_SOMETHING, and can go
> from there, no?
>
>> Another issue is that I want my
>> preprocessor directives kept at column 0, which unfortunately screws
>> up all rules that refer to the parent. E.g.:
>>
>> ((parent-is "if_statement") standalone-parent 4)
>>
>> Doesn't work for
>>
>> int main() {
>> if (true)
>> #ifdef A
>> prutt();
>> #else
>> fis();
>> #endif
>> }
>>
>> The rule I'd like to express is "take the indent of the closest
>> *indenting* parent and add one indent". That rule would match whether
>> that parent is a "while_statement", "if_statement", "for_statement",
>> etc. You can't express such rules with tree-sitter, can you?
>
> Not sure, but Yuan will know.
Everything is possible, it’s just elisp. The only problem is how generic you can make the rule. Here’s a POC that only works for this example; specifically, it only works for if statements and #ifdef directives. It should be extendable to for statement, while statement, etc, and maybe other directives too.
Speaking of indent, we need to do something with c-ts-mode’s indentation rules. It’s getting too long and too complex. But I don’t have any great idea at this point. Maybe we can replace the rules with a hand-rolled function so it has more structure, or try nvim’s query approach.
Yuan
[-- Attachment #2: preproc-indent.patch --]
[-- Type: application/octet-stream, Size: 1695 bytes --]
From 25de026b3eb32e7457270cd199fe0902876a2715 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Thu, 28 Nov 2024 01:51:44 -0800
Subject: [PATCH] Preproc indent POC
---
lisp/progmodes/c-ts-mode.el | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index c815ee35501..313dcfb5c05 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -435,6 +435,24 @@ c-ts-mode--indent-styles
((parent-is "labeled_statement")
c-ts-mode--standalone-grandparent c-ts-mode-indent-offset)
+ ,(let (anchor)
+ (list
+ (lambda (_node parent &rest _)
+ (let ((anchor-node
+ (cond
+ ((treesit-node-match-p parent "preproc_ifdef")
+ (treesit-node-prev-sibling parent))
+ ((treesit-node-match-p parent "preproc_else")
+ (treesit-node-prev-sibling
+ (treesit-node-parent parent))))))
+ (when anchor-node
+ (setq anchor (treesit-node-start anchor-node))
+ ;; If parent is preproc and previous sibling is
+ ;; if_statement, set anchor and return t.
+ (treesit-node-match-p anchor-node "if_statement"))))
+ (lambda (&rest _) anchor)
+ c-ts-mode-indent-offset))
+
;; Preproc directives
((node-is "preproc") column-0 0)
((node-is "#endif") column-0 0)
--
2.39.5 (Apple Git-151)
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
2024-11-28 7:30 ` Eli Zaretskii
2024-11-28 10:03 ` Yuan Fu
@ 2024-11-28 18:30 ` Filippo Argiolas
1 sibling, 0 replies; 4+ messages in thread
From: Filippo Argiolas @ 2024-11-28 18:30 UTC (permalink / raw)
To: Eli Zaretskii, Björn Lindqvist, Yuan Fu; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Björn Lindqvist <bjourne@gmail.com>
>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>>
>> I've been trying to get c-ts-mode to indent like I want, but I'm
>> running into problems related to preprocessor directives.
>
> Preprocessor directives are difficult because the tree-sitter C/C++
> grammars include only partial support for them.
>
>> For
>> example, consider a type definition nested in two #ifdefs:
>>
>> #ifdef X
>> #ifdef Y
>> typedef int foo;
>> #endif
>> #endif
>>
>> Since both the parent and grand parent of the type_definition is a
>> preproc_ifdef no rule matches.
>
> But if you go back (up) the parent-child hierarchy, you will
> eventually find a node which is not a preproc_SOMETHING, and can go
> from there, no?
>
I believe we might have a bug here, as far as I can tell it does not
match
((n-p-gp nil "preproc" "translation_unit") column-0 0)
Because both parent and grand parent are preproc. So it matches one of
the `c-ts-mode--standalone-parent-skip-preproc' rules right after.
After skipping preproc nodes parent is translation_unit and indents an offset
from there. Guess this step could be made smarter to check for
translation_unit and the rule above could be removed?
>> Another issue is that I want my
>> preprocessor directives kept at column 0, which unfortunately screws
>> up all rules that refer to the parent. E.g.:
>>
>> ((parent-is "if_statement") standalone-parent 4)
>>
>> Doesn't work for
>>
>> int main() {
>> if (true)
>> #ifdef A
>> prutt();
>> #else
>> fis();
>> #endif
>> }
>>
>> The rule I'd like to express is "take the indent of the closest
>> *indenting* parent and add one indent". That rule would match whether
>> that parent is a "while_statement", "if_statement", "for_statement",
>> etc. You can't express such rules with tree-sitter, can you?
>
> Not sure, but Yuan will know.
This can be worked around as Yuan showed, but isn't it a grammar bug?
problem is with the #ifdef function and if statement become siblings, without
preproc they have a child-parent relation.
In my experience c-ts-mode is a bit fragile with preprocessor
statements, probably because the grammar itself is fragile (see
e.g. [1]) and the problem is an hard one.
Yuan, do you think c-ts-mode could some way benefit from LSP knowledge
about inactive preprocessor branches? Idea is that we would at least
have a good syntax tree in the active branches while allowing some
errors in the inactive ones.
Filippo
1. https://github.com/tree-sitter/tree-sitter-c/issues/108
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-11-28 18:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-27 23:27 How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Björn Lindqvist
2024-11-28 7:30 ` Eli Zaretskii
2024-11-28 10:03 ` Yuan Fu
2024-11-28 18:30 ` Filippo Argiolas
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).