all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Filippo Argiolas <filippo.argiolas@gmail.com>
Cc: "Eli Zaretskii" <eliz@gnu.org>,
	"Björn Lindqvist" <bjourne@gmail.com>,
	emacs-devel@gnu.org
Subject: Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work?
Date: Sun, 1 Dec 2024 01:32:20 -0800	[thread overview]
Message-ID: <FE856458-E3F4-4306-B882-602CFE9BA586@gmail.com> (raw)
In-Reply-To: <m2jzcj7qb1.fsf@gmail.com>



> On Dec 1, 2024, at 12:36 AM, Filippo Argiolas <filippo.argiolas@gmail.com> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>>> On Nov 28, 2024, at 10:30 AM, Filippo Argiolas <filippo.argiolas@gmail.com> wrote:
>>> 
>>> Eli Zaretskii <eliz@gnu.org> writes:
>>> 
>>>>> From: Björn Lindqvist <bjourne@gmail.com>
>>>>> Date: Thu, 28 Nov 2024 00:27:17 +0100
>>>>> 
>>>>> I've been trying to get c-ts-mode to indent like I want, but I'm
>>>>> running into problems related to preprocessor directives.
>>>> 
>>>> Preprocessor directives are difficult because the tree-sitter C/C++
>>>> grammars include only partial support for them.
>>>> 
>>>>> For
>>>>> example, consider a type definition nested in two #ifdefs:
>>>>> 
>>>>>   #ifdef X
>>>>>   #ifdef Y
>>>>>   typedef int foo;
>>>>>   #endif
>>>>>   #endif
>>>>> 
>>>>> Since both the parent and grand parent of the type_definition is a
>>>>> preproc_ifdef no rule matches.
>>>> 
>>>> But if you go back (up) the parent-child hierarchy, you will
>>>> eventually find a node which is not a preproc_SOMETHING, and can go
>>>> from there, no?
>>>> 
>>> 
>>> I believe we might have a bug here, as far as I can tell it does not
>>> match
>>> 
>>> ((n-p-gp nil "preproc" "translation_unit") column-0 0)
>>> 
>>> Because both parent and grand parent are preproc. So it matches one of
>>> the `c-ts-mode--standalone-parent-skip-preproc' rules right after.
>>> 
>>> After skipping preproc nodes parent is translation_unit and indents an offset
>>> from there. Guess this step could be made smarter to check for
>>> translation_unit and the rule above could be removed?
>>> 
>>>>> Another issue is that I want my
>>>>> preprocessor directives kept at column 0, which unfortunately screws
>>>>> up all rules that refer to the parent. E.g.:
>>>>> 
>>>>>   ((parent-is "if_statement") standalone-parent 4)
>>>>> 
>>>>> Doesn't work for
>>>>> 
>>>>>   int main() {
>>>>>       if (true)
>>>>>   #ifdef A
>>>>>           prutt();
>>>>>   #else
>>>>>           fis();
>>>>>   #endif
>>>>>   }
>>>>> 
>>>>> The rule I'd like to express is "take the indent of the closest
>>>>> *indenting* parent and add one indent". That rule would match whether
>>>>> that parent is a "while_statement", "if_statement", "for_statement",
>>>>> etc. You can't express such rules with tree-sitter, can you?
>>>> 
>>>> Not sure, but Yuan will know.
>>> 
>>> This can be worked around as Yuan showed, but isn't it a grammar bug?
>>> problem is with the #ifdef function and if statement become siblings, without
>>> preproc they have a child-parent relation.
>>> 
>>> In my experience c-ts-mode is a bit fragile with preprocessor
>>> statements, probably because the grammar itself is fragile (see
>>> e.g. [1]) and the problem is an hard one.
>> 
>> Right.
>> 
>>> Yuan, do you think c-ts-mode could some way benefit from LSP knowledge
>>> about inactive preprocessor branches? Idea is that we would at least
>>> have a good syntax tree in the active branches while allowing some
>>> errors in the inactive ones.
>> 
>> Maybe. Technically you can create a parser and sets its range to only included the active branches. But for it to work end-to-end would require some major effort. I’m not sure if it’s worth it (in terms of code complexity and maintenance cost).
> 
> Interesting, maybe I'll experiment a bit with it and see where it
> goes. Agree that it already sounds overkill for little gain.
> 
> My major annoyance more than indent is when the preprocessor statements
> break function detection and imenu/breadcrumb. I have one offending file
> of this kind at work which unfortunately I cannot share. Will try to
> extract a test case that reproduce the issue and open a bug. May be it
> can be worked around some way from c-ts-mode.

I share the frustration. Tree-sitter for C could’ve been so much better if weren’t for the preprocessor and macros. 

IME, whether it can be worked around depends on the specific code. Some code just generates a parse tree that’s hard to recover.

Yuan


      reply	other threads:[~2024-12-01  9:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-27 23:27 How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Björn Lindqvist
2024-11-28  7:30 ` Eli Zaretskii
2024-11-28 10:03   ` Yuan Fu
2024-11-28 18:30   ` Filippo Argiolas
2024-12-01  6:18     ` Yuan Fu
2024-12-01  8:36       ` Filippo Argiolas
2024-12-01  9:32         ` Yuan Fu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FE856458-E3F4-4306-B882-602CFE9BA586@gmail.com \
    --to=casouri@gmail.com \
    --cc=bjourne@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=filippo.argiolas@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.