all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 62951@debbugs.gnu.org
Subject: bug#62951: 29.0.90; c-ts-mode: Incorrect fontification due to FOR_EACH_TAIL_SAFE
Date: Sat, 22 Apr 2023 17:28:25 -0700	[thread overview]
Message-ID: <2BDA88D0-A870-4FE3-BAF8-350FBAA943BF@gmail.com> (raw)
In-Reply-To: <83fs8s2xnk.fsf@gnu.org>



> On Apr 22, 2023, at 12:17 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Fri, 21 Apr 2023 13:37:08 -0700
>> Cc: 62951@debbugs.gnu.org
>> 
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>>> To reproduce:
>>> 
>>>  emacs -Q
>>>  C-x C-f src/fns.c RET
>>>  C-u 3365 M-g g
>>> 
>>> Observe that "if" and "STRINGP" in the body of FOR_EACH_TAIL_SAFE are
>>> fontified in font-lock-function-name-face.  This is because the
>>> FOR_EACH_TAIL_SAFE macro is misparsed by tree-sitter as a declaration.
>>> 
>>> Can we teach c-ts-mode to recognize FOR_EACH_TAIL_SAFE and
>>> FOR_EACH_TAIL for what they are, perhaps conditioned on
>>> c-ts-mode-emacs-sources-support being non-nil?
>> 
>> I’m aware of this issue, but the truth is there isn’t a good solution to
>> it. We need to recognize FOR_EACH_TAIL_SAFE (not hard) and fix arbitrary
>> code after it (hard). In this case it’s a if statement, with macro calls
>> and AND operation in it’s condition, it’s already three things we need
>> to recognize and somehow handle. It can also be a for loop, a switch
>> case, a function call, a while loop. If we want to fix FOR_EACH_TAIL we
>> would need to handle every possible thing, at that point we might as
>> well have wrote a parser :-)
> 
> Sorry, I don't understand the difficulties.  The body of FOR_EACH_TAIL
> and a few similar macros we use could be on of the following:
> 
>  . a single simple statement
>  . an 'if' clause
>  . a 'while' loop
>  . a 'do' loop
>  . a 'for' loop
>  . a brace-delimited block (this one already works, AFAICS, so we
>    perhaps need not anything about it)
> 
> (In practice, only the first 2 and the last one are used, AFAICS.)
> 
> Doesn't tree-sitter tell us enough to figure out which of the above is
> in the body?  If so, why would we need to write a full parser?

Ok, the full parser part is a bit exaggeration :-) But my point is it’s a lot of dirty code for a subpar result. Take

if (NILP (XCDR (tail)) && STRINGP (XCAR (tail)))

for example, it’s parsed as a function definition and all the tokens in the condition are parsed as weird things like macro declarator, error, function declarator, type, etc. And if the condition changes to something else, say it has another layer of function call, it’ll parse differently.

> 
>> We can probably fix this very particular case, but it’s still work and
>> overhead, and doesn’t mean much.
> 
> Please understand: good support for editing Emacs C sources is from my
> POV imperative for c-ts-mode to gain traction.  One of my gripes about
> CC Mode was insufficient support for our macro system and for various
> GCC attributes; that improved recently to some extend, but not enough,
> and at a price of introducing ugly lists of strings that cause trouble
> when used in file-local variables.  We must do better in c-ts-mode!
> 
> So please make an effort of providing reasonable built-in solutions
> for these idiosyncrasies of the Emacs C sources, conditioned on the
> new variable c-ts-mode-emacs-sources-support, at least for those of
> them that are used widely enough.  It is very important.

What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.) If we can fix the parser, we don’t need to retrofit hacks onto font-lock, indent, etc, separately, and it truly fixes the problem. The downside is compiling from grammar source to grammar.c needs rust and node tools. But I guess depending on the grammar maintained by tree-sitter’s author isn’t too much different from depending on the grammar maintained by another individual (ie, me)?

Yuan




  reply	other threads:[~2023-04-23  0:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-19 16:40 bug#62951: 29.0.90; c-ts-mode: Incorrect fontification due to FOR_EACH_TAIL_SAFE Eli Zaretskii
2023-04-21 20:37 ` Yuan Fu
2023-04-22  7:17   ` Eli Zaretskii
2023-04-23  0:28     ` Yuan Fu [this message]
2023-04-23  6:25       ` Eli Zaretskii
2023-04-24  7:02         ` Yuan Fu
2023-04-23 21:04       ` Dmitry Gutov
2023-04-26 22:19         ` Yuan Fu
2023-04-27  3:14           ` Yuan Fu
2023-04-27 15:03             ` Eli Zaretskii
2023-04-27 19:56               ` Yuan Fu
2023-04-28  5:41                 ` Eli Zaretskii
2023-04-29 22:55                   ` Yuan Fu
2023-04-30  5:24                     ` Eli Zaretskii
2023-04-27  8:57           ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2BDA88D0-A870-4FE3-BAF8-350FBAA943BF@gmail.com \
    --to=casouri@gmail.com \
    --cc=62951@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.