From: Theodor Thornhill via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 59415@debbugs.gnu.org, casouri@gmail.com
Subject: bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a portion of a large C file
Date: Sun, 20 Nov 2022 21:33:06 +0100 [thread overview]
Message-ID: <87v8n9qscd.fsf@thornhill.no> (raw)
In-Reply-To: <83h6yt4c12.fsf@gnu.org>
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: Yuan Fu <casouri@gmail.com>
>> Date: Sun, 20 Nov 2022 20:54:05 +0100
>>
>> > Observe that fontifications stop at this line for some reason.
>> > Fontification reappears on line 209271. Maybe it's because of the many
>> > braces that appear in warning face? Why does TS think there are syntax
>> > errors here? The C++ TS parser doesn't have that problem, btw.
>>
>> It seems the c parser definitely can't handle what it's seeing.
>
> Yes, but do you have any clue why it gives up at that line?
>
No, not yet.
> One thing that I see is that many braces around there are shown in warning
> face, so perhaps the parser is overwhelmed by the amount of parsing errors?
>
Yeah that's my first guess, but that shouldn't be an issue, it should be
able to font-lock _something_.
>> > P.S. Btw, isn't the treesit-max-buffer-size limit too low? 4 MiB?
>>
>> It might be! IIRC treesit uses 10x the buffer size to store the ast, so
>> it'll be some more memory usage.
>
> After lifting the limit to allow visiting the file, this file causes Emacs
> to go up to 350 MiB. Which is significant, but definitely not outrageous
> enough to prevent using TS with this file. And I'm sure "normal" C files
> (as opposed to ones written by a program) will need less memory. So 4 MiB
> sounds too restrictive to me. We should maybe increase that to 15 MiB on
> 32-bit systems and say 40 MiB on 64-bit?
>
I think it should probably be the same as in the C level, as I mentioned
in the other mail?
>> I'll do some more digging, but in the
>> meantime I attach this profiler report that shows font-locking as the
>> culprit:
>
> Culprit for what? For slow performance?
Yeah.
> Don't get me wrong: from my POV, TS works here better than CC Mode, in
> many use cases which are much more important than scrolling through
> the entire humongous file top to bottom. For example, just visiting
> the file takes 3 times as much with CC Mode as with c-ts-mode; going
> to EOB with CC Mode takes more 1 min 20 sec, whereas TS does it in 2.5
> sec. And likewise jumping into a random point in the file. Instead
> of Alan's 150 sec for a full scroll by CC Mode I get 27 min. The
> number of GC cycles with CC Mode is 10 times as large as with TS.
> (Caveat: my Emacs is built without optimizations, whereas Tree-sitter
> and the language support libraries are, of course, fully optimized.)
>
Ok, that's good to know!
>> In this profile I followed your repro, and did some more movement around
>> the buffer after. This isn't from emacs -Q, but I believe the results
>> will be just the same, considering where the slowness seems to be
>>
>>
>> 16695 85% - redisplay_internal (C function)
>> 16695 85% - jit-lock-function
>> 16695 85% - jit-lock-fontify-now
>> 16695 85% - jit-lock--run-functions
>> 16695 85% - run-hook-wrapped
>> 16695 85% - #<compiled -0x156eddb48a262583>
>> 16695 85% - font-lock-fontify-region
>> 16695 85% - font-lock-default-fontify-region
>> 16679 84% - treesit-font-lock-fontify-region
>
> Yes, treesit-font-lock-fontify-region takes the lion's share. If you or
> Yuan can speed this up, please do. But I see no reason to consider this a
> catastrophe, quite to the contrary.
I think it boils down to getting the root too many times. In an
unmodified buffer I think getting the root node should be instant, and
it seems to take some time. I'll try to figure out why.
Theo
next prev parent reply other threads:[~2022-11-20 20:33 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-20 17:55 bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a portion of a large C file Eli Zaretskii
2022-11-20 19:54 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-20 20:16 ` Eli Zaretskii
2022-11-20 20:33 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2022-11-20 20:51 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-20 20:59 ` Yuan Fu
2022-11-20 21:09 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-20 21:27 ` Yuan Fu
2022-11-20 21:56 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-21 1:27 ` Yuan Fu
2022-11-21 11:00 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-21 13:44 ` Eli Zaretskii
2022-11-21 15:15 ` Eli Zaretskii
2022-11-21 16:53 ` Yuan Fu
2022-11-21 17:17 ` Eli Zaretskii
2022-11-22 7:31 ` Yuan Fu
2022-11-21 12:41 ` Eli Zaretskii
2022-11-20 20:17 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v8n9qscd.fsf@thornhill.no \
--to=bug-gnu-emacs@gnu.org \
--cc=59415@debbugs.gnu.org \
--cc=casouri@gmail.com \
--cc=eliz@gnu.org \
--cc=theo@thornhill.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).