unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Theodor Thornhill via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 59415@debbugs.gnu.org, casouri@gmail.com
Subject: bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a portion of a large C file
Date: Sun, 20 Nov 2022 21:33:06 +0100	[thread overview]
Message-ID: <87v8n9qscd.fsf@thornhill.no> (raw)
In-Reply-To: <83h6yt4c12.fsf@gnu.org>

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: Yuan Fu <casouri@gmail.com>
>> Date: Sun, 20 Nov 2022 20:54:05 +0100
>> 
>> > Observe that fontifications stop at this line for some reason.
>> > Fontification reappears on line 209271.  Maybe it's because of the many
>> > braces that appear in warning face?  Why does TS think there are syntax
>> > errors here?  The C++ TS parser doesn't have that problem, btw.
>> 
>> It seems the c parser definitely can't handle what it's seeing.
>
> Yes, but do you have any clue why it gives up at that line?
>

No, not yet.


> One thing that I see is that many braces around there are shown in warning
> face, so perhaps the parser is overwhelmed by the amount of parsing errors?
>

Yeah that's my first guess, but that shouldn't be an issue, it should be
able to font-lock _something_.

>> > P.S. Btw, isn't the treesit-max-buffer-size limit too low?  4 MiB?
>> 
>> It might be!  IIRC treesit uses 10x the buffer size to store the ast, so
>> it'll be some more memory usage.
>
> After lifting the limit to allow visiting the file, this file causes Emacs
> to go up to 350 MiB.  Which is significant, but definitely not outrageous
> enough to prevent using TS with this file.  And I'm sure "normal" C files
> (as opposed to ones written by a program) will need less memory.  So 4 MiB
> sounds too restrictive to me.  We should maybe increase that to 15 MiB on
> 32-bit systems and say 40 MiB on 64-bit?
>

I think it should probably be the same as in the C level, as I mentioned
in the other mail?

>> I'll do some more digging, but in the
>> meantime I attach this profiler report that shows font-locking as the
>> culprit:
>
> Culprit for what?  For slow performance?

Yeah.

> Don't get me wrong: from my POV, TS works here better than CC Mode, in
> many use cases which are much more important than scrolling through
> the entire humongous file top to bottom.  For example, just visiting
> the file takes 3 times as much with CC Mode as with c-ts-mode; going
> to EOB with CC Mode takes more 1 min 20 sec, whereas TS does it in 2.5
> sec.  And likewise jumping into a random point in the file.  Instead
> of Alan's 150 sec for a full scroll by CC Mode I get 27 min.  The
> number of GC cycles with CC Mode is 10 times as large as with TS.
> (Caveat: my Emacs is built without optimizations, whereas Tree-sitter
> and the language support libraries are, of course, fully optimized.)
>

Ok, that's good to know!

>> In this profile I followed your repro, and did some more movement around
>> the buffer after.  This isn't from emacs -Q, but I believe the results
>> will be just the same, considering where the slowness seems to be
>> 
>> 
>>        16695  85% - redisplay_internal (C function)
>>        16695  85%  - jit-lock-function
>>        16695  85%   - jit-lock-fontify-now
>>        16695  85%    - jit-lock--run-functions
>>        16695  85%     - run-hook-wrapped
>>        16695  85%      - #<compiled -0x156eddb48a262583>
>>        16695  85%       - font-lock-fontify-region
>>        16695  85%        - font-lock-default-fontify-region
>>        16679  84%         - treesit-font-lock-fontify-region
>
> Yes, treesit-font-lock-fontify-region takes the lion's share.  If you or
> Yuan can speed this up, please do.  But I see no reason to consider this a
> catastrophe, quite to the contrary.

I think it boils down to getting the root too many times.  In an
unmodified buffer I think getting the root node should be instant, and
it seems to take some time.  I'll try to figure out why.

Theo





  reply	other threads:[~2022-11-20 20:33 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-20 17:55 bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a portion of a large C file Eli Zaretskii
2022-11-20 19:54 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-20 20:16   ` Eli Zaretskii
2022-11-20 20:33     ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2022-11-20 20:51       ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-20 20:59       ` Yuan Fu
2022-11-20 21:09         ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-20 21:27           ` Yuan Fu
2022-11-20 21:56             ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-21  1:27               ` Yuan Fu
2022-11-21 11:00                 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-21 13:44                 ` Eli Zaretskii
2022-11-21 15:15                 ` Eli Zaretskii
2022-11-21 16:53                   ` Yuan Fu
2022-11-21 17:17                     ` Eli Zaretskii
2022-11-22  7:31                       ` Yuan Fu
2022-11-21 12:41               ` Eli Zaretskii
2022-11-20 20:17 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8n9qscd.fsf@thornhill.no \
    --to=bug-gnu-emacs@gnu.org \
    --cc=59415@debbugs.gnu.org \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=theo@thornhill.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).