From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#59415: 29.0.50; [feature/tree-sitter] c-ts-mode fails to fontify a portion of a large C file Date: Sun, 20 Nov 2022 22:16:25 +0200 Message-ID: <83h6yt4c12.fsf@gnu.org> References: <83v8n94ij9.fsf@gnu.org> <87k03pwgf6.fsf@thornhill.no> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="21937"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 59415@debbugs.gnu.org, casouri@gmail.com To: Theodor Thornhill Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Nov 20 21:17:22 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1owqkj-0005RL-PX for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 20 Nov 2022 21:17:21 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1owqkT-0006a4-4i; Sun, 20 Nov 2022 15:17:05 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1owqkQ-0006Zo-Pe for bug-gnu-emacs@gnu.org; Sun, 20 Nov 2022 15:17:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1owqkQ-0006RR-A8 for bug-gnu-emacs@gnu.org; Sun, 20 Nov 2022 15:17:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1owqkQ-0000kf-5T for bug-gnu-emacs@gnu.org; Sun, 20 Nov 2022 15:17:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 20 Nov 2022 20:17:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59415 X-GNU-PR-Package: emacs X-Debbugs-Original-Cc: bug-gnu-emacs@gnu.org, casouri@gmail.com Original-Received: via spool by submit@debbugs.gnu.org id=B.16689753832832 (code B ref -1); Sun, 20 Nov 2022 20:17:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 20 Nov 2022 20:16:23 +0000 Original-Received: from localhost ([127.0.0.1]:44722 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1owqjm-0000ja-Lk for submit@debbugs.gnu.org; Sun, 20 Nov 2022 15:16:23 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:53240) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1owqjk-0000jT-H5 for submit@debbugs.gnu.org; Sun, 20 Nov 2022 15:16:21 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1owqjj-0006Xr-Hq for bug-gnu-emacs@gnu.org; Sun, 20 Nov 2022 15:16:20 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1owqji-0006OT-Lw; Sun, 20 Nov 2022 15:16:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=BzZ9ZqRV/HEsyrb61lfHo3vtfZAGp8jBeRaH/Mzpt0Y=; b=GeCnRoxKExnN GVzfFq6fD2bw0FBNOZrKkH5TUp8piN8gB9t+M7CYazpUVXyaobq0hjGS+FIamwKbtzsmmatCGHkCX TLNMbVXOGJv5osz1P0p1c+NG7hNTqcGEJ1fAkMKcafM50zGLfCgOh1qbWsXVWRcXeo/MmB9TKsqsg kjF+LMb1jBH2OjhbMAMfEnfw7lRjxz6hfkGIpMiX6XOl7R6t5l3URjY3QGjuu6XnpyG+iJkDGalyV 2XYfpovtoxbiGcf69spXCONOuchA75BWcqYz3Ks3ZGOZSn41br7cJA9Ffy+c+IeaV+8vd47IqG5Wl XWiox2jvnULQCEmG0OTC3A==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1owqjh-0005t8-LI; Sun, 20 Nov 2022 15:16:18 -0500 In-Reply-To: <87k03pwgf6.fsf@thornhill.no> (message from Theodor Thornhill on Sun, 20 Nov 2022 20:54:05 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:248471 Archived-At: > From: Theodor Thornhill > Cc: Yuan Fu > Date: Sun, 20 Nov 2022 20:54:05 +0100 > > > Observe that fontifications stop at this line for some reason. > > Fontification reappears on line 209271. Maybe it's because of the many > > braces that appear in warning face? Why does TS think there are syntax > > errors here? The C++ TS parser doesn't have that problem, btw. > > It seems the c parser definitely can't handle what it's seeing. Yes, but do you have any clue why it gives up at that line? One thing that I see is that many braces around there are shown in warning face, so perhaps the parser is overwhelmed by the amount of parsing errors? > > P.S. Btw, isn't the treesit-max-buffer-size limit too low? 4 MiB? > > It might be! IIRC treesit uses 10x the buffer size to store the ast, so > it'll be some more memory usage. After lifting the limit to allow visiting the file, this file causes Emacs to go up to 350 MiB. Which is significant, but definitely not outrageous enough to prevent using TS with this file. And I'm sure "normal" C files (as opposed to ones written by a program) will need less memory. So 4 MiB sounds too restrictive to me. We should maybe increase that to 15 MiB on 32-bit systems and say 40 MiB on 64-bit? > I'll do some more digging, but in the > meantime I attach this profiler report that shows font-locking as the > culprit: Culprit for what? For slow performance? Don't get me wrong: from my POV, TS works here better than CC Mode, in many use cases which are much more important than scrolling through the entire humongous file top to bottom. For example, just visiting the file takes 3 times as much with CC Mode as with c-ts-mode; going to EOB with CC Mode takes more 1 min 20 sec, whereas TS does it in 2.5 sec. And likewise jumping into a random point in the file. Instead of Alan's 150 sec for a full scroll by CC Mode I get 27 min. The number of GC cycles with CC Mode is 10 times as large as with TS. (Caveat: my Emacs is built without optimizations, whereas Tree-sitter and the language support libraries are, of course, fully optimized.) > In this profile I followed your repro, and did some more movement around > the buffer after. This isn't from emacs -Q, but I believe the results > will be just the same, considering where the slowness seems to be > > > 16695 85% - redisplay_internal (C function) > 16695 85% - jit-lock-function > 16695 85% - jit-lock-fontify-now > 16695 85% - jit-lock--run-functions > 16695 85% - run-hook-wrapped > 16695 85% - # > 16695 85% - font-lock-fontify-region > 16695 85% - font-lock-default-fontify-region > 16679 84% - treesit-font-lock-fontify-region Yes, treesit-font-lock-fontify-region takes the lion's share. If you or Yuan can speed this up, please do. But I see no reason to consider this a catastrophe, quite to the contrary.