From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 09:21:09 +0000 Message-ID: References: <55C404DC-1C29-449F-9A49-B20EDFFCFCEA@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10583"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Lars Ingebrigtsen , 25706@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Dec 01 10:22:10 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kk1rO-0002eH-9Q for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 01 Dec 2020 10:22:10 +0100 Original-Received: from localhost ([::1]:58704 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kk1rN-0003s9-2f for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 01 Dec 2020 04:22:09 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49282) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kk1rG-0003rx-4x for bug-gnu-emacs@gnu.org; Tue, 01 Dec 2020 04:22:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:46239) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kk1rF-0003BA-T9; Tue, 01 Dec 2020 04:22:01 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kk1rF-0006KB-OP; Tue, 01 Dec 2020 04:22:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bug-cc-mode@gnu.org Resent-Date: Tue, 01 Dec 2020 09:22:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25706 X-GNU-PR-Package: emacs,cc-mode X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 25706-submit@debbugs.gnu.org id=B25706.160681447924259 (code B ref 25706); Tue, 01 Dec 2020 09:22:01 +0000 Original-Received: (at 25706) by debbugs.gnu.org; 1 Dec 2020 09:21:19 +0000 Original-Received: from localhost ([127.0.0.1]:57785 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kk1qY-0006JD-V6 for submit@debbugs.gnu.org; Tue, 01 Dec 2020 04:21:19 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:44961 helo=mail.muc.de) by debbugs.gnu.org with smtp (Exim 4.84_2) (envelope-from ) id 1kk1qW-0006Ix-F5 for 25706@debbugs.gnu.org; Tue, 01 Dec 2020 04:21:17 -0500 Original-Received: (qmail 46960 invoked by uid 3782); 1 Dec 2020 09:21:09 -0000 Original-Received: from acm.muc.de (p4fe15c10.dip0.t-ipconnect.de [79.225.92.16]) by localhost.muc.de (tmda-ofmipd) with ESMTP; Tue, 01 Dec 2020 10:21:09 +0100 Original-Received: (qmail 5222 invoked by uid 1000); 1 Dec 2020 09:21:09 -0000 Content-Disposition: inline In-Reply-To: X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:194717 Archived-At: Hello, Mattias. On Mon, Nov 30, 2020 at 17:53:04 +0100, Mattias Engdegård wrote: > 30 nov. 2020 kl. 17.38 skrev Alan Mackenzie : > > Yes. I've had a look at the file, and it's large and lacking in > > braces. There are functions in CC Mode which search backwards for > > opening braces to establish context. When there are none, the > > search goes back to BOB. Lots of these searches, not efficiently > > cached, take a long time. > > It's a problem with CC Mode, not with the source file. It's a known > > problem, and not easy to fix. > Actually, it's the underscores! > Demo: fill a file with the line pairs > #define abc_defg_hij_klm__nop_qrst_uvw_xyz_w__ooa_cin_e__aoi__uynv(s) \ > 0 > repeated 1000 times, thus making it 2000 lines. Save as something.h. Slow! > Now replace each underscore with a letter. Save. Fast! > Fontifying the 2000 line file (with underscores) takes longer than the > original 80000 line file. Hey, wonderful! I haven't tried it yet, but I did try this: (i) Take the first 10% of the original 4MB file, and save it in a different file. (ii) Fontify that file from top to bottom: according to EPL, 292s (iii) Insert 9 new lines "{}" every 10% of that new file. (iv) Fontify the amended file top to bottom: new time 98s. That's a factor of 3 different. > I started going through c-find-decl-spots and > c-find-decl-prefix-search (together there are while statements nested > 4 deep) but am not sure exactly where the trouble is. A regexp? > Something syntax-char related (since '_' has symbol syntax, not word)? > CC-mode in general thrashes the regexp cache; the miss rate is at 27 % > for the original file, which is way too high. Enlarging the cache > enough to eliminate misses helps, but not nearly enough. So, you reckon replacing "\\(" by "\\(?:" wherever the first isn't really needed would make a big difference? Have I understood you right? If so, I've got a big job ahead of me, going through all the regexps in CC Mode doing the replacement, and fixing all the match_begininings and match_ends, and so on, which depend on them. -- Alan Mackenzie (Nuremberg, Germany).