From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 8 Dec 2020 18:42:35 +0000 Message-ID: References: <956BCA08-0376-4FAD-B1F7-2087C03F6181@acm.org> <53CC4F6E-716E-4D4B-8903-F32CCB676163@acm.org> <05F2A660-A403-4B81-AE77-416A739160A7@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3064"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Lars Ingebrigtsen , 25706@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Dec 08 20:24:03 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kmiah-0000eq-Au for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 08 Dec 2020 20:24:03 +0100 Original-Received: from localhost ([::1]:38782 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kmiag-00051T-BY for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 08 Dec 2020 14:24:02 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:36666) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kmhx0-0008Jz-J5 for bug-gnu-emacs@gnu.org; Tue, 08 Dec 2020 13:43:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:48197) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kmhx0-0002QK-BV; Tue, 08 Dec 2020 13:43:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kmhx0-0003bO-8I; Tue, 08 Dec 2020 13:43:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bug-cc-mode@gnu.org Resent-Date: Tue, 08 Dec 2020 18:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25706 X-GNU-PR-Package: emacs,cc-mode X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 25706-submit@debbugs.gnu.org id=B25706.160745296813818 (code B ref 25706); Tue, 08 Dec 2020 18:43:02 +0000 Original-Received: (at 25706) by debbugs.gnu.org; 8 Dec 2020 18:42:48 +0000 Original-Received: from localhost ([127.0.0.1]:59737 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmhwk-0003ai-Jx for submit@debbugs.gnu.org; Tue, 08 Dec 2020 13:42:48 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:40478 helo=mail.muc.de) by debbugs.gnu.org with smtp (Exim 4.84_2) (envelope-from ) id 1kmhwg-0003aH-QN for 25706@debbugs.gnu.org; Tue, 08 Dec 2020 13:42:44 -0500 Original-Received: (qmail 4787 invoked by uid 3782); 8 Dec 2020 18:42:35 -0000 Original-Received: from acm.muc.de (p4fe15bf3.dip0.t-ipconnect.de [79.225.91.243]) by localhost.muc.de (tmda-ofmipd) with ESMTP; Tue, 08 Dec 2020 19:42:35 +0100 Original-Received: (qmail 9020 invoked by uid 1000); 8 Dec 2020 18:42:35 -0000 Content-Disposition: inline In-Reply-To: <05F2A660-A403-4B81-AE77-416A739160A7@acm.org> X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:195427 Archived-At: Hello again, Mattias. On Sat, Dec 05, 2020 at 16:20:54 +0100, Mattias Engdegård wrote: > 4 dec. 2020 kl. 22.04 skrev Alan Mackenzie : [ .... ] > That's nice, thank you! It seems to be about 19 % faster than the > previous patch on this particular file, which is not bad at all. Well, the enclosed patch improves on this a little, particularly in C++ Mode. (Trying the monster file.h in C++ Mode is now something worth trying). Just as a matter of interest, I've done a fair bit of testing with a larger monster file (~14 MB) in the Linux kernel, at linux/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h . That's 133,000 lines, give or take. Even our largest file, src/xdisp.c is only 36,000 lines. I don't understand how a file describing hardware can come to anything like 133k lines. It must be soul destroying to have to write a driver based on a file like this. That file was put together by AMD, and I suspect they didn't take all that much care to make it usable. > Somehow, the delay when inserting a newline (pressing return) at line > 83610 of osprey_reg_map_macro.h becomes longer with the patch. I think I've fixed this. Thanks for prompting me. > Of course this is more than compensated by the speed-up in general, > but it may be worth taking a look at. There's one thing which still puzzles me. In osprey_reg....h, when scrolling through it (e.g. with (time-scroll)), it stutters markedly at around 13% of the way through. I've managed to localize this, it's happening in the macro c-find-decl-prefix-search (invoked only from c-find-decl-spots), and has something to do with the call to re-search-forward there, but I've not manage to pin down exactly what the cause is. > There is also a new and noticeable delay (0.5-1 s) in the very > beginning when scrolling through the file. (This is with the frame > sized to show 41 lines of 80 chars of a window, excluding mode line > and echo area.) This seems still to be there. I'll admit, I haven't really looked at this yet. Anyhow, please try out the (?)final version of my patch before I commit it and close the bug. It should apply cleanly to the master branch. I might well split it into three changes, two small, one large, since there are, in a sense three distinct fixes there. Thanks! diff --git a/lisp/progmodes/cc-engine.el b/lisp/progmodes/cc-engine.el index 252eec138c..2365085036 100644 --- a/lisp/progmodes/cc-engine.el +++ b/lisp/progmodes/cc-engine.el @@ -972,7 +972,7 @@ c-beginning-of-statement-1 ;; that we've moved. (while (progn (setq pos (point)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) ;; Protect post-++/-- operators just before a virtual semicolon. (and (not (c-at-vsemi-p)) (/= (skip-chars-backward "-+!*&~@`#") 0)))) @@ -984,7 +984,7 @@ c-beginning-of-statement-1 (if (and (memq (char-before) delims) (progn (forward-char -1) (setq saved (point)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (or (memq (char-before) delims) (memq (char-before) '(?: nil)) (eq (char-syntax (char-before)) ?\() @@ -1164,7 +1164,7 @@ c-beginning-of-statement-1 ;; HERE IS THE SINGLE PLACE INSIDE THE PDA LOOP WHERE WE MOVE ;; BACKWARDS THROUGH THE SOURCE. - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (let ((before-sws-pos (point)) ;; The end position of the area to search for statement ;; barriers in this round. @@ -1174,33 +1174,35 @@ c-beginning-of-statement-1 ;; Go back over exactly one logical sexp, taking proper ;; account of macros and escaped EOLs. (while - (progn - (setq comma-delimited (and (not comma-delim) - (eq (char-before) ?\,))) - (unless (c-safe (c-backward-sexp) t) - ;; Give up if we hit an unbalanced block. Since the - ;; stack won't be empty the code below will report a - ;; suitable error. - (setq pre-stmt-found t) - (throw 'loop nil)) - (cond - ;; Have we moved into a macro? - ((and (not macro-start) - (c-beginning-of-macro)) - (save-excursion - (c-backward-syntactic-ws) - (setq before-sws-pos (point))) - ;; Have we crossed a statement boundary? If not, - ;; keep going back until we find one or a "real" sexp. - (and + (and + (progn + (setq comma-delimited (and (not comma-delim) + (eq (char-before) ?\,))) + (unless (c-safe (c-backward-sexp) t) + ;; Give up if we hit an unbalanced block. Since the + ;; stack won't be empty the code below will report a + ;; suitable error. + (setq pre-stmt-found t) + (throw 'loop nil)) + (cond + ;; Have we moved into a macro? + ((and (not macro-start) + (c-beginning-of-macro)) (save-excursion - (c-end-of-macro) - (not (c-crosses-statement-barrier-p - (point) maybe-after-boundary-pos))) - (setq maybe-after-boundary-pos (point)))) - ;; Have we just gone back over an escaped NL? This - ;; doesn't count as a sexp. - ((looking-at "\\\\$"))))) + (c-backward-syntactic-ws lim) + (setq before-sws-pos (point))) + ;; Have we crossed a statement boundary? If not, + ;; keep going back until we find one or a "real" sexp. + (and + (save-excursion + (c-end-of-macro) + (not (c-crosses-statement-barrier-p + (point) maybe-after-boundary-pos))) + (setq maybe-after-boundary-pos (point)))) + ;; Have we just gone back over an escaped NL? This + ;; doesn't count as a sexp. + ((looking-at "\\\\$")))) + (>= (point) lim))) ;; Have we crossed a statement boundary? (setq boundary-pos @@ -1413,7 +1415,7 @@ c-beginning-of-statement-1 ;; Skip over the unary operators that can start the statement. (while (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) ;; protect AWK post-inc/decrement operators, etc. (and (not (c-at-vsemi-p (point))) (/= (skip-chars-backward "-.+!*&~@`#") 0))) @@ -3568,15 +3570,18 @@ c-get-fallback-scan-pos ;; Return a start position for building `c-state-cache' from ;; scratch. This will be at the top level, 2 defuns back. (save-excursion - ;; Go back 2 bods, but ignore any bogus positions returned by - ;; beginning-of-defun (i.e. open paren in column zero). - (goto-char here) - (let ((cnt 2)) - (while (not (or (bobp) (zerop cnt))) - (c-beginning-of-defun-1) ; Pure elisp BOD. - (if (eq (char-after) ?\{) - (setq cnt (1- cnt))))) - (point))) + (save-restriction + (when (> here (* 10 c-state-cache-too-far)) + (narrow-to-region (- here (* 10 c-state-cache-too-far)) here)) + ;; Go back 2 bods, but ignore any bogus positions returned by + ;; beginning-of-defun (i.e. open paren in column zero). + (goto-char here) + (let ((cnt 2)) + (while (not (or (bobp) (zerop cnt))) + (c-beginning-of-defun-1) ; Pure elisp BOD. + (if (eq (char-after) ?\{) + (setq cnt (1- cnt))))) + (point)))) (defun c-state-balance-parens-backwards (here- here+ top) ;; Return the position of the opening paren/brace/bracket before HERE- which @@ -3667,9 +3672,7 @@ c-parse-state-get-strategy how-far 0)) ((<= good-pos here) (setq strategy 'forward - start-point (if changed-macro-start - cache-pos - (max good-pos cache-pos)) + start-point (max good-pos cache-pos) how-far (- here start-point))) ((< (- good-pos here) (- here cache-pos)) ; FIXME!!! ; apply some sort of weighting. (setq strategy 'backward @@ -4337,8 +4340,12 @@ c-invalidate-state-cache-1 (if (and dropped-cons (<= too-high-pa here)) (c-append-lower-brace-pair-to-state-cache too-high-pa here here-bol)) - (setq c-state-cache-good-pos (or (c-state-cache-after-top-paren) - (c-state-get-min-scan-pos))))) + (if (and c-state-cache-good-pos (< here c-state-cache-good-pos)) + (setq c-state-cache-good-pos + (or (save-excursion + (goto-char here) + (c-literal-start)) + here))))) ;; The brace-pair desert marker: (when (car c-state-brace-pair-desert) @@ -4796,7 +4803,7 @@ c-on-identifier ;; Handle the "operator +" syntax in C++. (when (and c-overloadable-operators-regexp - (= (c-backward-token-2 0) 0)) + (= (c-backward-token-2 0 nil (c-determine-limit 500)) 0)) (cond ((and (looking-at c-overloadable-operators-regexp) (or (not c-opt-op-identifier-prefix) @@ -5065,7 +5072,8 @@ c-backward-token-2 (while (and (> count 0) (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + limit) (backward-char) (if (looking-at jump-syntax) (goto-char (scan-sexps (1+ (point)) -1)) @@ -5402,8 +5410,12 @@ c-syntactic-skip-backward ;; Optimize for, in particular, large blocks of comments from ;; `comment-region'. (progn (when opt-ws - (c-backward-syntactic-ws) - (setq paren-level-pos (point))) + (let ((opt-pos (point))) + (c-backward-syntactic-ws limit) + (if (or (null limit) + (> (point) limit)) + (setq paren-level-pos (point)) + (goto-char opt-pos)))) t) ;; Move back to a candidate end point which isn't in a literal ;; or in a macro we didn't start in. @@ -5423,7 +5435,11 @@ c-syntactic-skip-backward (setq macro-start (point)))) (goto-char macro-start)))) (when opt-ws - (c-backward-syntactic-ws))) + (let ((opt-pos (point))) + (c-backward-syntactic-ws limit) + (if (and limit + (<= (point) limit)) + (goto-char opt-pos))))) (< (point) pos)) ;; Check whether we're at the wrong level of nesting (when @@ -5474,7 +5490,7 @@ c-syntactic-skip-backward (progn ;; Skip syntactic ws afterwards so that we don't stop at the ;; end of a comment if `skip-chars' is something like "^/". - (c-backward-syntactic-ws) + (c-backward-syntactic-ws limit) (point))))) ;; We might want to extend this with more useful return values in @@ -5762,12 +5778,23 @@ c-literal-type (t 'c))) ; Assuming the range is valid. range)) +(defun c-determine-limit-no-macro (here org-start) + ;; If HERE is inside a macro, and ORG-START is not also in the same macro, + ;; return the beginning of the macro. Otherwise return HERE. Point is not + ;; preserved by this function. + (goto-char here) + (let ((here-BOM (and (c-beginning-of-macro) (point)))) + (if (and here-BOM + (not (eq (progn (goto-char org-start) + (and (c-beginning-of-macro) (point))) + here-BOM))) + here-BOM + here))) + (defsubst c-determine-limit-get-base (start try-size) ;; Get a "safe place" approximately TRY-SIZE characters before START. ;; This defsubst doesn't preserve point. (goto-char start) - (c-backward-syntactic-ws) - (setq start (point)) (let* ((pos (max (- start try-size) (point-min))) (s (c-semi-pp-to-literal pos)) (cand (or (car (cddr s)) pos))) @@ -5776,20 +5803,23 @@ c-determine-limit-get-base (parse-partial-sexp pos start nil nil (car s) 'syntax-table) (point)))) -(defun c-determine-limit (how-far-back &optional start try-size) +(defun c-determine-limit (how-far-back &optional start try-size org-start) ;; Return a buffer position approximately HOW-FAR-BACK non-literal ;; characters from START (default point). The starting position, either ;; point or START may not be in a comment or string. ;; ;; The position found will not be before POINT-MIN and won't be in a - ;; literal. + ;; literal. It will also not be inside a macro, unless START/point is also + ;; in the same macro. ;; ;; We start searching for the sought position TRY-SIZE (default ;; twice HOW-FAR-BACK) bytes back from START. ;; ;; This function must be fast. :-) + (save-excursion (let* ((start (or start (point))) + (org-start (or org-start start)) (try-size (or try-size (* 2 how-far-back))) (base (c-determine-limit-get-base start try-size)) (pos base) @@ -5842,21 +5872,27 @@ c-determine-limit (setq elt (car stack) stack (cdr stack)) (setq count (+ count (cdr elt)))) - - ;; Have we found enough yet? (cond ((null elt) ; No non-literal characters found. - (if (> base (point-min)) - (c-determine-limit how-far-back base (* 2 try-size)) - (point-min))) + (cond + ((> pos start) ; Nothing but literals + base) + ((> base (point-min)) + (c-determine-limit how-far-back base (* 2 try-size) org-start)) + (t base))) ((>= count how-far-back) - (+ (car elt) (- count how-far-back))) + (c-determine-limit-no-macro + (+ (car elt) (- count how-far-back)) + org-start)) ((eq base (point-min)) (point-min)) ((> base (- start try-size)) ; Can only happen if we hit point-min. - (car elt)) + (c-determine-limit-no-macro + (car elt) + org-start)) (t - (c-determine-limit (- how-far-back count) base (* 2 try-size))))))) + (c-determine-limit (- how-far-back count) base (* 2 try-size) + org-start)))))) (defun c-determine-+ve-limit (how-far &optional start-pos) ;; Return a buffer position about HOW-FAR non-literal characters forward @@ -6153,7 +6189,8 @@ c-bs-at-toplevel-p (or (null stack) ; Probably unnecessary. (<= (cadr stack) 1)))) -(defmacro c-find-decl-prefix-search () +(defmacro + c-find-decl-prefix-search () ;; Macro used inside `c-find-decl-spots'. It ought to be a defun, ;; but it contains lots of free variables that refer to things ;; inside `c-find-decl-spots'. The point is left at `cfd-match-pos' @@ -6248,8 +6285,14 @@ c-find-decl-prefix-search ;; preceding syntactic ws to set `cfd-match-pos' and to catch ;; any decl spots in the syntactic ws. (unless cfd-re-match - (c-backward-syntactic-ws) - (setq cfd-re-match (point)))) + (let ((cfd-cbsw-lim + (max (- (point) 1000) (point-min)))) + (c-backward-syntactic-ws cfd-cbsw-lim) + (setq cfd-re-match + (if (or (bobp) (> (point) cfd-cbsw-lim)) + (point) + (point-min)))) ; Set BOB case if the token's too far back. + )) ;; Choose whichever match is closer to the start. (if (< cfd-re-match cfd-prop-match) @@ -6410,7 +6453,7 @@ c-find-decl-spots (while (and (not (bobp)) (c-got-face-at (1- (point)) c-literal-faces)) (goto-char (previous-single-property-change - (point) 'face nil (point-min)))) + (point) 'face nil (point-min)))) ; No limit. FIXME, perhaps? 2020-12-07. ;; XEmacs doesn't fontify the quotes surrounding string ;; literals. @@ -6482,12 +6525,15 @@ c-find-decl-spots (c-invalidate-find-decl-cache cfd-start-pos) (setq syntactic-pos (point)) - (unless (eq syntactic-pos c-find-decl-syntactic-pos) + (unless + (eq syntactic-pos c-find-decl-syntactic-pos) ;; Don't have to do this if the cache is relevant here, ;; typically if the same line is refontified again. If ;; we're just some syntactic whitespace further down we can ;; still use the cache to limit the skipping. - (c-backward-syntactic-ws c-find-decl-syntactic-pos)) + (c-backward-syntactic-ws + (max (or c-find-decl-syntactic-pos (point-min)) + (- (point) 10000) (point-min)))) ;; If we hit `c-find-decl-syntactic-pos' and ;; `c-find-decl-match-pos' is set then we install the cached @@ -6613,7 +6659,8 @@ c-find-decl-spots ;; syntactic ws. (when (and cfd-match-pos (< cfd-match-pos syntactic-pos)) (goto-char syntactic-pos) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws + (min (+ (point) 2000) (point-max))) (and cfd-continue-pos (< cfd-continue-pos (point)) (setq cfd-token-pos (point)))) @@ -6654,7 +6701,8 @@ c-find-decl-spots ;; can't be nested, and that's already been done in ;; `c-find-decl-prefix-search'. (when (> cfd-continue-pos cfd-token-pos) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws + (min (+ (point) 2000) (point-max))) (setq cfd-token-pos (point))) ;; Continue if the following token fails the @@ -8817,7 +8865,7 @@ c-back-over-member-initializer-braces (or res (goto-char here)) res)) -(defmacro c-back-over-list-of-member-inits () +(defmacro c-back-over-list-of-member-inits (limit) ;; Go back over a list of elements, each looking like: ;; () , ;; or {} , (with possibly a <....> expressions @@ -8826,21 +8874,21 @@ c-back-over-list-of-member-inits ;; a comma. If either of or bracketed is missing, ;; throw nil to 'level. If the terminating } or ) is unmatched, throw nil ;; to 'done. This is not a general purpose macro! - '(while (eq (char-before) ?,) + `(while (eq (char-before) ?,) (backward-char) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws ,limit) (when (not (memq (char-before) '(?\) ?}))) (throw 'level nil)) (when (not (c-go-list-backward)) (throw 'done nil)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws ,limit) (while (eq (char-before) ?>) (when (not (c-backward-<>-arglist nil)) (throw 'done nil)) - (c-backward-syntactic-ws)) + (c-backward-syntactic-ws ,limit)) (when (not (c-back-over-compound-identifier)) (throw 'level nil)) - (c-backward-syntactic-ws))) + (c-backward-syntactic-ws ,limit))) (defun c-back-over-member-initializers (&optional limit) ;; Test whether we are in a C++ member initializer list, and if so, go back @@ -8859,14 +8907,14 @@ c-back-over-member-initializers (catch 'done (setq level-plausible (catch 'level - (c-backward-syntactic-ws) + (c-backward-syntactic-ws limit) (when (memq (char-before) '(?\) ?})) (when (not (c-go-list-backward)) (throw 'done nil)) - (c-backward-syntactic-ws)) + (c-backward-syntactic-ws limit)) (when (c-back-over-compound-identifier) - (c-backward-syntactic-ws)) - (c-back-over-list-of-member-inits) + (c-backward-syntactic-ws limit)) + (c-back-over-list-of-member-inits limit) (and (eq (char-before) ?:) (save-excursion (c-backward-token-2) @@ -8880,14 +8928,14 @@ c-back-over-member-initializers (setq level-plausible (catch 'level (goto-char pos) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws limit) (when (not (c-back-over-compound-identifier)) (throw 'level nil)) - (c-backward-syntactic-ws) - (c-back-over-list-of-member-inits) + (c-backward-syntactic-ws limit) + (c-back-over-list-of-member-inits limit) (and (eq (char-before) ?:) (save-excursion - (c-backward-token-2) + (c-backward-token-2 nil nil limit) (not (looking-at c-:$-multichar-token-regexp))) (c-just-after-func-arglist-p))))) @@ -12012,7 +12060,7 @@ c-looking-at-inexpr-block (goto-char haskell-op-pos)) (while (and (eq res 'maybe) - (progn (c-backward-syntactic-ws) + (progn (c-backward-syntactic-ws lim) (> (point) closest-lim)) (not (bobp)) (progn (backward-char) @@ -12783,7 +12831,7 @@ c-guess-basic-syntax (setq paren-state (cons containing-sexp paren-state) containing-sexp nil))) (setq lim (1+ containing-sexp)))) - (setq lim (point-min))) + (setq lim (c-determine-limit 1000))) ;; If we're in a parenthesis list then ',' delimits the ;; "statements" rather than being an operator (with the @@ -13025,7 +13073,9 @@ c-guess-basic-syntax ;; CASE 4: In-expression statement. C.f. cases 7B, 16A and ;; 17E. ((setq placeholder (c-looking-at-inexpr-block - (c-safe-position containing-sexp paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) containing-sexp ;; Have to turn on the heuristics after ;; the point even though it doesn't work @@ -13150,7 +13200,8 @@ c-guess-basic-syntax ;; init lists can, in practice, be very large. ((save-excursion (when (and (c-major-mode-is 'c++-mode) - (setq placeholder (c-back-over-member-initializers))) + (setq placeholder (c-back-over-member-initializers + lim))) (setq tmp-pos (point)))) (if (= (c-point 'bosws) (1+ tmp-pos)) (progn @@ -13469,7 +13520,7 @@ c-guess-basic-syntax ;; CASE 5I: ObjC method definition. ((and c-opt-method-key (looking-at c-opt-method-key)) - (c-beginning-of-statement-1 nil t) + (c-beginning-of-statement-1 (c-determine-limit 1000) t) (if (= (point) indent-point) ;; Handle the case when it's the first (non-comment) ;; thing in the buffer. Can't look for a 'same return @@ -13542,7 +13593,16 @@ c-guess-basic-syntax (if (>= (point) indent-point) (throw 'not-in-directive t)) (setq placeholder (point))) - nil))))) + nil)) + (and macro-start + (not (c-beginning-of-statement-1 lim nil nil nil t)) + (setq placeholder + (let ((ps-top (car paren-state))) + (if (consp ps-top) + (progn + (goto-char (cdr ps-top)) + (c-forward-syntactic-ws indent-point)) + (point-min)))))))) ;; For historic reasons we anchor at bol of the last ;; line of the previous declaration. That's clearly ;; highly bogus and useless, and it makes our lives hard @@ -13591,19 +13651,30 @@ c-guess-basic-syntax (eq (char-before) ?<) (not (and c-overloadable-operators-regexp (c-after-special-operator-id lim)))) - (c-beginning-of-statement-1 (c-safe-position (point) paren-state)) + (c-beginning-of-statement-1 + (or + (c-safe-position (point) paren-state) + (c-determine-limit 1000))) (c-add-syntax 'template-args-cont (c-point 'boi))) ;; CASE 5Q: we are at a statement within a macro. - (macro-start - (c-beginning-of-statement-1 containing-sexp) + ((and + macro-start + (save-excursion + (prog1 + (not (eq (c-beginning-of-statement-1 + (or containing-sexp (c-determine-limit 1000)) + nil nil nil t) + nil))) + (setq placeholder (point)))) + (goto-char placeholder) (c-add-stmt-syntax 'statement nil t containing-sexp paren-state)) - ;;CASE 5N: We are at a topmost continuation line and the only + ;;CASE 5S: We are at a topmost continuation line and the only ;;preceding items are annotations. ((and (c-major-mode-is 'java-mode) (setq placeholder (point)) - (c-beginning-of-statement-1) + (c-beginning-of-statement-1 lim) (progn (while (and (c-forward-annotation)) (c-forward-syntactic-ws)) @@ -13615,7 +13686,9 @@ c-guess-basic-syntax ;; CASE 5M: we are at a topmost continuation line (t - (c-beginning-of-statement-1 (c-safe-position (point) paren-state)) + (c-beginning-of-statement-1 + (or (c-safe-position (point) paren-state) + (c-determine-limit 1000))) (when (c-major-mode-is 'objc-mode) (setq placeholder (point)) (while (and (c-forward-objc-directive) @@ -13671,8 +13744,9 @@ c-guess-basic-syntax (setq tmpsymbol '(block-open . inexpr-statement) placeholder (cdr-safe (c-looking-at-inexpr-block - (c-safe-position containing-sexp - paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) containing-sexp))) ;; placeholder is nil if it's a block directly in ;; a function arglist. That makes us skip out of @@ -13804,7 +13878,9 @@ c-guess-basic-syntax (setq placeholder (c-guess-basic-syntax)))) (setq c-syntactic-context placeholder) (c-beginning-of-statement-1 - (c-safe-position (1- containing-sexp) paren-state)) + (or + (c-safe-position (1- containing-sexp) paren-state) + (c-determine-limit 1000 (1- containing-sexp)))) (c-forward-token-2 0) (while (cond ((looking-at c-specifier-key) @@ -13838,7 +13914,8 @@ c-guess-basic-syntax (c-add-syntax 'brace-list-close (point)) (setq lim (or (save-excursion (and - (c-back-over-member-initializers) + (c-back-over-member-initializers + (c-determine-limit 1000)) (point))) (c-most-enclosing-brace state-cache (point)))) (c-beginning-of-statement-1 lim nil nil t) @@ -13871,7 +13948,8 @@ c-guess-basic-syntax (c-add-syntax 'brace-list-intro (point)) (setq lim (or (save-excursion (and - (c-back-over-member-initializers) + (c-back-over-member-initializers + (c-determine-limit 1000)) (point))) (c-most-enclosing-brace state-cache (point)))) (c-beginning-of-statement-1 lim nil nil t) @@ -13927,7 +14005,9 @@ c-guess-basic-syntax ;; CASE 16A: closing a lambda defun or an in-expression ;; block? C.f. cases 4, 7B and 17E. ((setq placeholder (c-looking-at-inexpr-block - (c-safe-position containing-sexp paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) nil)) (setq tmpsymbol (if (eq (car placeholder) 'inlambda) 'inline-close @@ -14090,7 +14170,9 @@ c-guess-basic-syntax ;; CASE 17E: first statement in an in-expression block. ;; C.f. cases 4, 7B and 16A. ((setq placeholder (c-looking-at-inexpr-block - (c-safe-position containing-sexp paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) nil)) (setq tmpsymbol (if (eq (car placeholder) 'inlambda) 'defun-block-intro diff --git a/lisp/progmodes/cc-fonts.el b/lisp/progmodes/cc-fonts.el index bb7e5bea6e..166cbd7a49 100644 --- a/lisp/progmodes/cc-fonts.el +++ b/lisp/progmodes/cc-fonts.el @@ -947,7 +947,7 @@ c-font-lock-complex-decl-prepare ;; closest token before the region. (save-excursion (let ((pos (point))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (c-clear-char-properties (if (and (not (bobp)) (memq (c-get-char-property (1- (point)) 'c-type) @@ -969,7 +969,7 @@ c-font-lock-complex-decl-prepare ;; The declared identifiers are font-locked correctly as types, if ;; that is what they are. (let ((prop (save-excursion - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (unless (bobp) (c-get-char-property (1- (point)) 'c-type))))) (when (memq prop '(c-decl-id-start c-decl-type-start)) @@ -1008,15 +1008,24 @@ c-font-lock-<>-arglists (boundp 'parse-sexp-lookup-properties))) (c-parse-and-markup-<>-arglists t) c-restricted-<>-arglists - id-start id-end id-face pos kwd-sym) + id-start id-end id-face pos kwd-sym + old-pos) (while (and (< (point) limit) - (re-search-forward c-opt-<>-arglist-start limit t)) - - (setq id-start (match-beginning 1) - id-end (match-end 1) - pos (point)) - + (setq old-pos (point)) + (c-syntactic-re-search-forward "<" limit t nil t)) + (setq pos (point)) + (save-excursion + (backward-char) + (c-backward-syntactic-ws old-pos) + (if (re-search-backward + (concat "\\(\\`\\|" c-nonsymbol-key "\\)\\(" c-symbol-key"\\)\\=") + old-pos t) + (setq id-start (match-beginning 2) + id-end (match-end 2)) + (setq id-start nil id-end nil))) + + (when id-start (goto-char id-start) (unless (c-skip-comments-and-strings limit) (setq kwd-sym nil @@ -1033,7 +1042,7 @@ c-font-lock-<>-arglists (when (looking-at c-opt-<>-sexp-key) ;; There's a special keyword before the "<" that tells ;; that it's an angle bracket arglist. - (setq kwd-sym (c-keyword-sym (match-string 1))))) + (setq kwd-sym (c-keyword-sym (match-string 2))))) (t ;; There's a normal identifier before the "<". If we're not in @@ -1067,7 +1076,7 @@ c-font-lock-<>-arglists 'font-lock-type-face)))))) (goto-char pos))) - (goto-char pos)))))) + (goto-char pos))))))) nil) (defun c-font-lock-declarators (limit list types not-top @@ -1496,7 +1505,8 @@ c-font-lock-declarations ;; Check we haven't missed a preceding "typedef". (when (not (looking-at c-typedef-key)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + (max (- (point) 1000) (point-min))) (c-backward-token-2) (or (looking-at c-typedef-key) (goto-char start-pos))) @@ -1536,8 +1546,10 @@ c-font-lock-declarations (c-backward-token-2) (and (not (looking-at c-opt-<>-sexp-key)) - (progn (c-backward-syntactic-ws) - (memq (char-before) '(?\( ?,))) + (progn + (c-backward-syntactic-ws + (max (- (point) 1000) (point-min))) + (memq (char-before) '(?\( ?,))) (not (eq (c-get-char-property (1- (point)) 'c-type) 'c-decl-arg-start)))))) @@ -2295,7 +2307,8 @@ c-font-lock-c++-using (and c-colon-type-list-re (c-go-up-list-backward) (eq (char-after) ?{) - (eq (car (c-beginning-of-decl-1)) 'same) + (eq (car (c-beginning-of-decl-1 + (c-determine-limit 1000))) 'same) (looking-at c-colon-type-list-re))) ;; Inherited protected member: leave unfontified ) diff --git a/lisp/progmodes/cc-langs.el b/lisp/progmodes/cc-langs.el index d6089ea295..4d1aeaa5cb 100644 --- a/lisp/progmodes/cc-langs.el +++ b/lisp/progmodes/cc-langs.el @@ -699,6 +699,7 @@ c-populate-syntax-table ;; The same thing regarding Unicode identifiers applies here as to ;; `c-symbol-key'. t (concat "[" (c-lang-const c-nonsymbol-chars) "]")) +(c-lang-defvar c-nonsymbol-key (c-lang-const c-nonsymbol-key)) (c-lang-defconst c-identifier-ops "The operators that make up fully qualified identifiers. nil in diff --git a/lisp/progmodes/cc-mode.el b/lisp/progmodes/cc-mode.el index c5201d1af5..df9709df94 100644 --- a/lisp/progmodes/cc-mode.el +++ b/lisp/progmodes/cc-mode.el @@ -499,11 +499,14 @@ c-unfind-coalesced-tokens (save-excursion (when (< beg end) (goto-char beg) + (let ((lim (c-determine-limit 1000)) + (lim+ (c-determine-+ve-limit 1000 end))) (when (and (not (bobp)) - (progn (c-backward-syntactic-ws) (eq (point) beg)) + (progn (c-backward-syntactic-ws lim) (eq (point) beg)) (/= (skip-chars-backward c-symbol-chars (1- (point))) 0) - (progn (goto-char beg) (c-forward-syntactic-ws) (<= (point) end)) + (progn (goto-char beg) (c-forward-syntactic-ws lim+) + (<= (point) end)) (> (point) beg) (goto-char end) (looking-at c-symbol-char-key)) @@ -514,14 +517,14 @@ c-unfind-coalesced-tokens (goto-char end) (when (and (not (eobp)) - (progn (c-forward-syntactic-ws) (eq (point) end)) + (progn (c-forward-syntactic-ws lim+) (eq (point) end)) (looking-at c-symbol-char-key) - (progn (c-backward-syntactic-ws) (>= (point) beg)) + (progn (c-backward-syntactic-ws lim) (>= (point) beg)) (< (point) end) (/= (skip-chars-backward c-symbol-chars (1- (point))) 0)) (goto-char (1+ end)) (c-end-of-current-token) - (c-unfind-type (buffer-substring-no-properties end (point))))))) + (c-unfind-type (buffer-substring-no-properties end (point)))))))) ;; c-maybe-stale-found-type records a place near the region being ;; changed where an element of `found-types' might become stale. It @@ -1993,10 +1996,10 @@ c-before-change ;; inserting stuff after "foo" in "foo bar;", or ;; before "foo" in "typedef foo *bar;"? ;; - ;; We search for appropriate c-type properties "near" - ;; the change. First, find an appropriate boundary - ;; for this property search. - (let (lim + ;; We search for appropriate c-type properties "near" the + ;; change. First, find an appropriate boundary for this + ;; property search. + (let (lim lim-2 type type-pos marked-id term-pos (end1 @@ -2007,8 +2010,11 @@ c-before-change (when (>= end1 beg) ; Don't hassle about changes entirely in ; comments. ;; Find a limit for the search for a `c-type' property + ;; Point is currently undefined. A `goto-char' somewhere is needed. (2020-12-06). + (setq lim-2 (c-determine-limit 1000 (point) ; that is wrong. FIXME!!! (2020-12-06) + )) (while - (and (/= (skip-chars-backward "^;{}") 0) + (and (/= (skip-chars-backward "^;{}" lim-2) 0) (> (point) (point-min)) (memq (c-get-char-property (1- (point)) 'face) '(font-lock-comment-face font-lock-string-face)))) @@ -2032,7 +2038,8 @@ c-before-change (buffer-substring-no-properties (point) type-pos))) (goto-char end1) - (skip-chars-forward "^;{}") ; FIXME!!! loop for + (setq lim-2 (c-determine-+ve-limit 1000)) + (skip-chars-forward "^;{}" lim-2) ; FIXME!!! loop for ; comment, maybe (setq lim (point)) (setq term-pos @@ -2270,9 +2277,11 @@ c-fl-decl-end ;; preserved. (goto-char pos) (let ((lit-start (c-literal-start)) + (lim (c-determine-limit 1000)) enclosing-attribute pos1) (unless lit-start - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + lim) (when (setq enclosing-attribute (c-enclosing-c++-attribute)) (goto-char (car enclosing-attribute))) ; Only happens in C++ Mode. (when (setq pos1 (c-on-identifier)) @@ -2296,14 +2305,14 @@ c-fl-decl-end (setq pos1 (c-on-identifier)) (goto-char pos1) (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (eq (char-before) ?\()) (c-fl-decl-end (1- (point)))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (point)))) (and (progn (c-forward-syntactic-ws lim) (not (eobp))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (point))))))))) (defun c-change-expand-fl-region (_beg _end _old-len) -- Alan Mackenzie (Nuremberg, Germany).