From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#25362: 26.0.50; comment-region goes into an infinite loop Date: Mon, 9 Jan 2017 21:18:01 +0000 Message-ID: <20170109211801.GA4752@acm.fritz.box> References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1483996767 10135 195.159.176.226 (9 Jan 2017 21:19:27 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 9 Jan 2017 21:19:27 +0000 (UTC) User-Agent: Mutt/1.5.24 (2015-08-30) Cc: 25362@debbugs.gnu.org To: Dima Kogan Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Jan 09 22:19:18 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cQhLR-0000jS-HY for geb-bug-gnu-emacs@m.gmane.org; Mon, 09 Jan 2017 22:19:09 +0100 Original-Received: from localhost ([::1]:43134 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cQhLV-0002Pd-QO for geb-bug-gnu-emacs@m.gmane.org; Mon, 09 Jan 2017 16:19:13 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57562) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cQhLN-0002PL-St for bug-gnu-emacs@gnu.org; Mon, 09 Jan 2017 16:19:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cQhLK-0003x5-I6 for bug-gnu-emacs@gnu.org; Mon, 09 Jan 2017 16:19:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:60970) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cQhLK-0003x1-Dq; Mon, 09 Jan 2017 16:19:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cQhLK-0006i7-68; Mon, 09 Jan 2017 16:19:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bug-cc-mode@gnu.org Resent-Date: Mon, 09 Jan 2017 21:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25362 X-GNU-PR-Package: emacs,cc-mode X-GNU-PR-Keywords: confirmed Original-Received: via spool by 25362-submit@debbugs.gnu.org id=B25362.148399670025741 (code B ref 25362); Mon, 09 Jan 2017 21:19:02 +0000 Original-Received: (at 25362) by debbugs.gnu.org; 9 Jan 2017 21:18:20 +0000 Original-Received: from localhost ([127.0.0.1]:48136 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cQhKd-0006h7-HC for submit@debbugs.gnu.org; Mon, 09 Jan 2017 16:18:19 -0500 Original-Received: from ocolin.muc.de ([193.149.48.4]:46552 helo=mail.muc.de) by debbugs.gnu.org with smtp (Exim 4.84_2) (envelope-from ) id 1cQhKb-0006gy-BK for 25362@debbugs.gnu.org; Mon, 09 Jan 2017 16:18:18 -0500 Original-Received: (qmail 45239 invoked by uid 3782); 9 Jan 2017 21:18:15 -0000 Original-Received: from acm.muc.de (p548C7205.dip0.t-ipconnect.de [84.140.114.5]) by colin.muc.de (tmda-ofmipd) with ESMTP; Mon, 09 Jan 2017 22:18:14 +0100 Original-Received: (qmail 4796 invoked by uid 1000); 9 Jan 2017 21:18:01 -0000 Content-Disposition: inline In-Reply-To: X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:127941 Archived-At: Hello again, Dima. On Wed, Jan 04, 2017 at 03:37:25PM -0800, Dima Kogan wrote: > Hi. I'm using a very recent emacs built from master: > eb3416016b4 > I'm observing (comment-region) go into an infinite loop when presented > with particular data. I'm attaching a small source file that > demonstrates the issue. To see it break, run this: > emacs --eval '(add-hook (quote c-mode-common-hook) (lambda () (setq comment-start "//" comment-end "")))' \ > --eval '(global-set-key (kbd "") (lambda () (interactive) (comment-region 16 45)))' \ > -Q tst6.c > This loads the demo file with a clean configuration, and runs two bits > of lisp: > 1. Sets up a c++ commenting style. This is required for the issue to > present itself > 2. Binds f5 to run the problematic (comment-region) > On my machine, invoking emacs that way, and then hitting f5 shows the problem. > void f(void) > { > > g( // output > s ); > > } As already observed, this bug happens simply in C Mode by attempting to comment out the four lines inside the function, doing this by marking the lines then C-c C-c. The cause was a rather nasty situation where a "syntactic whitespace cache", designed to facilitate moving rapidly over large areas of WS (particularly long comments at the beginning of files) got screwed up by the insertion of "// " onto a line already containing a "//" later on. Here is a patch. It is not yet in its final form (the comments need fixing). Would you please try it out and let me know how well/how badly it works: diff -r a3e0a2a33629 cc-engine.el --- a/cc-engine.el Sat Dec 31 16:59:41 2016 +0000 +++ b/cc-engine.el Mon Jan 09 21:03:52 2017 +0000 @@ -1712,46 +1712,122 @@ `((c-debug-remove-face beg end 'c-debug-is-sws-face) (c-debug-remove-face beg end 'c-debug-in-sws-face))))) -(defsubst c-invalidate-sws-region-after (beg end) +;; The type of literal position `end' is in in a `before-change-functions' +;; function - one of `c', `c++', `string', `pound', or nil. +(defvar c-sws-lit-type nil) +;; A cons (BEG . END) of the literal/CPP construct enclosing END, if any, else +;; nil. +(defvar c-sws-lit-limits nil) + +(defun c-invalidate-sws-region-before (end) + ;; Called from c-before-change. END is the end of the change region, the + ;; standard parameter given to all before-change-functions. + ;; + ;; Note whether END is inside a comment of CPP construct, and if so note its + ;; bounds. + (save-excursion + (goto-char end) + (let* ((limits (c-literal-limits)) + (lit-type (c-literal-type limits))) + (cond + ((memq lit-type '(c c++)) + (setq c-sws-lit-type lit-type + c-sws-lit-limits limits)) + ((c-beginning-of-macro) + (setq c-sws-lit-type 'pound + c-sws-lit-limits (cons (point) + (progn (c-end-of-macro) (point))))) + (t (setq c-sws-lit-type nil + c-sws-lit-limits nil)))))) + +(defun c-invalidate-sws-region-after-del (beg end old-len) + ;; Text has been deleted, OLD-LEN characters of it starting from position + ;; BEG. END is typically eq to BEG. Should there have been a comment or + ;; CPP construct open at END before the deletion, check whether this + ;; deletion deleted or "damaged" its opening delimiter. If so, return the + ;; current position of where the construct ended, otherwise return nil. + (when c-sws-lit-limits + (setcdr c-sws-lit-limits (- (cdr c-sws-lit-limits) old-len)) + (if (and (< beg (+ (car c-sws-lit-limits) 2)) + (or (get-text-property end 'c-in-sws) + (next-single-property-change end 'c-in-sws nil + (cdr c-sws-lit-limits)) + (get-text-property end 'c-is-sws) + (next-single-property-change end 'c-is-sws nil + (cdr c-sws-lit-limits)))) + (cdr c-sws-lit-limits)))) + +(defun c-invalidate-sws-region-after-ins (end) + ;; Text has been inserted, ending at buffer position END. Should there be a + ;; literal or CPP construct open at END, check whether there are `c-in-sws' + ;; or `c-is-sws' text properties inside this literal. If there are, return + ;; the buffer position of the end of the literal, else return nil. + (save-excursion + (let* ((limits (c-literal-limits)) + (lit-type (c-literal-type limits))) + (goto-char end) + (when (and (not (memq lit-type '(c c++))) + (c-beginning-of-macro)) + (setq lit-type 'pound + limits (cons (point) + (progn (c-end-of-macro) (point))))) + (when (memq lit-type '(c c++ pound)) + (let ((next-in (next-single-property-change (car limits) 'c-in-sws + nil (cdr limits))) + (next-is (next-single-property-change (car limits) 'c-is-sws + nil (cdr limits)))) + (and (or next-in next-is) + (cdr limits))))))) + +(defun c-invalidate-sws-region-after (beg end old-len) ;; Called from `after-change-functions'. Note that if ;; `c-forward-sws' or `c-backward-sws' are used outside ;; `c-save-buffer-state' or similar then this will remove the cache ;; properties right after they're added. ;; ;; This function does hidden buffer changes. - - (save-excursion - ;; Adjust the end to remove the properties in any following simple - ;; ws up to and including the next line break, if there is any - ;; after the changed region. This is necessary e.g. when a rung - ;; marked empty line is converted to a line comment by inserting - ;; "//" before the line break. In that case the line break would - ;; keep the rung mark which could make a later `c-backward-sws' - ;; move into the line comment instead of over it. - (goto-char end) - (skip-chars-forward " \t\f\v") - (when (and (eolp) (not (eobp))) - (setq end (1+ (point))))) - - (when (and (= beg end) - (get-text-property beg 'c-in-sws) - (> beg (point-min)) - (get-text-property (1- beg) 'c-in-sws)) - ;; Ensure that an `c-in-sws' range gets broken. Note that it isn't - ;; safe to keep a range that was continuous before the change. E.g: - ;; - ;; #define foo - ;; \ - ;; bar - ;; - ;; There can be a "ladder" between "#" and "b". Now, if the newline - ;; after "foo" is removed then "bar" will become part of the cpp - ;; directive instead of a syntactically relevant token. In that - ;; case there's no longer syntactic ws from "#" to "b". - (setq beg (1- beg))) - - (c-debug-sws-msg "c-invalidate-sws-region-after [%s..%s]" beg end) - (c-remove-is-and-in-sws beg end)) + (let ((del-end + (and (> old-len 0) + (c-invalidate-sws-region-after-del beg end old-len))) + (ins-end + (and (> end beg) + (c-invalidate-sws-region-after-ins end)))) + (save-excursion + ;; Adjust the end to remove the properties in any following simple + ;; ws up to and including the next line break, if there is any + ;; after the changed region. This is necessary e.g. when a rung + ;; marked empty line is converted to a line comment by inserting + ;; "//" before the line break. In that case the line break would + ;; keep the rung mark which could make a later `c-backward-sws' + ;; move into the line comment instead of over it. + (goto-char end) + (skip-chars-forward " \t\f\v") + (when (and (eolp) (not (eobp))) + (setq end (1+ (point))))) + + (when (and (= beg end) + (get-text-property beg 'c-in-sws) + (> beg (point-min)) + (get-text-property (1- beg) 'c-in-sws)) + ;; Ensure that an `c-in-sws' range gets broken. Note that it isn't + ;; safe to keep a range that was continuous before the change. E.g: + ;; + ;; #define foo + ;; \ + ;; bar + ;; + ;; There can be a "ladder" between "#" and "b". Now, if the newline + ;; after "foo" is removed then "bar" will become part of the cpp + ;; directive instead of a syntactically relevant token. In that + ;; case there's no longer syntactic ws from "#" to "b". + (setq beg (1- beg))) + + (setq end (max (or del-end end) + (or ins-end end) + end)) + + (c-debug-sws-msg "c-invalidate-sws-region-after [%s..%s]" beg end) + (c-remove-is-and-in-sws beg end))) (defun c-forward-sws () ;; Used by `c-forward-syntactic-ws' to implement the unbounded search. diff -r a3e0a2a33629 cc-langs.el --- a/cc-langs.el Sat Dec 31 16:59:41 2016 +0000 +++ b/cc-langs.el Mon Jan 09 21:03:52 2017 +0000 @@ -1428,6 +1428,15 @@ t "*/" awk nil) +(c-lang-defconst c-block-comment-ender-regexp + ;; Regexp which matches the end of a block comment (if such exists in the + ;; language) + t (if (c-lang-const c-block-comment-ender) + (regexp-quote (c-lang-const c-block-comment-ender)) + "\\<\\>")) +(c-lang-defvar c-block-comment-ender-regexp + (c-lang-const c-block-comment-ender-regexp)) + (c-lang-defconst c-comment-start-regexp ;; Regexp to match the start of any type of comment. t (let ((re (c-make-keywords-re nil diff -r a3e0a2a33629 cc-mode.el --- a/cc-mode.el Sat Dec 31 16:59:41 2016 +0000 +++ b/cc-mode.el Mon Jan 09 21:03:52 2017 +0000 @@ -1189,6 +1189,7 @@ ;; Are we coalescing two tokens together, e.g. "fo o" -> "foo"? (when (< beg end) (c-unfind-coalesced-tokens beg end)) + (c-invalidate-sws-region-before end) ;; Are we (potentially) disrupting the syntactic context which ;; makes a type a type? E.g. by inserting stuff after "foo" in ;; "foo bar;", or before "foo" in "typedef foo *bar;"? @@ -1314,7 +1315,7 @@ (c-clear-char-property-with-value beg end 'syntax-table nil))) (c-trim-found-types beg end old-len) ; maybe we don't need all of these. - (c-invalidate-sws-region-after beg end) + (c-invalidate-sws-region-after beg end old-len) ;; (c-invalidate-state-cache beg) ; moved to `c-before-change'. (c-invalidate-find-decl-cache beg) Thanks for this rather obscure bug report. :-) -- Alan Mackenzie (Nuremberg, Germany).