From: Alan Mackenzie <acm@muc.de>
To: Dima Kogan <dima@secretsauce.net>
Cc: 25362@debbugs.gnu.org
Subject: bug#25362: 26.0.50; comment-region goes into an infinite loop
Date: Mon, 9 Jan 2017 21:18:01 +0000 [thread overview]
Message-ID: <20170109211801.GA4752@acm.fritz.box> (raw)
In-Reply-To: <ypoulguq8ki2.fsf@jpl.nasa.gov>
Hello again, Dima.
On Wed, Jan 04, 2017 at 03:37:25PM -0800, Dima Kogan wrote:
> Hi. I'm using a very recent emacs built from master:
> eb3416016b4
> I'm observing (comment-region) go into an infinite loop when presented
> with particular data. I'm attaching a small source file that
> demonstrates the issue. To see it break, run this:
> emacs --eval '(add-hook (quote c-mode-common-hook) (lambda () (setq comment-start "//" comment-end "")))' \
> --eval '(global-set-key (kbd "<f5>") (lambda () (interactive) (comment-region 16 45)))' \
> -Q tst6.c
> This loads the demo file with a clean configuration, and runs two bits
> of lisp:
> 1. Sets up a c++ commenting style. This is required for the issue to
> present itself
> 2. Binds f5 to run the problematic (comment-region)
> On my machine, invoking emacs that way, and then hitting f5 shows the problem.
> void f(void)
> {
>
> g( // output
> s );
>
> }
As already observed, this bug happens simply in C Mode by attempting to
comment out the four lines inside the function, doing this by marking
the lines then C-c C-c.
The cause was a rather nasty situation where a "syntactic whitespace
cache", designed to facilitate moving rapidly over large areas of WS
(particularly long comments at the beginning of files) got screwed up by
the insertion of "// " onto a line already containing a "//" later on.
Here is a patch. It is not yet in its final form (the comments need
fixing). Would you please try it out and let me know how well/how badly
it works:
diff -r a3e0a2a33629 cc-engine.el
--- a/cc-engine.el Sat Dec 31 16:59:41 2016 +0000
+++ b/cc-engine.el Mon Jan 09 21:03:52 2017 +0000
@@ -1712,46 +1712,122 @@
`((c-debug-remove-face beg end 'c-debug-is-sws-face)
(c-debug-remove-face beg end 'c-debug-in-sws-face)))))
-(defsubst c-invalidate-sws-region-after (beg end)
+;; The type of literal position `end' is in in a `before-change-functions'
+;; function - one of `c', `c++', `string', `pound', or nil.
+(defvar c-sws-lit-type nil)
+;; A cons (BEG . END) of the literal/CPP construct enclosing END, if any, else
+;; nil.
+(defvar c-sws-lit-limits nil)
+
+(defun c-invalidate-sws-region-before (end)
+ ;; Called from c-before-change. END is the end of the change region, the
+ ;; standard parameter given to all before-change-functions.
+ ;;
+ ;; Note whether END is inside a comment of CPP construct, and if so note its
+ ;; bounds.
+ (save-excursion
+ (goto-char end)
+ (let* ((limits (c-literal-limits))
+ (lit-type (c-literal-type limits)))
+ (cond
+ ((memq lit-type '(c c++))
+ (setq c-sws-lit-type lit-type
+ c-sws-lit-limits limits))
+ ((c-beginning-of-macro)
+ (setq c-sws-lit-type 'pound
+ c-sws-lit-limits (cons (point)
+ (progn (c-end-of-macro) (point)))))
+ (t (setq c-sws-lit-type nil
+ c-sws-lit-limits nil))))))
+
+(defun c-invalidate-sws-region-after-del (beg end old-len)
+ ;; Text has been deleted, OLD-LEN characters of it starting from position
+ ;; BEG. END is typically eq to BEG. Should there have been a comment or
+ ;; CPP construct open at END before the deletion, check whether this
+ ;; deletion deleted or "damaged" its opening delimiter. If so, return the
+ ;; current position of where the construct ended, otherwise return nil.
+ (when c-sws-lit-limits
+ (setcdr c-sws-lit-limits (- (cdr c-sws-lit-limits) old-len))
+ (if (and (< beg (+ (car c-sws-lit-limits) 2))
+ (or (get-text-property end 'c-in-sws)
+ (next-single-property-change end 'c-in-sws nil
+ (cdr c-sws-lit-limits))
+ (get-text-property end 'c-is-sws)
+ (next-single-property-change end 'c-is-sws nil
+ (cdr c-sws-lit-limits))))
+ (cdr c-sws-lit-limits))))
+
+(defun c-invalidate-sws-region-after-ins (end)
+ ;; Text has been inserted, ending at buffer position END. Should there be a
+ ;; literal or CPP construct open at END, check whether there are `c-in-sws'
+ ;; or `c-is-sws' text properties inside this literal. If there are, return
+ ;; the buffer position of the end of the literal, else return nil.
+ (save-excursion
+ (let* ((limits (c-literal-limits))
+ (lit-type (c-literal-type limits)))
+ (goto-char end)
+ (when (and (not (memq lit-type '(c c++)))
+ (c-beginning-of-macro))
+ (setq lit-type 'pound
+ limits (cons (point)
+ (progn (c-end-of-macro) (point)))))
+ (when (memq lit-type '(c c++ pound))
+ (let ((next-in (next-single-property-change (car limits) 'c-in-sws
+ nil (cdr limits)))
+ (next-is (next-single-property-change (car limits) 'c-is-sws
+ nil (cdr limits))))
+ (and (or next-in next-is)
+ (cdr limits)))))))
+
+(defun c-invalidate-sws-region-after (beg end old-len)
;; Called from `after-change-functions'. Note that if
;; `c-forward-sws' or `c-backward-sws' are used outside
;; `c-save-buffer-state' or similar then this will remove the cache
;; properties right after they're added.
;;
;; This function does hidden buffer changes.
-
- (save-excursion
- ;; Adjust the end to remove the properties in any following simple
- ;; ws up to and including the next line break, if there is any
- ;; after the changed region. This is necessary e.g. when a rung
- ;; marked empty line is converted to a line comment by inserting
- ;; "//" before the line break. In that case the line break would
- ;; keep the rung mark which could make a later `c-backward-sws'
- ;; move into the line comment instead of over it.
- (goto-char end)
- (skip-chars-forward " \t\f\v")
- (when (and (eolp) (not (eobp)))
- (setq end (1+ (point)))))
-
- (when (and (= beg end)
- (get-text-property beg 'c-in-sws)
- (> beg (point-min))
- (get-text-property (1- beg) 'c-in-sws))
- ;; Ensure that an `c-in-sws' range gets broken. Note that it isn't
- ;; safe to keep a range that was continuous before the change. E.g:
- ;;
- ;; #define foo
- ;; \
- ;; bar
- ;;
- ;; There can be a "ladder" between "#" and "b". Now, if the newline
- ;; after "foo" is removed then "bar" will become part of the cpp
- ;; directive instead of a syntactically relevant token. In that
- ;; case there's no longer syntactic ws from "#" to "b".
- (setq beg (1- beg)))
-
- (c-debug-sws-msg "c-invalidate-sws-region-after [%s..%s]" beg end)
- (c-remove-is-and-in-sws beg end))
+ (let ((del-end
+ (and (> old-len 0)
+ (c-invalidate-sws-region-after-del beg end old-len)))
+ (ins-end
+ (and (> end beg)
+ (c-invalidate-sws-region-after-ins end))))
+ (save-excursion
+ ;; Adjust the end to remove the properties in any following simple
+ ;; ws up to and including the next line break, if there is any
+ ;; after the changed region. This is necessary e.g. when a rung
+ ;; marked empty line is converted to a line comment by inserting
+ ;; "//" before the line break. In that case the line break would
+ ;; keep the rung mark which could make a later `c-backward-sws'
+ ;; move into the line comment instead of over it.
+ (goto-char end)
+ (skip-chars-forward " \t\f\v")
+ (when (and (eolp) (not (eobp)))
+ (setq end (1+ (point)))))
+
+ (when (and (= beg end)
+ (get-text-property beg 'c-in-sws)
+ (> beg (point-min))
+ (get-text-property (1- beg) 'c-in-sws))
+ ;; Ensure that an `c-in-sws' range gets broken. Note that it isn't
+ ;; safe to keep a range that was continuous before the change. E.g:
+ ;;
+ ;; #define foo
+ ;; \
+ ;; bar
+ ;;
+ ;; There can be a "ladder" between "#" and "b". Now, if the newline
+ ;; after "foo" is removed then "bar" will become part of the cpp
+ ;; directive instead of a syntactically relevant token. In that
+ ;; case there's no longer syntactic ws from "#" to "b".
+ (setq beg (1- beg)))
+
+ (setq end (max (or del-end end)
+ (or ins-end end)
+ end))
+
+ (c-debug-sws-msg "c-invalidate-sws-region-after [%s..%s]" beg end)
+ (c-remove-is-and-in-sws beg end)))
(defun c-forward-sws ()
;; Used by `c-forward-syntactic-ws' to implement the unbounded search.
diff -r a3e0a2a33629 cc-langs.el
--- a/cc-langs.el Sat Dec 31 16:59:41 2016 +0000
+++ b/cc-langs.el Mon Jan 09 21:03:52 2017 +0000
@@ -1428,6 +1428,15 @@
t "*/"
awk nil)
+(c-lang-defconst c-block-comment-ender-regexp
+ ;; Regexp which matches the end of a block comment (if such exists in the
+ ;; language)
+ t (if (c-lang-const c-block-comment-ender)
+ (regexp-quote (c-lang-const c-block-comment-ender))
+ "\\<\\>"))
+(c-lang-defvar c-block-comment-ender-regexp
+ (c-lang-const c-block-comment-ender-regexp))
+
(c-lang-defconst c-comment-start-regexp
;; Regexp to match the start of any type of comment.
t (let ((re (c-make-keywords-re nil
diff -r a3e0a2a33629 cc-mode.el
--- a/cc-mode.el Sat Dec 31 16:59:41 2016 +0000
+++ b/cc-mode.el Mon Jan 09 21:03:52 2017 +0000
@@ -1189,6 +1189,7 @@
;; Are we coalescing two tokens together, e.g. "fo o" -> "foo"?
(when (< beg end)
(c-unfind-coalesced-tokens beg end))
+ (c-invalidate-sws-region-before end)
;; Are we (potentially) disrupting the syntactic context which
;; makes a type a type? E.g. by inserting stuff after "foo" in
;; "foo bar;", or before "foo" in "typedef foo *bar;"?
@@ -1314,7 +1315,7 @@
(c-clear-char-property-with-value beg end 'syntax-table nil)))
(c-trim-found-types beg end old-len) ; maybe we don't need all of these.
- (c-invalidate-sws-region-after beg end)
+ (c-invalidate-sws-region-after beg end old-len)
;; (c-invalidate-state-cache beg) ; moved to `c-before-change'.
(c-invalidate-find-decl-cache beg)
Thanks for this rather obscure bug report. :-)
--
Alan Mackenzie (Nuremberg, Germany).
next prev parent reply other threads:[~2017-01-09 21:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-04 23:37 bug#25362: 26.0.50; comment-region goes into an infinite loop Dima Kogan
2017-01-05 20:54 ` Alan Mackenzie
2017-01-06 7:41 ` Eli Zaretskii
2017-01-06 9:32 ` Alan Mackenzie
2017-01-06 9:37 ` Eli Zaretskii
2017-01-06 9:41 ` Alan Mackenzie
2017-01-07 8:20 ` Alan Mackenzie
2017-01-07 9:44 ` Eli Zaretskii
2017-01-07 10:27 ` Alan Mackenzie
[not found] ` <20170107162907.GC3499@acm.fritz.box>
2017-01-07 16:31 ` Eli Zaretskii
2017-01-09 21:18 ` Alan Mackenzie [this message]
2017-01-09 21:46 ` Dima Kogan
2017-01-09 22:04 ` Alan Mackenzie
2017-01-11 18:32 ` Alan Mackenzie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170109211801.GA4752@acm.fritz.box \
--to=acm@muc.de \
--cc=25362@debbugs.gnu.org \
--cc=dima@secretsauce.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.