all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Alan Mackenzie <acm@muc.de>
To: Dima Kogan <dima@secretsauce.net>
Cc: 25362@debbugs.gnu.org
Subject: bug#25362: 26.0.50; comment-region goes into an infinite loop
Date: Mon, 9 Jan 2017 21:18:01 +0000	[thread overview]
Message-ID: <20170109211801.GA4752@acm.fritz.box> (raw)
In-Reply-To: <ypoulguq8ki2.fsf@jpl.nasa.gov>

Hello again, Dima.

On Wed, Jan 04, 2017 at 03:37:25PM -0800, Dima Kogan wrote:
> Hi. I'm using a very recent emacs built from master:

>   eb3416016b4

> I'm observing (comment-region) go into an infinite loop when presented
> with particular data. I'm attaching a small source file that
> demonstrates the issue. To see it break, run this:

>     emacs --eval '(add-hook (quote c-mode-common-hook) (lambda () (setq comment-start "//" comment-end "")))'  \
>           --eval '(global-set-key (kbd "<f5>") (lambda () (interactive) (comment-region 16 45)))'              \
>           -Q tst6.c

> This loads the demo file with a clean configuration, and runs two bits
> of lisp:

> 1. Sets up a c++ commenting style. This is required for the issue to
>    present itself

> 2. Binds f5 to run the problematic (comment-region)

> On my machine, invoking emacs that way, and then hitting f5 shows the problem.


> void f(void)
> {
> 
>     g( // output
>       s );
> 
> }

As already observed, this bug happens simply in C Mode by attempting to
comment out the four lines inside the function, doing this by marking
the lines then C-c C-c.

The cause was a rather nasty situation where a "syntactic whitespace
cache", designed to facilitate moving rapidly over large areas of WS
(particularly long comments at the beginning of files) got screwed up by
the insertion of "// " onto a line already containing a "//" later on.

Here is a patch.  It is not yet in its final form (the comments need
fixing).  Would you please try it out and let me know how well/how badly
it works:



diff -r a3e0a2a33629 cc-engine.el
--- a/cc-engine.el	Sat Dec 31 16:59:41 2016 +0000
+++ b/cc-engine.el	Mon Jan 09 21:03:52 2017 +0000
@@ -1712,46 +1712,122 @@
 	 `((c-debug-remove-face beg end 'c-debug-is-sws-face)
 	   (c-debug-remove-face beg end 'c-debug-in-sws-face)))))
 
-(defsubst c-invalidate-sws-region-after (beg end)
+;; The type of literal position `end' is in in a `before-change-functions'
+;; function - one of `c', `c++', `string', `pound', or nil.
+(defvar c-sws-lit-type nil)
+;; A cons (BEG . END) of the literal/CPP construct enclosing END, if any, else
+;; nil.
+(defvar c-sws-lit-limits nil)
+
+(defun c-invalidate-sws-region-before (end)
+  ;; Called from c-before-change.  END is the end of the change region, the
+  ;; standard parameter given to all before-change-functions.
+  ;;
+  ;; Note whether END is inside a comment of CPP construct, and if so note its
+  ;; bounds.
+  (save-excursion
+    (goto-char end)
+    (let* ((limits (c-literal-limits))
+	   (lit-type (c-literal-type limits)))
+      (cond
+       ((memq lit-type '(c c++))
+	(setq c-sws-lit-type lit-type
+	      c-sws-lit-limits limits))
+       ((c-beginning-of-macro)
+	(setq c-sws-lit-type 'pound
+	      c-sws-lit-limits (cons (point)
+				     (progn (c-end-of-macro) (point)))))
+       (t (setq c-sws-lit-type nil
+		c-sws-lit-limits nil))))))
+
+(defun c-invalidate-sws-region-after-del (beg end old-len)
+  ;; Text has been deleted, OLD-LEN characters of it starting from position
+  ;; BEG.  END is typically eq to BEG.  Should there have been a comment or
+  ;; CPP construct open at END before the deletion, check whether this
+  ;; deletion deleted or "damaged" its opening delimiter.  If so, return the
+  ;; current position of where the construct ended, otherwise return nil.
+  (when c-sws-lit-limits
+    (setcdr c-sws-lit-limits (- (cdr c-sws-lit-limits) old-len))
+    (if (and (< beg (+ (car c-sws-lit-limits) 2))
+	     (or (get-text-property end 'c-in-sws)
+		 (next-single-property-change end 'c-in-sws nil
+					      (cdr c-sws-lit-limits))
+		 (get-text-property end 'c-is-sws)
+		 (next-single-property-change end 'c-is-sws nil
+					      (cdr c-sws-lit-limits))))
+	(cdr c-sws-lit-limits))))
+
+(defun c-invalidate-sws-region-after-ins (end)
+  ;; Text has been inserted, ending at buffer position END.  Should there be a
+  ;; literal or CPP construct open at END, check whether there are `c-in-sws'
+  ;; or `c-is-sws' text properties inside this literal.  If there are, return
+  ;; the buffer position of the end of the literal, else return nil.
+  (save-excursion
+    (let* ((limits (c-literal-limits))
+	   (lit-type (c-literal-type limits)))
+      (goto-char end)
+      (when (and (not (memq lit-type '(c c++)))
+		 (c-beginning-of-macro))
+	(setq lit-type 'pound
+	      limits (cons (point)
+			   (progn (c-end-of-macro) (point)))))
+      (when (memq lit-type '(c c++ pound))
+	(let ((next-in (next-single-property-change (car limits) 'c-in-sws
+						    nil (cdr limits)))
+	      (next-is (next-single-property-change (car limits) 'c-is-sws
+						    nil (cdr limits))))
+	  (and (or next-in next-is)
+	       (cdr limits)))))))
+
+(defun c-invalidate-sws-region-after (beg end old-len)
   ;; Called from `after-change-functions'.  Note that if
   ;; `c-forward-sws' or `c-backward-sws' are used outside
   ;; `c-save-buffer-state' or similar then this will remove the cache
   ;; properties right after they're added.
   ;;
   ;; This function does hidden buffer changes.
-
-  (save-excursion
-    ;; Adjust the end to remove the properties in any following simple
-    ;; ws up to and including the next line break, if there is any
-    ;; after the changed region. This is necessary e.g. when a rung
-    ;; marked empty line is converted to a line comment by inserting
-    ;; "//" before the line break. In that case the line break would
-    ;; keep the rung mark which could make a later `c-backward-sws'
-    ;; move into the line comment instead of over it.
-    (goto-char end)
-    (skip-chars-forward " \t\f\v")
-    (when (and (eolp) (not (eobp)))
-      (setq end (1+ (point)))))
-
-  (when (and (= beg end)
-	     (get-text-property beg 'c-in-sws)
-	     (> beg (point-min))
-	     (get-text-property (1- beg) 'c-in-sws))
-    ;; Ensure that an `c-in-sws' range gets broken.  Note that it isn't
-    ;; safe to keep a range that was continuous before the change.  E.g:
-    ;;
-    ;;    #define foo
-    ;;         \
-    ;;    bar
-    ;;
-    ;; There can be a "ladder" between "#" and "b".  Now, if the newline
-    ;; after "foo" is removed then "bar" will become part of the cpp
-    ;; directive instead of a syntactically relevant token.  In that
-    ;; case there's no longer syntactic ws from "#" to "b".
-    (setq beg (1- beg)))
-
-  (c-debug-sws-msg "c-invalidate-sws-region-after [%s..%s]" beg end)
-  (c-remove-is-and-in-sws beg end))
+  (let ((del-end
+	 (and (> old-len 0)
+	      (c-invalidate-sws-region-after-del beg end old-len)))
+	(ins-end
+	 (and (> end beg)
+	      (c-invalidate-sws-region-after-ins end))))
+    (save-excursion
+      ;; Adjust the end to remove the properties in any following simple
+      ;; ws up to and including the next line break, if there is any
+      ;; after the changed region. This is necessary e.g. when a rung
+      ;; marked empty line is converted to a line comment by inserting
+      ;; "//" before the line break. In that case the line break would
+      ;; keep the rung mark which could make a later `c-backward-sws'
+      ;; move into the line comment instead of over it.
+      (goto-char end)
+      (skip-chars-forward " \t\f\v")
+      (when (and (eolp) (not (eobp)))
+	(setq end (1+ (point)))))
+
+    (when (and (= beg end)
+	       (get-text-property beg 'c-in-sws)
+	       (> beg (point-min))
+	       (get-text-property (1- beg) 'c-in-sws))
+      ;; Ensure that an `c-in-sws' range gets broken.  Note that it isn't
+      ;; safe to keep a range that was continuous before the change.  E.g:
+      ;;
+      ;;    #define foo
+      ;;         \
+      ;;    bar
+      ;;
+      ;; There can be a "ladder" between "#" and "b".  Now, if the newline
+      ;; after "foo" is removed then "bar" will become part of the cpp
+      ;; directive instead of a syntactically relevant token.  In that
+      ;; case there's no longer syntactic ws from "#" to "b".
+      (setq beg (1- beg)))
+
+    (setq end (max (or del-end end)
+		   (or ins-end end)
+		   end))
+
+    (c-debug-sws-msg "c-invalidate-sws-region-after [%s..%s]" beg end)
+    (c-remove-is-and-in-sws beg end)))
 
 (defun c-forward-sws ()
   ;; Used by `c-forward-syntactic-ws' to implement the unbounded search.
diff -r a3e0a2a33629 cc-langs.el
--- a/cc-langs.el	Sat Dec 31 16:59:41 2016 +0000
+++ b/cc-langs.el	Mon Jan 09 21:03:52 2017 +0000
@@ -1428,6 +1428,15 @@
   t    "*/"
   awk  nil)
 
+(c-lang-defconst c-block-comment-ender-regexp
+  ;; Regexp which matches the end of a block comment (if such exists in the
+  ;; language)
+  t (if (c-lang-const c-block-comment-ender)
+	(regexp-quote (c-lang-const c-block-comment-ender))
+      "\\<\\>"))
+(c-lang-defvar c-block-comment-ender-regexp
+	       (c-lang-const c-block-comment-ender-regexp))
+
 (c-lang-defconst c-comment-start-regexp
   ;; Regexp to match the start of any type of comment.
   t (let ((re (c-make-keywords-re nil
diff -r a3e0a2a33629 cc-mode.el
--- a/cc-mode.el	Sat Dec 31 16:59:41 2016 +0000
+++ b/cc-mode.el	Mon Jan 09 21:03:52 2017 +0000
@@ -1189,6 +1189,7 @@
 	  ;; Are we coalescing two tokens together, e.g. "fo o" -> "foo"?
 	  (when (< beg end)
 	    (c-unfind-coalesced-tokens beg end))
+	  (c-invalidate-sws-region-before end)
 	  ;; Are we (potentially) disrupting the syntactic context which
 	  ;; makes a type a type?  E.g. by inserting stuff after "foo" in
 	  ;; "foo bar;", or before "foo" in "typedef foo *bar;"?
@@ -1314,7 +1315,7 @@
 	      (c-clear-char-property-with-value beg end 'syntax-table nil)))
 
 	  (c-trim-found-types beg end old-len) ; maybe we don't need all of these.
-	  (c-invalidate-sws-region-after beg end)
+	  (c-invalidate-sws-region-after beg end old-len)
 	  ;; (c-invalidate-state-cache beg) ; moved to `c-before-change'.
 	  (c-invalidate-find-decl-cache beg)
 


Thanks for this rather obscure bug report.  :-)

-- 
Alan Mackenzie (Nuremberg, Germany).





  parent reply	other threads:[~2017-01-09 21:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-04 23:37 bug#25362: 26.0.50; comment-region goes into an infinite loop Dima Kogan
2017-01-05 20:54 ` Alan Mackenzie
2017-01-06  7:41   ` Eli Zaretskii
2017-01-06  9:32     ` Alan Mackenzie
2017-01-06  9:37       ` Eli Zaretskii
2017-01-06  9:41         ` Alan Mackenzie
2017-01-07  8:20         ` Alan Mackenzie
2017-01-07  9:44           ` Eli Zaretskii
2017-01-07 10:27             ` Alan Mackenzie
     [not found]             ` <20170107162907.GC3499@acm.fritz.box>
2017-01-07 16:31               ` Eli Zaretskii
2017-01-09 21:18 ` Alan Mackenzie [this message]
2017-01-09 21:46   ` Dima Kogan
2017-01-09 22:04     ` Alan Mackenzie
2017-01-11 18:32       ` Alan Mackenzie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170109211801.GA4752@acm.fritz.box \
    --to=acm@muc.de \
    --cc=25362@debbugs.gnu.org \
    --cc=dima@secretsauce.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.