unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Alan Mackenzie <acm@muc.de>
To: Jens Schmidt <jschmidt4gnu@vodafonemail.de>
Cc: Robert Weiner <rsw@gnu.org>,
	Hank Greenburg <hank.greenburg@protonmail.com>,
	Mats Lidell <mats.lidell@lidells.se>,
	Eli Zaretskii <eliz@gnu.org>,
	rswgnu@gmail.com, 61436@debbugs.gnu.org
Subject: bug#61436: Emacs Freezing With Java Files
Date: Fri, 13 Oct 2023 12:41:57 +0000	[thread overview]
Message-ID: <ZSk7FS-pS2Euz1aA@ACM> (raw)
In-Reply-To: <875y3bbokx.fsf@sappc2.fritz.box>

[-- Attachment #1: Type: text/plain, Size: 5280 bytes --]

Hello, Jens.

On Thu, Oct 12, 2023 at 21:58:06 +0200, Jens Schmidt wrote:
> Hi Alan,

> Alan Mackenzie <acm@muc.de> writes:

[ .... ]

> >> That always freezes Emacs (29 and master) even before it has a chance to
> >> display P1.java.  The freeze happens in function
> >> `c-get-fallback-scan-pos', where the while loop inf-loops.

Yes.

> > c-get-fallback-scan-pos tries to move to the beginning of a function.
> > This probably involves defun-prompt-regexp when it is non-nil.  :-(

> Otherwise we would see hangs or exponential behavior (?) somewhere in
> the Emacs regexp machinerie, but they take place in that while loop.  So
> I guess that there must be some other, additional quality that this
> regexp fulfills.  Like: "matches the empty string" (which it does not,
> as far as I can tell) or: "must only match before curlies" or whatnot.

> Unfortunately, the doc string/info doc of `defun-prompt-regexp´ provides
> only exactly that latter criterion:

>   That is to say, a defun begins on a line that starts with a match for
>   this regular expression, followed by a character with open-parenthesis
>   syntax.

> I guess that only pruning that regexp until things start unfreezing
> could give an answer here.  Or more tracing to see how point moves in
> `c-get-fallback-scan-pos'.  But I need some tracing break here ...


> ... or so I thought, I just couldn't resist:

> I expanded and instrumented that function from emacs-29 as follows,
> (hopefully) not changing any of its logic:

> ------------------------- snip -------------------------
> (defun c-get-fallback-scan-pos (here)
>   ;; Return a start position for building `c-state-cache' from scratch.  This
>   ;; will be at the top level, 2 defuns back.  Return nil if we don't find
>   ;; these defun starts a reasonable way back.
>   (message "c-get-fallback-scan-pos")
>   (save-excursion
>     (save-restriction
>       (when (> here (* 10 c-state-cache-too-far))
> 	(narrow-to-region (- here (* 10 c-state-cache-too-far)) here))
>       ;; Go back 2 bods, but ignore any bogus positions returned by
>       ;; beginning-of-defun (i.e. open paren in column zero).
>       (goto-char here)
>       (let ((cnt 2))
> 	(message "beginning-of-defun-loop-00: %d %d" cnt (point))
> 	(while (not (or (bobp) (zerop cnt)))
> 	  (message "beginning-of-defun-loop-01: %d" (point))
> 	  (let (beginning-of-defun-function end-of-defun-function)
> 	    (beginning-of-defun))
> 	  (and defun-prompt-regexp
> 	       (looking-at defun-prompt-regexp)
> 	       (message "beginning-of-defun-loop-02: %d" (point))
> 	       (goto-char (match-end 0)))
> 	  (message "beginning-of-defun-loop-03: %d" (point))
> 	  (if (eq (char-after) ?\{)
> 	      (setq cnt (1- cnt)))))
>       (and (not (bobp))
> 	   (point)))))
> ------------------------- snip -------------------------

> That results in the message triple

> ------------------------- snip -------------------------
> beginning-of-defun-loop-01: 5879
> beginning-of-defun-loop-02: 5801
> beginning-of-defun-loop-03: 5879
> beginning-of-defun-loop-01: 5879
> beginning-of-defun-loop-02: 5801
> beginning-of-defun-loop-03: 5879
> ...
> ------------------------- snip -------------------------

> inf-looping.  These points are (|: 5801, ^: 5879) here in P1.java:

> ------------------------- snip -------------------------
> 178    } catch (Exception e) {
> 179|      error("symTable.addDecl", "unexpected error with a single HashMap " + e)^;
> 180    }
> 181
> ------------------------- snip -------------------------

> So the catch-block just before line 181 is recognized as a potential BOD
> (previous trailing open curly?).  But then `defun-prompt-regexp' matches
> the function call in the catch-block as defun prompt regexp (which it
> better should not?), taking point back to where, on next BOD search, the
> exact previous BOD is found again.

> So probably there are really two issues here:

> 1. The `defun-prompt-regexp' used by Hyperbole, which matches too
>    broadly, and

> 2. function `c-get-fallback-scan-pos', which could try harder to avoid
>    inf-loops when such things happen.

> But that's where I *really* stop here :-)

You've diagnosed the bug completely.  Thanks!  The hang was caused
entirely by the loop in c-get-fallback-scan-pos, not the deficiencies in
that long regexp.

defun-prompt-regexp, when appended with a \\s( (as is done in
beginning-of-defun-raw) matches the "      error(" on L179 of P1.java.
The bare defun-prompt-regexp (as used in CC Mode) matches the entire
line except the terminating ;.  This regexp could do with some
amendment, but it is not the main cause of the bug.

To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
it only stops at a debug-prompt-regexp position when it also found a {.
Otherwise it will keep looping until it finds a better position or BOB.

Would all concerned please apply the attached patch to the Emacs master
branch, directory lisp/progmodes.  Then please byte compile CC Mode in
full (a macro has been changed), and try the result on your real Java
code.  (If anybody wants any help applying the patch or byte compiling,
feel free to send me private mail.)  Then please confirm that the bug is
indeed fixed.  Thanks!

-- 
Alan Mackenzie (Nuremberg, Germany).


[-- Attachment #2: diff.20231013.diff --]
[-- Type: text/plain, Size: 3677 bytes --]

diff -r b680bbba3141 cc-defs.el
--- a/cc-defs.el	Fri Sep 29 11:15:58 2023 +0000
+++ b/cc-defs.el	Fri Oct 13 12:23:11 2023 +0000
@@ -944,7 +944,8 @@
      (when dest (goto-char dest) t)))
 \f
 (defmacro c-beginning-of-defun-1 ()
-  ;; Wrapper around beginning-of-defun.
+  ;; Wrapper around beginning-of-defun.  Note that the return value from this
+  ;; macro has no significance.
   ;;
   ;; NOTE: This function should contain the only explicit use of
   ;; beginning-of-defun in CC Mode.  Eventually something better than
@@ -957,44 +958,49 @@
   ;; `c-parse-state'.
 
   `(progn
-     (if (and ,(fboundp 'buffer-syntactic-context-depth)
-	      c-enable-xemacs-performance-kludge-p)
-	 ,(when (fboundp 'buffer-syntactic-context-depth)
-	    ;; XEmacs only.  This can improve the performance of
-	    ;; c-parse-state to between 3 and 60 times faster when
-	    ;; braces are hung.  It can also degrade performance by
-	    ;; about as much when braces are not hung.
-	    '(let (beginning-of-defun-function end-of-defun-function
-					       pos)
-	       (while (not pos)
-		 (save-restriction
-		   (widen)
-		   (setq pos (c-safe-scan-lists
-			      (point) -1 (buffer-syntactic-context-depth))))
-		 (cond
-		  ((bobp) (setq pos (point-min)))
-		  ((not pos)
-		   (let ((distance (skip-chars-backward "^{")))
-		     ;; unbalanced parenthesis, while invalid C code,
-		     ;; shouldn't cause an infloop!  See unbal.c
-		     (when (zerop distance)
-		       ;; Punt!
-		       (beginning-of-defun)
-		       (setq pos (point)))))
-		  ((= pos 0))
-		  ((not (eq (char-after pos) ?{))
-		   (goto-char pos)
-		   (setq pos nil))
-		  ))
-	       (goto-char pos)))
-       ;; Emacs, which doesn't have buffer-syntactic-context-depth
-       (let (beginning-of-defun-function end-of-defun-function)
-	 (beginning-of-defun)))
-     ;; if defun-prompt-regexp is non-nil, b-o-d won't leave us at the
-     ;; open brace.
-     (and defun-prompt-regexp
-	  (looking-at defun-prompt-regexp)
-	  (goto-char (match-end 0)))))
+     (while
+	 (progn
+	   (if (and ,(fboundp 'buffer-syntactic-context-depth)
+		    c-enable-xemacs-performance-kludge-p)
+	       ,(when (fboundp 'buffer-syntactic-context-depth)
+		  ;; XEmacs only.  This can improve the performance of
+		  ;; c-parse-state to between 3 and 60 times faster when
+		  ;; braces are hung.  It can also degrade performance by
+		  ;; about as much when braces are not hung.
+		  '(let (beginning-of-defun-function end-of-defun-function
+						     pos)
+		     (while (not pos)
+		       (save-restriction
+			 (widen)
+			 (setq pos (c-safe-scan-lists
+				    (point) -1 (buffer-syntactic-context-depth))))
+		       (cond
+			((bobp) (setq pos (point-min)))
+			((not pos)
+			 (let ((distance (skip-chars-backward "^{")))
+			   ;; unbalanced parenthesis, while invalid C code,
+			   ;; shouldn't cause an infloop!  See unbal.c
+			   (when (zerop distance)
+			     ;; Punt!
+			     (beginning-of-defun)
+			     (setq pos (point)))))
+			((= pos 0))
+			((not (eq (char-after pos) ?{))
+			 (goto-char pos)
+			 (setq pos nil))
+			))
+		     (goto-char pos)))
+	     ;; Emacs, which doesn't have buffer-syntactic-context-depth
+	     (let (beginning-of-defun-function end-of-defun-function)
+	       (beginning-of-defun)))
+	   (and (not (bobp))
+		;; if defun-prompt-regexp is non-nil, b-o-d won't leave us at
+		;; the open brace.
+		defun-prompt-regexp
+		(looking-at (concat defun-prompt-regexp "\\s("))
+		(or (not (eq (char-before (match-end 0)) ?{))
+		    (progn (goto-char (1- (match-end 0)))
+			   nil)))))))
 
 \f
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  reply	other threads:[~2023-10-13 12:41 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-11 18:16 bug#61436: Emacs Freezing With Java Files Hank Greenburg via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-12  0:24 ` Hank Greenburg via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-12  6:30   ` Eli Zaretskii
2023-02-12 16:52     ` Hank Greenburg via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-12 17:05       ` Eli Zaretskii
2023-02-12 17:11         ` Hank Greenburg via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-09 20:26           ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-10 20:58             ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-11  7:28               ` Mats Lidell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-11 10:17                 ` Robert Weiner
2023-10-11 19:38                   ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-11 20:07                     ` Robert Weiner
2023-10-11 21:43                     ` Mats Lidell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-11 22:03                     ` Alan Mackenzie
2023-10-12 19:58                       ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-13 12:41                         ` Alan Mackenzie [this message]
2023-10-13 18:02                           ` Mats Lidell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-13 20:42                           ` Jens Schmidt via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-14 19:41                             ` Alan Mackenzie
2023-10-15 10:20                               ` Robert Weiner
2023-10-16 14:05                                 ` Alan Mackenzie
2023-10-16 19:10                                   ` Robert Weiner
2023-10-21 22:14                                   ` Mats Lidell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-22 14:15                                     ` Alan Mackenzie
2023-10-22 17:17                                       ` Mats Lidell via Bug reports for GNU Emacs, the Swiss army knife of text editors
     [not found]                                         ` <CA+OMD9hgM_NX7GmeW8ph5fBW6SkFGogf4W4JOO5o62H3X15WHw@mail.gmail.com>
2024-04-17 13:22                                           ` Alan Mackenzie
     [not found]                                           ` <Zh_JagP5xaaXJMOo@ACM>
2024-04-17 18:50                                             ` Alan Mackenzie
2024-04-17 22:24                                               ` Robert Weiner
2024-04-19  2:19                                               ` Robert Weiner
2024-04-19  4:40                                                 ` Robert Weiner
2024-04-19 15:59                                                   ` Alan Mackenzie
2024-04-19  2:58                           ` Robert Weiner
2023-02-12  6:00 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSk7FS-pS2Euz1aA@ACM \
    --to=acm@muc.de \
    --cc=61436@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=hank.greenburg@protonmail.com \
    --cc=jschmidt4gnu@vodafonemail.de \
    --cc=mats.lidell@lidells.se \
    --cc=rsw@gnu.org \
    --cc=rswgnu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).