Hi Alan:

This is to confirm that I have tested your cc-defs.el patch works properly and eliminates the Emacs hang when using the Hyperbole java-defun-prompt-regexp.  Nice work.

Regards,

Bob


On Fri, Oct 13, 2023 at 8:42 AM Alan Mackenzie <acm@muc.de> wrote:
Hello, Jens.

On Thu, Oct 12, 2023 at 21:58:06 +0200, Jens Schmidt wrote:
> Hi Alan,

> Alan Mackenzie <acm@muc.de> writes:

[ .... ]

> >> That always freezes Emacs (29 and master) even before it has a chance to
> >> display P1.java.  The freeze happens in function
> >> `c-get-fallback-scan-pos', where the while loop inf-loops.

Yes.

> > c-get-fallback-scan-pos tries to move to the beginning of a function.
> > This probably involves defun-prompt-regexp when it is non-nil.  :-(

> Otherwise we would see hangs or exponential behavior (?) somewhere in
> the Emacs regexp machinerie, but they take place in that while loop.  So
> I guess that there must be some other, additional quality that this
> regexp fulfills.  Like: "matches the empty string" (which it does not,
> as far as I can tell) or: "must only match before curlies" or whatnot.

> Unfortunately, the doc string/info doc of `defun-prompt-regexp´ provides
> only exactly that latter criterion:

>   That is to say, a defun begins on a line that starts with a match for
>   this regular expression, followed by a character with open-parenthesis
>   syntax.

> I guess that only pruning that regexp until things start unfreezing
> could give an answer here.  Or more tracing to see how point moves in
> `c-get-fallback-scan-pos'.  But I need some tracing break here ...


> ... or so I thought, I just couldn't resist:

> I expanded and instrumented that function from emacs-29 as follows,
> (hopefully) not changing any of its logic:

> ------------------------- snip -------------------------
> (defun c-get-fallback-scan-pos (here)
>   ;; Return a start position for building `c-state-cache' from scratch.  This
>   ;; will be at the top level, 2 defuns back.  Return nil if we don't find
>   ;; these defun starts a reasonable way back.
>   (message "c-get-fallback-scan-pos")
>   (save-excursion
>     (save-restriction
>       (when (> here (* 10 c-state-cache-too-far))
>       (narrow-to-region (- here (* 10 c-state-cache-too-far)) here))
>       ;; Go back 2 bods, but ignore any bogus positions returned by
>       ;; beginning-of-defun (i.e. open paren in column zero).
>       (goto-char here)
>       (let ((cnt 2))
>       (message "beginning-of-defun-loop-00: %d %d" cnt (point))
>       (while (not (or (bobp) (zerop cnt)))
>         (message "beginning-of-defun-loop-01: %d" (point))
>         (let (beginning-of-defun-function end-of-defun-function)
>           (beginning-of-defun))
>         (and defun-prompt-regexp
>              (looking-at defun-prompt-regexp)
>              (message "beginning-of-defun-loop-02: %d" (point))
>              (goto-char (match-end 0)))
>         (message "beginning-of-defun-loop-03: %d" (point))
>         (if (eq (char-after) ?\{)
>             (setq cnt (1- cnt)))))
>       (and (not (bobp))
>          (point)))))
> ------------------------- snip -------------------------

> That results in the message triple

> ------------------------- snip -------------------------
> beginning-of-defun-loop-01: 5879
> beginning-of-defun-loop-02: 5801
> beginning-of-defun-loop-03: 5879
> beginning-of-defun-loop-01: 5879
> beginning-of-defun-loop-02: 5801
> beginning-of-defun-loop-03: 5879
> ...
> ------------------------- snip -------------------------

> inf-looping.  These points are (|: 5801, ^: 5879) here in P1.java:

> ------------------------- snip -------------------------
> 178    } catch (Exception e) {
> 179|      error("symTable.addDecl", "unexpected error with a single HashMap " + e)^;
> 180    }
> 181
> ------------------------- snip -------------------------

> So the catch-block just before line 181 is recognized as a potential BOD
> (previous trailing open curly?).  But then `defun-prompt-regexp' matches
> the function call in the catch-block as defun prompt regexp (which it
> better should not?), taking point back to where, on next BOD search, the
> exact previous BOD is found again.

> So probably there are really two issues here:

> 1. The `defun-prompt-regexp' used by Hyperbole, which matches too
>    broadly, and

> 2. function `c-get-fallback-scan-pos', which could try harder to avoid
>    inf-loops when such things happen.

> But that's where I *really* stop here :-)

You've diagnosed the bug completely.  Thanks!  The hang was caused
entirely by the loop in c-get-fallback-scan-pos, not the deficiencies in
that long regexp.

defun-prompt-regexp, when appended with a \\s( (as is done in
beginning-of-defun-raw) matches the "      error(" on L179 of P1.java.
The bare defun-prompt-regexp (as used in CC Mode) matches the entire
line except the terminating ;.  This regexp could do with some
amendment, but it is not the main cause of the bug.

To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
it only stops at a debug-prompt-regexp position when it also found a {.
Otherwise it will keep looping until it finds a better position or BOB.

Would all concerned please apply the attached patch to the Emacs master
branch, directory lisp/progmodes.  Then please byte compile CC Mode in
full (a macro has been changed), and try the result on your real Java
code.  (If anybody wants any help applying the patch or byte compiling,
feel free to send me private mail.)  Then please confirm that the bug is
indeed fixed.  Thanks!

--
Alan Mackenzie (Nuremberg, Germany).