From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#61436: Emacs Freezing With Java Files Date: Mon, 16 Oct 2023 14:05:20 +0000 Message-ID: References: <87il7ew5wx.fsf@sappc2.fritz.box> <87il7dbosk.fsf@lidells.se> <87r0m1t0el.fsf@sappc2.fritz.box> <875y3bbokx.fsf@sappc2.fritz.box> <8734yew8yr.fsf@sappc2.fritz.box> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28775"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Robert Weiner , Hank Greenburg , Mats Lidell , 61436-done@debbugs.gnu.org, acm@muc.de, Eli Zaretskii , Jens Schmidt To: Robert Weiner Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Oct 16 16:05:53 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qsOEC-0007Cj-Dp for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 16 Oct 2023 16:05:52 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsOE1-0001Zs-DM; Mon, 16 Oct 2023 10:05:41 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qsODy-0001Zh-01 for bug-gnu-emacs@gnu.org; Mon, 16 Oct 2023 10:05:38 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qsODx-0008Ro-EK for bug-gnu-emacs@gnu.org; Mon, 16 Oct 2023 10:05:37 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qsOEL-0002yO-PO for bug-gnu-emacs@gnu.org; Mon, 16 Oct 2023 10:06:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 16 Oct 2023 14:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61436 X-GNU-PR-Package: emacs Original-Received: via spool by 61436-done@debbugs.gnu.org id=D61436.169746516111424 (code D ref 61436); Mon, 16 Oct 2023 14:06:01 +0000 Original-Received: (at 61436-done) by debbugs.gnu.org; 16 Oct 2023 14:06:01 +0000 Original-Received: from localhost ([127.0.0.1]:57445 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qsOEK-0002y9-PJ for submit@debbugs.gnu.org; Mon, 16 Oct 2023 10:06:01 -0400 Original-Received: from mail.muc.de ([193.149.48.3]:14536) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qsOEE-0002xm-Q1 for 61436-done@debbugs.gnu.org; Mon, 16 Oct 2023 10:05:58 -0400 Original-Received: (qmail 80511 invoked by uid 3782); 16 Oct 2023 16:05:23 +0200 Original-Received: from acm.muc.de (pd953a1c7.dip0.t-ipconnect.de [217.83.161.199]) (using STARTTLS) by colin.muc.de (tmda-ofmipd) with ESMTP; Mon, 16 Oct 2023 16:05:22 +0200 Original-Received: (qmail 8699 invoked by uid 1000); 16 Oct 2023 14:05:20 -0000 Content-Disposition: inline In-Reply-To: X-Submission-Agent: TMDA/1.3.x (Ph3nix) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:272570 Archived-At: Hello, Bob. On Sun, Oct 15, 2023 at 06:20:15 -0400, Robert Weiner wrote: > Hi Alan: > Would be great if you can improve those two regexps. The only > requirement is that they be able to recognize all defuns in the two > languages as best a regexp can, so that the whole defun can be > selected based on finding the opening brace regardless of coding > style. So far, I've only looked at the Java regexp. It had some serious deficiencies, notably: (i) It used "\\s-" (space syntax) a lot. This fails to mach \n, which in Java mode has comment-end syntax. (ii) The bit for the parenthesis expression was in an optional part of the regexp with the result that it would match "almost anything" rather than a defun start. In the following regexp these faults are fixed. Additionally, I've included more modifiers (things like private, volatile) which Java seems to have gathered over the years. I've also attempted to match generic functions. I don't know how well this will work out. Here's the regexp. Would people please try it out and let me know how well it works. (defconst java-defun-prompt-regexp (let ((space* "[ \t\n\r\f]*") (space+ "[ \t\n\r\f]+") (modifier* (concat "\\(?:" (regexp-opt '("abstract" "const" "default" "final" "native" "private" "protected" "public" "static" "strictfp" "synchronized" "threadsafe" "transient" "volatile") 'words) ; Compatible with XEmacs space+ "\\)*")) (ids-with-dots "[_$a-zA-Z][_$.a-zA-Z0-9]*") (ids-with-dot-\[\] "[[_$a-zA-Z][][_$.a-zA-Z0-9]*") (paren-exp "([^);{}]*)") (generic-exp "<[^(){};]*>")) (concat "^[ \t]*" modifier* "\\(?:" generic-exp space* "\\)?" ids-with-dot-\[\] space+ ; first part of type "\\(?:" ids-with-dot-\[\] space+ "\\)?" ; optional second part of type. "\\(?:[_a-zA-Z][^][ \t:;.,{}()=<>]*" ; defun name "\\|" ids-with-dot* "\\)" space* paren-exp "\\(?:" space* "]\\)*" ; What's this for? "\\(?:" space* "\\" space* ids-with-dot-\[\]s* "\\(?:," space* ids-with-dot-\[\]s* "\\)*" "\\)?" space*))) > Thanks. > -- Bob > > On Oct 14, 2023, at 8:41 PM, Alan Mackenzie wrote: [ .... ] > > Mats, I'm willing to work on that regular expression, and also the one > > for C++. As I mentioned earlier, I've got some tools which work on > > regexps, in particular pp-regexp, which prints a regexp more readably on > > several lines, and fix-re, which rewrites a regexp when it is > > ill-conditioned in certain ways. > > I foresee reverse engineering the regexps into more readable forms built > > up by concatenating basic blocks. For example for the java regexp I > > would define > > (defconst id "[a-zA-Z][][_$.a-zA-Z0-9]*") > > , and use this id in a largish concat form. > > I'm also willing to share pp-regexp and fix-re with you(r team), if that > > might help, on the understanding that neither is of release quality. -- Alan Mackenzie (Nuremberg, Germany).