From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#53680: Endless loop in peculiar case of string-match and string-match-p 27.02 and 28.0.50 Date: Tue, 1 Feb 2022 13:56:10 +0100 Message-ID: <965416B7-4ACC-4571-B2C2-8607A741212F@acm.org> References: <183c66f4-463a-b372-feee-5af9f6f45719@cvj.se> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37110"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Lars Ingebrigtsen , Andreas Schwab , 53680@debbugs.gnu.org To: Christian Johansson Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Feb 01 16:39:22 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nEvFa-0009Q8-Gg for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 01 Feb 2022 16:39:22 +0100 Original-Received: from localhost ([::1]:35646 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nEvFX-0003gY-UJ for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 01 Feb 2022 10:39:20 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:54940) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nEsiV-000398-LA for bug-gnu-emacs@gnu.org; Tue, 01 Feb 2022 07:57:05 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49105) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nEsiU-0004Nb-Ml for bug-gnu-emacs@gnu.org; Tue, 01 Feb 2022 07:57:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nEsiU-0001ih-GR for bug-gnu-emacs@gnu.org; Tue, 01 Feb 2022 07:57:02 -0500 X-Loop: help-debbugs@gnu.org In-Reply-To: <183c66f4-463a-b372-feee-5af9f6f45719@cvj.se> Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 01 Feb 2022 12:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 53680 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: confirmed Original-Received: via spool by 53680-submit@debbugs.gnu.org id=B53680.16437201816549 (code B ref 53680); Tue, 01 Feb 2022 12:57:02 +0000 Original-Received: (at 53680) by debbugs.gnu.org; 1 Feb 2022 12:56:21 +0000 Original-Received: from localhost ([127.0.0.1]:42008 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nEsho-0001hZ-LZ for submit@debbugs.gnu.org; Tue, 01 Feb 2022 07:56:20 -0500 Original-Received: from mail73c50.megamailservers.eu ([91.136.10.83]:46740 helo=mail92c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nEshl-0001hO-62 for 53680@debbugs.gnu.org; Tue, 01 Feb 2022 07:56:18 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1643720173; bh=rwpme24B7lJyO+y8aG8kZ7UhzO0tpLIsxhe/So5Bevw=; h=From:Subject:Date:Cc:To:From; b=HS+yDg38gluFP+knOrhGUoXrIN5jzbmnXsF1+TDQU4oChhtiFlFt11jW7sOtIeNQG NGh6pWf2jQCUaKmFjA0spQYcRgaH9+6lP4OIrFKtDARnMrZfj8aY12+qffDiBR6ull 9f6o8EEDcaqdNtaSBJI2Q0THoAryFXWAPO8fZnGw= Feedback-ID: mattiase@acm.or Original-Received: from smtpclient.apple (c188-150-171-71.bredband.tele2.se [188.150.171.71]) (authenticated bits=0) by mail92c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 211CuAGH013345; Tue, 1 Feb 2022 12:56:12 +0000 X-Mailer: Apple Mail (2.3654.120.0.1.13) X-CTCH-RefID: str=0001.0A742F1C.61F92DED.0025, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-Origin-Country: SE X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:225751 Archived-At: > = (string-match=C2=B7"[\r\t=C2=B7]*implements[\r\t=C2=B7]+\\([\r\t=C2=B7]*[\= \a-zA-Z_0-9_]+,?\\)+[\r\t=C2=B7]*{$"=C2=B7"ariable=C2=B7implements=C2=B7\\= Magento\\Framework\\Event\\OberserverInterface\r{\r=C2=B7=C2=B7=C2=B7=C2=B7= public=C2=B7function=C2=B7__construct()\r=C2=B7")=20 The diagnostics by Lars and Andreas is correct. Let's look at it more = closely, first translating the regexp to rx for ease of reasoning, and = see if we can make it work: (rx (* (in "\t\r ")) "implements" (+ (in "\t\r ")) (+ (group (* (in "\t\r ")) (+ (in "0-9A-Za-z" "\\_")) (? ","))) (* (in "\t\r ")) "{" eol) The first line is meaningless since it can match the empty string, but = you probably want to anchor the start of "implements" so that it doesn't = match "house_implements". Let's also drop the capture group, and we get: (rx symbol-start "implements" (+ (in "\t\r ")) (+ (* (in "\t\r ")) (+ (in "0-9A-Za-z" "\\_")) (? ",")) (* (in "\t\r ")) "{" eol) You clearly want to match a non-empty sequence of 'words' separated with = whitespace and/or commas, but the pattern is ambiguous -- all = inter-word separators are optional. Let's make them mandatory: (rx symbol-start "implements" ;; mandatory whitespace (+ (in "\t\r ")) ;; then a word (+ (in "0-9A-Za-z" "\\_")) ;; then maybe more words, each prefixed by spaces or comma (* (+ (in "\t\r ,")) ; fast and loose (+ (in "0-9A-Za-z" "\\_"))) ;; finally whitespace before the curly bracket (* (in "\t\r ")) "{" eol) which is reasonably efficient, since all ambiguity is now gone: the = regexp can (almost) only match in one way. Note the "fast and loose" pattern where we accept any number of spaces = or commas. Here it depends on your grammar but if you want exactly one = comma separating each word, that subexpression would be something like (* (in "\t\r ")) "," (* (in "\t\r ")) instead.