From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.devel Subject: Re: Scan of regexp mistakes Date: Tue, 5 Mar 2019 16:06:16 +0100 Message-ID: References: <3ef768c2-98d9-a42d-067a-4a5ffc945cf4@cs.ucla.edu> Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="167917"; mail-complaints-to="usenet@blaine.gmane.org" Cc: emacs-devel To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Mar 05 16:07:23 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h1Bf9-000hZM-4C for ged-emacs-devel@m.gmane.org; Tue, 05 Mar 2019 16:07:23 +0100 Original-Received: from localhost ([127.0.0.1]:44236 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h1Bf7-0008G3-WC for ged-emacs-devel@m.gmane.org; Tue, 05 Mar 2019 10:07:22 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:56989) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h1BeV-0008Ft-Px for emacs-devel@gnu.org; Tue, 05 Mar 2019 10:06:45 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h1BeU-0007OD-A4 for emacs-devel@gnu.org; Tue, 05 Mar 2019 10:06:43 -0500 Original-Received: from mail85c50.megamailservers.eu ([91.136.10.95]:58948 helo=mail18c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h1BeR-0007Hu-PZ for emacs-devel@gnu.org; Tue, 05 Mar 2019 10:06:42 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551798384; bh=oe+I7mO8K74CJ/GEdTMw6787MPhsU0aRv/I3GA4w1Zw=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=B8LNZq4Chg9vFbOyjLi0ml0sV5I5J3y97lMs7dfvnKx9Vx5RRUn9pn8qsz8HRs+PZ vHKfNBirR+4zRiqKSk818h/k5Gk3239pSux+q2C44CwN+dz6xDzz4yIYK29peV7/c0 eTtMdeo0dSFqOPmVaKvE7WK1VlHTaCP7sNwCfAWk= Feedback-ID: mattiase@acm.or Original-Received: from [192.168.0.4] (c83-251-8-17.bredband.comhem.se [83.251.8.17]) (authenticated bits=0) by mail18c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x25F6HSO007514; Tue, 5 Mar 2019 15:06:24 +0000 In-Reply-To: <3ef768c2-98d9-a42d-067a-4a5ffc945cf4@cs.ucla.edu> X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0208.5C7E9070.0082, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KOR08mNo c=1 sm=1 tr=0 a=NAHmi3I8mP0S/Y8gRKeQyA==:117 a=NAHmi3I8mP0S/Y8gRKeQyA==:17 a=IkcTkHD0fZMA:10 a=kBQ-CWbRAAAA:20 a=SyyQzqxZ43UxuCg3IcsA:9 a=bVvBWUWdE5Mj73Is:21 a=GJjNpOmJYrZsBiWg:21 a=QEXdDO2ut3YA:10 a=gFK9olCvu2RJzDtPQqdE:22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.10.95 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:233848 Archived-At: 5 mars 2019 kl. 03.04 skrev Paul Eggert : >=20 > Thanks for reporting that. I fixed the glitches as best I could by > applying the attached patch. I didn't see any false alarms, which is = good. Good work! > It'd be nice if we could catch such typos on a regular basis. Is there > some easy way to do that? A simple way might be for you to run your > trawler once a month (say) and report back here. A nicer way would be > for "make check" to run the trawler. I can run it periodically but would surely forget. Should I put the = trawler in the Emacs source tree (if so, where?), in ELPA, or elsewhere? As a temporary measure, it now resides at = https://github.com/mattiase/trawl, and it has some improvements which = uncovered a few more nits. The error locations are now more precise, = too. About that massive change of yours: most of it were obvious, of course, = but perhaps you could satisfy my curiosity: diff --git a/lisp/arc-mode.el b/lisp/arc-mode.el index 8de0103019..2afde7ee75 100644 --- a/lisp/arc-mode.el +++ b/lisp/arc-mode.el @@ -2016,7 +2016,7 @@ This doesn't recover lost files, it just undoes = changes in the buffer itself." (call-process "lsar" nil t nil "-l" (or file copy)) (if copy (delete-file copy)) (goto-char (point-min)) - (re-search-forward "^\\(\s+=3D+\s?+\\)+\n") + (re-search-forward "^\\(\s+=3D+\s+\\)+\n") ^^^ Are you sure this shouldn't be `\s*', which was the previous semantics, = or `\s+?', in case it was a transposition mistake? I suppose the \n at the end makes a non-greedy repetition unlikely. diff --git a/lisp/gnus/gnus-art.el b/lisp/gnus/gnus-art.el index 06f7be3da7..fa3abfac58 100644 --- a/lisp/gnus/gnus-art.el +++ b/lisp/gnus/gnus-art.el @@ -7378,7 +7378,7 @@ groups." =20 ;; Regexp suggested by Felix Wiemann in <87oeuomcz9.fsf@news2.ososo.de> (defcustom gnus-button-valid-localpart-regexp - "[a-z0-9$%(*-=3D?[_][^<>\")!;:,{}\n\t @]*" + "[a-z$%(*-=3D?[_][^<>\")!;:,{}\n\t @]*" You kept the rather odd range `*-=3D' which comprises = `*+,-./0123456789:;<=3D'. Is it supposed to be that way? diff --git a/lisp/language/ethio-util.el b/lisp/language/ethio-util.el index afc2239fbf..512d49b9c5 100644 --- a/lisp/language/ethio-util.el +++ b/lisp/language/ethio-util.el @@ -804,7 +804,7 @@ The 2nd and 3rd arguments BEGIN and END specify the = region." =20 ;; Special Ethiopic punctuation. (goto-char (point-min)) - (while (re-search-forward "\\ce[=C2=BB\\.\\?]\\|=C2=AB\\ce" nil t) + (while (re-search-forward "\\ce[=C2=BB\\.?]\\|=C2=AB\\ce" nil t) Should `\' really be kept in the set of characters? It looks like it was = only included as an attempt to escape `.' and `?'. diff --git a/lisp/net/goto-addr.el b/lisp/net/goto-addr.el index 43659d2820..c25d787391 100644 --- a/lisp/net/goto-addr.el +++ b/lisp/net/goto-addr.el @@ -246,7 +246,7 @@ there, then load the URL at or before point." "Find e-mail address around or before point. Then search backwards to beginning of line for the start of an e-mail address. If no e-mail address found, return nil." - (re-search-backward "[^-_A-z0-9.@]" (line-beginning-position) 'lim) + (re-search-backward "[^-_A-Za-z0-9.@]" (line-beginning-position) = 'lim) This is good, but I should just point out that searching for A-z = uncovers more suspect regexps, some of which aren't found by the = trawler. diff --git a/lisp/nxml/rng-uri.el b/lisp/nxml/rng-uri.el index 0e458cfd2f..d8f2884f5e 100644 --- a/lisp/nxml/rng-uri.el +++ b/lisp/nxml/rng-uri.el @@ -42,7 +42,7 @@ escape them using %HH." =20 (defun rng-uri-escape-multibyte (uri) "Escape multibyte characters in URI." - (replace-regexp-in-string "[:nonascii:]" + (replace-regexp-in-string "[[:nonascii:]]" Lovely one! Here is another one in the same file (line 33), but that = wasn't found by the trawler: (replace-regexp-in-string "[\000-\032\177<>#%\"{}|\\^[]`%?;]" That \032 doesn't look right (number base confusion?), and it looks like = it's meant as a single character alternative but it isn't, given the = misplaced `]'. diff --git a/lisp/org/org-mobile.el b/lisp/org/org-mobile.el index 1ff6358403..83dcc7b0d1 100644 --- a/lisp/org/org-mobile.el +++ b/lisp/org/org-mobile.el @@ -845,11 +845,11 @@ If BEG and END are given, only do this in that = region." (cl-incf cnt-error) (throw 'next t)) (move-marker bos-marker (point)) - (if (re-search-forward "^** Old value[ \t]*$" eos t) + (if (re-search-forward "^\\*\\* Old value[ \t]*$" eos t) Shouldn't this start with "^\\**", or does it have to be exactly two = asterisks? (setq old (buffer-substring (1+ (match-end 0)) (progn (outline-next-heading) (point))))) - (if (re-search-forward "^** New value[ \t]*$" eos t) + (if (re-search-forward "^\\*\\* New value[ \t]*$" eos t) Idem. --- a/lisp/org/org.el +++ b/lisp/org/org.el @@ -10467,7 +10467,7 @@ This is still an experimental function, your = mileage may vary." ((and (equal type "lisp") (string-match "^/" path)) ;; Planner has a slash, we do not. (setq type "elisp" path (substring path 1))) - ((string-match "^//\\(.?*\\)/\\(<.*>\\)$" path) + ((string-match "^//\\(.*\\)/\\(<.*>\\)$" path) Another repetition-of-repetition. Sure it shouldn't be `*?' instead? It = looks likely, since there is a `/' following that would be eaten by the = `.*' given half a chance. diff --git a/lisp/progmodes/fortran.el b/lisp/progmodes/fortran.el index be272c0922..c1a267f4c5 100644 --- a/lisp/progmodes/fortran.el +++ b/lisp/progmodes/fortran.el @@ -2052,7 +2052,7 @@ If ALL is nil, only match comments that start in = column > 0." (when (<=3D (point) bos) (move-to-column (1+ fill-column)) ;; What is this doing??? - (or (re-search-forward "[\t\n,'+-/*)=3D]" eol t) + (or (re-search-forward "[-\t\n,'+./*)=3D]" eol t) Where did the . come from? Don't you think that `+-/*' were meant to = include those four symbols only? diff --git a/lisp/progmodes/mixal-mode.el b/lisp/progmodes/mixal-mode.el index 1ea4b33093..a759709b5c 100644 --- a/lisp/progmodes/mixal-mode.el +++ b/lisp/progmodes/mixal-mode.el @@ -1044,7 +1044,7 @@ EXECUTION-TIME holds info about the time it takes, = number or string.") . mixal-font-lock-operation-code-face) (,(regexp-opt mixal-assembly-pseudoinstructions 'words) . mixal-font-lock-assembly-pseudoinstruction-face) - ("^[A-Z0-9a-z]*[ \t]+[A-ZO-9a-z]+[ \t]+\\(=3D.*=3D\\)" + ("^[A-Z0-9a-z]*[ \t]+[A-Z0-9a-z]+[ \t]+\\(=3D.*=3D\\)" Another glorious regexp! diff --git a/lisp/progmodes/verilog-mode.el = b/lisp/progmodes/verilog-mode.el index a949a461c1..e1003378b2 100644 --- a/lisp/progmodes/verilog-mode.el +++ b/lisp/progmodes/verilog-mode.el @@ -9357,7 +9357,7 @@ Returns REGEXP and list of ( (signal_name = connection_name)... )." ;; Regexp form?? ((looking-at ;; Regexp bug in XEmacs disallows ][ inside [], and = wants + last - = "\\s-*\\.\\(\\([a-zA-Z0-9`_$+@^.*?|---]\\|[][]\\|\\\\[()|]\\)+\\)\\s-*(\\(= .*\\))\\s-*\\(,\\|)\\s-*;\\)") + = "\\s-*\\.\\(\\([-a-zA-Z0-9`_$+@^.*?]\\|[][]\\|\\\\[()|]\\)+\\)\\s-*(\\(.*\= \))\\s-*\\(,\\|)\\s-*;\\)") (setq rep (match-string-no-properties 3)) (goto-char (match-end 0)) (setq tpl-wild-list Are you sure that | shouldn't be there too? Or is this some kind of = XEmacs idiom?