From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Re: Regexp error scan (March 26) Date: Tue, 26 Mar 2019 19:10:05 -0700 Organization: UCLA Computer Science Department Message-ID: <77419a89-ce9f-919b-c221-c7a3b938587a@cs.ucla.edu> References: <0E648A80-8673-44DB-B481-981474AC3D7C@acm.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------1FA3F32FBD69FD3558DEEE97" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="19602"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.0 Cc: emacs-devel To: =?UTF-8?Q?Mattias_Engdeg=c3=a5rd?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Mar 27 03:11:07 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h8y1z-0004xH-9L for ged-emacs-devel@m.gmane.org; Wed, 27 Mar 2019 03:11:07 +0100 Original-Received: from localhost ([127.0.0.1]:40665 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h8y1y-0000lB-AS for ged-emacs-devel@m.gmane.org; Tue, 26 Mar 2019 22:11:06 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:46748) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h8y15-0000hM-FY for emacs-devel@gnu.org; Tue, 26 Mar 2019 22:10:13 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h8y13-0001Eb-Cp for emacs-devel@gnu.org; Tue, 26 Mar 2019 22:10:11 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:34908) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h8y12-0001CV-Ru for emacs-devel@gnu.org; Tue, 26 Mar 2019 22:10:09 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 55BD1160E34; Tue, 26 Mar 2019 19:10:07 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id OoJspbA8eG0u; Tue, 26 Mar 2019 19:10:05 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 69955160EC0; Tue, 26 Mar 2019 19:10:05 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id TLL8265nx548; Tue, 26 Mar 2019 19:10:05 -0700 (PDT) Original-Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 43B7C160A56; Tue, 26 Mar 2019 19:10:05 -0700 (PDT) Openpgp: preference=signencrypt Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECH In-Reply-To: <0E648A80-8673-44DB-B481-981474AC3D7C@acm.org> Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 131.179.128.68 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:234757 Archived-At: This is a multi-part message in MIME format. --------------1FA3F32FBD69FD3558DEEE97 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 3/26/19 10:38 AM, Mattias Engdeg=C3=A5rd wrote: > This is the latest regexp error scan of Emacs source files, using relin= t 1.5 (in ELPA now, or when it updates), xr 1.9. > New checks has permitted it to discover more irregularities. > > The log contains, at the end, checks found with an experimental version= of xr that yielded too many false positives to be useful in general, but= these errors seem to be genuine. Thanks, I installed the attached patch to try to fix those issues. --------------1FA3F32FBD69FD3558DEEE97 Content-Type: text/x-patch; name="0001-2019-03-26-regex-cleanup.patch" Content-Disposition: attachment; filename="0001-2019-03-26-regex-cleanup.patch" Content-Transfer-Encoding: quoted-printable >From d6b45e7ec0d4ced419f413ff20ce854964ae3cce Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 26 Mar 2019 19:06:36 -0700 Subject: [PATCH] 2019-03-26 regex cleanup MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit Problems reported by Mattias Engdeg=C3=A5rd in: https://lists.gnu.org/r/emacs-devel/2019-03/msg01028.html * lisp/align.el (align-rules-list): * lisp/speedbar.el (speedbar-check-read-only, speedbar-check-vc): * lisp/vc/diff-mode.el (diff-add-change-log-entries-other-window): * lisp/woman.el (woman-parse-numeric-arg): Put "-" at end of character alternatives, since a range was not intended. * lisp/erc/erc.el (font-lock): * lisp/mail/footnote.el (cl-seq): Avoid duplicate character alternatives by using cl-seq API. * lisp/mail/footnote.el (footnote--current-regexp): * lisp/textmodes/css-mode.el (css--font-lock-keywords): Avoid repetition of repetition. * lisp/net/webjump.el (webjump-url-encode): Add ~ to character alternatives, and rewrite confusing range. * lisp/progmodes/verilog-mode.el (verilog-compiler-directives) (verilog-assignment-operator-re): Remove duplicate. * lisp/progmodes/verilog-mode.el (verilog-preprocessor-re): * lisp/textmodes/css-mode.el (css--font-lock-keywords): Don=E2=80=99t escape a char that doesn=E2=80=99t need it. * lisp/textmodes/picture.el (picture-tab-chars): In docstring, do not say regexp characters will be quoted; merely say in another way that the syntax is that of character alternatives. (picture-set-tab-stops, picture-tab-search): Don=E2=80=99t attempt to regexp-quote picture-tab-chars. (picture-tab-search): Quote \ in picture-tab-chars for skip-chars-backwards, which treats \ differently than regexp character alternatives do. --- lisp/align.el | 2 +- lisp/erc/erc.el | 7 +++---- lisp/mail/footnote.el | 16 ++++++++++++---- lisp/net/webjump.el | 2 +- lisp/progmodes/verilog-mode.el | 8 +++----- lisp/speedbar.el | 4 ++-- lisp/textmodes/css-mode.el | 4 ++-- lisp/textmodes/picture.el | 14 ++++++++------ lisp/vc/diff-mode.el | 2 +- lisp/woman.el | 2 +- 10 files changed, 34 insertions(+), 27 deletions(-) diff --git a/lisp/align.el b/lisp/align.el index a81498be5d..fd88d0eda4 100644 --- a/lisp/align.el +++ b/lisp/align.el @@ -438,7 +438,7 @@ align-rules-list (tab-stop . nil)) =20 (perl-assignment - (regexp . ,(concat "[^=3D!^&*-+<>/| \t\n]\\(\\s-*\\)=3D[~>]?" + (regexp . ,(concat "[^=3D!^&*+<>/| \t\n-]\\(\\s-*\\)=3D[~>]?" "\\(\\s-*\\)\\([^>=3D \t\n]\\|$\\)")) (group . (1 2)) (modes . align-perl-modes) diff --git a/lisp/erc/erc.el b/lisp/erc/erc.el index bcaa3e4525..e34487de27 100644 --- a/lisp/erc/erc.el +++ b/lisp/erc/erc.el @@ -67,6 +67,7 @@ (load "erc-loaddefs" nil t) =20 (eval-when-compile (require 'cl-lib)) +(require 'cl-seq) (require 'font-lock) (require 'pp) (require 'thingatpt) @@ -2522,10 +2523,8 @@ erc-lurker-maybe-trim non-nil." (if erc-lurker-trim-nicks (replace-regexp-in-string - (format "[%s]" - (mapconcat (lambda (char) - (regexp-quote (char-to-string char))) - erc-lurker-ignore-chars "")) + (regexp-opt (cl-delete-duplicates + (mapcar #'char-to-string erc-lurker-ignore-chars))) "" nick) nick)) =20 diff --git a/lisp/mail/footnote.el b/lisp/mail/footnote.el index a7802929dc..7f88e30120 100644 --- a/lisp/mail/footnote.el +++ b/lisp/mail/footnote.el @@ -64,6 +64,7 @@ ;;; Code: =20 (eval-when-compile (require 'cl-lib)) +(require 'cl-seq) (defvar filladapt-token-table) =20 (defgroup footnote nil @@ -363,7 +364,9 @@ footnote-hebrew-numeric ("=D7=A7" "=D7=A8" "=D7=A9" "=D7=AA" "=D7=AA=D7=A7" "=D7=AA=D7=A8" "= =D7=AA=D7=A9" "=D7=AA=D7=AA" "=D7=AA=D7=AA=D7=A7"))) =20 (defconst footnote-hebrew-numeric-regex - (concat "[" (apply #'concat (apply #'append footnote-hebrew-numeric)) = "']+")) + (concat "[" (cl-delete-duplicates + (apply #'concat (apply #'append footnote-hebrew-numeric))) + "']+")) ;; (defconst footnote-hebrew-numeric-regex "\\([=D7=90=D7=91=D7=92=D7=93= =D7=94=D7=95=D7=96=D7=97=D7=98]'\\)?\\(=D7=AA\\)?\\(=D7=AA\\)?\\([=D7=A7=D7= =A8=D7=A9=D7=AA]\\)?\\([=D7=98=D7=99=D7=9B=D7=9C=D7=9E=D7=A0=D7=A1=D7=A2=D7= =A4=D7=A6]\\)?\\([=D7=90=D7=91=D7=92=D7=93=D7=94=D7=95=D7=96=D7=97=D7=98]= \\)?") =20 (defun footnote--hebrew-numeric (n) @@ -457,9 +460,14 @@ footnote--index-to-string =20 (defun footnote--current-regexp () "Return the regexp of the index of the current style." - (concat (nth 2 (or (assq footnote-style footnote-style-alist) - (nth 0 footnote-style-alist))) - "*")) + (let ((regexp (nth 2 (or (assq footnote-style footnote-style-alist) + (nth 0 footnote-style-alist))))) + (concat + ;; Hack to avoid repetition of repetition. + (if (string-match "[^\\]\\\\\\{2\\}*[*+?]\\'" regexp) + (substring regexp 0 -1) + regexp) + "*"))) =20 (defun footnote--refresh-footnotes (&optional index-regexp) "Redraw all footnotes. diff --git a/lisp/net/webjump.el b/lisp/net/webjump.el index 40df23e174..e297b9d610 100644 --- a/lisp/net/webjump.el +++ b/lisp/net/webjump.el @@ -342,7 +342,7 @@ webjump-url-encode (mapconcat (lambda (c) (let ((s (char-to-string c))) (cond ((string=3D s " ") "+") - ((string-match "[a-zA-Z_.-/]" s) s) + ((string-match "[a-zA-Z_./~-]" s) s) (t (upcase (format "%%%02x" c)))))) (encode-coding-string str 'utf-8) "")) diff --git a/lisp/progmodes/verilog-mode.el b/lisp/progmodes/verilog-mode= .el index 9e241c70e7..f55cf0002d 100644 --- a/lisp/progmodes/verilog-mode.el +++ b/lisp/progmodes/verilog-mode.el @@ -2053,7 +2053,7 @@ verilog-compiler-directives "`resetall" "`timescale" "`unconnected_drive" "`undef" "`undefinea= ll" ;; compiler directives not covered by IEEE 1800 "`case" "`default" "`endfor" "`endprotect" "`endswitch" "`endwhile= " "`for" - "`format" "`if" "`let" "`protect" "`switch" "`timescale" "`time_sc= ale" + "`format" "`if" "`let" "`protect" "`switch" "`time_scale" "`while" )) "List of Verilog compiler directives.") @@ -2414,9 +2414,7 @@ verilog-assignment-operator-re '( ;; blocking assignment_operator "=3D" "+=3D" "-=3D" "*=3D" "/=3D" "%=3D" "&=3D" "|=3D" "^=3D" "<<= =3D" ">>=3D" "<<<=3D" ">>>=3D" - ;; non blocking assignment operator - "<=3D" - ;; comparison + ;; comparison (also nonblocking assignment "<=3D") "=3D=3D" "!=3D" "=3D=3D=3D" "!=3D=3D" "<=3D" ">=3D" "=3D=3D?" "!=3D= ?" "<->" ;; event_trigger "->" "->>" @@ -2973,7 +2971,7 @@ verilog-preprocessor-re "\\<\\(`pragma\\)\\>\\s-+.+$" "\\)\\|\\(?:" ;; `timescale time_unit / time_precision - "\\<\\(`timescale\\)\\>\\s-+10\\{0,2\\}\\s-*[munpf]?s\\s-*\\/\\s-*1= 0\\{0,2\\}\\s-*[munpf]?s" + "\\<\\(`timescale\\)\\>\\s-+10\\{0,2\\}\\s-*[munpf]?s\\s-*/\\s-*10\= \{0,2\\}\\s-*[munpf]?s" "\\)\\|\\(?:" ;; `define and `if can span multiple lines if line ends in '\'. NOT= E: `if is not IEEE 1800-2012 ;; from http://www.emacswiki.org/emacs/MultilineRegexp diff --git a/lisp/speedbar.el b/lisp/speedbar.el index 399ef4557b..4823e4ba56 100644 --- a/lisp/speedbar.el +++ b/lisp/speedbar.el @@ -2849,7 +2849,7 @@ speedbar-check-read-only (progn (goto-char speedbar-ro-to-do-point) (while (and (not (input-pending-p)) - (re-search-forward "^\\([0-9]+\\):\\s-*[[<][+-?][]>] " + (re-search-forward "^\\([0-9]+\\):\\s-*[[<][+?-][]>] " nil t)) (setq speedbar-ro-to-do-point (point)) (let ((f (speedbar-line-file))) @@ -2900,7 +2900,7 @@ speedbar-check-vc (progn (goto-char speedbar-vc-to-do-point) (while (and (not (input-pending-p)) - (re-search-forward "^\\([0-9]+\\):\\s-*\\[[+-?]\\] " + (re-search-forward "^\\([0-9]+\\):\\s-*\\[[+?-]\\] " nil t)) (setq speedbar-vc-to-do-point (point)) (if (speedbar-check-vc-this-line (match-string 1)) diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el index cddcdc0947..57ecc9788e 100644 --- a/lisp/textmodes/css-mode.el +++ b/lisp/textmodes/css-mode.el @@ -892,7 +892,7 @@ css--font-lock-keywords (,(concat "@" css-ident-re) (0 font-lock-builtin-face)) ;; Selectors. ;; Allow plain ":root" as a selector. - ("^[ \t]*\\(:root\\)\\(?:[\n \t]*\\)*{" (1 'css-selector keep)) + ("^[ \t]*\\(:root\\)\\(?:[\n \t]*\\){" (1 'css-selector keep)) ;; FIXME: attribute selectors don't work well because they may conta= in ;; strings which have already been highlighted as f-l-string-face an= d ;; thus prevent this highlighting from being applied (actually now t= hat @@ -915,7 +915,7 @@ css--font-lock-keywords "\\(?:\\(:" (regexp-opt (append css-pseudo-class-ids css-pseudo-element-ids) t) - "\\|\\::" (regexp-opt css-pseudo-element-ids t) "\\)" + "\\|::" (regexp-opt css-pseudo-element-ids t) "\\)" "\\(?:([^)]+)\\)?" (if (not sassy) "[^:{}()\n]*" diff --git a/lisp/textmodes/picture.el b/lisp/textmodes/picture.el index f0e30135f1..b520849467 100644 --- a/lisp/textmodes/picture.el +++ b/lisp/textmodes/picture.el @@ -387,7 +387,8 @@ picture-tab-chars \\[picture-set-tab-stops] and \\[picture-tab-search]. The syntax for this variable is like the syntax used inside of `[...]' in a regular expression--but without the `[' and the `]'. -It is NOT a regular expression, any regexp special characters will be qu= oted. +It is NOT a regular expression, and should follow the usual +rules for the contents of a character alternative. It defines a set of \"interesting characters\" to look for when setting \(or searching for) tab stops, initially \"!-~\" (all printing character= s). For example, suppose that you are editing a table which is formatted thu= s: @@ -425,7 +426,7 @@ picture-set-tab-stops (if arg (setq tabs (or (default-value 'tab-stop-list) (indent-accumulate-tab-stops (window-width)))) - (let ((regexp (concat "[ \t]+[" (regexp-quote picture-tab-chars) "]"))) + (let ((regexp (concat "[ \t]+[" picture-tab-chars "]"))) (beginning-of-line) (let ((bol (point))) (end-of-line) @@ -433,8 +434,8 @@ picture-set-tab-stops (skip-chars-forward " \t") (setq tabs (cons (current-column) tabs))) (if (null tabs) - (error "No characters in set %s on this line" - (regexp-quote picture-tab-chars)))))) + (error "No characters in set [%s] on this line" + picture-tab-chars))))) (setq tab-stop-list tabs) (let ((blurb (make-string (1+ (nth (1- (length tabs)) tabs)) ?\ ))= ) (while tabs @@ -455,12 +456,13 @@ picture-tab-search (progn (beginning-of-line) (skip-chars-backward - (concat "^" (regexp-quote picture-tab-chars)) + (concat "^" (replace-regexp-in-string + "\\\\" "\\\\" picture-tab-chars nil t)) (point-min)) (not (bobp)))) (move-to-column target)) (if (re-search-forward - (concat "[ \t]+[" (regexp-quote picture-tab-chars) "]") + (concat "[ \t]+[" picture-tab-chars "]") (line-end-position) 'move) (setq target (1- (current-column))) diff --git a/lisp/vc/diff-mode.el b/lisp/vc/diff-mode.el index b67caab7f5..dbde284da8 100644 --- a/lisp/vc/diff-mode.el +++ b/lisp/vc/diff-mode.el @@ -2213,7 +2213,7 @@ diff-add-change-log-entries-other-window ;; `add-change-log-entry-other-window' works better in ;; that case. (re-search-forward - (concat "\n[!+-<>]" + (concat "\n[!+<>-]" ;; If the hunk is a context hunk with an empty= first ;; half, recognize the "--- NNN,MMM ----" line "\\(-- [0-9]+\\(,[0-9]+\\)? ----\n" diff --git a/lisp/woman.el b/lisp/woman.el index a351f788ec..39d9b806d2 100644 --- a/lisp/woman.el +++ b/lisp/woman.el @@ -3511,7 +3511,7 @@ woman-parse-numeric-arg (let ((value (if (looking-at "[+-]") 0 (woman-parse-numeric-value))) op) (while (cond - ((looking-at "[+-/*%]") ; arithmetic operators + ((looking-at "[+/*%-]") ; arithmetic operators (forward-char) (setq op (intern-soft (match-string 0))) (setq value (funcall op value (woman-parse-numeric-value)))) --=20 2.20.1 --------------1FA3F32FBD69FD3558DEEE97--