all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: emacs-devel <emacs-devel@gnu.org>
Subject: Re: Regexp error scan (March 26)
Date: Tue, 26 Mar 2019 19:10:05 -0700	[thread overview]
Message-ID: <77419a89-ce9f-919b-c221-c7a3b938587a@cs.ucla.edu> (raw)
In-Reply-To: <0E648A80-8673-44DB-B481-981474AC3D7C@acm.org>

[-- Attachment #1: Type: text/plain, Size: 491 bytes --]

On 3/26/19 10:38 AM, Mattias Engdegård wrote:
> This is the latest regexp error scan of Emacs source files, using relint 1.5 (in ELPA now, or when it updates), xr 1.9.
> New checks has permitted it to discover more irregularities.
>
> The log contains, at the end, checks found with an experimental version of xr that yielded too many false positives to be useful in general, but these errors seem to be genuine.

Thanks, I installed the attached patch to try to fix those issues.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-2019-03-26-regex-cleanup.patch --]
[-- Type: text/x-patch; name="0001-2019-03-26-regex-cleanup.patch", Size: 12613 bytes --]

From d6b45e7ec0d4ced419f413ff20ce854964ae3cce Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 26 Mar 2019 19:06:36 -0700
Subject: [PATCH] 2019-03-26 regex cleanup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problems reported by Mattias Engdegård in:
https://lists.gnu.org/r/emacs-devel/2019-03/msg01028.html
* lisp/align.el (align-rules-list):
* lisp/speedbar.el (speedbar-check-read-only, speedbar-check-vc):
* lisp/vc/diff-mode.el (diff-add-change-log-entries-other-window):
* lisp/woman.el (woman-parse-numeric-arg):
Put "-" at end of character alternatives, since a range was not intended.
* lisp/erc/erc.el (font-lock):
* lisp/mail/footnote.el (cl-seq):
Avoid duplicate character alternatives by using cl-seq API.
* lisp/mail/footnote.el (footnote--current-regexp):
* lisp/textmodes/css-mode.el (css--font-lock-keywords):
Avoid repetition of repetition.
* lisp/net/webjump.el (webjump-url-encode):
Add ~ to character alternatives, and rewrite confusing range.
* lisp/progmodes/verilog-mode.el (verilog-compiler-directives)
(verilog-assignment-operator-re):
Remove duplicate.
* lisp/progmodes/verilog-mode.el (verilog-preprocessor-re):
* lisp/textmodes/css-mode.el (css--font-lock-keywords):
Don’t escape a char that doesn’t need it.
* lisp/textmodes/picture.el (picture-tab-chars): In docstring,
do not say regexp characters will be quoted; merely say in
another way that the syntax is that of character alternatives.
(picture-set-tab-stops, picture-tab-search): Don’t attempt
to regexp-quote picture-tab-chars.
(picture-tab-search): Quote \ in picture-tab-chars for
skip-chars-backwards, which treats \ differently than
regexp character alternatives do.
---
 lisp/align.el                  |  2 +-
 lisp/erc/erc.el                |  7 +++----
 lisp/mail/footnote.el          | 16 ++++++++++++----
 lisp/net/webjump.el            |  2 +-
 lisp/progmodes/verilog-mode.el |  8 +++-----
 lisp/speedbar.el               |  4 ++--
 lisp/textmodes/css-mode.el     |  4 ++--
 lisp/textmodes/picture.el      | 14 ++++++++------
 lisp/vc/diff-mode.el           |  2 +-
 lisp/woman.el                  |  2 +-
 10 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/lisp/align.el b/lisp/align.el
index a81498be5d..fd88d0eda4 100644
--- a/lisp/align.el
+++ b/lisp/align.el
@@ -438,7 +438,7 @@ align-rules-list
      (tab-stop . nil))
 
     (perl-assignment
-     (regexp   . ,(concat "[^=!^&*-+<>/| \t\n]\\(\\s-*\\)=[~>]?"
+     (regexp   . ,(concat "[^=!^&*+<>/| \t\n-]\\(\\s-*\\)=[~>]?"
 			  "\\(\\s-*\\)\\([^>= \t\n]\\|$\\)"))
      (group    . (1 2))
      (modes    . align-perl-modes)
diff --git a/lisp/erc/erc.el b/lisp/erc/erc.el
index bcaa3e4525..e34487de27 100644
--- a/lisp/erc/erc.el
+++ b/lisp/erc/erc.el
@@ -67,6 +67,7 @@
 (load "erc-loaddefs" nil t)
 
 (eval-when-compile (require 'cl-lib))
+(require 'cl-seq)
 (require 'font-lock)
 (require 'pp)
 (require 'thingatpt)
@@ -2522,10 +2523,8 @@ erc-lurker-maybe-trim
 non-nil."
   (if erc-lurker-trim-nicks
       (replace-regexp-in-string
-       (format "[%s]"
-               (mapconcat (lambda (char)
-                            (regexp-quote (char-to-string char)))
-                          erc-lurker-ignore-chars ""))
+       (regexp-opt (cl-delete-duplicates
+		    (mapcar #'char-to-string erc-lurker-ignore-chars)))
        "" nick)
     nick))
 
diff --git a/lisp/mail/footnote.el b/lisp/mail/footnote.el
index a7802929dc..7f88e30120 100644
--- a/lisp/mail/footnote.el
+++ b/lisp/mail/footnote.el
@@ -64,6 +64,7 @@
 ;;; Code:
 
 (eval-when-compile (require 'cl-lib))
+(require 'cl-seq)
 (defvar filladapt-token-table)
 
 (defgroup footnote nil
@@ -363,7 +364,9 @@ footnote-hebrew-numeric
     ("ק" "ר" "ש" "ת" "תק" "תר" "תש" "תת" "תתק")))
 
 (defconst footnote-hebrew-numeric-regex
-  (concat "[" (apply #'concat (apply #'append footnote-hebrew-numeric)) "']+"))
+  (concat "[" (cl-delete-duplicates
+	       (apply #'concat (apply #'append footnote-hebrew-numeric)))
+	  "']+"))
 ;; (defconst footnote-hebrew-numeric-regex "\\([אבגדהוזחט]'\\)?\\(ת\\)?\\(ת\\)?\\([קרשת]\\)?\\([טיכלמנסעפצ]\\)?\\([אבגדהוזחט]\\)?")
 
 (defun footnote--hebrew-numeric (n)
@@ -457,9 +460,14 @@ footnote--index-to-string
 
 (defun footnote--current-regexp ()
   "Return the regexp of the index of the current style."
-  (concat (nth 2 (or (assq footnote-style footnote-style-alist)
-		     (nth 0 footnote-style-alist)))
-	  "*"))
+  (let ((regexp (nth 2 (or (assq footnote-style footnote-style-alist)
+			   (nth 0 footnote-style-alist)))))
+    (concat
+     ;; Hack to avoid repetition of repetition.
+     (if (string-match "[^\\]\\\\\\{2\\}*[*+?]\\'" regexp)
+	 (substring regexp 0 -1)
+       regexp)
+     "*")))
 
 (defun footnote--refresh-footnotes (&optional index-regexp)
   "Redraw all footnotes.
diff --git a/lisp/net/webjump.el b/lisp/net/webjump.el
index 40df23e174..e297b9d610 100644
--- a/lisp/net/webjump.el
+++ b/lisp/net/webjump.el
@@ -342,7 +342,7 @@ webjump-url-encode
   (mapconcat (lambda (c)
                (let ((s (char-to-string c)))
                  (cond ((string= s " ") "+")
-                       ((string-match "[a-zA-Z_.-/]" s) s)
+		       ((string-match "[a-zA-Z_./~-]" s) s)
                        (t (upcase (format "%%%02x" c))))))
              (encode-coding-string str 'utf-8)
              ""))
diff --git a/lisp/progmodes/verilog-mode.el b/lisp/progmodes/verilog-mode.el
index 9e241c70e7..f55cf0002d 100644
--- a/lisp/progmodes/verilog-mode.el
+++ b/lisp/progmodes/verilog-mode.el
@@ -2053,7 +2053,7 @@ verilog-compiler-directives
       "`resetall" "`timescale" "`unconnected_drive" "`undef" "`undefineall"
       ;; compiler directives not covered by IEEE 1800
       "`case" "`default" "`endfor" "`endprotect" "`endswitch" "`endwhile" "`for"
-      "`format" "`if" "`let" "`protect" "`switch" "`timescale" "`time_scale"
+      "`format" "`if" "`let" "`protect" "`switch" "`time_scale"
       "`while"
       ))
   "List of Verilog compiler directives.")
@@ -2414,9 +2414,7 @@ verilog-assignment-operator-re
      '(
        ;; blocking assignment_operator
        "=" "+=" "-=" "*=" "/=" "%=" "&=" "|=" "^=" "<<=" ">>=" "<<<=" ">>>="
-       ;; non blocking assignment operator
-       "<="
-       ;; comparison
+       ;; comparison (also nonblocking assignment "<=")
        "==" "!=" "===" "!==" "<=" ">=" "==?" "!=?" "<->"
        ;; event_trigger
        "->" "->>"
@@ -2973,7 +2971,7 @@ verilog-preprocessor-re
      "\\<\\(`pragma\\)\\>\\s-+.+$"
      "\\)\\|\\(?:"
      ;; `timescale time_unit / time_precision
-     "\\<\\(`timescale\\)\\>\\s-+10\\{0,2\\}\\s-*[munpf]?s\\s-*\\/\\s-*10\\{0,2\\}\\s-*[munpf]?s"
+     "\\<\\(`timescale\\)\\>\\s-+10\\{0,2\\}\\s-*[munpf]?s\\s-*/\\s-*10\\{0,2\\}\\s-*[munpf]?s"
      "\\)\\|\\(?:"
      ;; `define and `if can span multiple lines if line ends in '\'. NOTE: `if is not IEEE 1800-2012
      ;; from http://www.emacswiki.org/emacs/MultilineRegexp
diff --git a/lisp/speedbar.el b/lisp/speedbar.el
index 399ef4557b..4823e4ba56 100644
--- a/lisp/speedbar.el
+++ b/lisp/speedbar.el
@@ -2849,7 +2849,7 @@ speedbar-check-read-only
 	(progn
 	  (goto-char speedbar-ro-to-do-point)
 	  (while (and (not (input-pending-p))
-		      (re-search-forward "^\\([0-9]+\\):\\s-*[[<][+-?][]>] "
+		      (re-search-forward "^\\([0-9]+\\):\\s-*[[<][+?-][]>] "
 					 nil t))
 	    (setq speedbar-ro-to-do-point (point))
 	    (let ((f (speedbar-line-file)))
@@ -2900,7 +2900,7 @@ speedbar-check-vc
 	(progn
 	  (goto-char speedbar-vc-to-do-point)
 	  (while (and (not (input-pending-p))
-		      (re-search-forward "^\\([0-9]+\\):\\s-*\\[[+-?]\\] "
+		      (re-search-forward "^\\([0-9]+\\):\\s-*\\[[+?-]\\] "
 					 nil t))
 	    (setq speedbar-vc-to-do-point (point))
 	    (if (speedbar-check-vc-this-line (match-string 1))
diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el
index cddcdc0947..57ecc9788e 100644
--- a/lisp/textmodes/css-mode.el
+++ b/lisp/textmodes/css-mode.el
@@ -892,7 +892,7 @@ css--font-lock-keywords
     (,(concat "@" css-ident-re) (0 font-lock-builtin-face))
     ;; Selectors.
     ;; Allow plain ":root" as a selector.
-    ("^[ \t]*\\(:root\\)\\(?:[\n \t]*\\)*{" (1 'css-selector keep))
+    ("^[ \t]*\\(:root\\)\\(?:[\n \t]*\\){" (1 'css-selector keep))
     ;; FIXME: attribute selectors don't work well because they may contain
     ;; strings which have already been highlighted as f-l-string-face and
     ;; thus prevent this highlighting from being applied (actually now that
@@ -915,7 +915,7 @@ css--font-lock-keywords
        "\\(?:\\(:" (regexp-opt (append css-pseudo-class-ids
                                        css-pseudo-element-ids)
                                t)
-       "\\|\\::" (regexp-opt css-pseudo-element-ids t) "\\)"
+       "\\|::" (regexp-opt css-pseudo-element-ids t) "\\)"
        "\\(?:([^)]+)\\)?"
        (if (not sassy)
            "[^:{}()\n]*"
diff --git a/lisp/textmodes/picture.el b/lisp/textmodes/picture.el
index f0e30135f1..b520849467 100644
--- a/lisp/textmodes/picture.el
+++ b/lisp/textmodes/picture.el
@@ -387,7 +387,8 @@ picture-tab-chars
 \\[picture-set-tab-stops] and \\[picture-tab-search].
 The syntax for this variable is like the syntax used inside of `[...]'
 in a regular expression--but without the `[' and the `]'.
-It is NOT a regular expression, any regexp special characters will be quoted.
+It is NOT a regular expression, and should follow the usual
+rules for the contents of a character alternative.
 It defines a set of \"interesting characters\" to look for when setting
 \(or searching for) tab stops, initially \"!-~\" (all printing characters).
 For example, suppose that you are editing a table which is formatted thus:
@@ -425,7 +426,7 @@ picture-set-tab-stops
       (if arg
 	  (setq tabs (or (default-value 'tab-stop-list)
 			 (indent-accumulate-tab-stops (window-width))))
-	(let ((regexp (concat "[ \t]+[" (regexp-quote picture-tab-chars) "]")))
+	(let ((regexp (concat "[ \t]+[" picture-tab-chars "]")))
 	  (beginning-of-line)
 	  (let ((bol (point)))
 	    (end-of-line)
@@ -433,8 +434,8 @@ picture-set-tab-stops
 	      (skip-chars-forward " \t")
 	      (setq tabs (cons (current-column) tabs)))
 	    (if (null tabs)
-		(error "No characters in set %s on this line"
-		       (regexp-quote picture-tab-chars))))))
+		(error "No characters in set [%s] on this line"
+		       picture-tab-chars)))))
       (setq tab-stop-list tabs)
       (let ((blurb (make-string (1+ (nth (1- (length tabs)) tabs)) ?\ )))
 	(while tabs
@@ -455,12 +456,13 @@ picture-tab-search
 	       (progn
 		 (beginning-of-line)
 		 (skip-chars-backward
-		  (concat "^" (regexp-quote picture-tab-chars))
+		  (concat "^" (replace-regexp-in-string
+			       "\\\\" "\\\\" picture-tab-chars nil t))
 		  (point-min))
 		 (not (bobp))))
 	  (move-to-column target))
       (if (re-search-forward
-	   (concat "[ \t]+[" (regexp-quote picture-tab-chars) "]")
+	   (concat "[ \t]+[" picture-tab-chars "]")
 	   (line-end-position)
 	   'move)
 	  (setq target (1- (current-column)))
diff --git a/lisp/vc/diff-mode.el b/lisp/vc/diff-mode.el
index b67caab7f5..dbde284da8 100644
--- a/lisp/vc/diff-mode.el
+++ b/lisp/vc/diff-mode.el
@@ -2213,7 +2213,7 @@ diff-add-change-log-entries-other-window
                  ;; `add-change-log-entry-other-window' works better in
                  ;; that case.
                  (re-search-forward
-                  (concat "\n[!+-<>]"
+		  (concat "\n[!+<>-]"
                           ;; If the hunk is a context hunk with an empty first
                           ;; half, recognize the "--- NNN,MMM ----" line
                           "\\(-- [0-9]+\\(,[0-9]+\\)? ----\n"
diff --git a/lisp/woman.el b/lisp/woman.el
index a351f788ec..39d9b806d2 100644
--- a/lisp/woman.el
+++ b/lisp/woman.el
@@ -3511,7 +3511,7 @@ woman-parse-numeric-arg
   (let ((value (if (looking-at "[+-]") 0 (woman-parse-numeric-value)))
 	op)
     (while (cond
-	    ((looking-at "[+-/*%]")	; arithmetic operators
+	    ((looking-at "[+/*%-]")	; arithmetic operators
 	     (forward-char)
 	     (setq op (intern-soft (match-string 0)))
 	     (setq value (funcall op value (woman-parse-numeric-value))))
-- 
2.20.1


  reply	other threads:[~2019-03-27  2:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-26 17:38 Regexp error scan (March 26) Mattias Engdegård
2019-03-27  2:10 ` Paul Eggert [this message]
2019-03-27  3:43   ` Basil L. Contovounesios
2019-03-27  9:14     ` Mattias Engdegård
2019-03-27 13:55       ` Andy Moreton
2019-03-27 14:36         ` Stefan Monnier
2019-03-27 14:41           ` Mattias Engdegård
2019-03-27 14:45             ` Stefan Monnier
2019-03-27 14:49               ` Noam Postavsky
2019-03-27 15:45                 ` Mattias Engdegård
2019-03-27 15:48                 ` Stefan Monnier
2019-03-27 18:50                   ` Paul Eggert
2019-03-27 15:28             ` Damien Collard
2019-03-27 12:09   ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77419a89-ce9f-919b-c221-c7a3b938587a@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=mattiase@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.