unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Emacs developers <emacs-devel@gnu.org>
Subject: Regexp scan of Emacs (April 19)
Date: Fri, 19 Apr 2019 11:39:04 +0200	[thread overview]
Message-ID: <90232AC2-3228-4C8F-AD84-FFB6A30F51AF@acm.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

This is the latest scan of errors and oddities in regexps in the Emacs source tree.
New this time is an experimental check for branch subsumption: whether one branch in an or-expression matches a superset of another, like "[ab]\\|a". Please tell me if you believe this might be useful, so that I know whether to include it in the next release of xr.

The algorithm uses some simple linear heuristics since a full regexp subset check would be quite expensive and probably require DFA construction and graph equivalence; maybe something for a future version.

[-- Attachment #2: relint.log --]
[-- Type: application/octet-stream, Size: 10023 bytes --]

;; -*- compilation -*-
Relint results for ~/emacs
lisp/eshell/em-hist.el:726:31: In call to string-match: Branch matches superset of a previous branch (pos 37)
  "^:?\\([0-9]+\\|[$^%*]\\)?\\(\\*\\|-[0-9]*\\|[$^%*]\\)?"
   ............................................^
lisp/international/ja-dic-cnv.el:127:29: In call to re-search-forward: Branch matches subset of a previous branch (pos 16)
  "^[#<>?]\\(\\(\\cH\\|ー\\)+\\) "
   ....................^
lisp/international/ja-dic-cnv.el:160:31: In call to re-search-forward: Branch matches subset of a previous branch (pos 10)
  "^\\(\\(\\cH\\|ー\\)+\\)[<>?] "
   ..............^
lisp/international/ja-dic-cnv.el:278:33: In call to re-search-forward: Branch matches subset of a previous branch (pos 10)
  "^\\(\\(\\cH\\|ー\\)+\\) \\(/\\cj.*\\)/$"
   ..............^
lisp/net/tramp-adb.el:56:3: In tramp-adb-prompt: Repetition of expression matching an empty string (pos 57)
  "^[[:digit:]]*|?\\(?:[[:alnum:]\e;[]*@?[[:alnum:]]*[^#\\$]*\\)?[#\\$][[:space:]]"
   .............................................................^
lisp/progmodes/cc-awk.el:98:29: In c-awk-esc-pair-re: Branch matches subset of a previous branch (pos 10)
  "\\\\\\(.\\|\n\\|\r\\|\\'\\)"
   ................^
lisp/progmodes/cc-awk.el:137:3: In c-awk-harmless-string*-re: Branch matches subset of a previous branch (pos 29)
  "\\([^_#/\"{}();\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*"
   .........................................^
lisp/progmodes/cc-awk.el:141:3: In c-awk-harmless-string*-here-re: Branch matches subset of a previous branch (pos 31)
  "\\=\\([^_#/\"{}();\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*"
   ............................................^
lisp/progmodes/cc-awk.el:148:3: In c-awk-harmless-line-string*-re: Branch matches subset of a previous branch (pos 24)
  "\\([^_#/\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*"
   ....................................^
lisp/progmodes/cc-awk.el:152:3: In c-awk-harmless-line-re: Branch matches subset of a previous branch (pos 24)
  "\\([^_#/\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*\\(#.*\\)?\\(\n\\|\r\\|\\'\\)"
   ....................................^
lisp/progmodes/cc-awk.el:159:3: In c-awk-harmless-lines+-here-re: Branch matches subset of a previous branch (pos 28)
  "\\=\\(\\([^_#/\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*\\(#.*\\)?\\(\n\\|\r\\|\\'\\)\\)+"
   ..........................................^
lisp/progmodes/cc-awk.el:167:3: In c-awk-string-innards-re: Branch matches subset of a previous branch (pos 21)
  "\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*"
   .................................^
lisp/progmodes/cc-awk.el:170:3: In c-awk-string-without-end-here-re: Branch matches subset of a previous branch (pos 26)
  "\\=_?\"\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*"
   ........................................^
lisp/progmodes/cc-awk.el:174:3: In c-awk-possibly-open-string-re: Branch matches subset of a previous branch (pos 22)
  "\"\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*\\(\"\\|$\\|\\'\\)"
   ...................................^
lisp/progmodes/cc-awk.el:191:3: In c-awk-regexp-char-list-re: Branch matches subset of a previous branch (pos 45)
  "\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)"
   ....................................................................^
lisp/progmodes/cc-awk.el:197:3: In c-awk-regexp-innards-re: Branch matches subset of a previous branch (pos 12)
  "\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ...................^
lisp/progmodes/cc-awk.el:197:3: In c-awk-regexp-innards-re: Branch matches subset of a previous branch (pos 66)
  "\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   .....................................................................................................^
lisp/progmodes/cc-awk.el:201:3: In c-awk-regexp-without-end-re: Branch matches subset of a previous branch (pos 13)
  "/\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ....................^
lisp/progmodes/cc-awk.el:201:3: In c-awk-regexp-without-end-re: Branch matches subset of a previous branch (pos 67)
  "/\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ......................................................................................................^
lisp/progmodes/cc-awk.el:255:3: In c-awk-non-/-syn-ws*-re: Branch matches subset of a previous branch (pos 69)
  "\\(\\(\\\\[\n\r]\\|[ \t]\\)*\\([^#/\"\\\n\r \t]\\|\\\\\\(.\\|\\'\\)\\|\"\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*\\(\"\\|$\\|\\'\\)\\)\\)*"
   .........................................................................................................^
lisp/progmodes/cc-mode.el:1247:26: In call to re-search-forward: Branch matches subset of a previous branch (pos 17)
  "[\n\r]?\\(\\\\\\(.\\|\n\\|\r\\)\\|[^\\\n\r]\\)*"
   ..........................^
lisp/progmodes/cc-mode.el:1374:27: In call to looking-at: Branch matches subset of a previous branch (pos 12)
  "\\(\\\\\\(.\\|\n\\|\r\\)\\|[^\"]\\)*"
   ...................^
lisp/progmodes/cperl-mode.el:7980:16: In call to looking-at: Branch matches superset of a previous branch (pos 71)
  "\\([a-zA-Z0-9]+[^*+{?]\\)\\|\\$\\([a-zA-Z0-9_]+\\([[{]\\)?\\|[^\n \t)|]\\)\\|[$^]\\|\\(\\\\.\\|[^][()#|*+?\n]\\)\\([*+{?]\\??\\)?\\|\\(\\[\\)\\|\\((\\(\\?\\)?\\)\\|\\(|\\)"
   ....................................................................................^
lisp/arc-mode.el:2020:26: In call to looking-at: Branch matches subset of a previous branch (pos 44)
  "^ +[0-9.]+ +D?-+ +\\([0-9-]+\\) +\\([-0-9.%]+\\|-+\\) +\\([0-9a-zA-Z]+\\) +\\([0-9-]+\\) +\\([0-9:]+\\) +\\(.*\\)\n"
   ................................................^
lisp/info.el:1534:41: In call to re-search-forward: Branch matches superset of a previous branch (pos 18)
  "^\\* \\([^:\n]+:\\(:\\|[^.\n]+\\).\\)"
   .......................^
lisp/xml.el:247:27: In xml-att-type-re: Branch matches superset of a previous branch (pos 191)
  "\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:(\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\(?:\\s-*|\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\)*\\s-+)\\)\\)\\)"
   ..........................................................................................................................................................................................................................^
lisp/xml.el:257:26: In xml-att-def-re: Branch matches superset of a previous branch (pos 239)
  "\\(?:\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\s-*\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:(\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\(?:\\s-*|\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\)*\\s-+)\\)\\)\\)\\s-*\\(?:#REQUIRED\\|#IMPLIED\\|\\(?:#FIXED\\s-+\\)*\\(?:\"\\(?:[^&\"]\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*\"\\|'\\(?:[^&']\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*'\\)\\)\\)"
   ................................................................................................................................................................................................................................................................................^
lisp/xml.el:805:28: In call to looking-at: Branch matches superset of a previous branch (pos 304)
  "<!ATTLIST[ \t\n\r]*\\([[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)[ \t\n\r]*\\(\\(?:\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\s-*\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:(\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\(?:\\s-*|\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\)*\\s-+)\\)\\)\\)\\s-*\\(?:#REQUIRED\\|#IMPLIED\\|\\(?:#FIXED\\s-+\\)*\\(?:\"\\(?:[^&\"]\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*\"\\|'\\(?:[^&']\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*'\\)\\)\\)\\)*[ \t\n\r]*>"
   .............................................................................................................................................................................................................................................................................................................................................................^

             reply	other threads:[~2019-04-19  9:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-19  9:39 Mattias Engdegård [this message]
2019-04-19 12:42 ` Regexp scan of Emacs (April 19) Michael Albinus
2019-04-19 16:04 ` Paul Eggert
2019-04-19 20:29   ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=90232AC2-3228-4C8F-AD84-FFB6A30F51AF@acm.org \
    --to=mattiase@acm.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).