unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Regexp scan of Emacs (April 19)
@ 2019-04-19  9:39 Mattias Engdegård
  2019-04-19 12:42 ` Michael Albinus
  2019-04-19 16:04 ` Paul Eggert
  0 siblings, 2 replies; 4+ messages in thread
From: Mattias Engdegård @ 2019-04-19  9:39 UTC (permalink / raw)
  To: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

This is the latest scan of errors and oddities in regexps in the Emacs source tree.
New this time is an experimental check for branch subsumption: whether one branch in an or-expression matches a superset of another, like "[ab]\\|a". Please tell me if you believe this might be useful, so that I know whether to include it in the next release of xr.

The algorithm uses some simple linear heuristics since a full regexp subset check would be quite expensive and probably require DFA construction and graph equivalence; maybe something for a future version.

[-- Attachment #2: relint.log --]
[-- Type: application/octet-stream, Size: 10023 bytes --]

;; -*- compilation -*-
Relint results for ~/emacs
lisp/eshell/em-hist.el:726:31: In call to string-match: Branch matches superset of a previous branch (pos 37)
  "^:?\\([0-9]+\\|[$^%*]\\)?\\(\\*\\|-[0-9]*\\|[$^%*]\\)?"
   ............................................^
lisp/international/ja-dic-cnv.el:127:29: In call to re-search-forward: Branch matches subset of a previous branch (pos 16)
  "^[#<>?]\\(\\(\\cH\\|ー\\)+\\) "
   ....................^
lisp/international/ja-dic-cnv.el:160:31: In call to re-search-forward: Branch matches subset of a previous branch (pos 10)
  "^\\(\\(\\cH\\|ー\\)+\\)[<>?] "
   ..............^
lisp/international/ja-dic-cnv.el:278:33: In call to re-search-forward: Branch matches subset of a previous branch (pos 10)
  "^\\(\\(\\cH\\|ー\\)+\\) \\(/\\cj.*\\)/$"
   ..............^
lisp/net/tramp-adb.el:56:3: In tramp-adb-prompt: Repetition of expression matching an empty string (pos 57)
  "^[[:digit:]]*|?\\(?:[[:alnum:]\e;[]*@?[[:alnum:]]*[^#\\$]*\\)?[#\\$][[:space:]]"
   .............................................................^
lisp/progmodes/cc-awk.el:98:29: In c-awk-esc-pair-re: Branch matches subset of a previous branch (pos 10)
  "\\\\\\(.\\|\n\\|\r\\|\\'\\)"
   ................^
lisp/progmodes/cc-awk.el:137:3: In c-awk-harmless-string*-re: Branch matches subset of a previous branch (pos 29)
  "\\([^_#/\"{}();\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*"
   .........................................^
lisp/progmodes/cc-awk.el:141:3: In c-awk-harmless-string*-here-re: Branch matches subset of a previous branch (pos 31)
  "\\=\\([^_#/\"{}();\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*"
   ............................................^
lisp/progmodes/cc-awk.el:148:3: In c-awk-harmless-line-string*-re: Branch matches subset of a previous branch (pos 24)
  "\\([^_#/\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*"
   ....................................^
lisp/progmodes/cc-awk.el:152:3: In c-awk-harmless-line-re: Branch matches subset of a previous branch (pos 24)
  "\\([^_#/\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*\\(#.*\\)?\\(\n\\|\r\\|\\'\\)"
   ....................................^
lisp/progmodes/cc-awk.el:159:3: In c-awk-harmless-lines+-here-re: Branch matches subset of a previous branch (pos 28)
  "\\=\\(\\([^_#/\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|_\\([^\"]\\|\\'\\)\\)*\\(#.*\\)?\\(\n\\|\r\\|\\'\\)\\)+"
   ..........................................^
lisp/progmodes/cc-awk.el:167:3: In c-awk-string-innards-re: Branch matches subset of a previous branch (pos 21)
  "\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*"
   .................................^
lisp/progmodes/cc-awk.el:170:3: In c-awk-string-without-end-here-re: Branch matches subset of a previous branch (pos 26)
  "\\=_?\"\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*"
   ........................................^
lisp/progmodes/cc-awk.el:174:3: In c-awk-possibly-open-string-re: Branch matches subset of a previous branch (pos 22)
  "\"\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*\\(\"\\|$\\|\\'\\)"
   ...................................^
lisp/progmodes/cc-awk.el:191:3: In c-awk-regexp-char-list-re: Branch matches subset of a previous branch (pos 45)
  "\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)"
   ....................................................................^
lisp/progmodes/cc-awk.el:197:3: In c-awk-regexp-innards-re: Branch matches subset of a previous branch (pos 12)
  "\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ...................^
lisp/progmodes/cc-awk.el:197:3: In c-awk-regexp-innards-re: Branch matches subset of a previous branch (pos 66)
  "\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   .....................................................................................................^
lisp/progmodes/cc-awk.el:201:3: In c-awk-regexp-without-end-re: Branch matches subset of a previous branch (pos 13)
  "/\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ....................^
lisp/progmodes/cc-awk.el:201:3: In c-awk-regexp-without-end-re: Branch matches subset of a previous branch (pos 67)
  "/\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\(\\\\[\n\r]\\)*\\^\\)?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ......................................................................................................^
lisp/progmodes/cc-awk.el:255:3: In c-awk-non-/-syn-ws*-re: Branch matches subset of a previous branch (pos 69)
  "\\(\\(\\\\[\n\r]\\|[ \t]\\)*\\([^#/\"\\\n\r \t]\\|\\\\\\(.\\|\\'\\)\\|\"\\([^\"\\\n\r]\\|\\\\\\(.\\|\n\\|\r\\|\\'\\)\\)*\\(\"\\|$\\|\\'\\)\\)\\)*"
   .........................................................................................................^
lisp/progmodes/cc-mode.el:1247:26: In call to re-search-forward: Branch matches subset of a previous branch (pos 17)
  "[\n\r]?\\(\\\\\\(.\\|\n\\|\r\\)\\|[^\\\n\r]\\)*"
   ..........................^
lisp/progmodes/cc-mode.el:1374:27: In call to looking-at: Branch matches subset of a previous branch (pos 12)
  "\\(\\\\\\(.\\|\n\\|\r\\)\\|[^\"]\\)*"
   ...................^
lisp/progmodes/cperl-mode.el:7980:16: In call to looking-at: Branch matches superset of a previous branch (pos 71)
  "\\([a-zA-Z0-9]+[^*+{?]\\)\\|\\$\\([a-zA-Z0-9_]+\\([[{]\\)?\\|[^\n \t)|]\\)\\|[$^]\\|\\(\\\\.\\|[^][()#|*+?\n]\\)\\([*+{?]\\??\\)?\\|\\(\\[\\)\\|\\((\\(\\?\\)?\\)\\|\\(|\\)"
   ....................................................................................^
lisp/arc-mode.el:2020:26: In call to looking-at: Branch matches subset of a previous branch (pos 44)
  "^ +[0-9.]+ +D?-+ +\\([0-9-]+\\) +\\([-0-9.%]+\\|-+\\) +\\([0-9a-zA-Z]+\\) +\\([0-9-]+\\) +\\([0-9:]+\\) +\\(.*\\)\n"
   ................................................^
lisp/info.el:1534:41: In call to re-search-forward: Branch matches superset of a previous branch (pos 18)
  "^\\* \\([^:\n]+:\\(:\\|[^.\n]+\\).\\)"
   .......................^
lisp/xml.el:247:27: In xml-att-type-re: Branch matches superset of a previous branch (pos 191)
  "\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:(\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\(?:\\s-*|\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\)*\\s-+)\\)\\)\\)"
   ..........................................................................................................................................................................................................................^
lisp/xml.el:257:26: In xml-att-def-re: Branch matches superset of a previous branch (pos 239)
  "\\(?:\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\s-*\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:(\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\(?:\\s-*|\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\)*\\s-+)\\)\\)\\)\\s-*\\(?:#REQUIRED\\|#IMPLIED\\|\\(?:#FIXED\\s-+\\)*\\(?:\"\\(?:[^&\"]\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*\"\\|'\\(?:[^&']\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*'\\)\\)\\)"
   ................................................................................................................................................................................................................................................................................^
lisp/xml.el:805:28: In call to looking-at: Branch matches superset of a previous branch (pos 304)
  "<!ATTLIST[ \t\n\r]*\\([[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)[ \t\n\r]*\\(\\(?:\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\s-*\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:\\(?:NOTATION\\s-+(\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\(?:\\s-*|\\s-*[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*\\)*\\s-*)\\)\\|\\(?:(\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\(?:\\s-*|\\s-*[[:word:]:_.0-9\267̀-ͯ‿⁀-]+\\)*\\s-+)\\)\\)\\)\\s-*\\(?:#REQUIRED\\|#IMPLIED\\|\\(?:#FIXED\\s-+\\)*\\(?:\"\\(?:[^&\"]\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*\"\\|'\\(?:[^&']\\|\\(?:&[[:word:]:_][[:word:]:_.0-9\267̀-ͯ‿⁀-]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*'\\)\\)\\)\\)*[ \t\n\r]*>"
   .............................................................................................................................................................................................................................................................................................................................................................^

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-19 20:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-19  9:39 Regexp scan of Emacs (April 19) Mattias Engdegård
2019-04-19 12:42 ` Michael Albinus
2019-04-19 16:04 ` Paul Eggert
2019-04-19 20:29   ` Mattias Engdegård

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).