unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Scan of regexps in emacs
@ 2019-03-09 13:26 Mattias Engdegård
  2019-03-09 14:56 ` Alan Mackenzie
  2019-03-11  2:45 ` Paul Eggert
  0 siblings, 2 replies; 13+ messages in thread
From: Mattias Engdegård @ 2019-03-09 13:26 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 229 bytes --]

Here is a new regexp error scan of the Emacs source tree.
The new complaints are due to improvements in the regexp-finding
abilities of the trawler. The locations are also more precise, and
there is a caret line for extra help.


[-- Attachment #2: trawl.log --]
[-- Type: text/x-log, Size: 5385 bytes --]

;; Trawling ~/emacs  -*- compilation -*-
emacs/lisp/language/china-util.el:171:29: In call to looking-at: Unescaped literal `$' (pos 1)
  "\e$A"
   ..^
emacs/lisp/progmodes/cc-awk.el:191:3: In c-awk-regexp-char-list-re: Unescaped literal `^' (pos 13)
  "\\[\\(\\\\[\n\r]\\)*^?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)"
   ....................^
emacs/lisp/progmodes/cc-awk.el:197:3: In c-awk-regexp-innards-re: Unescaped literal `^' (pos 34)
  "\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\\\[\n\r]\\)*^?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   .....................................................^
emacs/lisp/progmodes/cc-awk.el:201:3: In c-awk-regexp-without-end-re: Unescaped literal `^' (pos 35)
  "/\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[\\(\\\\[\n\r]\\)*^?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)\\|[^[/\\\n\r]\\)*"
   ......................................................^
emacs/lisp/progmodes/cperl-mode.el:4927:30: In call to looking-at: Escaped non-special character `e' (pos 2)
  "\\(\\elsif\\|if\\|unless\\|while\\|until\\|for\\(each\\)?\\>\\(\\([ \t\n]*\\(#[^\n]*\n[ \t\n]*\\)*\\(state\\|my\\|local\\|our\\)\\)?[ \t\n]*\\(#[^\n]*\n[ \t\n]*\\)*\\$[_a-zA-Z0-9]+\\)?\\)\\>"
   ...^
emacs/lisp/progmodes/idlwave.el:3692:26: In call to re-search-backward: Character `,' included in range `+-/' (pos 311)
  "\\(\\<\\(&&\\|and\\|b\\(egin\\|reak\\)\\|c\\(ase\\|o\\(mpile_opt\\|ntinue\\)\\)\\|do\\|e\\(lse\\|nd\\(case\\|else\\|for\\|if\\|rep\\|switch\\|while\\)?\\|q\\)\\|for\\(ward_function\\)?\\|g\\(oto\\|[et]\\)\\|i\\(f\\|nherits\\)\\|l[et]\\|mod\\|n\\(e\\|ot\\)\\|o\\(n_\\(error\\|ioerror\\)\\|[fr]\\)\\|re\\(peat\\|turn\\)\\|switch\\|then\\|until\\|while\\|xor\\|||\\)\\>\\|[[(*+-/=,^><]\\)\\s-*\\*"
   .........................................................................................................................................................................................................................................................................................................................................................................................^
emacs/lisp/progmodes/scheme.el:425:3: In dsssl-font-lock-keywords: Unescaped literal `*' (pos 49)
  "(\\(and\\|c\\(ase\\|ond\\)\\|else\\|if\\|l\\(ambda\\|et\\(\\|*\\|rec\\)\\)\\|map\\|or\\|with-mode\\)\\>"
   .............................................................^
emacs/lisp/textmodes/texinfmt.el:587:3: In texinfo-part-of-para-regexp: Unescaped literal `^' (pos 223)
  "^@\\(b{\\|bullet{\\|cite{\\|code{\\|email{\\|emph{\\|equiv{\\|error{\\|expansion{\\|file{\\|i{\\|inforef{\\|kbd{\\|key{\\|lisp{\\|minus{\\|point{\\|print{\\|pxref{\\|r{\\|ref{\\|result{\\|samp{\\|sc{\\|t{\\|TeX{\\|today{\\|url{\\|var{\\|w{\\|xref{\\|@-\\|@^\\|@`\\|@'\\|@\"\\|@,\\|@=\\|@~\\|@OE{\\|@oe{\\|@AA{\\|@aa{\\|@AE{\\|@ae{\\|@ss{\\|@questiondown{\\|@exclamdown{\\|@L{\\|@l{\\|@O{\\|@o{\\|@dotaccent{\\|@ubaraccent{\\|@d{\\|@H{\\|@ringaccent{\\|@tieaccent{\\|@u{\\|@v{\\|@dotless{\\)"
   ................................................................................................................................................................................................................................................................^
emacs/lisp/textmodes/texinfmt.el:647:25: In call to looking-at: Unescaped literal `^' (pos 403)
  "\\(^@\\(direntry\\|lisp\\|smalllisp\\|example\\|smallexample\\|display\\|smalldisplay\\|format\\|smallformat\\|flushleft\\|flushright\\|menu\\|multitable\\|titlepage\\|iftex\\|ifhtml\\|tex\\|html\\)\\|^@\\(b{\\|bullet{\\|cite{\\|code{\\|email{\\|emph{\\|equiv{\\|error{\\|expansion{\\|file{\\|i{\\|inforef{\\|kbd{\\|key{\\|lisp{\\|minus{\\|point{\\|print{\\|pxref{\\|r{\\|ref{\\|result{\\|samp{\\|sc{\\|t{\\|TeX{\\|today{\\|url{\\|var{\\|w{\\|xref{\\|@-\\|@^\\|@`\\|@'\\|@\"\\|@,\\|@=\\|@~\\|@OE{\\|@oe{\\|@AA{\\|@aa{\\|@AE{\\|@ae{\\|@ss{\\|@questiondown{\\|@exclamdown{\\|@L{\\|@l{\\|@O{\\|@o{\\|@dotaccent{\\|@ubaraccent{\\|@d{\\|@H{\\|@ringaccent{\\|@tieaccent{\\|@u{\\|@v{\\|@dotless{\\)\\)"
   .........................................................................................................................................................................................................................................................................................................................................................................................................................................................................^
emacs/lisp/align.el:386:3: In align-rules-list (make-assignment): Duplicated `\' inside character alternative (pos 35)
  "^\\s-*\\w+\\(\\s-*\\):?=\\(\\s-*\\)\\([^\t\n \\\\]\\|$\\)"
   ...............................................^
emacs/lisp/comint.el:2084:34: In call to string-match: Unescaped literal `^' (pos 3)
  "\\(^^\\)\\1+"
   ....^
emacs/test/src/regex-emacs-tests.el:305:33: In call to re-search-backward: Duplicated `\' inside character alternative (pos 10)
  "\\(?:^\\|[^\\\\]\\)\\(?:\\\\\\\\\\)*\\\\.\\="
   .............^
emacs/test/src/regex-emacs-tests.el:322:31: In call to re-search-forward: Duplicated `\' inside character alternative (pos 10)
  "\\(?:^\\|[^\\\\]\\)\\(?:\\\\\\\\\\)*\\\\[Ss]"
   .............^

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-09 13:26 Scan of regexps in emacs Mattias Engdegård
@ 2019-03-09 14:56 ` Alan Mackenzie
  2019-03-09 15:09   ` Alan Mackenzie
  2019-03-09 17:06   ` Paul Eggert
  2019-03-11  2:45 ` Paul Eggert
  1 sibling, 2 replies; 13+ messages in thread
From: Alan Mackenzie @ 2019-03-09 14:56 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: emacs-devel

Hello, Mattias.

On Sat, Mar 09, 2019 at 14:26:38 +0100, Mattias Engdegård wrote:
> Here is a new regexp error scan of the Emacs source tree.
> The new complaints are due to improvements in the regexp-finding
> abilities of the trawler. The locations are also more precise, and
> there is a caret line for extra help.

[ .... ]

As a matter of interest, the current scan failed to catch an unescaped
].  This was at the second ^ (inserted by me) in the last quoted line:

> emacs/lisp/progmodes/cc-awk.el:191:3: In c-awk-regexp-char-list-re: Unescaped literal `^' (pos 13)
>   "\\[\\(\\\\[\n\r]\\)*^?\\(\\\\[\n\r]\\)*]?\\(\\\\\\(.\\|\n\\|\r\\|\\'\\)\\|\\[:[a-z]+:\\]\\|[^]\n\r]\\)*\\(]\\|$\\)"
>    ....................^                  ^

Incidentally, there is another error in this regexp.  It contains a
stanza of the form A*x?A* near the beginning, which is asking for
trouble (time exponential in the number of escaped NLs) should there be
lots of escaped NLs anywhere.  A correct way to write this is
\\(A*x\\)?A*.

I will be fixing all these glitches.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-09 14:56 ` Alan Mackenzie
@ 2019-03-09 15:09   ` Alan Mackenzie
  2019-03-10 11:19     ` Mattias Engdegård
  2019-03-09 17:06   ` Paul Eggert
  1 sibling, 1 reply; 13+ messages in thread
From: Alan Mackenzie @ 2019-03-09 15:09 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: emacs-devel

Hello, Mattias.

On Sat, Mar 09, 2019 at 14:56:21 +0000, Alan Mackenzie wrote:
[ .... ]

> As a matter of interest, the current scan failed to catch an unescaped
> ].  This was at the second ^ (inserted by me) in the last quoted line:

Apologies: such a ] is not a special character, and should not be
flagged.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-09 14:56 ` Alan Mackenzie
  2019-03-09 15:09   ` Alan Mackenzie
@ 2019-03-09 17:06   ` Paul Eggert
  2019-03-09 17:46     ` Alan Mackenzie
  1 sibling, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2019-03-09 17:06 UTC (permalink / raw)
  To: Alan Mackenzie, Mattias Engdegård; +Cc: emacs-devel

Alan Mackenzie wrote:
> I will be fixing all these glitches.

Thanks. Do you mean all the glitches in Mattias's new report, or all the 
glitches that you mentioned in your email? If the latter, I'll take a look at 
Mattias's new report. (I don't want to duplicate your work.)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-09 17:06   ` Paul Eggert
@ 2019-03-09 17:46     ` Alan Mackenzie
  0 siblings, 0 replies; 13+ messages in thread
From: Alan Mackenzie @ 2019-03-09 17:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Mattias Engdegård, emacs-devel

Hello, Paul.

On Sat, Mar 09, 2019 at 09:06:35 -0800, Paul Eggert wrote:
> Alan Mackenzie wrote:
> > I will be fixing all these glitches.

> Thanks. Do you mean all the glitches in Mattias's new report, or all the 
> glitches that you mentioned in your email? If the latter, I'll take a look at 
> Mattias's new report. (I don't want to duplicate your work.)

I meant just the glitches I mentioned in my post.  Sorry for being
ambiguous.

I've just pushed this fix to master.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-09 15:09   ` Alan Mackenzie
@ 2019-03-10 11:19     ` Mattias Engdegård
  0 siblings, 0 replies; 13+ messages in thread
From: Mattias Engdegård @ 2019-03-10 11:19 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

9 mars 2019 kl. 16.09 skrev Alan Mackenzie <acm@muc.de>:
> 
>> As a matter of interest, the current scan failed to catch an unescaped
>> ].  This was at the second ^ (inserted by me) in the last quoted line:
> 
> Apologies: such a ] is not a special character, and should not be
> flagged.

Thanks for the fixes, Alan!
In fact, re-lint doesn't complain about \] either since it's very common and usually harmless.

Inside [...] it's less smooth -- I wish there was a way to detect "escaped" ], ^ and - there. Right now such mistakes have to be discovered indirectly.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-09 13:26 Scan of regexps in emacs Mattias Engdegård
  2019-03-09 14:56 ` Alan Mackenzie
@ 2019-03-11  2:45 ` Paul Eggert
  2019-03-11  2:56   ` Clément Pit-Claudel
  2019-03-11  8:51   ` Mattias Engdegård
  1 sibling, 2 replies; 13+ messages in thread
From: Paul Eggert @ 2019-03-11  2:45 UTC (permalink / raw)
  To: Mattias Engdegård, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 184 bytes --]

Mattias Engdegård wrote:
> Here is a new regexp error scan of the Emacs source tree.

Thanks. Alan fixed some of them and I installed the attached, which I hope fixes 
the rest.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-More-regexp-corrections-and-tweaks.patch --]
[-- Type: text/x-patch; name="0001-More-regexp-corrections-and-tweaks.patch", Size: 5480 bytes --]

From 7c6cdb122008ff902a3edec021b97027aa416c24 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 10 Mar 2019 19:42:11 -0700
Subject: [PATCH] More regexp corrections and tweaks
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problems reported by Mattias Engdegård in:
https://lists.gnu.org/r/emacs-devel/2019-03/msg00247.html
* lisp/align.el (align-rules-list):
* lisp/comint.el (comint-output-filter):
* lisp/language/china-util.el (encode-hz-region):
* lisp/progmodes/cperl-mode.el (cperl-indent-exp):
* lisp/progmodes/idlwave.el (idlwave-is-pointer-dereference):
* lisp/progmodes/scheme.el (dsssl-font-lock-keywords):
* lisp/textmodes/texinfmt.el (texinfo-accent-commands):
* test/src/regex-emacs-tests.el (regex-tests-re-even-escapes):
Fix some regular-expression typos.
---
 lisp/align.el                 | 2 +-
 lisp/comint.el                | 2 +-
 lisp/language/china-util.el   | 2 +-
 lisp/progmodes/cperl-mode.el  | 2 +-
 lisp/progmodes/idlwave.el     | 2 +-
 lisp/progmodes/scheme.el      | 2 +-
 lisp/textmodes/texinfmt.el    | 8 +-------
 test/src/regex-emacs-tests.el | 2 +-
 8 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/lisp/align.el b/lisp/align.el
index 43918811b9..594d15eee1 100644
--- a/lisp/align.el
+++ b/lisp/align.el
@@ -399,7 +399,7 @@ align-rules-list
 		   (lambda (end reverse)
 		     (funcall (if reverse 're-search-backward
 				're-search-forward)
-			      (concat "[^ \t\n\\\\]"
+			      (concat "[^ \t\n\\]"
 				      (regexp-quote comment-start)
 				      "\\(.+\\)$") end t))))
      (modes    . align-open-comment-modes))
diff --git a/lisp/comint.el b/lisp/comint.el
index a5fca7ea2a..e5012be982 100644
--- a/lisp/comint.el
+++ b/lisp/comint.el
@@ -2081,7 +2081,7 @@ comint-output-filter
                        (prompt-re (concat "\\`" (regexp-quote prompt))))
                   (while (string-match prompt-re string)
                     (setq string (substring string (match-end 0)))))))
-            (while (string-match (concat "\\(^" comint-prompt-regexp
+            (while (string-match (concat "\\(" comint-prompt-regexp
                                          "\\)\\1+")
                                  string)
               (setq string (replace-match "\\1" nil nil string)))
diff --git a/lisp/language/china-util.el b/lisp/language/china-util.el
index 70710bac18..1638565133 100644
--- a/lisp/language/china-util.el
+++ b/lisp/language/china-util.el
@@ -168,7 +168,7 @@ encode-hz-region
 	      ;; ESC ESC -> ESC
 	      (delete-char 1)
 	    (forward-char -1)
-	    (if (looking-at iso2022-gb-designation)
+	    (if (looking-at "\e\\$A")
 		(progn
 		  (delete-region (match-beginning 0) (match-end 0))
 		  (insert hz-gb-designation)
diff --git a/lisp/progmodes/cperl-mode.el b/lisp/progmodes/cperl-mode.el
index a9402e17a9..970c5669c6 100644
--- a/lisp/progmodes/cperl-mode.el
+++ b/lisp/progmodes/cperl-mode.el
@@ -4924,7 +4924,7 @@ cperl-indent-exp
 			      (if (looking-at "\\(state\\|my\\|local\\|our\\)\\>")
 				  (forward-sexp -1))))
 			(if (looking-at
-			     (concat "\\(\\elsif\\|if\\|unless\\|while\\|until"
+			     (concat "\\(elsif\\|if\\|unless\\|while\\|until"
 				     "\\|for\\(each\\)?\\>\\(\\("
 				     cperl-maybe-white-and-comment-rex
 				     "\\(state\\|my\\|local\\|our\\)\\)?"
diff --git a/lisp/progmodes/idlwave.el b/lisp/progmodes/idlwave.el
index 25bc788ffc..5ff22571b9 100644
--- a/lisp/progmodes/idlwave.el
+++ b/lisp/progmodes/idlwave.el
@@ -3690,7 +3690,7 @@ idlwave-is-pointer-dereference
    (save-excursion
      (forward-char)
      (re-search-backward (concat "\\(" idlwave-idl-keywords
-                                 "\\|[[(*+-/=,^><]\\)\\s-*\\*") limit t))))
+                                 "\\|[-[(*+/=,^><]\\)\\s-*\\*") limit t))))
 
 
 ;; Statement templates
diff --git a/lisp/progmodes/scheme.el b/lisp/progmodes/scheme.el
index 62f521ee94..507a4c7085 100644
--- a/lisp/progmodes/scheme.el
+++ b/lisp/progmodes/scheme.el
@@ -433,7 +433,7 @@ dsssl-font-lock-keywords
               ;; (make-regexp '("case" "cond" "else" "if" "lambda"
               ;; "let" "let*" "letrec" "and" "or" "map" "with-mode"))
               "and\\|c\\(ase\\|ond\\)\\|else\\|if\\|"
-              "l\\(ambda\\|et\\(\\|*\\|rec\\)\\)\\|map\\|or\\|with-mode"
+              "l\\(ambda\\|et\\(\\|\\*\\|rec\\)\\)\\|map\\|or\\|with-mode"
               "\\)\\>")
       1)
      ;; DSSSL syntax
diff --git a/lisp/textmodes/texinfmt.el b/lisp/textmodes/texinfmt.el
index 61c31a511c..4bfecb48b6 100644
--- a/lisp/textmodes/texinfmt.el
+++ b/lisp/textmodes/texinfmt.el
@@ -552,13 +552,7 @@ texinfo-no-refill-regexp
 
 (defvar texinfo-accent-commands
   (concat
-   "@^\\|"
-   "@`\\|"
-   "@'\\|"
-   "@\"\\|"
-   "@,\\|"
-   "@=\\|"
-   "@~\\|"
+   "@[\"',=^`~]\\|"
    "@OE{\\|"
    "@oe{\\|"
    "@AA{\\|"
diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el
index 9a40316573..0ae50c94d4 100644
--- a/test/src/regex-emacs-tests.el
+++ b/test/src/regex-emacs-tests.el
@@ -278,7 +278,7 @@ regex-tests-match
 
 
 (defconst regex-tests-re-even-escapes
-  "\\(?:^\\|[^\\\\]\\)\\(?:\\\\\\\\\\)*"
+  "\\(?:^\\|[^\\]\\)\\(?:\\\\\\\\\\)*"
   "Regex that matches an even number of \\ characters")
 
 (defconst regex-tests-re-odd-escapes
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-11  2:45 ` Paul Eggert
@ 2019-03-11  2:56   ` Clément Pit-Claudel
  2019-03-11  3:37     ` Paul Eggert
  2019-03-11  8:51   ` Mattias Engdegård
  1 sibling, 1 reply; 13+ messages in thread
From: Clément Pit-Claudel @ 2019-03-11  2:56 UTC (permalink / raw)
  To: emacs-devel

On 10/03/2019 22.45, Paul Eggert wrote:
> -            (while (string-match (concat "\\(^" comint-prompt-regexp
> +            (while (string-match (concat "\\(" comint-prompt-regexp
>                                           "\\)\\1+")
>                                   string)

I think your change altered the meaning of that regexp.  Was that intentional?  Or am I misunderstanding?  The manual says this:

    For historical compatibility reasons, ‘^’ can be used only at the beginning of the regular expression, or after ‘\(’, ‘\(?:’ or ‘\|’. 

…and indeed (string-match-p "\\(^abc\\)" "xabc") is nil but (string-match-p "\\(abc\\)" "xabc") is 1.

Clément.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-11  2:56   ` Clément Pit-Claudel
@ 2019-03-11  3:37     ` Paul Eggert
  2019-03-11  8:39       ` Mattias Engdegård
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2019-03-11  3:37 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

Clément Pit-Claudel wrote:
> On 10/03/2019 22.45, Paul Eggert wrote:
>> -            (while (string-match (concat "\\(^" comint-prompt-regexp
>> +            (while (string-match (concat "\\(" comint-prompt-regexp
>>                                            "\\)\\1+")
>>                                    string)
> I think your change altered the meaning of that regexp.

Yes and no. Yes, it altered the meaning of the regexp, but no it should fix a 
bug rather than introduce one because comint-prompt-regexp in practice always 
seems to be anchored to a line start. For example, comint-prompt-regexp defaults 
to "^", which meant that the above code's entire regexp was this:

\(^^\)\1+

which is equivalent to this regexp:

\(^\^\)\1+

which is not what was wanted.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-11  3:37     ` Paul Eggert
@ 2019-03-11  8:39       ` Mattias Engdegård
  0 siblings, 0 replies; 13+ messages in thread
From: Mattias Engdegård @ 2019-03-11  8:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Clément Pit-Claudel, emacs-devel

11 mars 2019 kl. 04.37 skrev Paul Eggert <eggert@cs.ucla.edu>:
> 
> Clément Pit-Claudel wrote:
>> On 10/03/2019 22.45, Paul Eggert wrote:
>>> -            (while (string-match (concat "\\(^" comint-prompt-regexp
>>> +            (while (string-match (concat "\\(" comint-prompt-regexp
>>>                                           "\\)\\1+")
>>>                                   string)
>> I think your change altered the meaning of that regexp.
> 
> Yes and no. Yes, it altered the meaning of the regexp, but no it should fix a bug rather than introduce one because comint-prompt-regexp in practice always seems to be anchored to a line start. For example, comint-prompt-regexp defaults to "^", which meant that the above code's entire regexp was this:
> 
> \(^^\)\1+

One way would be to replace the "\\(^" prefix with "\\(?:^\\)\\(", which should work whether or not comint-prompt-regexp begins with a ^. However, as you say, comint-prompt-regexp always seems to include the ^ and I think it is used elsewhere with the tacit assumption that it does, so the committed change should be fine.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-11  2:45 ` Paul Eggert
  2019-03-11  2:56   ` Clément Pit-Claudel
@ 2019-03-11  8:51   ` Mattias Engdegård
  2019-03-11 22:49     ` Paul Eggert
  1 sibling, 1 reply; 13+ messages in thread
From: Mattias Engdegård @ 2019-03-11  8:51 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

11 mars 2019 kl. 03.45 skrev Paul Eggert <eggert@cs.ucla.edu>:
> 
> Mattias Engdegård wrote:
>> Here is a new regexp error scan of the Emacs source tree.
> 
> Thanks. Alan fixed some of them and I installed the attached, which I hope fixes the rest.

Thank you, and it looks good. Maybe one tweak:

--- a/lisp/language/china-util.el
+++ b/lisp/language/china-util.el
@@ -168,4 +168,4 @@ encode-hz-region
 	      ;; ESC ESC -> ESC
 	      (delete-char 1)
 	    (forward-char -1)
-	    (if (looking-at iso2022-gb-designation)
+	    (if (looking-at "\e\\$A")

What about (regexp-quote iso2022-gb-designation) instead, possibly hoisted?
(Of course the reader then wonders why iso2022-ascii-designation isn't quoted. Oh dear.)




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-11  8:51   ` Mattias Engdegård
@ 2019-03-11 22:49     ` Paul Eggert
  2019-03-12 10:21       ` Mattias Engdegård
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2019-03-11 22:49 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: emacs-devel

On 3/11/19 1:51 AM, Mattias Engdegård wrote:
> -	    (if (looking-at iso2022-gb-designation)
> +	    (if (looking-at "\e\\$A")
>
> What about (regexp-quote iso2022-gb-designation) instead, possibly hoisted?
> (Of course the reader then wonders why iso2022-ascii-designation isn't quoted. Oh dear.)

I went through exactly the same thought processes. As variables like
iso2022-gb-designation are not really intended to be changed, I figured
it was OK to simply expand it inline by hand.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Scan of regexps in emacs
  2019-03-11 22:49     ` Paul Eggert
@ 2019-03-12 10:21       ` Mattias Engdegård
  0 siblings, 0 replies; 13+ messages in thread
From: Mattias Engdegård @ 2019-03-12 10:21 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

11 mars 2019 kl. 23.49 skrev Paul Eggert <eggert@cs.ucla.edu>:
> 
> On 3/11/19 1:51 AM, Mattias Engdegård wrote:
>> -	    (if (looking-at iso2022-gb-designation)
>> +	    (if (looking-at "\e\\$A")
>> 
>> What about (regexp-quote iso2022-gb-designation) instead, possibly hoisted?
>> (Of course the reader then wonders why iso2022-ascii-designation isn't quoted. Oh dear.)
> 
> I went through exactly the same thought processes. As variables like
> iso2022-gb-designation are not really intended to be changed, I figured
> it was OK to simply expand it inline by hand.

Agreed. In fact, it is so common to see looking-at with a nonliteral-free pattern that it might be worth adding a standard looking-at-string-p for that purpose. (Or is there already one? With Emacs, you can never be sure.)
That would take care of lots of silly regexp quoting worries, be more readable, and a little faster.




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-12 10:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-09 13:26 Scan of regexps in emacs Mattias Engdegård
2019-03-09 14:56 ` Alan Mackenzie
2019-03-09 15:09   ` Alan Mackenzie
2019-03-10 11:19     ` Mattias Engdegård
2019-03-09 17:06   ` Paul Eggert
2019-03-09 17:46     ` Alan Mackenzie
2019-03-11  2:45 ` Paul Eggert
2019-03-11  2:56   ` Clément Pit-Claudel
2019-03-11  3:37     ` Paul Eggert
2019-03-11  8:39       ` Mattias Engdegård
2019-03-11  8:51   ` Mattias Engdegård
2019-03-11 22:49     ` Paul Eggert
2019-03-12 10:21       ` Mattias Engdegård

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).