* Emacs regexp scan (Sep 29) @ 2019-09-29 19:39 Mattias Engdegård 2019-10-04 21:42 ` Paul Eggert 0 siblings, 1 reply; 24+ messages in thread From: Mattias Engdegård @ 2019-09-29 19:39 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 148 bytes --] Time for another of these, after some scanner improvements. It looks like a lot, but it's probably just a handful errors causing many warnings. [-- Attachment #2: relint.log --] [-- Type: application/octet-stream, Size: 137572 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-09-29 19:39 Emacs regexp scan (Sep 29) Mattias Engdegård @ 2019-10-04 21:42 ` Paul Eggert 2019-10-05 8:10 ` Eli Zaretskii 2019-10-05 10:03 ` Mattias Engdegård 0 siblings, 2 replies; 24+ messages in thread From: Paul Eggert @ 2019-10-04 21:42 UTC (permalink / raw) To: Mattias Engdegård; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 185 bytes --] Thanks, I installed the attached patch, which I hope fixes all the bugs and style glitches uncovered by that scan, along with some other style glitches I noticed in the neighborhood. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Fix-bugs-found-by-2019-09-29-regexp-scanner.patch --] [-- Type: text/x-patch; name="0001-Fix-bugs-found-by-2019-09-29-regexp-scanner.patch", Size: 15105 bytes --] From 57556fa21da431b47b2f3842ddb61273f5d39820 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Fri, 4 Oct 2019 14:38:22 -0700 Subject: [PATCH] Fix bugs found by 2019-09-29 regexp scanner MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Problems reported by Mattias EngdegÃ¥rd in: https://lists.gnu.org/r/emacs-devel/2019-09/threads.html * lisp/calendar/iso8601.el (iso8601--year-match) (iso8601--full-date-match, iso8601--without-day-match) (iso8601--week-date-match, iso8601--ordinal-date-match) (iso8601--zone-match): * lisp/textmodes/rst.el (rst-re-alist-def): Put â-â at the end of bracketed ranges, following the style suggestion in the Elisp manual. (iso8601--time-match): Use \([0-9]*\) instead of \([0-9]+\)? to pacify the regexp scanner. (iso8601-parse-time): Adjust accordingly. * lisp/language/burmese.el (burmese-composable-pattern): * lisp/language/indian.el (devanagari-composable-pattern) (bengali-composable-pattern, gurmukhi-composable-pattern) (gujarati-composable-pattern, oriya-composable-pattern) (telugu-composable-pattern, kannada-composable-pattern) (malayalam-composable-pattern): Prefer [ab] to [a-b] when the characters differ by 1, to pacify the regexp scanner. * lisp/language/burmese.el (burmese-composable-pattern): Fix missing-â\uâ typos. * lisp/language/indian.el (gurmukhi-composable-pattern): Fix missing-â\â typo. * lisp/language/tibetan.el (tibetan-regexp): Quote â+â in regexp to pacify the regexp scanner. Simplify. * lisp/textmodes/rst.el (rst-re-alist-def): Fix â[]-'...]â typo by putting the â-â at end of the bracketed expression. --- lisp/calendar/iso8601.el | 18 ++++++------- lisp/language/burmese.el | 4 +-- lisp/language/indian.el | 58 ++++++++++++++++++++-------------------- lisp/language/tibetan.el | 23 +++++++--------- lisp/textmodes/rst.el | 8 +++--- 5 files changed, 54 insertions(+), 57 deletions(-) diff --git a/lisp/calendar/iso8601.el b/lisp/calendar/iso8601.el index 3ff91d910c..78a94d47be 100644 --- a/lisp/calendar/iso8601.el +++ b/lisp/calendar/iso8601.el @@ -62,17 +62,17 @@ iso8601--concat-regexps regexps "\\|")) (defconst iso8601--year-match - "\\([-+]\\)?\\([0-9][0-9][0-9][0-9]\\)") + "\\([+-]\\)?\\([0-9][0-9][0-9][0-9]\\)") (defconst iso8601--full-date-match - "\\([-+]\\)?\\([0-9][0-9][0-9][0-9]\\)-?\\([0-9][0-9]\\)-?\\([0-9][0-9]\\)") + "\\([+-]\\)?\\([0-9][0-9][0-9][0-9]\\)-?\\([0-9][0-9]\\)-?\\([0-9][0-9]\\)") (defconst iso8601--without-day-match - "\\([-+]\\)?\\([0-9][0-9][0-9][0-9]\\)-\\([0-9][0-9]\\)") + "\\([+-]\\)?\\([0-9][0-9][0-9][0-9]\\)-\\([0-9][0-9]\\)") (defconst iso8601--outdated-date-match "--\\([0-9][0-9]\\)-?\\([0-9][0-9]\\)") (defconst iso8601--week-date-match - "\\([-+]\\)?\\([0-9][0-9][0-9][0-9]\\)-?W\\([0-9][0-9]\\)-?\\([0-9]\\)?") + "\\([+-]\\)?\\([0-9][0-9][0-9][0-9]\\)-?W\\([0-9][0-9]\\)-?\\([0-9]\\)?") (defconst iso8601--ordinal-date-match - "\\([-+]\\)?\\([0-9][0-9][0-9][0-9]\\)-?\\([0-9][0-9][0-9]\\)") + "\\([+-]\\)?\\([0-9][0-9][0-9][0-9]\\)-?\\([0-9][0-9][0-9]\\)") (defconst iso8601--date-match (iso8601--concat-regexps (list iso8601--year-match @@ -83,10 +83,10 @@ iso8601--date-match iso8601--ordinal-date-match))) (defconst iso8601--time-match - "\\([0-9][0-9]\\):?\\([0-9][0-9]\\)?:?\\([0-9][0-9]\\)?[.,]?\\([0-9]+\\)?") + "\\([0-9][0-9]\\):?\\([0-9][0-9]\\)?:?\\([0-9][0-9]\\)?[.,]?\\([0-9]*\\)") (defconst iso8601--zone-match - "\\(Z\\|\\([-+]\\)\\([0-9][0-9]\\):?\\([0-9][0-9]\\)?\\)") + "\\(Z\\|\\([+-]\\)\\([0-9][0-9]\\):?\\([0-9][0-9]\\)?\\)") (defconst iso8601--full-time-match (concat "\\(" (replace-regexp-in-string "(" "(?:" iso8601--time-match) "\\)" @@ -142,7 +142,7 @@ iso8601-parse (defun iso8601-parse-date (string) "Parse STRING (in ISO 8601 format) and return a `decode-time' value." (cond - ;; Just a year: [-+]YYYY. + ;; Just a year: [+-]YYYY. ((iso8601--match iso8601--year-match string) (iso8601--decoded-time :year (iso8601--adjust-year (match-string 1 string) @@ -236,7 +236,7 @@ iso8601-parse-time (string-to-number (match-string 2 time)))) (second (and (match-string 3 time) (string-to-number (match-string 3 time)))) - (fraction (and (match-string 4 time) + (fraction (and (not (zerop (length (match-string 4 time)))) (string-to-number (match-string 4 time))))) (when (and fraction (eq form t)) diff --git a/lisp/language/burmese.el b/lisp/language/burmese.el index 25425ec485..7f2a99a41a 100644 --- a/lisp/language/burmese.el +++ b/lisp/language/burmese.el @@ -39,11 +39,11 @@ (defvar burmese-composable-pattern (let ((table '(("K" . "[\u1004\u105A]\u103A\u1039") ; KINZI sequence - ("C" . "[\u1000-\u102A\u103F\u1041-\u1049\u104E\u105A-\u105D\u1061\u1065-\u1066\u106E\u1071\u1075\u1081\u108E\uAA60-\uAA6F\uAA71-\uAA76]") ; consonant and vowel letter + ("C" . "[\u1000-\u102A\u103F\u1041-\u1049\u104E\u105A-\u105D\u1061\u1065\u1066\u106E\u1071\u1075\u1081\u108E\uAA60-\uAA6F\uAA71-\uAA76]") ; consonant and vowel letter ("V" . "\u1039") ; VIRAMA ("A" . "\u103A") ; ASAT ("S" . "[\u1000-\u1019\u101C\u101E\u1020\u1021\u105A]") ; subscript - ("M" . "[\u103B-\u103E\105E-\1060]") ; medial + ("M" . "[\u103B-\u103E\u105E-\u1060]") ; medial ("v" . "[\u102B-\u103A\u103C-\u103E\u1062-\u1064\u1067-\u106D\u1071-\u1074\u1082-\u108D\u108F\u109A\u109C\uAA70]"))) ; vowel sign, etc. (regexp "\\(K\\)?C\\(VS\\)?\\(VS\\)?A?M*v*")) (let ((case-fold-search nil)) diff --git a/lisp/language/indian.el b/lisp/language/indian.el index f1e61a354c..4013faca7c 100644 --- a/lisp/language/indian.el +++ b/lisp/language/indian.el @@ -139,14 +139,14 @@ devanagari-composable-pattern (let ((table '(("a" . "[\u0900-\u0902]") ; vowel modifier (above) ("A" . "\u0903") ; vowel modifier (post) - ("V" . "[\u0904-\u0914\u0960-\u0961\u0972]") ; independent vowel + ("V" . "[\u0904-\u0914\u0960\u0961\u0972]") ; independent vowel ("C" . "[\u0915-\u0939\u0958-\u095F\u0979-\u097F]") ; consonant ("R" . "\u0930") ; RA ("n" . "\u093C") ; NUKTA - ("v" . "[\u093E-\u094C\u094E\u0955\u0962-\u0963]") ; vowel sign + ("v" . "[\u093E-\u094C\u094E\u0955\u0962\u0963]") ; vowel sign ("H" . "\u094D") ; HALANT - ("s" . "[\u0951-\u0952]") ; stress sign - ("t" . "[\u0953-\u0954]") ; accent + ("s" . "[\u0951\u0952]") ; stress sign + ("t" . "[\u0953\u0954]") ; accent ("N" . "\u200C") ; ZWNJ ("J" . "\u200D") ; ZWJ ("X" . "[\u0900-\u097F]")))) ; all coverage @@ -166,13 +166,13 @@ devanagari-composable-pattern (defconst bengali-composable-pattern (let ((table '(("a" . "\u0981") ; SIGN CANDRABINDU - ("A" . "[\u0982-\u0983]") ; SIGN ANUSVARA .. VISARGA - ("V" . "[\u0985-\u0994\u09E0-\u09E1]") ; independent vowel + ("A" . "[\u0982\u0983]") ; SIGN ANUSVARA .. VISARGA + ("V" . "[\u0985-\u0994\u09E0\u09E1]") ; independent vowel ("C" . "[\u0995-\u09B9\u09DC-\u09DF\u09F1]") ; consonant - ("B" . "[\u09AC\u09AF-\u09B0\u09F0]") ; BA, YA, RA + ("B" . "[\u09AC\u09AF\u09B0\u09F0]") ; BA, YA, RA ("R" . "[\u09B0\u09F0]") ; RA ("n" . "\u09BC") ; NUKTA - ("v" . "[\u09BE-\u09CC\u09D7\u09E2-\u09E3]") ; vowel sign + ("v" . "[\u09BE-\u09CC\u09D7\u09E2\u09E3]") ; vowel sign ("H" . "\u09CD") ; HALANT ("T" . "\u09CE") ; KHANDA TA ("N" . "\u200C") ; ZWNJ @@ -195,11 +195,11 @@ bengali-composable-pattern (defconst gurmukhi-composable-pattern (let ((table - '(("a" . "[\u0A01-\u0A02\u0A70]") ; SIGN ADAK BINDI .. BINDI, TIPPI + '(("a" . "[\u0A01\u0A02\u0A70]") ; SIGN ADAK BINDI .. BINDI, TIPPI ("A" . "\u0A03") ; SIGN VISARGA ("V" . "[\u0A05-\u0A14]") ; independent vowel ("C" . "[\u0A15-\u0A39\u0A59-\u0A5E]") ; consonant - ("Y" . "[\u0A2F-u0A30\u0A35\u0A39]") ; YA, RA, VA, HA + ("Y" . "[\u0A2F-\u0A30\u0A35\u0A39]") ; YA, RA, VA, HA ("n" . "\u0A3C") ; NUKTA ("v" . "[\u0A3E-\u0A4C]") ; vowel sign ("H" . "\u0A4D") ; VIRAMA @@ -221,13 +221,13 @@ gurmukhi-composable-pattern (defconst gujarati-composable-pattern (let ((table - '(("a" . "[\u0A81-\u0A82]") ; SIGN CANDRABINDU .. ANUSVARA + '(("a" . "[\u0A81\u0A82]") ; SIGN CANDRABINDU .. ANUSVARA ("A" . "\u0A83") ; SIGN VISARGA - ("V" . "[\u0A85-\u0A94\u0AE0-\u0AE1]") ; independent vowel + ("V" . "[\u0A85-\u0A94\u0AE0\u0AE1]") ; independent vowel ("C" . "[\u0A95-\u0AB9]") ; consonant ("R" . "\u0AB0") ; RA ("n" . "\u0ABC") ; NUKTA - ("v" . "[\u0ABE-\u0ACC\u0AE2-\u0AE3]") ; vowel sign + ("v" . "[\u0ABE-\u0ACC\u0AE2\u0AE3]") ; vowel sign ("H" . "\u0ACD") ; VIRAMA ("N" . "\u200C") ; ZWNJ ("J" . "\u200D") ; ZWJ @@ -248,13 +248,13 @@ gujarati-composable-pattern (defconst oriya-composable-pattern (let ((table '(("a" . "\u0B01") ; SIGN CANDRABINDU - ("A" . "[\u0B02-\u0B03]") ; SIGN ANUSVARA .. VISARGA - ("V" . "[\u0B05-\u0B14\u0B60-\u0B61]") ; independent vowel - ("C" . "[\u0B15-\u0B39\u0B5C-\u0B5D\u0B71]") ; consonant - ("B" . "[\u0B15-\u0B17\u0B1B-\u0B1D\u0B1F-\u0B21\u0B23-\u0B24\u0B27-\u0B30\u0B32-\u0B35\u0B38-\u0B39]") ; consonant with below form + ("A" . "[\u0B02\u0B03]") ; SIGN ANUSVARA .. VISARGA + ("V" . "[\u0B05-\u0B14\u0B60\u0B61]") ; independent vowel + ("C" . "[\u0B15-\u0B39\u0B5C\u0B5D\u0B71]") ; consonant + ("B" . "[\u0B15-\u0B17\u0B1B-\u0B1D\u0B1F-\u0B21\u0B23\u0B24\u0B27-\u0B30\u0B32-\u0B35\u0B38\u0B39]") ; consonant with below form ("R" . "\u0B30") ; RA ("n" . "\u0B3C") ; NUKTA - ("v" . "[\u0B3E-\u0B4C\u0B56-\u0B57\u0B62-\u0B63]") ; vowel sign + ("v" . "[\u0B3E-\u0B4C\u0B56\u0B57\u0B62\u0B63]") ; vowel sign ("H" . "\u0B4D") ; VIRAMA ("N" . "\u200C") ; ZWNJ ("J" . "\u200D") ; ZWJ @@ -296,9 +296,9 @@ tamil-composable-pattern (defconst telugu-composable-pattern (let ((table '(("a" . "[\u0C01-\u0C03]") ; SIGN CANDRABINDU .. VISARGA - ("V" . "[\u0C05-\u0C14\u0C60-\u0C61]") ; independent vowel - ("C" . "[\u0C15-\u0C39\u0C58-\u0C59]") ; consonant - ("v" . "[\u0C3E-\u0C4C\u0C55-\u0C56\u0C62-\u0C63]") ; vowel sign + ("V" . "[\u0C05-\u0C14\u0C60\u0C61]") ; independent vowel + ("C" . "[\u0C15-\u0C39\u0C58\u0C59]") ; consonant + ("v" . "[\u0C3E-\u0C4C\u0C55\u0C56\u0C62\u0C63]") ; vowel sign ("H" . "\u0C4D") ; VIRAMA ("N" . "\u200C") ; ZWNJ ("J" . "\u200D") ; ZWJ @@ -318,12 +318,12 @@ telugu-composable-pattern (defconst kannada-composable-pattern (let ((table - '(("A" . "[\u0C82-\u0C83]") ; SIGN ANUSVARA .. VISARGA - ("V" . "[\u0C85-\u0C94\u0CE0-\u0CE1]") ; independent vowel - ("C" . "[\u0C95-\u0CB9\u0CDE]") ; consonant + '(("A" . "[\u0C82\u0C83]") ; SIGN ANUSVARA .. VISARGA + ("V" . "[\u0C85-\u0C94\u0CE0\u0CE1]") ; independent vowel + ("C" . "[\u0C95-\u0CB9\u0CDE]") ; consonant ("R" . "\u0CB0") ; RA ("n" . "\u0CBC") ; NUKTA - ("v" . "[\u0CBE-\u0CCC\u0CD5-\u0CD6\u0CE2-\u0CE3]") ; vowel sign + ("v" . "[\u0CBE-\u0CCC\u0CD5\u0CD6\u0CE2\u0CE3]") ; vowel sign ("H" . "\u0CCD") ; VIRAMA ("N" . "\u200C") ; ZWNJ ("J" . "\u200D") ; ZWJ @@ -343,11 +343,11 @@ kannada-composable-pattern (defconst malayalam-composable-pattern (let ((table - '(("A" . "[\u0D02-\u0D03]") ; SIGN ANUSVARA .. VISARGA - ("V" . "[\u0D05-\u0D14\u0D60-\u0D61]") ; independent vowel + '(("A" . "[\u0D02\u0D03]") ; SIGN ANUSVARA .. VISARGA + ("V" . "[\u0D05-\u0D14\u0D60\u0D61]") ; independent vowel ("C" . "[\u0D15-\u0D39]") ; consonant - ("Y" . "[\u0D2F-\u0D30\u0D32\u0D35]") ; YA, RA, LA, VA - ("v" . "[\u0D3E-\u0D4C\u0D57\u0D62-\u0D63]") ; postbase matra + ("Y" . "[\u0D2F\u0D30\u0D32\u0D35]") ; YA, RA, LA, VA + ("v" . "[\u0D3E-\u0D4C\u0D57\u0D62\u0D63]") ; postbase matra ("H" . "\u0D4D") ; SIGN VIRAMA ("N" . "\u200C") ; ZWNJ ("J" . "\u200D") ; ZWJ diff --git a/lisp/language/tibetan.el b/lisp/language/tibetan.el index 4be25cecab..b42a1e8fb8 100644 --- a/lisp/language/tibetan.el +++ b/lisp/language/tibetan.el @@ -549,19 +549,16 @@ tibetan-precomposition-rule-alist ("སྨ" . "ö°"))) (defconst tibetan-regexp - (let ((l (list tibetan-precomposed-transcription-alist - tibetan-consonant-transcription-alist - tibetan-vowel-transcription-alist - tibetan-modifier-transcription-alist - tibetan-subjoined-transcription-alist)) - (separator "\\|") - tail pattern) - (while l - (setq tail (car l) l (cdr l)) - (while tail - (setq pattern (cons separator (cons (car (car tail)) pattern)) - tail (cdr tail)))) - (apply 'concat (nreverse (cdr pattern)))) + (let (pattern) + (dolist (alist (list tibetan-precomposed-transcription-alist + tibetan-consonant-transcription-alist + tibetan-vowel-transcription-alist + tibetan-modifier-transcription-alist + tibetan-subjoined-transcription-alist) + (apply #'concat (nreverse (cdr pattern)))) + (dolist (key-val alist) + (setq pattern (cons "\\|" (cons (regexp-quote (car key-val)) + pattern)))))) "Regexp matching a Tibetan transcription of a composable Tibetan sequence. The result of matching is to be used for indexing alists at conversion from a roman transcription to the corresponding Tibetan character.") diff --git a/lisp/textmodes/rst.el b/lisp/textmodes/rst.el index 88c44c06da..b7438fbb10 100644 --- a/lisp/textmodes/rst.el +++ b/lisp/textmodes/rst.el @@ -388,8 +388,8 @@ rst-re-alist-def ; item tag. ;; Inline markup (`ilm') - (ilm-pfx (:alt "^" hws-prt "[-'\"([{<ââ«â/:]")) - (ilm-sfx (:alt "$" hws-prt "[]-'\")}>ââ»/:.,;!?\\]")) + (ilm-pfx (:alt "^" hws-prt "['\"([{<ââ«â/:-]")) + (ilm-sfx (:alt "$" hws-prt "[]'\")}>ââ»/:.,;!?\\-]")) ;; Inline markup content (`ilc') (ilcsgl-tag "\\S ") ; A single non-white character. @@ -431,7 +431,7 @@ rst-re-alist-def (fld-tag ":" fldnam-tag ":") ; A field marker. ;; Options (`opt') - (optsta-tag (:alt "[-+/]" "--")) ; Start of an option. + (optsta-tag (:alt "[+/-]" "--")) ; Start of an option. (optnam-tag "\\sw" (:alt "-" "\\sw") "*") ; Name of an option. (optarg-tag (:shy "[ =]\\S +")) ; Option argument. (optsep-tag (:shy "," hws-prt)) ; Separator between options. @@ -457,7 +457,7 @@ rst-re-alist-def ; tag. ;; Symbol (`sym') - (sym-prt "[-+.:_]") ; Non-word part of a symbol. + (sym-prt "[+.:_-]") ; Non-word part of a symbol. (sym-tag (:shy "\\sw+" (:shy sym-prt "\\sw+") "*")) ;; URIs (`uri') -- 2.21.0 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-04 21:42 ` Paul Eggert @ 2019-10-05 8:10 ` Eli Zaretskii 2019-10-05 9:37 ` Mattias Engdegård ` (2 more replies) 2019-10-05 10:03 ` Mattias Engdegård 1 sibling, 3 replies; 24+ messages in thread From: Eli Zaretskii @ 2019-10-05 8:10 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 4 Oct 2019 14:42:41 -0700 > Cc: emacs-devel <emacs-devel@gnu.org> > > Thanks, I installed the attached patch, which I hope fixes all the bugs > and style glitches uncovered by that scan, along with some other style > glitches I noticed in the neighborhood. I question the need for "fixing" those "style glitches" (and even the very existence of a "style glitch", which sounds like a contradiction of terms to me). > (defconst iso8601--year-match > - "\\([-+]\\)?\\([0-9][0-9][0-9][0-9]\\)") > + "\\([+-]\\)?\\([0-9][0-9][0-9][0-9]\\)") What is the purpose of this and other similar changes? AFAIK, both variants are valid, so it sounds like your personal stylistic preference is for the latter. Is that the only reason? If so, let's please not make changes where this is the only reason, as doing so risks breaking code (due to typos) and complicates forensics, and otherwise serves no useful purpose. > (defconst tibetan-regexp > - (let ((l (list tibetan-precomposed-transcription-alist > - tibetan-consonant-transcription-alist > - tibetan-vowel-transcription-alist > - tibetan-modifier-transcription-alist > - tibetan-subjoined-transcription-alist)) > - (separator "\\|") > - tail pattern) > - (while l > - (setq tail (car l) l (cdr l)) > - (while tail > - (setq pattern (cons separator (cons (car (car tail)) pattern)) > - tail (cdr tail)))) > - (apply 'concat (nreverse (cdr pattern)))) > + (let (pattern) > + (dolist (alist (list tibetan-precomposed-transcription-alist > + tibetan-consonant-transcription-alist > + tibetan-vowel-transcription-alist > + tibetan-modifier-transcription-alist > + tibetan-subjoined-transcription-alist) > + (apply #'concat (nreverse (cdr pattern)))) > + (dolist (key-val alist) > + (setq pattern (cons "\\|" (cons (regexp-quote (car key-val)) > + pattern)))))) > "Regexp matching a Tibetan transcription of a composable Tibetan sequence. This non-trivial change is documented as * lisp/language/tibetan.el (tibetan-regexp): Quote `+' in regexp to pacify the regexp scanner. Simplify. If the regexp scanner needs to be pacified, isn't it better to fix the scanner instead? I also don't think I see the simplification here. In fact, the original code looks simpler to me than the new one, as the former is just a simple while loop, whereas the latter is a nested dolist. Thanks. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 8:10 ` Eli Zaretskii @ 2019-10-05 9:37 ` Mattias Engdegård 2019-10-05 10:49 ` Eli Zaretskii 2019-10-05 9:52 ` Paul Eggert 2019-10-05 16:59 ` Lars Ingebrigtsen 2 siblings, 1 reply; 24+ messages in thread From: Mattias Engdegård @ 2019-10-05 9:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel 5 okt. 2019 kl. 10.10 skrev Eli Zaretskii <eliz@gnu.org>: > If the regexp scanner needs to be pacified, isn't it better to fix the > scanner instead? The check is there because it is often useful. Relint/xr has been detecting and complaining about non-escaped use of special characters such as +, *, ?, ^ and $ for some time now, and for good reason: it's an error-prone exploitation of a hole in the syntax. We think that "*.^" is better written "\\*.\\^" because the latter is more regular, less likely to break when modified, and tells the reader that no, it isn't a mistake, the programmer knows what he is doing. Such non-essential escaping has been added many times before, and it has never been controversial in the slightest. > I also don't think I see the simplification here. In fact, the > original code looks simpler to me than the new one, as the former is > just a simple while loop, whereas the latter is a nested dolist. Actually the original was a nested pair of while loops, which indicates that it wasn't quite that simple. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 9:37 ` Mattias Engdegård @ 2019-10-05 10:49 ` Eli Zaretskii 2019-10-05 15:16 ` Stefan Monnier 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-10-05 10:49 UTC (permalink / raw) To: Mattias Engdegård; +Cc: eggert, emacs-devel > From: Mattias Engdegård <mattiase@acm.org> > Date: Sat, 5 Oct 2019 11:37:45 +0200 > Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > > 5 okt. 2019 kl. 10.10 skrev Eli Zaretskii <eliz@gnu.org>: > > > If the regexp scanner needs to be pacified, isn't it better to fix the > > scanner instead? > > The check is there because it is often useful. Relint/xr has been detecting and complaining about non-escaped use of special characters such as +, *, ?, ^ and $ for some time now, and for good reason: it's an error-prone exploitation of a hole in the syntax. We think that "*.^" is better written "\\*.\\^" because the latter is more regular, less likely to break when modified, and tells the reader that no, it isn't a mistake, the programmer knows what he is doing. > > Such non-essential escaping has been added many times before, and it has never been controversial in the slightest. I'm not sure I understand: is there a technical reason for producing what sounds like a false positive in these cases, or is this done with some principle in mind? If the latter, what is the underlying principle? In general, I'd rather we didn't flag valid constructs if we can, as doing that is an annoyance for programmers. In particular, I find nothing wrong with "*.^", and I find "\\*.\\^" harder to read. > > I also don't think I see the simplification here. In fact, the > > original code looks simpler to me than the new one, as the former is > > just a simple while loop, whereas the latter is a nested dolist. > > Actually the original was a nested pair of while loops, which indicates that it wasn't quite that simple. It's quite simple when I read it. So this appears to be a stylistic preference again. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 10:49 ` Eli Zaretskii @ 2019-10-05 15:16 ` Stefan Monnier 2019-10-05 16:02 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Stefan Monnier @ 2019-10-05 15:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Mattias Engdegård, eggert, emacs-devel > In general, I'd rather we didn't flag valid constructs if we can, as > doing that is an annoyance for programmers. In particular, I find > nothing wrong with "*.^", and I find "\\*.\\^" harder to read. It's more error prone, e.g. if you later concat some prefix to the regexp. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 15:16 ` Stefan Monnier @ 2019-10-05 16:02 ` Eli Zaretskii 0 siblings, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2019-10-05 16:02 UTC (permalink / raw) To: Stefan Monnier; +Cc: mattiase, eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Mattias Engdegård <mattiase@acm.org>, > eggert@cs.ucla.edu, > emacs-devel@gnu.org > Date: Sat, 05 Oct 2019 11:16:04 -0400 > > > In general, I'd rather we didn't flag valid constructs if we can, as > > doing that is an annoyance for programmers. In particular, I find > > nothing wrong with "*.^", and I find "\\*.\\^" harder to read. > > It's more error prone, e.g. if you later concat some prefix to the regexp. No one says such concatenation will ever happen. Besides, any change requires one to make sure the result is valid anyway. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 8:10 ` Eli Zaretskii 2019-10-05 9:37 ` Mattias Engdegård @ 2019-10-05 9:52 ` Paul Eggert 2019-10-05 10:59 ` Eli Zaretskii 2019-10-05 16:59 ` Lars Ingebrigtsen 2 siblings, 1 reply; 24+ messages in thread From: Paul Eggert @ 2019-10-05 9:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, emacs-devel On 10/5/19 1:10 AM, Eli Zaretskii wrote: > both variants are valid, so it sounds like your personal stylistic > preference is for the latter. Is that the only reason? No. The "Special Characters in Regular Expressions" subsubsection of the Elisp manual recommends putting '-' at the end of bracket expressions, since putting '-' elsewhere can cause confusion (one or two examples of which caused bugs that was fixed in that patch). The patch did not systematically put "-" at the end of every bracket expression, only those reasonably near bugs and other glitches. > If the regexp scanner needs to be pacified, isn't it better to fix the > scanner instead? No, as the regexp scanner was right here: it complained about unescaped literal "+" in regexps, which is poor practice (and is documented as poor practice in the Elisp manual). > the > original code looks simpler to me than the new one, as the former is > just a simple while loop, whereas the latter is a nested dolist. Actually the old code was so confusing that it was a bit hard to see that it was a nested loop (not a simple while loop). I originally had the same confusion that you did, which is why I redid the loops to make the nesting more obvious and to lessen the number of lines of code. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 9:52 ` Paul Eggert @ 2019-10-05 10:59 ` Eli Zaretskii 2019-10-05 15:20 ` Stefan Monnier 2019-10-05 19:19 ` Paul Eggert 0 siblings, 2 replies; 24+ messages in thread From: Eli Zaretskii @ 2019-10-05 10:59 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, emacs-devel > Cc: mattiase@acm.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 5 Oct 2019 02:52:34 -0700 > > On 10/5/19 1:10 AM, Eli Zaretskii wrote: > > both variants are valid, so it sounds like your personal stylistic > > preference is for the latter. Is that the only reason? > > No. The "Special Characters in Regular Expressions" subsubsection of the Elisp > manual recommends putting '-' at the end of bracket expressions, since putting > '-' elsewhere can cause confusion (one or two examples of which caused bugs that > was fixed in that patch). The patch did not systematically put "-" at the end of > every bracket expression, only those reasonably near bugs and other glitches. The text in the manual says: To include a ‘-’, write ‘-’ as the first or last character of the character alternative, or as the upper bound of a range. That's it. And I personally can see no confusion in the likes of "[-+]", whereas I did need to consult the manual to learn that "[+-]" is also right. So at least for me, the confusion worked the other way around. > > If the regexp scanner needs to be pacified, isn't it better to fix the > > scanner instead? > > No, as the regexp scanner was right here: it complained about unescaped literal > "+" in regexps, which is poor practice (and is documented as poor practice in > the Elisp manual). Poor, but correct. Look, this scanner is a good tool, but forcing a particular set of stylistic preferences on the project by using that tool, let alone silently so, is more than we have bid for. If you have strong enough opinions on that, let's discuss this first, document the agreements in the manual, and only apply them after that. > > the > > original code looks simpler to me than the new one, as the former is > > just a simple while loop, whereas the latter is a nested dolist. > > Actually the old code was so confusing that it was a bit hard to see that it was > a nested loop (not a simple while loop). I originally had the same confusion > that you did, which is why I redid the loops to make the nesting more obvious > and to lessen the number of lines of code. That's again a personal stylistic preference that I'd rather not regard as a reason strong enough to change correct code. If anything, it runs the risk of introducing bugs that can lie low years before they are detected. If you think the code might be incorrect, find an test case where it misbehaves, then modifying it will be justified. In general, I find that we make too many changes whose justification is weak at best and non-existent at worst. This both wastes our limited resources and risks introducing bugs for no good reason. Let's try to restrain ourselves and not make changes just because we can. Emacs is an old and stable program; we should try not to destabilize it unless we fix real bugs or introduce new features. Thanks. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 10:59 ` Eli Zaretskii @ 2019-10-05 15:20 ` Stefan Monnier 2019-10-05 16:03 ` Eli Zaretskii 2019-10-05 19:19 ` Paul Eggert 1 sibling, 1 reply; 24+ messages in thread From: Stefan Monnier @ 2019-10-05 15:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, Paul Eggert, emacs-devel > That's it. And I personally can see no confusion in the likes of > "[-+]", whereas I did need to consult the manual to learn that "[+-]" > is also right. So at least for me, the confusion worked the other way > around. As discussed here a while back, putting `-` as the last char is actually the only choice which always works (because it can't be the first char when that first char has to be `]`), so I recommend to always put `-` as the last char. Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 15:20 ` Stefan Monnier @ 2019-10-05 16:03 ` Eli Zaretskii 2019-10-06 13:42 ` Stefan Monnier 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-10-05 16:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: mattiase, eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Paul Eggert <eggert@cs.ucla.edu>, mattiase@acm.org, emacs-devel@gnu.org > Date: Sat, 05 Oct 2019 11:20:24 -0400 > > > That's it. And I personally can see no confusion in the likes of > > "[-+]", whereas I did need to consult the manual to learn that "[+-]" > > is also right. So at least for me, the confusion worked the other way > > around. > > As discussed here a while back, putting `-` as the last char is actually > the only choice which always works (because it can't be the first char > when that first char has to be `]`), so I recommend to always put `-` as > the last char. That's unrelated. I was talking about modifying _existing_ code, not about writing new code. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 16:03 ` Eli Zaretskii @ 2019-10-06 13:42 ` Stefan Monnier 2019-10-06 18:01 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Stefan Monnier @ 2019-10-06 13:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, eggert, emacs-devel > That's unrelated. I was talking about modifying _existing_ code, not > about writing new code. In my experience, a lot of new code is written by looking at old code (witness all the quoted lambdas around that can be found in "new" code). Stefan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-06 13:42 ` Stefan Monnier @ 2019-10-06 18:01 ` Eli Zaretskii 0 siblings, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2019-10-06 18:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: mattiase, eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: eggert@cs.ucla.edu, mattiase@acm.org, emacs-devel@gnu.org > Date: Sun, 06 Oct 2019 09:42:31 -0400 > > > That's unrelated. I was talking about modifying _existing_ code, not > > about writing new code. > > In my experience, a lot of new code is written by looking at old code > (witness all the quoted lambdas around that can be found in "new" code). Which is why we have code reviews. Witness the many messages by yourself and others that comment on those practices we are unhappy about. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 10:59 ` Eli Zaretskii 2019-10-05 15:20 ` Stefan Monnier @ 2019-10-05 19:19 ` Paul Eggert 2019-10-05 19:31 ` Eli Zaretskii 1 sibling, 1 reply; 24+ messages in thread From: Paul Eggert @ 2019-10-05 19:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, emacs-devel On 10/5/19 3:59 AM, Eli Zaretskii wrote: > The text in the manual says: > > To include a ‘-’, write ‘-’ as the first or last character of the > character alternative, or as the upper bound of a range. That quote is from an older version of the manual, which was amended after some discussion on this very point. The manual now distinguishes more carefully between what's recommended, and what's valid even though it may cause confusion. The manual now recommends putting literal '-' at the end and says that putting it elsewhere is not the best style even when it is valid. I take your point about not modifying Emacs just for the fun of it. However, when bugs are being fixed, it's OK if the person doing the fix takes the opportunity to make nearby code clearer, as the added clarity can increase the probability of the fix being correct and can simplify later maintenance. Although one can go overboard in the process and it's sometimes better to leave code alone even when it is confusing, generally speaking we should delegate these minor decisions to whoever's actually doing the work, in order to encourage bug-fixers rather than throw cold water on them. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 19:19 ` Paul Eggert @ 2019-10-05 19:31 ` Eli Zaretskii 2019-10-05 19:50 ` Paul Eggert 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-10-05 19:31 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, emacs-devel > Cc: mattiase@acm.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 5 Oct 2019 12:19:16 -0700 > > I take your point about not modifying Emacs just for the fun of it. However, > when bugs are being fixed, it's OK if the person doing the fix takes the > opportunity to make nearby code clearer, as the added clarity can increase the > probability of the fix being correct and can simplify later maintenance. Sorry, I disagree. Especially since "nearby" was IMO interpreted very leniently here. We only agreed to do "nearby" fixes that basically don't affect the code to be executed, such as whitespace changes, typo corrections, removal of the 'register' qualifier, etc. > Although one can go overboard in the process and it's sometimes better to leave > code alone even when it is confusing The code in question was not confusing by any measure, not IMO. And what's confusing for someone could not be confusing for others. E.g., I consider the new code in tibetan.el "confusing", so should I go ahead and change it back to what it was before? I don't want us to start disputes over stylistic issues, so let's respect stylistic preferences of whoever wrote the code, especially if changing them means a real non-trivial code change. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 19:31 ` Eli Zaretskii @ 2019-10-05 19:50 ` Paul Eggert 2019-10-06 17:19 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Paul Eggert @ 2019-10-05 19:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, emacs-devel On 10/5/19 12:31 PM, Eli Zaretskii wrote: > I consider the new code in tibetan.el "confusing", so should I go > ahead and change it back to what it was before? Sure, feel free. It's certainly not worth arguing about. But don't lose the regexp-quote fix while you change it back. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 19:50 ` Paul Eggert @ 2019-10-06 17:19 ` Eli Zaretskii 2019-10-06 17:33 ` Paul Eggert 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-10-06 17:19 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, emacs-devel > Cc: mattiase@acm.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 5 Oct 2019 12:50:59 -0700 > > On 10/5/19 12:31 PM, Eli Zaretskii wrote: > > I consider the new code in tibetan.el "confusing", so should I go > > ahead and change it back to what it was before? > > Sure, feel free. I'd like to try to avoid the need for that in the future, because it leaves bad taste in people's mouth, and sometimes wastes time an energy. Can we please agree to make stylistic changes only if either (a) the change is in the code which is being modified/fixed anyway, or (b) the change doesn't affect the code produced from the modified source? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-06 17:19 ` Eli Zaretskii @ 2019-10-06 17:33 ` Paul Eggert 2019-10-06 18:53 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Paul Eggert @ 2019-10-06 17:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, emacs-devel On 10/6/19 10:19 AM, Eli Zaretskii wrote: > Can we please agree to make stylistic changes only if either (a) the > change is in the code which is being modified/fixed anyway, or (b) the > change doesn't affect the code produced from the modified source? That's what I did in this particular instance. Obviously you disagree, but that's what I thought at the time, and still do. Let's just agree to disagree; it's not worth arguing about. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-06 17:33 ` Paul Eggert @ 2019-10-06 18:53 ` Eli Zaretskii 2019-10-06 19:19 ` Paul Eggert 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-10-06 18:53 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, emacs-devel > Cc: mattiase@acm.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 6 Oct 2019 10:33:34 -0700 > > On 10/6/19 10:19 AM, Eli Zaretskii wrote: > > Can we please agree to make stylistic changes only if either (a) the > > change is in the code which is being modified/fixed anyway, or (b) the > > change doesn't affect the code produced from the modified source? > > That's what I did in this particular instance. Was it (a) or (b), according to your interpretation? > Obviously you disagree, but that's what I thought at the time, and > still do. Let's just agree to disagree; it's not worth arguing > about. We cannot agree to disagree on this, because it means we will keep bumping at this and arguing about such changes in the future. I'm trying to find a compromise that we all could live with. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-06 18:53 ` Eli Zaretskii @ 2019-10-06 19:19 ` Paul Eggert 2019-10-06 19:31 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Paul Eggert @ 2019-10-06 19:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, emacs-devel On 10/6/19 11:53 AM, Eli Zaretskii wrote: > Was it (a) or (b), according to your interpretation? Partly (a) and partly (b). Really, it doesn't matter. > we will keep > bumping at this and arguing about such changes in the future. I'm not arguing about the change now. As I said, please feel free to revert whatever part of it that you dislike (though I suggest keeping the bug fixes). I don't think unimportant little cases like this are worth either your time or mine (though perhaps we'll have to disagree about that as well :-). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-06 19:19 ` Paul Eggert @ 2019-10-06 19:31 ` Eli Zaretskii 0 siblings, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2019-10-06 19:31 UTC (permalink / raw) To: Paul Eggert; +Cc: mattiase, emacs-devel > Cc: mattiase@acm.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 6 Oct 2019 12:19:17 -0700 > > > we will keep > > bumping at this and arguing about such changes in the future. > > I'm not arguing about the change now. As I said, please feel free to revert > whatever part of it that you dislike (though I suggest keeping the bug fixes). I > don't think unimportant little cases like this are worth either your time or > mine (though perhaps we'll have to disagree about that as well :-). Would you prefer that I asked you in similar cases in the future to revert the relevant parts? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 8:10 ` Eli Zaretskii 2019-10-05 9:37 ` Mattias Engdegård 2019-10-05 9:52 ` Paul Eggert @ 2019-10-05 16:59 ` Lars Ingebrigtsen 2019-10-05 18:52 ` Paul Eggert 2 siblings, 1 reply; 24+ messages in thread From: Lars Ingebrigtsen @ 2019-10-05 16:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, Paul Eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Thanks, I installed the attached patch, which I hope fixes all the bugs >> and style glitches uncovered by that scan, along with some other style >> glitches I noticed in the neighborhood. > > I question the need for "fixing" those "style glitches" (and even the > very existence of a "style glitch", which sounds like a contradiction > of terms to me). What's confusing to me is why regexps in particular are being subjected to these stylistic rewrites. There's a bunch of styles being used in Emacs Lisp code, but we avoid doing mass fixups (according to one preference or another) of working code, because non-functional code churn makes maintenance more difficult. I mean, I prefer (when foo (do-something)) over (if foo (do-something)) or (and foo (do-something)) but I hope that nobody would do a mass-rewrite of the Emacs code base to prefer `when' here, because it'd make diving into the history of the code a nightmare. So what's so special about regexps that "[-+]" has to be mass-rewritten as "[+-]"? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-05 16:59 ` Lars Ingebrigtsen @ 2019-10-05 18:52 ` Paul Eggert 0 siblings, 0 replies; 24+ messages in thread From: Paul Eggert @ 2019-10-05 18:52 UTC (permalink / raw) To: Lars Ingebrigtsen, Eli Zaretskii; +Cc: mattiase, emacs-devel On 10/5/19 9:59 AM, Lars Ingebrigtsen wrote: > what's so special about regexps that "[-+]" has to be mass-rewritten > as "[+-]"? There's no mass rewriting here. I changed regexps near other (more-complicated) regexps that didn't put "-" at the end, regexps that were in fact incorrect evidently due to their authors' confusion. Using a consistent style in a trouble-prone area can help avoid future bugs when someone changes that code in the future. If you add regexps to Emacs in the future, I suggest putting literal "-" at the end of bracket expressions, since that's the style recommended in the manual for a solid technical reason, namely to avoid all-too-common confusion in this area that causes real bugs. That is why the regexp scanner flags these "-" instances: it finds real bugs, some of which were fixed in the most-recent patch in this area. Of course there's a cost-benefit analysis to doing lint checking like this. In this particular case, the number of actual bugs and the simplicity of pacifying the lint checker mean that the benefits of using the checker outweigh the cost. That being said, there's no pressing need to go through all the Emacs code and rewrite [-+] to [+-], just as there's no pressing need to go through all the Emacs code and remove the useless and distracting "register" keyword. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs regexp scan (Sep 29) 2019-10-04 21:42 ` Paul Eggert 2019-10-05 8:10 ` Eli Zaretskii @ 2019-10-05 10:03 ` Mattias Engdegård 1 sibling, 0 replies; 24+ messages in thread From: Mattias Engdegård @ 2019-10-05 10:03 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel 4 okt. 2019 kl. 23.42 skrev Paul Eggert <eggert@cs.ucla.edu>: > > Thanks, I installed the attached patch, which I hope fixes all the bugs and style glitches uncovered by that scan, along with some other style glitches I noticed in the neighborhood. Thank you, I've reviewed it and found no errors. I also grepped for similar mistakes (missing \ or u in Unicode escapes) but didn't find anything else. Just a small complaint remaining: (defconst gurmukhi-composable-pattern [...] - ("Y" . "[\u0A2F-u0A30\u0A35\u0A39]") ; YA, RA, VA, HA + ("Y" . "[\u0A2F-\u0A30\u0A35\u0A39]") ; YA, RA, VA, HA The hyphen can be removed. (You can run the checker yourself by installing the `relint' package from GNU ELPA, and running M-x relint-directory RET <Emacs source dir> RET.) ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2019-10-06 19:31 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-09-29 19:39 Emacs regexp scan (Sep 29) Mattias Engdegård 2019-10-04 21:42 ` Paul Eggert 2019-10-05 8:10 ` Eli Zaretskii 2019-10-05 9:37 ` Mattias Engdegård 2019-10-05 10:49 ` Eli Zaretskii 2019-10-05 15:16 ` Stefan Monnier 2019-10-05 16:02 ` Eli Zaretskii 2019-10-05 9:52 ` Paul Eggert 2019-10-05 10:59 ` Eli Zaretskii 2019-10-05 15:20 ` Stefan Monnier 2019-10-05 16:03 ` Eli Zaretskii 2019-10-06 13:42 ` Stefan Monnier 2019-10-06 18:01 ` Eli Zaretskii 2019-10-05 19:19 ` Paul Eggert 2019-10-05 19:31 ` Eli Zaretskii 2019-10-05 19:50 ` Paul Eggert 2019-10-06 17:19 ` Eli Zaretskii 2019-10-06 17:33 ` Paul Eggert 2019-10-06 18:53 ` Eli Zaretskii 2019-10-06 19:19 ` Paul Eggert 2019-10-06 19:31 ` Eli Zaretskii 2019-10-05 16:59 ` Lars Ingebrigtsen 2019-10-05 18:52 ` Paul Eggert 2019-10-05 10:03 ` Mattias Engdegård
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).