From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Re: Regexp error scan (March 26) Date: Wed, 27 Mar 2019 11:50:29 -0700 Organization: UCLA Computer Science Department Message-ID: References: <0E648A80-8673-44DB-B481-981474AC3D7C@acm.org> <77419a89-ce9f-919b-c221-c7a3b938587a@cs.ucla.edu> <87h8bpvv74.fsf@tcd.ie> <7E84CED0-9E85-45FB-8BBF-B87ED51F4818@acm.org> <045E4E1D-EF0B-41A8-809D-48F38AABD602@acm.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------26475A2CBD129713274112FF" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="51364"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.0 Cc: =?UTF-8?Q?Mattias_Engdeg=c3=a5rd?= , Emacs developers To: Stefan Monnier , Noam Postavsky Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Mar 27 19:58:29 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h9Dkp-000DCc-VI for ged-emacs-devel@m.gmane.org; Wed, 27 Mar 2019 19:58:28 +0100 Original-Received: from localhost ([127.0.0.1]:52173 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h9Dko-0004tU-Qn for ged-emacs-devel@m.gmane.org; Wed, 27 Mar 2019 14:58:26 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:35864) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h9Dh7-0002Jx-TO for emacs-devel@gnu.org; Wed, 27 Mar 2019 14:54:39 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h9DdE-0004Wi-15 for emacs-devel@gnu.org; Wed, 27 Mar 2019 14:50:37 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:59216) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h9DdC-0004W3-Qr for emacs-devel@gnu.org; Wed, 27 Mar 2019 14:50:35 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id BFD2C160F22; Wed, 27 Mar 2019 11:50:31 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 1z9FQu5_scq0; Wed, 27 Mar 2019 11:50:30 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 47FA8160F23; Wed, 27 Mar 2019 11:50:30 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 3-K5F6tJdayf; Wed, 27 Mar 2019 11:50:30 -0700 (PDT) Original-Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 25E90160F1A; Wed, 27 Mar 2019 11:50:30 -0700 (PDT) Openpgp: preference=signencrypt Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECH In-Reply-To: Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 131.179.128.68 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:234788 Archived-At: This is a multi-part message in MIME format. --------------26475A2CBD129713274112FF Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 3/27/19 8:48 AM, Stefan Monnier wrote: >> Or use string-to-list? > That would be too obvious. Thanks to everybody who helped improve that code with "obvious" changes that weren't obvious to me. I installed the attached patches to try to incorporate all the comments. I'm not sure what to do about footnote.el's blithe overuse of "+" and "*" so I merely left a FIXME comment for that, stealing its wording from Mattias's email. I avoided regexp-opt before because its doc string implied that (regexp-opt '("a" "a")) was invalid. The first patch attempts to fix that confusion too. --------------26475A2CBD129713274112FF Content-Type: text/x-patch; name="0001-Use-regexp-opt-charset-to-improve-regexp-tweaks.patch" Content-Disposition: attachment; filename*0="0001-Use-regexp-opt-charset-to-improve-regexp-tweaks.patch" Content-Transfer-Encoding: quoted-printable >From 92acab73e0dd3921b53eac4f3fba327b7aa4d3aa Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 27 Mar 2019 11:36:13 -0700 Subject: [PATCH] Use regexp-opt-charset to improve regexp tweaks * lisp/emacs-lisp/regexp-opt.el (regexp-opt): Reword confusing sentence in doc string. * lisp/erc/erc.el (erc-lurker-maybe-trim): * lisp/mail/footnote.el (footnote-hebrew-numeric-regex): Improve by using regexp-opt-charset. --- lisp/emacs-lisp/regexp-opt.el | 6 +++--- lisp/erc/erc.el | 4 +--- lisp/mail/footnote.el | 12 ++++++++---- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/lisp/emacs-lisp/regexp-opt.el b/lisp/emacs-lisp/regexp-opt.e= l index fce6a47d98..d883752d71 100644 --- a/lisp/emacs-lisp/regexp-opt.el +++ b/lisp/emacs-lisp/regexp-opt.el @@ -86,9 +86,9 @@ ;;;###autoload (defun regexp-opt (strings &optional paren keep-order) "Return a regexp to match a string in the list STRINGS. -Each string should be unique in STRINGS and should not contain -any regexps, quoted or not. Optional PAREN specifies how the -returned regexp is surrounded by grouping constructs. +Each member of STRINGS is treated as a fixed string, not as a regexp. +Optional PAREN specifies how the returned regexp is surrounded by +grouping constructs. =20 If STRINGS is the empty list, the return value is a regexp that never matches anything. diff --git a/lisp/erc/erc.el b/lisp/erc/erc.el index e34487de27..d1fa5c7f12 100644 --- a/lisp/erc/erc.el +++ b/lisp/erc/erc.el @@ -67,7 +67,6 @@ (load "erc-loaddefs" nil t) =20 (eval-when-compile (require 'cl-lib)) -(require 'cl-seq) (require 'font-lock) (require 'pp) (require 'thingatpt) @@ -2523,8 +2522,7 @@ erc-lurker-maybe-trim non-nil." (if erc-lurker-trim-nicks (replace-regexp-in-string - (regexp-opt (cl-delete-duplicates - (mapcar #'char-to-string erc-lurker-ignore-chars))) + (regexp-opt-charset (string-to-list erc-lurker-ignore-chars)) "" nick) nick)) =20 diff --git a/lisp/mail/footnote.el b/lisp/mail/footnote.el index 7f88e30120..81dc11de76 100644 --- a/lisp/mail/footnote.el +++ b/lisp/mail/footnote.el @@ -64,7 +64,6 @@ ;;; Code: =20 (eval-when-compile (require 'cl-lib)) -(require 'cl-seq) (defvar filladapt-token-table) =20 (defgroup footnote nil @@ -364,9 +363,9 @@ footnote-hebrew-numeric ("=D7=A7" "=D7=A8" "=D7=A9" "=D7=AA" "=D7=AA=D7=A7" "=D7=AA=D7=A8" "= =D7=AA=D7=A9" "=D7=AA=D7=AA" "=D7=AA=D7=AA=D7=A7"))) =20 (defconst footnote-hebrew-numeric-regex - (concat "[" (cl-delete-duplicates - (apply #'concat (apply #'append footnote-hebrew-numeric))) - "']+")) + (let ((numchars (string-to-list + (apply #'concat (apply #'append footnote-hebrew-numeric))))) + (concat (regexp-opt-charset (cons ?' numchars)) "+"))) ;; (defconst footnote-hebrew-numeric-regex "\\([=D7=90=D7=91=D7=92=D7=93= =D7=94=D7=95=D7=96=D7=97=D7=98]'\\)?\\(=D7=AA\\)?\\(=D7=AA\\)?\\([=D7=A7=D7= =A8=D7=A9=D7=AA]\\)?\\([=D7=98=D7=99=D7=9B=D7=9C=D7=9E=D7=A0=D7=A1=D7=A2=D7= =A4=D7=A6]\\)?\\([=D7=90=D7=91=D7=92=D7=93=D7=94=D7=95=D7=96=D7=97=D7=98]= \\)?") =20 (defun footnote--hebrew-numeric (n) @@ -464,6 +463,11 @@ footnote--current-regexp (nth 0 footnote-style-alist))))) (concat ;; Hack to avoid repetition of repetition. + ;; FIXME: I'm not sure the added * makes sense at all; there is + ;; always a single number within the footnote-{start,end}-tag pairs= . + ;; Worse, the code goes on and adds yet another + later on, in + ;; footnote-refresh-footnotes, just in case. That makes even less s= ense. + ;; Likely, both the * and the extra + should go away. (if (string-match "[^\\]\\\\\\{2\\}*[*+?]\\'" regexp) (substring regexp 0 -1) regexp) --=20 2.20.1 --------------26475A2CBD129713274112FF Content-Type: text/x-patch; name="0001-Tune-css-mode-regexp.patch" Content-Disposition: attachment; filename="0001-Tune-css-mode-regexp.patch" Content-Transfer-Encoding: quoted-printable >From df167575d1ac2d056c8a2ef1fc83d768c09a3d28 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 27 Mar 2019 11:43:18 -0700 Subject: [PATCH] Tune css-mode regexp MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit * lisp/textmodes/css-mode.el (css--font-lock-keywords): Omit unnecessary \(?: \) in regexp. Suggested by Mattias Engdeg=C3=A5rd = in: https://lists.gnu.org/r/emacs-devel/2019-03/msg01042.html --- lisp/textmodes/css-mode.el | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el index d3ca2d9558..11a77b5bb7 100644 --- a/lisp/textmodes/css-mode.el +++ b/lisp/textmodes/css-mode.el @@ -892,7 +892,7 @@ css--font-lock-keywords (,(concat "@" css-ident-re) (0 font-lock-builtin-face)) ;; Selectors. ;; Allow plain ":root" as a selector. - ("^[ \t]*\\(:root\\)\\(?:[\n \t]*\\){" (1 'css-selector keep)) + ("^[ \t]*\\(:root\\)[\n \t]*{" (1 'css-selector keep)) ;; FIXME: attribute selectors don't work well because they may conta= in ;; strings which have already been highlighted as f-l-string-face an= d ;; thus prevent this highlighting from being applied (actually now t= hat --=20 2.20.1 --------------26475A2CBD129713274112FF--