From: "Mattias Engdegård" <mattiase@acm.org>
To: 37659@debbugs.gnu.org
Subject: bug#37659: rx additions: anychar, unmatchable, unordered-or
Date: Wed, 9 Oct 2019 10:59:43 +0200 [thread overview]
Message-ID: <7926BA83-93AB-47A5-875E-229BE7192874@acm.org> (raw)
In-Reply-To: <EEEF0800-40AC-4121-A55C-6B0C3804D566@acm.org>
[-- Attachment #1: Type: text/plain, Size: 193 bytes --]
Also consider changing the rendition of anychar/anything from ".\\|\n" to "[^z-a]", which is faster and does not allocate stack space. Previously, (* anything) wouldn't match large strings.
[-- Attachment #2: 0004-Use-z-a-for-matching-any-character-anychar-anything-.patch --]
[-- Type: application/octet-stream, Size: 2502 bytes --]
From c72633774b375eaadd6117eb0b26fb9792fed1bd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Wed, 9 Oct 2019 10:22:10 +0200
Subject: [PATCH 4/4] Use [^z-a] for matching any character (anychar/anything)
in rx
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* lisp/emacs-lisp/rx.el (rx--translate-symbol):
* test/lisp/emacs-lisp/rx-tests.el (rx-any, rx-atoms):
Use [^z-a] instead of ".\\|\n" for anychar.
The new expression is faster (about 2×) and does not allocate regexp
stack space. For example, (0+ anychar) now matches strings of any
size.
---
lisp/emacs-lisp/rx.el | 2 +-
test/lisp/emacs-lisp/rx-tests.el | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index 0b14144698..2f58033ffd 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -131,7 +131,7 @@ rx--translate-symbol
;; Use `list' instead of a quoted list to wrap the strings here,
;; since the return value may be mutated.
((or 'nonl 'not-newline 'any) (cons (list ".") t))
- ((or 'anychar 'anything) (rx--translate-form '(or nonl "\n")))
+ ((or 'anychar 'anything) (cons (list "[^z-a]") t))
('unmatchable (rx--empty))
((or 'bol 'line-start) (cons (list "^") 'lseq))
((or 'eol 'line-end) (cons (list "$") 'rseq))
diff --git a/test/lisp/emacs-lisp/rx-tests.el b/test/lisp/emacs-lisp/rx-tests.el
index bced74569f..a098784a85 100644
--- a/test/lisp/emacs-lisp/rx-tests.el
+++ b/test/lisp/emacs-lisp/rx-tests.el
@@ -134,9 +134,9 @@ rx-any
(should (equal (rx (not (any "!a" "0-8" digit nonascii)))
"[^!0-8a[:digit:][:nonascii:]]"))
(should (equal (rx (any) (not (any)))
- "\\`a\\`\\(?:.\\|\n\\)"))
+ "\\`a\\`[^z-a]"))
(should (equal (rx (any "") (not (any "")))
- "\\`a\\`\\(?:.\\|\n\\)")))
+ "\\`a\\`[^z-a]")))
(ert-deftest rx-pcase ()
(should (equal (pcase "a 1 2 3 1 1 b"
@@ -193,7 +193,7 @@ rx-repeat
(ert-deftest rx-atoms ()
(should (equal (rx anychar anything)
- "\\(?:.\\|\n\\)\\(?:.\\|\n\\)"))
+ "[^z-a][^z-a]"))
(should (equal (rx unmatchable)
"\\`a\\`"))
(should (equal (rx line-start not-newline nonl any line-end)
--
2.21.0 (Apple Git-122)
next prev parent reply other threads:[~2019-10-09 8:59 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-08 9:36 bug#37659: rx additions: anychar, unmatchable, unordered-or Mattias Engdegård
2019-10-09 8:59 ` Mattias Engdegård [this message]
2019-10-11 23:07 ` bug#37659: Mattias Engdegård <mattiase <at> acm.org> Paul Eggert
2019-10-12 10:47 ` Mattias Engdegård
2019-10-13 16:52 ` Paul Eggert
2019-10-13 19:48 ` Mattias Engdegård
2019-10-22 15:14 ` bug#37659: rx additions: anychar, unmatchable, unordered-or Mattias Engdegård
2019-10-22 15:27 ` Robert Pluim
2019-10-22 17:33 ` Paul Eggert
2019-10-23 9:15 ` Mattias Engdegård
2019-10-23 23:14 ` Paul Eggert
2019-10-24 1:56 ` Drew Adams
2019-10-24 9:09 ` Mattias Engdegård
2019-10-24 14:24 ` Drew Adams
2019-10-24 9:17 ` Phil Sainty
2019-10-24 14:32 ` Drew Adams
2019-10-24 8:58 ` Mattias Engdegård
2019-10-27 11:53 ` Mattias Engdegård
2020-02-11 12:57 ` Mattias Engdegård
2020-02-11 15:43 ` Eli Zaretskii
2020-02-11 19:17 ` Mattias Engdegård
2020-02-12 0:52 ` Paul Eggert
2020-02-12 11:22 ` Mattias Engdegård
2020-02-13 18:38 ` Mattias Engdegård
2020-02-13 18:50 ` Paul Eggert
2020-02-13 19:16 ` Mattias Engdegård
2020-02-13 19:30 ` Eli Zaretskii
2020-02-13 22:23 ` Mattias Engdegård
2020-02-14 7:45 ` Eli Zaretskii
2020-02-14 16:15 ` Paul Eggert
2020-02-14 20:49 ` Mattias Engdegård
2020-03-01 10:09 ` Mattias Engdegård
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7926BA83-93AB-47A5-875E-229BE7192874@acm.org \
--to=mattiase@acm.org \
--cc=37659@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).