unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: 37659@debbugs.gnu.org
Subject: bug#37659: rx additions: anychar, unmatchable, unordered-or
Date: Wed, 9 Oct 2019 10:59:43 +0200	[thread overview]
Message-ID: <7926BA83-93AB-47A5-875E-229BE7192874@acm.org> (raw)
In-Reply-To: <EEEF0800-40AC-4121-A55C-6B0C3804D566@acm.org>

[-- Attachment #1: Type: text/plain, Size: 193 bytes --]

Also consider changing the rendition of anychar/anything from ".\\|\n" to "[^z-a]", which is faster and does not allocate stack space. Previously, (* anything) wouldn't match large strings.


[-- Attachment #2: 0004-Use-z-a-for-matching-any-character-anychar-anything-.patch --]
[-- Type: application/octet-stream, Size: 2502 bytes --]

From c72633774b375eaadd6117eb0b26fb9792fed1bd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Wed, 9 Oct 2019 10:22:10 +0200
Subject: [PATCH 4/4] Use [^z-a] for matching any character (anychar/anything)
 in rx
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lisp/emacs-lisp/rx.el (rx--translate-symbol):
* test/lisp/emacs-lisp/rx-tests.el (rx-any, rx-atoms):
Use [^z-a] instead of ".\\|\n" for anychar.

The new expression is faster (about 2×) and does not allocate regexp
stack space.  For example, (0+ anychar) now matches strings of any
size.
---
 lisp/emacs-lisp/rx.el            | 2 +-
 test/lisp/emacs-lisp/rx-tests.el | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index 0b14144698..2f58033ffd 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -131,7 +131,7 @@ rx--translate-symbol
     ;; Use `list' instead of a quoted list to wrap the strings here,
     ;; since the return value may be mutated.
     ((or 'nonl 'not-newline 'any) (cons (list ".") t))
-    ((or 'anychar 'anything)      (rx--translate-form '(or nonl "\n")))
+    ((or 'anychar 'anything)      (cons (list "[^z-a]") t))
     ('unmatchable                 (rx--empty))
     ((or 'bol 'line-start)        (cons (list "^") 'lseq))
     ((or 'eol 'line-end)          (cons (list "$") 'rseq))
diff --git a/test/lisp/emacs-lisp/rx-tests.el b/test/lisp/emacs-lisp/rx-tests.el
index bced74569f..a098784a85 100644
--- a/test/lisp/emacs-lisp/rx-tests.el
+++ b/test/lisp/emacs-lisp/rx-tests.el
@@ -134,9 +134,9 @@ rx-any
   (should (equal (rx (not (any "!a" "0-8" digit nonascii)))
                  "[^!0-8a[:digit:][:nonascii:]]"))
   (should (equal (rx (any) (not (any)))
-                 "\\`a\\`\\(?:.\\|\n\\)"))
+                 "\\`a\\`[^z-a]"))
   (should (equal (rx (any "") (not (any "")))
-                 "\\`a\\`\\(?:.\\|\n\\)")))
+                 "\\`a\\`[^z-a]")))
 
 (ert-deftest rx-pcase ()
   (should (equal (pcase "a 1 2 3 1 1 b"
@@ -193,7 +193,7 @@ rx-repeat
 
 (ert-deftest rx-atoms ()
   (should (equal (rx anychar anything)
-                 "\\(?:.\\|\n\\)\\(?:.\\|\n\\)"))
+                 "[^z-a][^z-a]"))
   (should (equal (rx unmatchable)
                  "\\`a\\`"))
   (should (equal (rx line-start not-newline nonl any line-end)
-- 
2.21.0 (Apple Git-122)


  reply	other threads:[~2019-10-09  8:59 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08  9:36 bug#37659: rx additions: anychar, unmatchable, unordered-or Mattias Engdegård
2019-10-09  8:59 ` Mattias Engdegård [this message]
2019-10-11 23:07 ` bug#37659: Mattias Engdegård <mattiase <at> acm.org> Paul Eggert
2019-10-12 10:47   ` Mattias Engdegård
2019-10-13 16:52     ` Paul Eggert
2019-10-13 19:48       ` Mattias Engdegård
2019-10-22 15:14       ` bug#37659: rx additions: anychar, unmatchable, unordered-or Mattias Engdegård
2019-10-22 15:27         ` Robert Pluim
2019-10-22 17:33         ` Paul Eggert
2019-10-23  9:15           ` Mattias Engdegård
2019-10-23 23:14             ` Paul Eggert
2019-10-24  1:56               ` Drew Adams
2019-10-24  9:09                 ` Mattias Engdegård
2019-10-24 14:24                   ` Drew Adams
2019-10-24  9:17                 ` Phil Sainty
2019-10-24 14:32                   ` Drew Adams
2019-10-24  8:58               ` Mattias Engdegård
2019-10-27 11:53                 ` Mattias Engdegård
2020-02-11 12:57           ` Mattias Engdegård
2020-02-11 15:43             ` Eli Zaretskii
2020-02-11 19:17               ` Mattias Engdegård
2020-02-12  0:52                 ` Paul Eggert
2020-02-12 11:22                   ` Mattias Engdegård
2020-02-13 18:38                     ` Mattias Engdegård
2020-02-13 18:50                       ` Paul Eggert
2020-02-13 19:16                         ` Mattias Engdegård
2020-02-13 19:30                           ` Eli Zaretskii
2020-02-13 22:23                             ` Mattias Engdegård
2020-02-14  7:45                               ` Eli Zaretskii
2020-02-14 16:15                                 ` Paul Eggert
2020-02-14 20:49                                   ` Mattias Engdegård
2020-03-01 10:09                                   ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7926BA83-93AB-47A5-875E-229BE7192874@acm.org \
    --to=mattiase@acm.org \
    --cc=37659@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).