From: "Mattias Engdegård" <mattiase@acm.org>
To: Michael Heerdegen <michael_heerdegen@web.de>
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: regular expressions that match nothing
Date: Wed, 15 May 2019 23:07:11 +0200 [thread overview]
Message-ID: <128EBFB8-78FF-47C3-8F28-C1EF91BFC4BB@acm.org> (raw)
In-Reply-To: <87a7fnzd3u.fsf@web.de>
[-- Attachment #1: Type: text/plain, Size: 302 bytes --]
15 maj 2019 kl. 22.17 skrev Michael Heerdegen <michael_heerdegen@web.de>:
>
> Should there be an rx regexp form for this?
We don't necessarily need a special form for it; we can just make `(or)' work.
Proposed patch attached. (I also added its dual, (seq), since it would be silly not to.)
[-- Attachment #2: 0001-Allow-zero-argument-rx-or-and-seq-forms.patch --]
[-- Type: application/octet-stream, Size: 3884 bytes --]
From b7706f5b398bb360ac1405842efe852ca89b9de8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Wed, 15 May 2019 22:44:00 +0200
Subject: [PATCH] Allow zero-argument rx `or' and `seq' forms
Make the rx `or' and `seq' forms accept zero arguments to produce a
never-matching regexp and an empty string, respectively.
* lisp/emacs-lisp/rx.el (rx-constituents, rx-or): Permit zero args.
(rx): Amend doc string for `or' and `seq'.
* test/lisp/emacs-lisp/rx-tests.el (rx-or, rx-seq): Test the change.
* etc/NEWS (Changes in Specialized Modes and Packages): Mention the change.
---
etc/NEWS | 5 +++++
lisp/emacs-lisp/rx.el | 13 ++++++++-----
test/lisp/emacs-lisp/rx-tests.el | 8 +++++++-
3 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/etc/NEWS b/etc/NEWS
index 699a04b524..5f3468596b 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1310,6 +1310,11 @@ when given in a string. Previously, '(any "\x80-\xff")' would match
characters U+0080...U+00FF. Now the expression matches raw bytes in
the 128...255 range, as expected.
+*** The rx 'or' and 'seq' forms no longer require any arguments.
+The zero-argument forms (or) and (seq) are now permitted: (or)
+produces a regexp that never matches anything, while (seq) produces
+the empty string, each being an identity for its operation.
+
** Frames
+++
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index fdd24317c6..5437927b9e 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -111,11 +111,11 @@
;; FIXME: support macros.
(defvar rx-constituents ;Not `const' because some modes extend it.
- '((and . (rx-and 1 nil))
+ '((and . (rx-and 0 nil))
(seq . and) ; SRE
(: . and) ; SRE
(sequence . and) ; sregex
- (or . (rx-or 1 nil))
+ (or . (rx-or 0 nil))
(| . or) ; SRE
(not-newline . ".")
(nonl . not-newline) ; SRE
@@ -391,9 +391,11 @@ FORM is of the form `(and FORM1 ...)'."
"Parse and produce code from FORM, which is `(or FORM1 ...)'."
(rx-check form)
(rx-group-if
- (if (memq nil (mapcar 'stringp (cdr form)))
- (mapconcat (lambda (x) (rx-form x '|)) (cdr form) "\\|")
+ (cond
+ ((null (cdr form)) regexp-unmatchable)
+ ((cl-every #'stringp (cdr form))
(regexp-opt (cdr form) nil t))
+ (t (mapconcat (lambda (x) (rx-form x '|)) (cdr form) "\\|")))
(and (memq rx-parent '(: * t)) rx-parent)))
@@ -1122,6 +1124,7 @@ CHAR
`(seq SEXP1 SEXP2 ...)'
`(sequence SEXP1 SEXP2 ...)'
matches what SEXP1 matches, followed by what SEXP2 matches, etc.
+ Without arguments, matches the empty string.
`(submatch SEXP1 SEXP2 ...)'
`(group SEXP1 SEXP2 ...)'
@@ -1137,7 +1140,7 @@ CHAR
`(| SEXP1 SEXP2 ...)'
matches anything that matches SEXP1 or SEXP2, etc. If all
args are strings, use `regexp-opt' to optimize the resulting
- regular expression.
+ regular expression. Without arguments, never matches anything.
`(minimal-match SEXP)'
produce a non-greedy regexp for SEXP. Normally, regexps matching
diff --git a/test/lisp/emacs-lisp/rx-tests.el b/test/lisp/emacs-lisp/rx-tests.el
index 4a5919edf0..6f392d616d 100644
--- a/test/lisp/emacs-lisp/rx-tests.el
+++ b/test/lisp/emacs-lisp/rx-tests.el
@@ -107,7 +107,13 @@
"ab"))
(should (equal (and (string-match (rx (or "a" "ab" "abc")) s)
(match-string 0 s))
- "a"))))
+ "a")))
+ ;; Test zero-argument `or'.
+ (should (equal (rx (or)) regexp-unmatchable)))
+
+(ert-deftest rx-seq ()
+ ;; Test zero-argument `seq'.
+ (should (equal (rx (seq)) "")))
(provide 'rx-tests)
;; rx-tests.el ends here.
--
2.20.1 (Apple Git-117)
next prev parent reply other threads:[~2019-05-15 21:07 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-14 7:25 regular expressions that match nothing philippe schnoebelen
2019-05-14 10:14 ` Mattias Engdegård
2019-05-14 19:41 ` Stefan Monnier
2019-05-15 16:21 ` Mattias Engdegård
2019-05-15 19:41 ` Alan Mackenzie
2019-05-16 10:54 ` Mattias Engdegård
2019-05-16 23:18 ` Phil Sainty
2019-05-17 9:43 ` Alan Mackenzie
2019-05-17 10:17 ` Mattias Engdegård
2019-05-17 12:53 ` Stefan Monnier
2019-05-15 20:17 ` Michael Heerdegen
2019-05-15 21:06 ` Stefan Monnier
2019-05-15 21:07 ` Mattias Engdegård [this message]
2019-05-15 21:38 ` Michael Heerdegen
2019-05-16 6:57 ` More re odditie [Was: regular expressions that match nothing] phs
2019-05-16 9:29 ` Mattias Engdegård
2019-05-16 10:59 ` phs
2019-05-16 12:31 ` Stefan Monnier
2019-05-16 18:35 ` Michael Heerdegen
2019-05-16 20:31 ` Mattias Engdegård
2019-05-16 21:01 ` Global and local definitions of non-functions/variable (was: More re odditie [Was: regular expressions that match nothing]) Stefan Monnier
2019-05-20 16:26 ` Bootstrap/autoload policy (was Re: regular expressions that match nothing) Mattias Engdegård
2019-05-22 14:02 ` Stefan Monnier
2019-05-22 14:07 ` Mattias Engdegård
2019-05-22 14:24 ` Stefan Monnier
2019-05-22 15:06 ` Mattias Engdegård
2019-05-22 15:53 ` Stefan Monnier
2019-05-22 16:40 ` Mattias Engdegård
2019-05-22 19:08 ` Stefan Monnier
2019-05-26 12:05 ` Basil L. Contovounesios
2019-05-16 18:12 ` regular expressions that match nothing Eric Abrahamsen
2019-05-19 4:30 ` 回复: " net june
2019-05-19 5:00 ` HaiJun Zhang
2019-05-19 7:32 ` Mattias Engdegård
2019-05-20 7:56 ` philippe schnoebelen
2019-05-20 23:19 ` Richard Stallman
2019-05-19 14:12 ` 回复: " Drew Adams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=128EBFB8-78FF-47C3-8F28-C1EF91BFC4BB@acm.org \
--to=mattiase@acm.org \
--cc=emacs-devel@gnu.org \
--cc=michael_heerdegen@web.de \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).