unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Michael Heerdegen <michael_heerdegen@web.de>
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: regular expressions that match nothing
Date: Wed, 15 May 2019 23:07:11 +0200	[thread overview]
Message-ID: <128EBFB8-78FF-47C3-8F28-C1EF91BFC4BB@acm.org> (raw)
In-Reply-To: <87a7fnzd3u.fsf@web.de>

[-- Attachment #1: Type: text/plain, Size: 302 bytes --]

15 maj 2019 kl. 22.17 skrev Michael Heerdegen <michael_heerdegen@web.de>:
> 
> Should there be an rx regexp form for this?

We don't necessarily need a special form for it; we can just make `(or)' work.

Proposed patch attached. (I also added its dual, (seq), since it would be silly not to.)


[-- Attachment #2: 0001-Allow-zero-argument-rx-or-and-seq-forms.patch --]
[-- Type: application/octet-stream, Size: 3884 bytes --]

From b7706f5b398bb360ac1405842efe852ca89b9de8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Wed, 15 May 2019 22:44:00 +0200
Subject: [PATCH] Allow zero-argument rx `or' and `seq' forms

Make the rx `or' and `seq' forms accept zero arguments to produce a
never-matching regexp and an empty string, respectively.

* lisp/emacs-lisp/rx.el (rx-constituents, rx-or): Permit zero args.
(rx): Amend doc string for `or' and `seq'.
* test/lisp/emacs-lisp/rx-tests.el (rx-or, rx-seq): Test the change.
* etc/NEWS (Changes in Specialized Modes and Packages): Mention the change.
---
 etc/NEWS                         |  5 +++++
 lisp/emacs-lisp/rx.el            | 13 ++++++++-----
 test/lisp/emacs-lisp/rx-tests.el |  8 +++++++-
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index 699a04b524..5f3468596b 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1310,6 +1310,11 @@ when given in a string.  Previously, '(any "\x80-\xff")' would match
 characters U+0080...U+00FF.  Now the expression matches raw bytes in
 the 128...255 range, as expected.
 
+*** The rx 'or' and 'seq' forms no longer require any arguments.
+The zero-argument forms (or) and (seq) are now permitted: (or)
+produces a regexp that never matches anything, while (seq) produces
+the empty string, each being an identity for its operation.
+
 ** Frames
 
 +++
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index fdd24317c6..5437927b9e 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -111,11 +111,11 @@
 ;; FIXME: support macros.
 
 (defvar rx-constituents              ;Not `const' because some modes extend it.
-  '((and		. (rx-and 1 nil))
+  '((and		. (rx-and 0 nil))
     (seq		. and)		; SRE
     (:			. and)		; SRE
     (sequence		. and)		; sregex
-    (or			. (rx-or 1 nil))
+    (or			. (rx-or 0 nil))
     (|			. or)		; SRE
     (not-newline	. ".")
     (nonl		. not-newline)	; SRE
@@ -391,9 +391,11 @@ FORM is of the form `(and FORM1 ...)'."
   "Parse and produce code from FORM, which is `(or FORM1 ...)'."
   (rx-check form)
   (rx-group-if
-   (if (memq nil (mapcar 'stringp (cdr form)))
-       (mapconcat (lambda (x) (rx-form x '|)) (cdr form) "\\|")
+   (cond
+    ((null (cdr form)) regexp-unmatchable)
+    ((cl-every #'stringp (cdr form))
      (regexp-opt (cdr form) nil t))
+    (t (mapconcat (lambda (x) (rx-form x '|)) (cdr form) "\\|")))
    (and (memq rx-parent '(: * t)) rx-parent)))
 
 
@@ -1122,6 +1124,7 @@ CHAR
 `(seq SEXP1 SEXP2 ...)'
 `(sequence SEXP1 SEXP2 ...)'
      matches what SEXP1 matches, followed by what SEXP2 matches, etc.
+     Without arguments, matches the empty string.
 
 `(submatch SEXP1 SEXP2 ...)'
 `(group SEXP1 SEXP2 ...)'
@@ -1137,7 +1140,7 @@ CHAR
 `(| SEXP1 SEXP2 ...)'
      matches anything that matches SEXP1 or SEXP2, etc.  If all
      args are strings, use `regexp-opt' to optimize the resulting
-     regular expression.
+     regular expression.  Without arguments, never matches anything.
 
 `(minimal-match SEXP)'
      produce a non-greedy regexp for SEXP.  Normally, regexps matching
diff --git a/test/lisp/emacs-lisp/rx-tests.el b/test/lisp/emacs-lisp/rx-tests.el
index 4a5919edf0..6f392d616d 100644
--- a/test/lisp/emacs-lisp/rx-tests.el
+++ b/test/lisp/emacs-lisp/rx-tests.el
@@ -107,7 +107,13 @@
                    "ab"))
     (should (equal (and (string-match (rx (or "a" "ab" "abc")) s)
                         (match-string 0 s))
-                   "a"))))
+                   "a")))
+  ;; Test zero-argument `or'.
+  (should (equal (rx (or)) regexp-unmatchable)))
+
+(ert-deftest rx-seq ()
+  ;; Test zero-argument `seq'.
+  (should (equal (rx (seq)) "")))
 
 (provide 'rx-tests)
 ;; rx-tests.el ends here.
-- 
2.20.1 (Apple Git-117)


  parent reply	other threads:[~2019-05-15 21:07 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14  7:25 regular expressions that match nothing philippe schnoebelen
2019-05-14 10:14 ` Mattias Engdegård
2019-05-14 19:41   ` Stefan Monnier
2019-05-15 16:21     ` Mattias Engdegård
2019-05-15 19:41       ` Alan Mackenzie
2019-05-16 10:54         ` Mattias Engdegård
2019-05-16 23:18           ` Phil Sainty
2019-05-17  9:43             ` Alan Mackenzie
2019-05-17 10:17               ` Mattias Engdegård
2019-05-17 12:53               ` Stefan Monnier
2019-05-15 20:17       ` Michael Heerdegen
2019-05-15 21:06         ` Stefan Monnier
2019-05-15 21:07         ` Mattias Engdegård [this message]
2019-05-15 21:38           ` Michael Heerdegen
2019-05-16  6:57           ` More re odditie [Was: regular expressions that match nothing] phs
2019-05-16  9:29             ` Mattias Engdegård
2019-05-16 10:59               ` phs
2019-05-16 12:31                 ` Stefan Monnier
2019-05-16 18:35             ` Michael Heerdegen
2019-05-16 20:31               ` Mattias Engdegård
2019-05-16 21:01                 ` Global and local definitions of non-functions/variable (was: More re odditie [Was: regular expressions that match nothing]) Stefan Monnier
2019-05-20 16:26           ` Bootstrap/autoload policy (was Re: regular expressions that match nothing) Mattias Engdegård
2019-05-22 14:02             ` Stefan Monnier
2019-05-22 14:07               ` Mattias Engdegård
2019-05-22 14:24                 ` Stefan Monnier
2019-05-22 15:06                   ` Mattias Engdegård
2019-05-22 15:53                     ` Stefan Monnier
2019-05-22 16:40                       ` Mattias Engdegård
2019-05-22 19:08                         ` Stefan Monnier
2019-05-26 12:05                         ` Basil L. Contovounesios
2019-05-16 18:12       ` regular expressions that match nothing Eric Abrahamsen
2019-05-19  4:30         ` 回复: " net june
2019-05-19  5:00           ` HaiJun Zhang
2019-05-19  7:32             ` Mattias Engdegård
2019-05-20  7:56               ` philippe schnoebelen
2019-05-20 23:19                 ` Richard Stallman
2019-05-19 14:12           ` 回复: " Drew Adams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=128EBFB8-78FF-47C3-8F28-C1EF91BFC4BB@acm.org \
    --to=mattiase@acm.org \
    --cc=emacs-devel@gnu.org \
    --cc=michael_heerdegen@web.de \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).