unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: "Basil L. Contovounesios" <contovob@tcd.ie>
Cc: 34641@debbugs.gnu.org
Subject: bug#34641: rx: (or ...) order unpredictable
Date: Mon, 25 Feb 2019 15:26:18 +0100	[thread overview]
Message-ID: <759EA0BC-6EE9-4711-A5C9-C631207FF7E5@acm.org> (raw)
In-Reply-To: <87mumk957a.fsf@tcd.ie>

[-- Attachment #1: Type: text/plain, Size: 224 bytes --]

24 feb. 2019 kl. 23.44 skrev Basil L. Contovounesios <contovob@tcd.ie>:
> 
> FWIW, CC Mode has used "a\\`" since the following discussion:

Thank you, I'll use that then.
Here is a patch (to be applied after the other one).

[-- Attachment #2: 0001-Correct-regexp-opt-return-value-for-empty-string-lis.patch --]
[-- Type: application/octet-stream, Size: 3955 bytes --]

From 28f34a04513254c5bb4507ec6daa510e7ba166da Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 25 Feb 2019 15:22:02 +0100
Subject: [PATCH] Correct regexp-opt return value for empty string list

When regexp-opt is called with an empty list of strings, return a regexp
that doesn't match anything instead of the empty string (Bug#34641).

* doc/lispref/searching.texi (Regular Expression Functions):
* etc/NEWS:
Document the new behaviour.
* lisp/emacs-lisp/regexp-opt.el (regexp-opt):
Return a never-match regexp for empty inputs.
---
 doc/lispref/searching.texi    |  3 +++
 etc/NEWS                      |  6 ++++++
 lisp/emacs-lisp/regexp-opt.el | 23 +++++++++++++++--------
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 73a7304a3b..0b944a2711 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -960,6 +960,9 @@ possible.  A hand-tuned regular expression can sometimes be slightly
 more efficient, but is almost never worth the effort.}.
 @c E.g., see https://debbugs.gnu.org/2816
 
+If @var{strings} is empty, the return value is a regexp that never
+matches anything.
+
 The optional argument @var{paren} can be any of the following:
 
 @table @asis
diff --git a/etc/NEWS b/etc/NEWS
index 5f7616429b..6506a1c6b5 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1624,6 +1624,12 @@ in any order.  If the new third argument is non-nil, the match is
 guaranteed to be performed in the order given, as if the strings were
 made into a regexp by joining them with '\|'.
 
++++
+** The function 'regexp-opt', when given an empty list of strings, now
+returns a regexp that never matches anything, which is an identity for
+this operation.  Previously, the empty string was returned in this
+case.
+
 \f
 * Changes in Emacs 27.1 on Non-Free Operating Systems
 
diff --git a/lisp/emacs-lisp/regexp-opt.el b/lisp/emacs-lisp/regexp-opt.el
index 33a5b770a0..107b453637 100644
--- a/lisp/emacs-lisp/regexp-opt.el
+++ b/lisp/emacs-lisp/regexp-opt.el
@@ -90,6 +90,9 @@ Each string should be unique in STRINGS and should not contain
 any regexps, quoted or not.  Optional PAREN specifies how the
 returned regexp is surrounded by grouping constructs.
 
+If STRINGS is empty, the return value is a regexp that never
+matches anything.
+
 The optional argument PAREN can be any of the following:
 
 a string
@@ -139,14 +142,18 @@ usually more efficient than that of a simplified version:
 	   (sorted-strings (delete-dups
 			    (sort (copy-sequence strings) 'string-lessp)))
 	   (re
-            ;; If NOREORDER is non-nil and the list contains a prefix
-            ;; of another string, we give up all attempts at optimisation.
-            ;; There is plenty of room for improvement (Bug#34641).
-            (if (and noreorder (regexp-opt--contains-prefix sorted-strings))
-                (concat (or open "\\(?:")
-                        (mapconcat #'regexp-quote strings "\\|")
-                        "\\)")
-              (regexp-opt-group sorted-strings (or open t) (not open)))))
+            (cond
+             ;; No strings: return a\` which cannot match anything.
+             ((null strings)
+              (concat (or open "\\(?:") "a\\`\\)"))
+             ;; If we cannot reorder, give up all attempts at
+             ;; optimisation.  There is room for improvement (Bug#34641).
+             ((and noreorder (regexp-opt--contains-prefix sorted-strings))
+              (concat (or open "\\(?:")
+                      (mapconcat #'regexp-quote strings "\\|")
+                      "\\)"))
+             (t
+              (regexp-opt-group sorted-strings (or open t) (not open))))))
       (cond ((eq paren 'words)
 	     (concat "\\<" re "\\>"))
 	    ((eq paren 'symbols)
-- 
2.17.2 (Apple Git-113)


  reply	other threads:[~2019-02-25 14:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-24 18:40 bug#34641: rx: (or ...) order unpredictable Mattias Engdegård
2019-02-24 19:06 ` Eli Zaretskii
2019-02-24 21:18   ` Mattias Engdegård
2019-02-24 22:44     ` Basil L. Contovounesios
2019-02-25 14:26       ` Mattias Engdegård [this message]
2019-03-02 12:33     ` Eli Zaretskii
2019-03-02 14:05       ` Mattias Engdegård
2019-03-02 14:08         ` Mattias Engdegård
2019-03-02 14:23           ` Eli Zaretskii
2019-03-02 14:37             ` Mattias Engdegård
2019-03-02 23:48       ` Phil Sainty
2019-03-03  8:54         ` Mattias Engdegård
2019-03-07  9:00           ` Phil Sainty
2019-02-25  2:37 ` Noam Postavsky
2019-02-25  9:56   ` Mattias Engdegård
2019-02-25 14:43     ` Noam Postavsky
2019-02-25 14:48       ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=759EA0BC-6EE9-4711-A5C9-C631207FF7E5@acm.org \
    --to=mattiase@acm.org \
    --cc=34641@debbugs.gnu.org \
    --cc=contovob@tcd.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).