all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: 34641@debbugs.gnu.org
Subject: bug#34641: rx: (or ...) order unpredictable
Date: Sun, 24 Feb 2019 19:40:33 +0100	[thread overview]
Message-ID: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> (raw)

The rx (or ...) construct sometimes reorders its subexpressions, which makes its semantics unpredictable. For example,

(rx (or "ab" "a") (or "a" "ab"))
=>
"\\(?:ab?\\)\\(?:ab?\\)"

The user reasonably expects (or e1 e2) to translate to E1\|E2, where ei translates to Ei, or a semantic equivalent. Not having this control makes rx useless or dangerous for many purposes.

The reason for the reordering is the use of regex-opt behind the scenes. Whether rx is the place to do this kind of optimisation is a matter of opinion; mine is that it belongs in the regexp engine, together with other, more aggressive optimisations (DFA, native-code generation, etc) could be performed as well.

We could determine whether any string is a prefix of another. If not, regexp-opt should be safe to call. Alternatively, this check could be done in regexp-opt (activated by a flag). That would be my preferred short-term solution.

(Speaking of regexp-opt, it has another bug that does not affect rx: it returns the empty string if given an empty list of strings. The correct return value is a regexp that never matches anything. Fix it, document it, or turn it into an error?)






             reply	other threads:[~2019-02-24 18:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-24 18:40 Mattias Engdegård [this message]
2019-02-24 19:06 ` bug#34641: rx: (or ...) order unpredictable Eli Zaretskii
2019-02-24 21:18   ` Mattias Engdegård
2019-02-24 22:44     ` Basil L. Contovounesios
2019-02-25 14:26       ` Mattias Engdegård
2019-03-02 12:33     ` Eli Zaretskii
2019-03-02 14:05       ` Mattias Engdegård
2019-03-02 14:08         ` Mattias Engdegård
2019-03-02 14:23           ` Eli Zaretskii
2019-03-02 14:37             ` Mattias Engdegård
2019-03-02 23:48       ` Phil Sainty
2019-03-03  8:54         ` Mattias Engdegård
2019-03-07  9:00           ` Phil Sainty
2019-02-25  2:37 ` Noam Postavsky
2019-02-25  9:56   ` Mattias Engdegård
2019-02-25 14:43     ` Noam Postavsky
2019-02-25 14:48       ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org \
    --to=mattiase@acm.org \
    --cc=34641@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.