From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#34641: rx: (or ...) order unpredictable Date: Sun, 24 Feb 2019 19:40:33 +0100 Message-ID: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="8476"; mail-complaints-to="usenet@blaine.gmane.org" To: 34641@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Feb 24 19:41:28 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gxyiM-00025d-Cp for geb-bug-gnu-emacs@m.gmane.org; Sun, 24 Feb 2019 19:41:26 +0100 Original-Received: from localhost ([127.0.0.1]:54387 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxyiL-0006Dt-9Z for geb-bug-gnu-emacs@m.gmane.org; Sun, 24 Feb 2019 13:41:25 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:36313) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxyi9-0006DE-IE for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:41:14 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gxyi5-000430-RB for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:41:12 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:36934) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gxyhz-000413-7J for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:41:05 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gxyhz-0006m7-4X for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:41:03 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 24 Feb 2019 18:41:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 34641 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.155103365526013 (code B ref -1); Sun, 24 Feb 2019 18:41:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 24 Feb 2019 18:40:55 +0000 Original-Received: from localhost ([127.0.0.1]:50476 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxyhq-0006lV-M6 for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:40:54 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:50128) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxyhn-0006lB-OZ for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:40:52 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:41962) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gxyhh-0003rj-Pf for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:40:45 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:36267) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxyhh-0006Co-07 for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:40:45 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gxyhe-0003oU-4u for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:40:44 -0500 Original-Received: from mail236c50.megamailservers.eu ([91.136.10.246]:38640 helo=mail56c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gxyhb-0003kH-QV for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:40:40 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551033635; bh=urDqCTIuggRfGwibsYdDMYJuoSN0gfcpkjPtpKc16BA=; h=From:Subject:Date:To:From; b=K4ao5YXAYFTsonx/EZhg9vfC2i1M8exbL0G0dWQW7yCdYVeo5qEf1YyM3REz7pS80 DICdQkaSKVFujx+S+1gtqLe59aEwFObnL+MGuRXfp9T0VsCksKPUtxicmVDCpM28cW qFwPofw+WWOT3fbKxDTR+VnzB7DWE+cYtML7JEeE= Feedback-ID: mattiase@acm.or Original-Received: from [192.168.1.65] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail56c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x1OIeXlS011293 for ; Sun, 24 Feb 2019 18:40:35 +0000 X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0205.5C72E523.0017, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=EarmvsuC c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=kj9zAlcOel0A:10 a=jTTxrkOn4n9Vm1INi-MA:9 a=CjuIK1q_8ugA:10 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:155729 Archived-At: The rx (or ...) construct sometimes reorders its subexpressions, which = makes its semantics unpredictable. For example, (rx (or "ab" "a") (or "a" "ab")) =3D> "\\(?:ab?\\)\\(?:ab?\\)" The user reasonably expects (or e1 e2) to translate to E1\|E2, where ei = translates to Ei, or a semantic equivalent. Not having this control = makes rx useless or dangerous for many purposes. The reason for the reordering is the use of regex-opt behind the scenes. = Whether rx is the place to do this kind of optimisation is a matter of = opinion; mine is that it belongs in the regexp engine, together with = other, more aggressive optimisations (DFA, native-code generation, etc) = could be performed as well. We could determine whether any string is a prefix of another. If not, = regexp-opt should be safe to call. Alternatively, this check could be = done in regexp-opt (activated by a flag). That would be my preferred = short-term solution. (Speaking of regexp-opt, it has another bug that does not affect rx: it = returns the empty string if given an empty list of strings. The correct = return value is a regexp that never matches anything. Fix it, document = it, or turn it into an error?)