From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Robert Pluim Newsgroups: gmane.emacs.bugs Subject: bug#37659: rx additions: anychar, unmatchable, unordered-or Date: Tue, 22 Oct 2019 17:27:48 +0200 Message-ID: References: <88571301-3F15-428F-82F9-60A23D817EF8@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="246401"; mail-complaints-to="usenet@blaine.gmane.org" Cc: Paul Eggert , 37659@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Oct 22 17:28:20 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1iMw55-0011xx-NB for geb-bug-gnu-emacs@m.gmane.org; Tue, 22 Oct 2019 17:28:19 +0200 Original-Received: from localhost ([::1]:33056 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iMw54-00008j-CX for geb-bug-gnu-emacs@m.gmane.org; Tue, 22 Oct 2019 11:28:18 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56634) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iMw4p-00008U-29 for bug-gnu-emacs@gnu.org; Tue, 22 Oct 2019 11:28:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iMw4n-0006lc-WC for bug-gnu-emacs@gnu.org; Tue, 22 Oct 2019 11:28:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51522) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iMw4n-0006kk-Q6 for bug-gnu-emacs@gnu.org; Tue, 22 Oct 2019 11:28:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1iMw4n-0006Mk-IR for bug-gnu-emacs@gnu.org; Tue, 22 Oct 2019 11:28:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Robert Pluim Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 22 Oct 2019 15:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 37659 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 37659-submit@debbugs.gnu.org id=B37659.157175807824460 (code B ref 37659); Tue, 22 Oct 2019 15:28:01 +0000 Original-Received: (at 37659) by debbugs.gnu.org; 22 Oct 2019 15:27:58 +0000 Original-Received: from localhost ([127.0.0.1]:60343 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iMw4j-0006MS-W4 for submit@debbugs.gnu.org; Tue, 22 Oct 2019 11:27:58 -0400 Original-Received: from mail-wr1-f52.google.com ([209.85.221.52]:33697) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iMw4i-0006MG-FV for 37659@debbugs.gnu.org; Tue, 22 Oct 2019 11:27:56 -0400 Original-Received: by mail-wr1-f52.google.com with SMTP id s1so9816424wro.0 for <37659@debbugs.gnu.org>; Tue, 22 Oct 2019 08:27:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:in-reply-to:message-id:mime-version:content-transfer-encoding; bh=nusCylp/AoPgrk8o5JyhJgcDKmXcqY0Kqh9PEeRR3k8=; b=TOjioK98QzzG4Kd4xXlHiObG5YO3z966/8+jvsIRUIb0dBqDh3G+kqOSAwFbe1OSto 3MHE0P+6EVV0bY3j2vjX23p6D94biPjTvm56ynG8n2X5WyCkLD33cvMKI11/FrzZhbZv n//2Xlinrc6wTCMNV86vswaP/etLl76tOrPfyQjLRvfSE3e2jzdLBQGCFOZYAS1TcYnr Cjsl0LGMx8LWnfoXCM+1zUFcZeet9VvevAjG+yQ8p6w3CHxpr1AcdH5VbIhDYtCfNouq ap6dyedT99EZcKODJAbXuDR++f/bvwilB+INBf/Lvw6fSqiuRPepXLvDQCk7wJxhnDu9 OtnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version :content-transfer-encoding; bh=nusCylp/AoPgrk8o5JyhJgcDKmXcqY0Kqh9PEeRR3k8=; b=AaPRahEp+lztV320md5Fn1aZO5OJX54Ytz9dJQBthUR7zg7Vh5atl8OAFftat+AQf7 d6IERv9vp37dpQb/K/zJAHnSEGaiBmEkDsmPqi704tY7IdprlFkbuWsCZ+0DcgHPLo8D 0jnVgViAPPI+sCOl7YuXo88vJSLnxTBlAB7Xsp9S6M9i0cxddDTOza724LVB1J1w80qF V2kQ9HtEYuP1GO7uXIvYob4IWstIrwuCA0CokwE/iKJB6ZNq546xABnU3+KQeKPXBoes Ol9OZnxadaMiRSKhZIXGJAoC5AcLoqkzmrPDCBKjINfCh1SXoVClypQVRdouoBo7LsFe 9yDw== X-Gm-Message-State: APjAAAXraZZZ0LUhwKETEOqt9LKp+s2XVqMo+YYU0k+QimEmGFLgD//2 RxhEfWNe3/UvmX5Q6EIEwOQvapEkTZU= X-Google-Smtp-Source: APXvYqzx+Z8B+Hu5oETHeGuPV1l6fKtrSa6EakMF6QgiBye5w8q8tfr++E05i8hKN5ulfMQBBxYPjw== X-Received: by 2002:a05:6000:118f:: with SMTP id g15mr793319wrx.242.1571758070049; Tue, 22 Oct 2019 08:27:50 -0700 (PDT) Original-Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id 200sm11294251wme.32.2019.10.22.08.27.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 08:27:49 -0700 (PDT) Mail-Copies-To: never Gmane-Reply-To-List: yes In-Reply-To: <88571301-3F15-428F-82F9-60A23D817EF8@acm.org> ("Mattias \=\?utf-8\?Q\?Engdeg\=C3\=A5rd\=22's\?\= message of "Tue, 22 Oct 2019 17:14:08 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:169991 Archived-At: >>>>> On Tue, 22 Oct 2019 17:14:08 +0200, Mattias Engdeg=C3=A5rd said: Mattias> 'regexp-opt' always generates a regexp preferring long matches= . This Mattias> is undocumented, but useful enough that I would be surprised i= f this Mattias> property wasn't exploited (perhaps unknowingly) by callers. It= 's quite Mattias> natural: given a set of strings, surely the caller want them a= ll to be Mattias> candidates for a match, even if there is no following anchoring Mattias> pattern. Mattias> Thus, instead of 'unordered-or', define the operator in terms = of long Mattias> matches: 'or-max' (working name) would work like 'or' but guar= antee a Mattias> longest match, and only permit strings and 'or-max' forms as Mattias> arguments. Thus, the rx user gets all the benefits from 'regex= p-opt' Mattias> in a composable way, without a need to sort the strings or oth= erwise Mattias> prepare them. Mattias> (The old 'or' behaviour always used 'regexp-opt' when possible= , which Mattias> was very fragile: (or "a" "ab") would match "ab", but (or "a" = "ab" Mattias> digit) would just match "a". 'or-max' is robust, without surpr= ises.) Mattias> Of course, we should also guarantee the maximum-matching prope= rty of Mattias> regexp-opt. This is just a matter of documentation (and test);= it does Mattias> not restrict optimisations as far as I can tell. Mattias> Again, I'm open to suggestions about a better name than 'or-ma= x'. or-greedy?