From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: modern regexes in emacs Date: Fri, 15 Feb 2019 17:54:05 +0000 Message-ID: <20190215175405.GA5438@ACM> References: <20180616123704.7123f6d7@jabberwock.cb.piermont.com> <87po0qs6re.fsf@gmail.com> <83r2c9m8yj.fsf@gnu.org> <17581DA9-7DCA-432E-A2E8-E5184DFA8B4B@acm.org> <20190215114728.0785e891@jabberwock.cb.piermont.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="36027"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.10.1 (2018-07-13) Cc: Mattias =?iso-8859-1?Q?Engdeg=E5rd?= , lokedhs@gmail.com, emacs-devel@gnu.org, Philippe Vaucher , jaygkamat@gmail.com, Eli Zaretskii To: "Perry E. Metzger" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Feb 15 18:58:14 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1guhkb-0009CH-SN for ged-emacs-devel@m.gmane.org; Fri, 15 Feb 2019 18:58:13 +0100 Original-Received: from localhost ([127.0.0.1]:44155 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1guhka-0002q1-TS for ged-emacs-devel@m.gmane.org; Fri, 15 Feb 2019 12:58:12 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:46770) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1guhk1-0002pv-S7 for emacs-devel@gnu.org; Fri, 15 Feb 2019 12:57:38 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1guhk0-00065U-Fz for emacs-devel@gnu.org; Fri, 15 Feb 2019 12:57:37 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:34184 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1guhjz-0005bj-UJ for emacs-devel@gnu.org; Fri, 15 Feb 2019 12:57:36 -0500 Original-Received: (qmail 30579 invoked by uid 3782); 15 Feb 2019 17:57:18 -0000 Original-Received: from acm.muc.de (p4FE15EA1.dip0.t-ipconnect.de [79.225.94.161]) by colin.muc.de (tmda-ofmipd) with ESMTP; Fri, 15 Feb 2019 18:57:17 +0100 Original-Received: (qmail 28442 invoked by uid 1000); 15 Feb 2019 17:54:05 -0000 Content-Disposition: inline In-Reply-To: <20190215114728.0785e891@jabberwock.cb.piermont.com> X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:233386 Archived-At: Hello, Perry. On Fri, Feb 15, 2019 at 11:47:28 -0500, Perry E. Metzger wrote: > On Fri, 15 Feb 2019 17:24:18 +0100 Mattias Engdegård > wrote: > > 15 feb. 2019 kl. 15.18 skrev Eli Zaretskii : > > > It should be possible if we introduce new functions for PCRE, or > > > if we mark PCRE regexps in some special way, like put a special > > > text property on the string. > > It would be easier if those who ask for PCRE would say exactly what > > they want: > > (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc -- > > but restricted to the set of features of the Emacs regexp engine. > Modern syntax is the main one. Such use of "modern" always gets on my nerves. "Modern" is not the same as "good", and likely has a very weak correlation with it. Why aren't we all using "modern" editors, for example? > > (2) The features of PCRE not present in Emacs regexps. Which ones, > > exactly? Lookbehind assertions? Atomic groups? > I'm not particularly interested in those. That would be the sole reason for me for any switch. > > (3) PCRE for interactive use only. > > (4) PCRE for general Elisp programming. > The old style syntax is repulsive. I disagree. But that's not important. What's important is to have a standard invariable regexp notation, otherwise confusion and unwanted unforeseen nastinesses will occur. > I think we should make it possible to slowly switch over to the syntax > everyone using regexps has gotten used to over the last 30 years or so. > BREs in the style Emacs has been using have been obsolete for longer > than many Emacs users have been alive. They're not obsolete: they're used in grep, sed, and in Emacs. There are several different standards for writing regexps, all of approximately the same age. None is better than any other (aside from extra facilities available in some versions). This seems to me to be the same argument as that proposing that Emacs should change its key bindings to match those of other programs, because "everybody" knows those other bindings. > > Locating and wrapping the places that ask for regexps > > interactively, such as `query-replace-regexp', would permit the > > interactive regexp syntax to become a simple user customisation -- > > traditional, PCRE, rx or whatnot. It would be a matter of writing a > > transformation function, and possibly some syntax highlighting, for > > each case. Exactly. And then we've got 10 to 20 years of confusion, with several mutually incompatible regexp notations competing for attention in the same Emacs. I think this would be a thoroughly bad idea. > > I wouldn't be surprised if 99% of the requests are really about not > > having to escape |(){} as metacharacters in interactive use. > No, that's a lot of my complaint. I can't even remember what the > correct syntax is half the time. I don't suffer that difficulty in Emacs (though I sometimes do in grep, egrep, sed and AWK, all of which have slightly different regexps). But I would begin to suffer it if there started to be a mixture of incompatible regexp notations in Emacs sources. Let's keep things simple. > Anyway, I recommend Eli's approach. We create a parallel set of > modernized syntax functions, and people can slowly adopt them. I suggest we retain our current regexp notation, together with compatible tools, as the sole way of writing regexps in Emacs. This notation is not all that bad, and it is thoroughly documented and well tested. It's the approach which will cause the least confusion. It works. > Perry > -- > Perry E. Metzger perry@piermont.com -- Alan Mackenzie (Nuremberg, Germany).