unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: "Elias Mårtenson" <lokedhs@gmail.com>
Cc: "Perry E. Metzger" <perry@piermont.com>,
	Jay Kamat <jaygkamat@gmail.com>,
	emacs-devel <emacs-devel@gnu.org>
Subject: Re: modern regexes in emacs
Date: Mon, 11 Feb 2019 23:12:47 +0100	[thread overview]
Message-ID: <A74AC1AD-9D65-43F9-9BD5-17B3D40E554A@acm.org> (raw)
In-Reply-To: <CADtN0WKcKW9P5u5J=rY0kq4cBvrrvcv6FbRAD-rSXsDMSsF-zg@mail.gmail.com>

10 feb. 2019 kl. 10.39 skrev Elias Mårtenson <lokedhs@gmail.com>:
> 
> While I'm sure that is true for lot of people (and for those, the newly announced xr package helps here), others prefer to use the more compact regex syntax. 
> 
> However, I don't think anyone would argue that the Emacs regex syntax has any advantages compared to pcre. I certainly need to wade through the Emacs regex manual every time I want to do slightly more advanced regex matching, followed by lots of testing. 
> 
> When using regexes in regular editing (as opposed to elisp programming) it's even worse. 
> 
> I'm most definitely in favour of pcre. 

Hello Elias,

Of course you should write "-?[0-9]+" when you need it! And for interactive use -- search-and-replace, say -- the conventional notations are not bad, since they are compact to write, you have the meaning all in your head anyway, and nobody is going to look at it later on.

Where rx shines is for the complex ones. I have written page-long regexps in Perl and Python, and despite the fact that both languages permit a "structured" regexp layout, they does not come close to rx when it counts: rx can be read, understood, maintained, evolved, and composed far better, and with fewer mistakes.

I agree that the Posix notation is probably better than the old-style version in Emacs since the former tends to be a tad lighter in backslashes. Some languages - OCaml, Python, etc -- have some form of string literal that avoids the need to escape backslashes, but fundamentally, regexps are not strings but an algebraic notation with values and operators, and deserve some kind of higher language-level support. Larry Wall understood that.

So I suggest you give rx a go next time you need to write a complicated regexp in Elisp. If you still find it too verbose, you can use short keywords, like `+' or `1+' instead of `one-or-more'. You can even speak a hybrid dialect by injecting little regexp strings inside a big rx expression with the `(regexp ...)' syntax! Take a look at the big `gnu' matcher in compile.el (around line 281) to see what that looks like.

Careful here -- rx is addictive, and you may very well come to use it more and more.




  reply	other threads:[~2019-02-11 22:12 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-16 16:37 modern regexes in emacs Perry E. Metzger
2018-06-16 17:45 ` Radon Rosborough
2018-06-16 18:25   ` Perry E. Metzger
2018-06-16 21:01     ` Daniel Colascione
2018-06-16 22:31 ` Jay Kamat
2019-02-09 17:20   ` Philippe Vaucher
2019-02-10  9:39   ` Elias Mårtenson
2019-02-11 22:12     ` Mattias Engdegård [this message]
2019-02-15 13:42       ` Philippe Vaucher
2019-02-15 14:10         ` Clément Pit-Claudel
2019-02-15 15:03           ` Philippe Vaucher
2019-02-15 15:13             ` Clément Pit-Claudel
2019-02-15 14:18         ` Eli Zaretskii
2019-02-15 15:28           ` Perry E. Metzger
2019-02-15 16:06             ` Stefan Monnier
2019-02-15 16:24           ` Mattias Engdegård
2019-02-15 16:47             ` Perry E. Metzger
2019-02-15 17:54               ` Alan Mackenzie
2019-02-15 18:27                 ` Drew Adams
2019-02-15 23:33                   ` Perry E. Metzger
2019-02-16  0:34                     ` Jay Kamat
2019-02-16  1:46                       ` Perry E. Metzger
2019-02-16  2:44                         ` Jay Kamat
2019-02-15 18:36                 ` Eli Zaretskii
2019-02-15 18:43                   ` Mattias Engdegård
2019-02-15 19:48                     ` Eli Zaretskii
2019-02-17  3:17                       ` Richard Stallman
2019-02-25 14:47                         ` Lars Ingebrigtsen
2019-02-25 15:46                           ` Clément Pit-Claudel
2019-02-26  2:57                             ` Richard Stallman
2019-02-26 12:39                               ` Lars Ingebrigtsen
2019-02-26 13:24                                 ` Troy Hinckley
2019-02-26 13:32                                   ` Lars Ingebrigtsen
2019-02-26 14:33                                     ` Andreas Schwab
2019-02-27 12:09                                       ` Mattias Engdegård
2019-02-27 18:18                                         ` Daniel Pittman
2019-02-26 15:29                                 ` Eli Zaretskii
2019-02-27  4:08                                 ` Richard Stallman
2019-02-26  3:47                             ` Elias Mårtenson
2019-02-26 12:00                           ` Mattias Engdegård
2019-02-15 23:35                     ` Perry E. Metzger
2019-02-17 20:01                     ` Juri Linkov
2019-02-18  0:38                       ` Stefan Monnier
2019-02-15 18:46                   ` Clément Pit-Claudel
2019-02-15 19:52                     ` Eli Zaretskii
2019-02-15 20:08                       ` Clément Pit-Claudel
2019-02-15 19:14                   ` Alan Mackenzie
2019-02-15 20:00                     ` Eli Zaretskii
2019-02-15 20:40                       ` Alan Mackenzie
2019-02-15 23:33                   ` Perry E. Metzger
2019-02-15 18:44                 ` Clément Pit-Claudel
2019-02-15 19:37                 ` Stefan Monnier
2019-02-19 12:29                   ` Van L
2019-02-17 20:47         ` Stefan Monnier
2019-02-18  8:40           ` Philippe Vaucher
2019-02-18  8:55           ` Mattias Engdegård
  -- strict thread matches above, loose matches on Subject: below --
2018-06-16 21:33 Jimmy Yuen Ho Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A74AC1AD-9D65-43F9-9BD5-17B3D40E554A@acm.org \
    --to=mattiase@acm.org \
    --cc=emacs-devel@gnu.org \
    --cc=jaygkamat@gmail.com \
    --cc=lokedhs@gmail.com \
    --cc=perry@piermont.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).