unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* modern regexes in emacs
@ 2018-06-16 16:37 Perry E. Metzger
  2018-06-16 17:45 ` Radon Rosborough
  2018-06-16 22:31 ` Jay Kamat
  0 siblings, 2 replies; 57+ messages in thread
From: Perry E. Metzger @ 2018-06-16 16:37 UTC (permalink / raw)
  To: emacs-devel

I think, someday, it would be nice if users could select modern
regex syntax instead of the very very old-fashioned and awkward Emacs
regex syntax. The old syntax and functions that implement it need to
be kept around for legacy reasons, but one could easily set up a set
of parallel new functions that used modern PCRE style syntax, and
allow users to select those instead when doing things like
isearching on regexps etc.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2018-06-16 16:37 modern regexes in emacs Perry E. Metzger
@ 2018-06-16 17:45 ` Radon Rosborough
  2018-06-16 18:25   ` Perry E. Metzger
  2018-06-16 22:31 ` Jay Kamat
  1 sibling, 1 reply; 57+ messages in thread
From: Radon Rosborough @ 2018-06-16 17:45 UTC (permalink / raw)
  To: perry; +Cc: emacs-devel

> I think, someday, it would be nice if users could select modern
> regex syntax instead of the very very old-fashioned and awkward Emacs
> regex syntax

I agree. See https://github.com/benma/visual-regexp-steroids.el, which
implements this.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2018-06-16 17:45 ` Radon Rosborough
@ 2018-06-16 18:25   ` Perry E. Metzger
  2018-06-16 21:01     ` Daniel Colascione
  0 siblings, 1 reply; 57+ messages in thread
From: Perry E. Metzger @ 2018-06-16 18:25 UTC (permalink / raw)
  To: Radon Rosborough; +Cc: emacs-devel

On Sat, 16 Jun 2018 11:45:40 -0600 Radon Rosborough
<radon.neon@gmail.com> wrote:
> > I think, someday, it would be nice if users could select modern
> > regex syntax instead of the very very old-fashioned and awkward
> > Emacs regex syntax  
> 
> I agree. See https://github.com/benma/visual-regexp-steroids.el,
> which implements this.
> 

That requires python, isn't integrated into emacs, etc.

-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2018-06-16 18:25   ` Perry E. Metzger
@ 2018-06-16 21:01     ` Daniel Colascione
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Colascione @ 2018-06-16 21:01 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: Radon Rosborough, emacs-devel

> On Sat, 16 Jun 2018 11:45:40 -0600 Radon Rosborough
> <radon.neon@gmail.com> wrote:
>> > I think, someday, it would be nice if users could select modern
>> > regex syntax instead of the very very old-fashioned and awkward
>> > Emacs regex syntax

Right now, I'd just settle for an easier-to-type equivalent to "\_<". This
sequence is the typing equivalent of a nasty cough. I've actually been
wondering how feasible it'd be to support an rx input mode.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
@ 2018-06-16 21:33 Jimmy Yuen Ho Wong
  0 siblings, 0 replies; 57+ messages in thread
From: Jimmy Yuen Ho Wong @ 2018-06-16 21:33 UTC (permalink / raw)
  To: perry, emacs-devel

> That requires python, isn't integrated into emacs, etc.

It doesn't. You can select pcre2el as the Regexp syntax, which
implements a good subset of PCRE on top of Emacs Regexp. However, that
project hasn't seen any movement since late 2016. It's be nice if Emacs
had PCRE baked in.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2018-06-16 16:37 modern regexes in emacs Perry E. Metzger
  2018-06-16 17:45 ` Radon Rosborough
@ 2018-06-16 22:31 ` Jay Kamat
  2019-02-09 17:20   ` Philippe Vaucher
  2019-02-10  9:39   ` Elias Mårtenson
  1 sibling, 2 replies; 57+ messages in thread
From: Jay Kamat @ 2018-06-16 22:31 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: emacs-devel

Perry E. Metzger writes:

> I think, someday, it would be nice if users could select modern
> regex syntax instead of the very very old-fashioned and awkward Emacs
> regex syntax. The old syntax and functions that implement it need to
> be kept around for legacy reasons, but one could easily set up a set
> of parallel new functions that used modern PCRE style syntax, and
> allow users to select those instead when doing things like
> isearching on regexps etc.

I just wanted to note that `rx' is in many cases much easier to write and
understand than even PCRE. I'd recommend learning and using `rx' if you are
annoyed about backslashes or readability.

-Jay



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2018-06-16 22:31 ` Jay Kamat
@ 2019-02-09 17:20   ` Philippe Vaucher
  2019-02-10  9:39   ` Elias Mårtenson
  1 sibling, 0 replies; 57+ messages in thread
From: Philippe Vaucher @ 2019-02-09 17:20 UTC (permalink / raw)
  To: Jay Kamat; +Cc: Emacs developers, Perry E. Metzger

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On Sun, Jun 17, 2018 at 12:32 AM Jay Kamat <jaygkamat@gmail.com> wrote:

> Perry E. Metzger writes:
>
> > I think, someday, it would be nice if users could select modern
> > regex syntax instead of the very very old-fashioned and awkward Emacs
> > regex syntax. The old syntax and functions that implement it need to
> > be kept around for legacy reasons, but one could easily set up a set
> > of parallel new functions that used modern PCRE style syntax, and
> > allow users to select those instead when doing things like
> > isearching on regexps etc.
>
> I just wanted to note that `rx' is in many cases much easier to write and
> understand than even PCRE. I'd recommend learning and using `rx' if you are
> annoyed about backslashes or readability.
>

I only remember the PCRE syntax that's why I use packages like
`visual-regexp` and `pcre2el`, if emacs supported it natively that'd be a
huge step forward for me.

Philippe

[-- Attachment #2: Type: text/html, Size: 1380 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2018-06-16 22:31 ` Jay Kamat
  2019-02-09 17:20   ` Philippe Vaucher
@ 2019-02-10  9:39   ` Elias Mårtenson
  2019-02-11 22:12     ` Mattias Engdegård
  1 sibling, 1 reply; 57+ messages in thread
From: Elias Mårtenson @ 2019-02-10  9:39 UTC (permalink / raw)
  To: Jay Kamat; +Cc: emacs-devel, Perry E. Metzger

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

On Sun, 17 Jun 2018, 06:32 Jay Kamat <jaygkamat@gmail.com wrote:

I just wanted to note that `rx' is in many cases much easier to write and
> understand than even PCRE. I'd recommend learning and using `rx' if you are
> annoyed about backslashes or readability.
>

While I'm sure that is true for lot of people (and for those, the newly
announced xr package helps here), others prefer to use the more compact
regex syntax.

However, I don't think anyone would argue that the Emacs regex syntax has
any advantages compared to pcre. I certainly need to wade through the Emacs
regex manual every time I want to do slightly more advanced regex matching,
followed by lots of testing.

When using regexes in regular editing (as opposed to elisp programming)
it's even worse.

I'm most definitely in favour of pcre.
Regards,
Elias

>

[-- Attachment #2: Type: text/html, Size: 1538 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-10  9:39   ` Elias Mårtenson
@ 2019-02-11 22:12     ` Mattias Engdegård
  2019-02-15 13:42       ` Philippe Vaucher
  0 siblings, 1 reply; 57+ messages in thread
From: Mattias Engdegård @ 2019-02-11 22:12 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: Perry E. Metzger, Jay Kamat, emacs-devel

10 feb. 2019 kl. 10.39 skrev Elias Mårtenson <lokedhs@gmail.com>:
> 
> While I'm sure that is true for lot of people (and for those, the newly announced xr package helps here), others prefer to use the more compact regex syntax. 
> 
> However, I don't think anyone would argue that the Emacs regex syntax has any advantages compared to pcre. I certainly need to wade through the Emacs regex manual every time I want to do slightly more advanced regex matching, followed by lots of testing. 
> 
> When using regexes in regular editing (as opposed to elisp programming) it's even worse. 
> 
> I'm most definitely in favour of pcre. 

Hello Elias,

Of course you should write "-?[0-9]+" when you need it! And for interactive use -- search-and-replace, say -- the conventional notations are not bad, since they are compact to write, you have the meaning all in your head anyway, and nobody is going to look at it later on.

Where rx shines is for the complex ones. I have written page-long regexps in Perl and Python, and despite the fact that both languages permit a "structured" regexp layout, they does not come close to rx when it counts: rx can be read, understood, maintained, evolved, and composed far better, and with fewer mistakes.

I agree that the Posix notation is probably better than the old-style version in Emacs since the former tends to be a tad lighter in backslashes. Some languages - OCaml, Python, etc -- have some form of string literal that avoids the need to escape backslashes, but fundamentally, regexps are not strings but an algebraic notation with values and operators, and deserve some kind of higher language-level support. Larry Wall understood that.

So I suggest you give rx a go next time you need to write a complicated regexp in Elisp. If you still find it too verbose, you can use short keywords, like `+' or `1+' instead of `one-or-more'. You can even speak a hybrid dialect by injecting little regexp strings inside a big rx expression with the `(regexp ...)' syntax! Take a look at the big `gnu' matcher in compile.el (around line 281) to see what that looks like.

Careful here -- rx is addictive, and you may very well come to use it more and more.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-11 22:12     ` Mattias Engdegård
@ 2019-02-15 13:42       ` Philippe Vaucher
  2019-02-15 14:10         ` Clément Pit-Claudel
                           ` (2 more replies)
  0 siblings, 3 replies; 57+ messages in thread
From: Philippe Vaucher @ 2019-02-15 13:42 UTC (permalink / raw)
  To: Mattias Engdegård
  Cc: emacs-devel, Jay Kamat, Elias Mårtenson, Perry E. Metzger

[-- Attachment #1: Type: text/plain, Size: 597 bytes --]

> Of course you should write "-?[0-9]+" when you need it! And for
> interactive use -- search-and-replace, say -- the conventional notations
> are not bad, since they are compact to write, you have the meaning all in
> your head anyway, and nobody is going to look at it later on.


I think the purpose of this thread is to ask for emacs to support PCRE
regexpes in commands like `query-replace-regexp`.

Would this even be possible? I can imagine a whole lot of packages breaking
if the regexp syntax changed, and changing it just for the user input in
interactive functions looks a bit sketchy.

[-- Attachment #2: Type: text/html, Size: 858 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 13:42       ` Philippe Vaucher
@ 2019-02-15 14:10         ` Clément Pit-Claudel
  2019-02-15 15:03           ` Philippe Vaucher
  2019-02-15 14:18         ` Eli Zaretskii
  2019-02-17 20:47         ` Stefan Monnier
  2 siblings, 1 reply; 57+ messages in thread
From: Clément Pit-Claudel @ 2019-02-15 14:10 UTC (permalink / raw)
  To: emacs-devel

On 15/02/2019 08.42, Philippe Vaucher wrote:
> Would this even be possible? I can imagine a whole lot of packages breaking if the regexp syntax changed, and changing it just for the user input in interactive functions looks a bit sketchy.

We could just add a special tag at the beginning of a regexp to indicate that it's a pcre regexp; something like this maybe? (re-search-forward "\\(?pcre:\\)…[pcre regexp goes here]…").  This form is currently a syntax error, so there would be no ambiguity, and we could define a (pcre …) macro so that you could write (re-search-forward (pcre "…[pcre regexp goes here]…")) instead.  Alternatively, we could use an explicit tag, something like (re-search-forward (cons 'pcre "…[pcre regexp goes here]…")).

For interactive functions, I imagine you'd have a defcustom with a preferred regexp dialect.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 13:42       ` Philippe Vaucher
  2019-02-15 14:10         ` Clément Pit-Claudel
@ 2019-02-15 14:18         ` Eli Zaretskii
  2019-02-15 15:28           ` Perry E. Metzger
  2019-02-15 16:24           ` Mattias Engdegård
  2019-02-17 20:47         ` Stefan Monnier
  2 siblings, 2 replies; 57+ messages in thread
From: Eli Zaretskii @ 2019-02-15 14:18 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: mattiase, perry, jaygkamat, lokedhs, emacs-devel

> From: Philippe Vaucher <philippe.vaucher@gmail.com>
> Date: Fri, 15 Feb 2019 14:42:43 +0100
> Cc: emacs-devel <emacs-devel@gnu.org>, Jay Kamat <jaygkamat@gmail.com>,
> 	Elias Mårtenson <lokedhs@gmail.com>,
> 	"Perry E. Metzger" <perry@piermont.com>
> 
> I think the purpose of this thread is to ask for emacs to support PCRE regexpes in commands like
> `query-replace-regexp`.
> 
> Would this even be possible? I can imagine a whole lot of packages breaking if the regexp syntax changed,
> and changing it just for the user input in interactive functions looks a bit sketchy.

It should be possible if we introduce new functions for PCRE, or if we
mark PCRE regexps in some special way, like put a special text
property on the string.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 14:10         ` Clément Pit-Claudel
@ 2019-02-15 15:03           ` Philippe Vaucher
  2019-02-15 15:13             ` Clément Pit-Claudel
  0 siblings, 1 reply; 57+ messages in thread
From: Philippe Vaucher @ 2019-02-15 15:03 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 1957 bytes --]

>
> > Would this even be possible? I can imagine a whole lot of packages
> breaking if the regexp syntax changed, and changing it just for the user
> input in interactive functions looks a bit sketchy.
>
> We could just add a special tag at the beginning of a regexp to indicate
> that it's a pcre regexp; something like this maybe? (re-search-forward
> "\\(?pcre:\\)…[pcre regexp goes here]…").  This form is currently a syntax
> error, so there would be no ambiguity, and we could define a (pcre …) macro
> so that you could write (re-search-forward (pcre "…[pcre regexp goes
> here]…")) instead.  Alternatively, we could use an explicit tag, something
> like (re-search-forward (cons 'pcre "…[pcre regexp goes here]…")).
>
> For interactive functions, I imagine you'd have a defcustom with a
> preferred regexp dialect.
>

I like where this is going, that and Eli's suggestion of a special text
property we have plenty of ways to implement it where it'd play nice with
the existing code.

So far 3 proposals:

   - Regexps are always strings, with "\\(?pcre:\\)" as part of the regexp
      - when the string is displayed you need to scan the beginning to see
      it is a PCRE regex
      - no separation between the regexp and it's kind
   - Regexps are strings (emacs regexps) or conses with their kind as
   symbol with the first argument
      - when the argument is displayed you see immediatly wether it's an
      emacs regexp or one using another engine
      - the regexp is clearly separated from it's kind, probably faciliting
      convertions
      - seems more "open", in the sense we can easily imagine new types
      ('emacs, 'pcre', 'rx, 'sed, 'vim-verymagic, etc)
   - Special text property on the string
      - Not immediatly visible that it is a PCRE regexp
      - Harder to manipulate?

Given this I'm in favor of the 2nd option, but maybe I missed some points.

Philippe

[-- Attachment #2: Type: text/html, Size: 2320 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 15:03           ` Philippe Vaucher
@ 2019-02-15 15:13             ` Clément Pit-Claudel
  0 siblings, 0 replies; 57+ messages in thread
From: Clément Pit-Claudel @ 2019-02-15 15:13 UTC (permalink / raw)
  To: Philippe Vaucher; +Cc: Emacs developers

On 15/02/2019 10.03, Philippe Vaucher wrote:
> Given this I'm in favor of the 2nd option, but maybe I missed some points.

Thinking more about this, there is one non-trivial issue: concatenation.  It's common for code in Emacs to take a regexp, assume it's a string, and do something like (concat "\\(" some-regexp-var "\\|" some-other-regexp-var "\\)").

Solution 1 could be tweaked to wrap the whole regexp: "\\(?pcre:…[pcre regexp here]…\\)", and so could solution 3 (a text property spanning the whole length of the string), but solution 2 won't work well here.

Not to mention the fact that if the regexps are matched by different engines, we now have to make these work together :/

Clément.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 14:18         ` Eli Zaretskii
@ 2019-02-15 15:28           ` Perry E. Metzger
  2019-02-15 16:06             ` Stefan Monnier
  2019-02-15 16:24           ` Mattias Engdegård
  1 sibling, 1 reply; 57+ messages in thread
From: Perry E. Metzger @ 2019-02-15 15:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: mattiase, Philippe Vaucher, jaygkamat, lokedhs, emacs-devel

On Fri, 15 Feb 2019 16:18:12 +0200 Eli Zaretskii <eliz@gnu.org> wrote:
> > I think the purpose of this thread is to ask for emacs to support
> > PCRE regexpes in commands like `query-replace-regexp`.
> > 
> > Would this even be possible? I can imagine a whole lot of
> > packages breaking if the regexp syntax changed, and changing it
> > just for the user input in interactive functions looks a bit
> > sketchy.  
> 
> It should be possible if we introduce new functions for PCRE, or if
> we mark PCRE regexps in some special way, like put a special text
> property on the string.

I think the right thing is to introduce new functions for new-style
regexps that parallel the old ones, and to allow users to bind things
like the regexp flavors of isearch to the new-style versions if they
wish.

We can decide if we want the new-style versions to be the default
search bindings at some distant date.

One could also very slowly replace use of old-style regexp functions
with the new-style regexp functions in lisp code, but that could be
done over many many years if desired.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 15:28           ` Perry E. Metzger
@ 2019-02-15 16:06             ` Stefan Monnier
  0 siblings, 0 replies; 57+ messages in thread
From: Stefan Monnier @ 2019-02-15 16:06 UTC (permalink / raw)
  To: emacs-devel

>> It should be possible if we introduce new functions for PCRE, or if
>> we mark PCRE regexps in some special way, like put a special text
>> property on the string.

A simpler option is for Elisp users to write (pcre "foo") where `pcre`
is a function that converts to Emacs's own format.

I think it would make a lot of sense for Emacs's search functions to
accept other kinds of search specifications than regular expressions
represented as strings, e.g. to also accept precompiled regexps (or
NFAs/DFAs), so `pcre` could also return one of those representations
if/when support for it is added.


        Stefan




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 14:18         ` Eli Zaretskii
  2019-02-15 15:28           ` Perry E. Metzger
@ 2019-02-15 16:24           ` Mattias Engdegård
  2019-02-15 16:47             ` Perry E. Metzger
  1 sibling, 1 reply; 57+ messages in thread
From: Mattias Engdegård @ 2019-02-15 16:24 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Philippe Vaucher, jaygkamat, lokedhs, emacs-devel,
	Perry E. Metzger

15 feb. 2019 kl. 15.18 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> It should be possible if we introduce new functions for PCRE, or if we
> mark PCRE regexps in some special way, like put a special text
> property on the string.

It would be easier if those who ask for PCRE would say exactly what they want:

(1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc -- but restricted to the set of features of the Emacs regexp engine.
(2) The features of PCRE not present in Emacs regexps. Which ones, exactly? Lookbehind assertions? Atomic groups?
(3) PCRE for interactive use only.
(4) PCRE for general Elisp programming.

Locating and wrapping the places that ask for regexps interactively, such as `query-replace-regexp', would permit the interactive regexp syntax to become a simple user customisation -- traditional, PCRE, rx or whatnot. It would be a matter of writing a transformation function, and possibly some syntax highlighting, for each case.

I wouldn't be surprised if 99% of the requests are really about not having to escape |(){} as metacharacters in interactive use.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 16:24           ` Mattias Engdegård
@ 2019-02-15 16:47             ` Perry E. Metzger
  2019-02-15 17:54               ` Alan Mackenzie
  0 siblings, 1 reply; 57+ messages in thread
From: Perry E. Metzger @ 2019-02-15 16:47 UTC (permalink / raw)
  To: Mattias Engdegård
  Cc: Eli Zaretskii, jaygkamat, lokedhs, Philippe Vaucher, emacs-devel

On Fri, 15 Feb 2019 17:24:18 +0100 Mattias Engdegård
<mattiase@acm.org> wrote:
> 15 feb. 2019 kl. 15.18 skrev Eli Zaretskii <eliz@gnu.org>:
> > 
> > It should be possible if we introduce new functions for PCRE, or
> > if we mark PCRE regexps in some special way, like put a special
> > text property on the string.  
> 
> It would be easier if those who ask for PCRE would say exactly what
> they want:
> 
> (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc --
> but restricted to the set of features of the Emacs regexp engine.

Modern syntax is the main one.

> (2) The features of PCRE not present in Emacs regexps. Which ones,
> exactly? Lookbehind assertions? Atomic groups?

I'm not particularly interested in those.

> (3) PCRE for interactive use only.
> (4) PCRE for general Elisp programming. 

The old style syntax is repulsive. I think we should make it possible
to slowly switch over to the syntax everyone using regexps has gotten
used to over the last 30 years or so. BREs in the style Emacs has
been using have been obsolete for longer than many Emacs users have
been alive.

> Locating and wrapping the places that ask for regexps
> interactively, such as `query-replace-regexp', would permit the
> interactive regexp syntax to become a simple user customisation --
> traditional, PCRE, rx or whatnot. It would be a matter of writing a
> transformation function, and possibly some syntax highlighting, for
> each case.
> 
> I wouldn't be surprised if 99% of the requests are really about not
> having to escape |(){} as metacharacters in interactive use.

No, that's a lot of my complaint. I can't even remember what the
correct syntax is half the time.

Anyway, I recommend Eli's approach. We create a parallel set of
modernized syntax functions, and people can slowly adopt them.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 16:47             ` Perry E. Metzger
@ 2019-02-15 17:54               ` Alan Mackenzie
  2019-02-15 18:27                 ` Drew Adams
                                   ` (3 more replies)
  0 siblings, 4 replies; 57+ messages in thread
From: Alan Mackenzie @ 2019-02-15 17:54 UTC (permalink / raw)
  To: Perry E. Metzger
  Cc: Mattias Engdegård, lokedhs, emacs-devel, Philippe Vaucher,
	jaygkamat, Eli Zaretskii

Hello, Perry.

On Fri, Feb 15, 2019 at 11:47:28 -0500, Perry E. Metzger wrote:
> On Fri, 15 Feb 2019 17:24:18 +0100 Mattias Engdegård
> <mattiase@acm.org> wrote:
> > 15 feb. 2019 kl. 15.18 skrev Eli Zaretskii <eliz@gnu.org>:

> > > It should be possible if we introduce new functions for PCRE, or
> > > if we mark PCRE regexps in some special way, like put a special
> > > text property on the string.  

> > It would be easier if those who ask for PCRE would say exactly what
> > they want:

> > (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc --
> > but restricted to the set of features of the Emacs regexp engine.

> Modern syntax is the main one.

Such use of "modern" always gets on my nerves.  "Modern" is not the same
as "good", and likely has a very weak correlation with it.  Why aren't we
all using "modern" editors, for example?

> > (2) The features of PCRE not present in Emacs regexps. Which ones,
> > exactly? Lookbehind assertions? Atomic groups?

> I'm not particularly interested in those.

That would be the sole reason for me for any switch.

> > (3) PCRE for interactive use only.
> > (4) PCRE for general Elisp programming. 

> The old style syntax is repulsive.

I disagree.  But that's not important.  What's important is to have a
standard invariable regexp notation, otherwise confusion and unwanted
unforeseen nastinesses will occur.

> I think we should make it possible to slowly switch over to the syntax
> everyone using regexps has gotten used to over the last 30 years or so.
> BREs in the style Emacs has been using have been obsolete for longer
> than many Emacs users have been alive.

They're not obsolete: they're used in grep, sed, and in Emacs.

There are several different standards for writing regexps, all of
approximately the same age.  None is better than any other (aside from
extra facilities available in some versions).

This seems to me to be the same argument as that proposing that Emacs
should change its key bindings to match those of other programs, because
"everybody" knows those other bindings.

> > Locating and wrapping the places that ask for regexps
> > interactively, such as `query-replace-regexp', would permit the
> > interactive regexp syntax to become a simple user customisation --
> > traditional, PCRE, rx or whatnot. It would be a matter of writing a
> > transformation function, and possibly some syntax highlighting, for
> > each case.

Exactly.  And then we've got 10 to 20 years of confusion, with several
mutually incompatible regexp notations competing for attention in the
same Emacs.  I think this would be a thoroughly bad idea.

> > I wouldn't be surprised if 99% of the requests are really about not
> > having to escape |(){} as metacharacters in interactive use.

> No, that's a lot of my complaint. I can't even remember what the
> correct syntax is half the time.

I don't suffer that difficulty in Emacs (though I sometimes do in grep,
egrep, sed and AWK, all of which have slightly different regexps).  But I
would begin to suffer it if there started to be a mixture of incompatible
regexp notations in Emacs sources.  Let's keep things simple.

> Anyway, I recommend Eli's approach. We create a parallel set of
> modernized syntax functions, and people can slowly adopt them.

I suggest we retain our current regexp notation, together with compatible
tools, as the sole way of writing regexps in Emacs.  This notation is not
all that bad, and it is thoroughly documented and well tested.  It's the
approach which will cause the least confusion.  It works.

> Perry
> -- 
> Perry E. Metzger		perry@piermont.com

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: modern regexes in emacs
  2019-02-15 17:54               ` Alan Mackenzie
@ 2019-02-15 18:27                 ` Drew Adams
  2019-02-15 23:33                   ` Perry E. Metzger
  2019-02-15 18:36                 ` Eli Zaretskii
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 57+ messages in thread
From: Drew Adams @ 2019-02-15 18:27 UTC (permalink / raw)
  To: Alan Mackenzie, Perry E. Metzger
  Cc: Mattias Engdegård, lokedhs, emacs-devel, Philippe Vaucher,
	jaygkamat, Eli Zaretskii

> > Modern syntax is the main one.
> 
> Such use of "modern" always gets on my nerves.  "Modern" is not the same
> as "good", and likely has a very weak correlation with it.

Not to mention that "modern" has been applied to the latest fashion, ephemeral or not, for at least 100 years.  Today's modernista is tomorrow morning's has-been, but s?he sometimes continues to tout the same old-fashioned modernisms.

There's absolutely nothing new about labeling something "modern" (or "old-fashioned", for that matter).  Nothing new about "modern".

> Why aren't we all using "modern" editors, for example?

Why indeed?

Headline: "Users of Anachronistic Editor Emacs Go 'Modern'!"

> > I think we should make it possible to slowly switch over to the syntax
> > everyone using regexps has gotten used to over the last 30 years or so.
> > BREs in the style Emacs has been using have been obsolete for longer
> > than many Emacs users have been alive.
> 
> They're not obsolete: they're used in grep, sed, and in Emacs.
> 
> There are several different standards for writing regexps, all of
> approximately the same age.  None is better than any other (aside from
> extra facilities available in some versions).

But surely some are "modern" and others are "obsolete", Alan. ;-)

(What's the equivalent of L'Academie Francaise for things technical?)

Emacs itself has been obsolete for longer than many Emacs users have been alive.  Emacs is dead.  Long live Emacs.

> This seems to me to be the same argument as that proposing that Emacs
> should change its key bindings to match those of other programs, because
> "everybody" knows those other bindings.

Emacs key bindings have been obsolete longer than many Emacs users have been alive.  Please remember this.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 17:54               ` Alan Mackenzie
  2019-02-15 18:27                 ` Drew Adams
@ 2019-02-15 18:36                 ` Eli Zaretskii
  2019-02-15 18:43                   ` Mattias Engdegård
                                     ` (3 more replies)
  2019-02-15 18:44                 ` Clément Pit-Claudel
  2019-02-15 19:37                 ` Stefan Monnier
  3 siblings, 4 replies; 57+ messages in thread
From: Eli Zaretskii @ 2019-02-15 18:36 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: mattiase, lokedhs, emacs-devel, philippe.vaucher, jaygkamat,
	perry

> Date: Fri, 15 Feb 2019 17:54:05 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: Mattias Engdegård <mattiase@acm.org>, lokedhs@gmail.com,
> 	emacs-devel@gnu.org, Philippe Vaucher <philippe.vaucher@gmail.com>,
> 	jaygkamat@gmail.com, Eli Zaretskii <eliz@gnu.org>
> 
> > Anyway, I recommend Eli's approach. We create a parallel set of
> > modernized syntax functions, and people can slowly adopt them.
> 
> I suggest we retain our current regexp notation, together with compatible
> tools, as the sole way of writing regexps in Emacs.  This notation is not
> all that bad, and it is thoroughly documented and well tested.  It's the
> approach which will cause the least confusion.  It works.

I proposed to have a separate set of functions that will accept PCRE
syntax.  That would allow everyone to have what they want: you to use
the "classic" regexps, and those who want PCRE to have that.  Where's
the problem with that?



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:36                 ` Eli Zaretskii
@ 2019-02-15 18:43                   ` Mattias Engdegård
  2019-02-15 19:48                     ` Eli Zaretskii
                                       ` (2 more replies)
  2019-02-15 18:46                   ` Clément Pit-Claudel
                                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 57+ messages in thread
From: Mattias Engdegård @ 2019-02-15 18:43 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Elias Mårtenson, perry, philippe.vaucher, jaygkamat,
	Alan Mackenzie, emacs-devel

15 feb. 2019 kl. 19.36 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> I proposed to have a separate set of functions that will accept PCRE
> syntax.  That would allow everyone to have what they want: you to use
> the "classic" regexps, and those who want PCRE to have that.  Where's
> the problem with that?

If I read the petitioners correctly, they want a global setting that permits them to type a(b|c) instead of a\(b\|c\) in isearch-forward-regexp and all other interactive commands. A separate set of functions would be to hand them a box of replacement transistors and some solder.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 17:54               ` Alan Mackenzie
  2019-02-15 18:27                 ` Drew Adams
  2019-02-15 18:36                 ` Eli Zaretskii
@ 2019-02-15 18:44                 ` Clément Pit-Claudel
  2019-02-15 19:37                 ` Stefan Monnier
  3 siblings, 0 replies; 57+ messages in thread
From: Clément Pit-Claudel @ 2019-02-15 18:44 UTC (permalink / raw)
  To: emacs-devel

On 15/02/2019 12.54, Alan Mackenzie wrote:
> That would be the sole reason for me for any switch.

Same here :)




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:36                 ` Eli Zaretskii
  2019-02-15 18:43                   ` Mattias Engdegård
@ 2019-02-15 18:46                   ` Clément Pit-Claudel
  2019-02-15 19:52                     ` Eli Zaretskii
  2019-02-15 19:14                   ` Alan Mackenzie
  2019-02-15 23:33                   ` Perry E. Metzger
  3 siblings, 1 reply; 57+ messages in thread
From: Clément Pit-Claudel @ 2019-02-15 18:46 UTC (permalink / raw)
  To: emacs-devel

On 15/02/2019 13.36, Eli Zaretskii wrote:
> I proposed to have a separate set of functions that will accept PCRE
> syntax.  That would allow everyone to have what they want: you to use
> the "classic" regexps, and those who want PCRE to have that.  Where's
> the problem with that?

I think that solution doesn't let you pass regexps using fancy PCRE features to existing code through defvars and defcustoms.  As a concrete example, sometimes assertions would be useful in regexps that define outlines, or in syntax highlighting, or in comment marker definitions. 

Clément.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:36                 ` Eli Zaretskii
  2019-02-15 18:43                   ` Mattias Engdegård
  2019-02-15 18:46                   ` Clément Pit-Claudel
@ 2019-02-15 19:14                   ` Alan Mackenzie
  2019-02-15 20:00                     ` Eli Zaretskii
  2019-02-15 23:33                   ` Perry E. Metzger
  3 siblings, 1 reply; 57+ messages in thread
From: Alan Mackenzie @ 2019-02-15 19:14 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: mattiase, lokedhs, emacs-devel, philippe.vaucher, jaygkamat,
	perry

Hello, Eli.

On Fri, Feb 15, 2019 at 20:36:13 +0200, Eli Zaretskii wrote:
> > Date: Fri, 15 Feb 2019 17:54:05 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: Mattias Engdegård <mattiase@acm.org>, lokedhs@gmail.com,
> > 	emacs-devel@gnu.org, Philippe Vaucher <philippe.vaucher@gmail.com>,
> > 	jaygkamat@gmail.com, Eli Zaretskii <eliz@gnu.org>

> > > Anyway, I recommend Eli's approach. We create a parallel set of
> > > modernized syntax functions, and people can slowly adopt them.

> > I suggest we retain our current regexp notation, together with compatible
> > tools, as the sole way of writing regexps in Emacs.  This notation is not
> > all that bad, and it is thoroughly documented and well tested.  It's the
> > approach which will cause the least confusion.  It works.

> I proposed to have a separate set of functions that will accept PCRE
> syntax.  That would allow everyone to have what they want: you to use
> the "classic" regexps, and those who want PCRE to have that.  Where's
> the problem with that?

This will end up with a mixture of the two incompatible styles of regexp
in the Emacs sources.  I can see there being such a mixture even within
single source files.  This will be confusing to everybody, particularly
to beginners.

Regexps are difficult.  Whether one has to escape a literal parenthesis,
or a parenthesis used as a grouping token makes little difference, IMAO,
to the overall difficulty of regexps.

And we will have yet one more technical choice where "modernists" will
attempt to force "traditionalists" to do what the "modernists" want.
This was even explicit in somebody's post in this thread (though they
pretended that it would just happen without force).

I think the costs of an alternative regexp style will outweight any
benefits, and this will affect everybody, not just those in favour of
some alternative style.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 17:54               ` Alan Mackenzie
                                   ` (2 preceding siblings ...)
  2019-02-15 18:44                 ` Clément Pit-Claudel
@ 2019-02-15 19:37                 ` Stefan Monnier
  2019-02-19 12:29                   ` Van L
  3 siblings, 1 reply; 57+ messages in thread
From: Stefan Monnier @ 2019-02-15 19:37 UTC (permalink / raw)
  To: emacs-devel

>> > (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc --
>> > but restricted to the set of features of the Emacs regexp engine.
>> Modern syntax is the main one.
> Such use of "modern" always gets on my nerves.  "Modern" is not the same
> as "good", and likely has a very weak correlation with it.  Why aren't we
> all using "modern" editors, for example?

I much prefer the |(){} syntax over the current \|\(\)\{\}.
This said, such a switch would probably be rather painful.
It probably make sense to do it at the same time we switch to
Scheme/CommonLisp/Agda.


        Stefan




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:43                   ` Mattias Engdegård
@ 2019-02-15 19:48                     ` Eli Zaretskii
  2019-02-17  3:17                       ` Richard Stallman
  2019-02-15 23:35                     ` Perry E. Metzger
  2019-02-17 20:01                     ` Juri Linkov
  2 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2019-02-15 19:48 UTC (permalink / raw)
  To: Mattias Engdegård
  Cc: lokedhs, perry, philippe.vaucher, jaygkamat, acm, emacs-devel

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Fri, 15 Feb 2019 19:43:15 +0100
> Cc: Alan Mackenzie <acm@muc.de>,
>         Elias Mårtenson <lokedhs@gmail.com>,
>         emacs-devel@gnu.org, philippe.vaucher@gmail.com, jaygkamat@gmail.com,
>         perry@piermont.com
> 
> 15 feb. 2019 kl. 19.36 skrev Eli Zaretskii <eliz@gnu.org>:
> > 
> > I proposed to have a separate set of functions that will accept PCRE
> > syntax.  That would allow everyone to have what they want: you to use
> > the "classic" regexps, and those who want PCRE to have that.  Where's
> > the problem with that?
> 
> If I read the petitioners correctly, they want a global setting that
> permits them to type a(b|c) instead of a\(b\|c\) in
> isearch-forward-regexp and all other interactive commands.

That is not going to happen any time soon, because it will break gobs
of Emacs code.  I don't see anyone proposing that, because they all
understand the impossible implications.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:46                   ` Clément Pit-Claudel
@ 2019-02-15 19:52                     ` Eli Zaretskii
  2019-02-15 20:08                       ` Clément Pit-Claudel
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2019-02-15 19:52 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

> From: Clément Pit-Claudel <cpitclaudel@gmail.com>
> Date: Fri, 15 Feb 2019 13:46:23 -0500
> 
> On 15/02/2019 13.36, Eli Zaretskii wrote:
> > I proposed to have a separate set of functions that will accept PCRE
> > syntax.  That would allow everyone to have what they want: you to use
> > the "classic" regexps, and those who want PCRE to have that.  Where's
> > the problem with that?
> 
> I think that solution doesn't let you pass regexps using fancy PCRE features to existing code through defvars and defcustoms.  As a concrete example, sometimes assertions would be useful in regexps that define outlines, or in syntax highlighting, or in comment marker definitions. 

Then those who want that will need to come up with either a function
that converts PCRE to the traditional syntax, or invent new functions
for outlines, syntax highlighting, etc. for that.  You cannot
seriously propose a global setting that will switch to PCRE syntax, as
that will cause a lot of breakage.

The original motivation was to provide PCRE in _commands_.  Suddenly
we are talking about using PCRE everywhere?  Please hold your horses.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 19:14                   ` Alan Mackenzie
@ 2019-02-15 20:00                     ` Eli Zaretskii
  2019-02-15 20:40                       ` Alan Mackenzie
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2019-02-15 20:00 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: mattiase, lokedhs, emacs-devel, philippe.vaucher, jaygkamat,
	perry

> Date: Fri, 15 Feb 2019 19:14:47 +0000
> Cc: perry@piermont.com, mattiase@acm.org, lokedhs@gmail.com,
>   emacs-devel@gnu.org, philippe.vaucher@gmail.com, jaygkamat@gmail.com
> From: Alan Mackenzie <acm@muc.de>
> 
> > > I suggest we retain our current regexp notation, together with compatible
> > > tools, as the sole way of writing regexps in Emacs.  This notation is not
> > > all that bad, and it is thoroughly documented and well tested.  It's the
> > > approach which will cause the least confusion.  It works.
> 
> > I proposed to have a separate set of functions that will accept PCRE
> > syntax.  That would allow everyone to have what they want: you to use
> > the "classic" regexps, and those who want PCRE to have that.  Where's
> > the problem with that?
> 
> This will end up with a mixture of the two incompatible styles of regexp
> in the Emacs sources.  I can see there being such a mixture even within
> single source files.  This will be confusing to everybody, particularly
> to beginners.

How is that different from having rx.el?  And how is that different
from having pcase.el, which invents a whole new sub-language on tyop
of Lisp?  Etc. etc. -- that ship has already sailed.

IMO, we'd be silly (let alone look and sound silly) to try to stop
this.  The net result will be an unbunlded package which everyone will
use, while we bury our heads in the sand.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 19:52                     ` Eli Zaretskii
@ 2019-02-15 20:08                       ` Clément Pit-Claudel
  0 siblings, 0 replies; 57+ messages in thread
From: Clément Pit-Claudel @ 2019-02-15 20:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 15/02/2019 14.52, Eli Zaretskii wrote:
> You cannot seriously propose a global setting that will switch to 
> PCRE syntax

No, not at all. Where did you see a hint of me suggesting that?

> as that will cause a lot of breakage.

Just to clarify, I'm looking for a solution that would not break anything.  Using a flag in the regexp might be one such solution ("\(?pcre:…regexp here…\)"), or it might not.

> Suddenly we are talking about using PCRE everywhere?  Please hold
> your horses.
I'm talking about allowing PCRE everywhere, not necessarily using it everywhere.  And FWIW, I don't care whether it's PCRE, I just would like to use some of the more advanced features.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 20:00                     ` Eli Zaretskii
@ 2019-02-15 20:40                       ` Alan Mackenzie
  0 siblings, 0 replies; 57+ messages in thread
From: Alan Mackenzie @ 2019-02-15 20:40 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: mattiase, lokedhs, emacs-devel, philippe.vaucher, jaygkamat,
	perry

Hello, Eli.

On Fri, Feb 15, 2019 at 22:00:53 +0200, Eli Zaretskii wrote:
> > Date: Fri, 15 Feb 2019 19:14:47 +0000
> > Cc: perry@piermont.com, mattiase@acm.org, lokedhs@gmail.com,
> >   emacs-devel@gnu.org, philippe.vaucher@gmail.com, jaygkamat@gmail.com
> > From: Alan Mackenzie <acm@muc.de>

> > > > I suggest we retain our current regexp notation, together with compatible
> > > > tools, as the sole way of writing regexps in Emacs.  This notation is not
> > > > all that bad, and it is thoroughly documented and well tested.  It's the
> > > > approach which will cause the least confusion.  It works.

> > > I proposed to have a separate set of functions that will accept PCRE
> > > syntax.  That would allow everyone to have what they want: you to use
> > > the "classic" regexps, and those who want PCRE to have that.  Where's
> > > the problem with that?

> > This will end up with a mixture of the two incompatible styles of regexp
> > in the Emacs sources.  I can see there being such a mixture even within
> > single source files.  This will be confusing to everybody, particularly
> > to beginners.

> How is that different from having rx.el?

rx.el is fully compatible with standard regexps, and can be viewed as a
tool to construct them.

> And how is that different from having pcase.el, which invents a whole
> new sub-language on top of Lisp?

I fear it might not be.  I don't think pcase.el was a good addition to
Emacs.

> Etc. etc. -- that ship has already sailed.

No, just because we have several questionable extensions to Emacs Lisp
doesn't mean we shouldn't be careful when considering further extensions.

> IMO, we'd be silly (let alone look and sound silly) to try to stop
> this.  The net result will be an unbundled package which everyone will
> use, while we bury our heads in the sand.

Maybe you're right.  But nobody in the thread before me had brought up
the disadvantages of the proposal, so I did.

To implement a full PCRE inside Emacs Lisp will involve extending it to
handle Emacs specific elements (such as \s<character>), and thus closely
tying it with syntax.c, etc.  "Trying to stop" the proposal may need
little more than not actively supporting it.

Maybe it won't be as bad as I foresee.  There are "but"s, though.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:27                 ` Drew Adams
@ 2019-02-15 23:33                   ` Perry E. Metzger
  2019-02-16  0:34                     ` Jay Kamat
  0 siblings, 1 reply; 57+ messages in thread
From: Perry E. Metzger @ 2019-02-15 23:33 UTC (permalink / raw)
  To: Drew Adams
  Cc: Mattias Engdegård, lokedhs, emacs-devel, Philippe Vaucher,
	jaygkamat, Alan Mackenzie, Eli Zaretskii

On Fri, 15 Feb 2019 10:27:44 -0800 (PST) Drew Adams
<drew.adams@oracle.com> wrote:
> > > Modern syntax is the main one.  
> > 
> > Such use of "modern" always gets on my nerves.  "Modern" is not
> > the same as "good", and likely has a very weak correlation with
> > it.  
> 
> Not to mention that "modern" has been applied to the latest
> fashion, ephemeral or not, for at least 100 years.  Today's
> modernista is tomorrow morning's has-been, but s?he sometimes
> continues to tout the same old-fashioned modernisms.

Look, the old syntax was replaced by the Unix people in the early
1980s because it was garbage. Everyone uses the new syntax, and
everyone is used to it. Sure, new doesn't always mean better, but in
this case, yes, the newer regex syntax is a whole lot better, not to
mention that it's what everyone on earth is used to.

> > They're not obsolete: they're used in grep, sed, and in Emacs.

They are not used in egrep which is now 35 years old, and all modern
seds take modern RE syntax if you ask, and everyone who uses sed asks.

> Emacs itself has been obsolete for longer than many Emacs users
> have been alive.  Emacs is dead.  Long live Emacs.

No, Emacs is not obsolete, because it's a living editor that adapts
to the times. The emacs of 2019 is very much not the Emacs of
1983 that I started with. It has shifted and adapted well with time.
But I think that many people seem to want to encase it in amber and
kill it by making it irrelevant to modern users. Luckily they won't
get their way.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:36                 ` Eli Zaretskii
                                     ` (2 preceding siblings ...)
  2019-02-15 19:14                   ` Alan Mackenzie
@ 2019-02-15 23:33                   ` Perry E. Metzger
  3 siblings, 0 replies; 57+ messages in thread
From: Perry E. Metzger @ 2019-02-15 23:33 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: mattiase, lokedhs, emacs-devel, philippe.vaucher, jaygkamat,
	Alan Mackenzie

On Fri, 15 Feb 2019 20:36:13 +0200 Eli Zaretskii <eliz@gnu.org> wrote:
> I proposed to have a separate set of functions that will accept PCRE
> syntax.  That would allow everyone to have what they want: you to
> use the "classic" regexps, and those who want PCRE to have that.
> Where's the problem with that?

There is no problem. It is the way Emacs has always adapted to the
times, and I think it's a great solution.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:43                   ` Mattias Engdegård
  2019-02-15 19:48                     ` Eli Zaretskii
@ 2019-02-15 23:35                     ` Perry E. Metzger
  2019-02-17 20:01                     ` Juri Linkov
  2 siblings, 0 replies; 57+ messages in thread
From: Perry E. Metzger @ 2019-02-15 23:35 UTC (permalink / raw)
  To: Mattias Engdegård
  Cc: Elias Mårtenson, emacs-devel, philippe.vaucher, jaygkamat,
	Alan Mackenzie, Eli Zaretskii

On Fri, 15 Feb 2019 19:43:15 +0100 Mattias Engdegård
<mattiase@acm.org> wrote:
> 15 feb. 2019 kl. 19.36 skrev Eli Zaretskii <eliz@gnu.org>:
> > 
> > I proposed to have a separate set of functions that will accept
> > PCRE syntax.  That would allow everyone to have what they want:
> > you to use the "classic" regexps, and those who want PCRE to have
> > that.  Where's the problem with that?  
> 
> If I read the petitioners correctly, they want a global setting
> that permits them to type a(b|c) instead of a\(b\|c\) in
> isearch-forward-regexp and all other interactive commands. A
> separate set of functions would be to hand them a box of
> replacement transistors and some solder.

That's hardly true. Rebinding a couple of functions is no work at all.
Note that I'm one of the people who wants the distinct functions. If
need be, we can give people a function to call to rebind everything
one way or the other but it is hardly a big deal even without it.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 23:33                   ` Perry E. Metzger
@ 2019-02-16  0:34                     ` Jay Kamat
  2019-02-16  1:46                       ` Perry E. Metzger
  0 siblings, 1 reply; 57+ messages in thread
From: Jay Kamat @ 2019-02-16  0:34 UTC (permalink / raw)
  To: Perry E. Metzger, Drew Adams
  Cc: Mattias Engdegård, lokedhs, emacs-devel, Philippe Vaucher,
	Alan Mackenzie, Eli Zaretskii



On February 15, 2019 3:33:17 PM PST, "Perry E. Metzger" <perry@piermont.com> wrote:

>Look, the old syntax was replaced by the Unix people in the early
>1980s because it was garbage. Everyone uses the new syntax, and
>everyone is used to it. Sure, new doesn't always mean better, but in
>this case, yes, the newer regex syntax is a whole lot better, not to
>mention that it's what everyone on earth is used to.

I started using Emacs less than 4 years ago (so I could probably consider myself a 'modern user'), and I honestly find the new syntax much more confusing. I have never used the flags in those programs to get the new syntax, and I have seen those flags in a shrinking minority of shell scripts. I find the new syntax much more unreadable and arcane than the original, although the original is difficult by itself.

I don't oppose adding more syntaxes, but I will never willingly use the new format personally.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-16  0:34                     ` Jay Kamat
@ 2019-02-16  1:46                       ` Perry E. Metzger
  2019-02-16  2:44                         ` Jay Kamat
  0 siblings, 1 reply; 57+ messages in thread
From: Perry E. Metzger @ 2019-02-16  1:46 UTC (permalink / raw)
  To: Jay Kamat
  Cc: Mattias Engdegård, lokedhs, emacs-devel, Philippe Vaucher,
	Alan Mackenzie, Eli Zaretskii, Drew Adams

On Fri, 15 Feb 2019 16:34:57 -0800 Jay Kamat <jaygkamat@gmail.com>
wrote:
> On February 15, 2019 3:33:17 PM PST, "Perry E. Metzger"
> <perry@piermont.com> wrote:
> 
> >Look, the old syntax was replaced by the Unix people in the early
> >1980s because it was garbage. Everyone uses the new syntax, and
> >everyone is used to it. Sure, new doesn't always mean better, but
> >in this case, yes, the newer regex syntax is a whole lot better,
> >not to mention that it's what everyone on earth is used to.  
> 
> I started using Emacs less than 4 years ago (so I could probably
> consider myself a 'modern user'), and I honestly find the new
> syntax much more confusing.

Other people get to pick what they want, too.

Eli has proposed an entirely reasonable solution where you can pick
the ancient syntax or the modern one. You can still use the old
syntax, but people who want the syntax Perl and Ruby and Python and
Awk and pretty much every other programming language uses can get it
if they opt in.

If you don't want to let other people get the option of using
the modern syntax, that's not demanding the right to use the old one,
which won't be taken from you, that's demanding the right to keep
other people from using the reasonable modern syntax. I think that's
unreasonable.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-16  1:46                       ` Perry E. Metzger
@ 2019-02-16  2:44                         ` Jay Kamat
  0 siblings, 0 replies; 57+ messages in thread
From: Jay Kamat @ 2019-02-16  2:44 UTC (permalink / raw)
  To: Perry E. Metzger
  Cc: Mattias Engdegård, lokedhs, emacs-devel, Philippe Vaucher,
	Jay Kamat, Alan Mackenzie, Eli Zaretskii, Drew Adams


Perry E. Metzger writes:

> Other people get to pick what they want, too.

> If you don't want to let other people get the option of using
> the modern syntax, that's not demanding the right to use the old one,
> which won't be taken from you, that's demanding the right to keep
> other people from using the reasonable modern syntax. I think that's
> unreasonable.

I suppose you didn't read my previous mail past the first sentence, so I'll
quote it below for convenience:

>> I started using Emacs less than 4 years ago (so I could probably consider myself a 'modern user'), and I honestly find the new syntax much more confusing. I have never used the flags in those programs to get the new syntax, and I have seen those flags in a shrinking minority of shell scripts. I find the new syntax much more unreadable and arcane than the original, although the original is difficult by itself.
>>
>> I don't oppose adding more syntaxes, but I will never willingly use the new format personally.

Although, now that you've proposed it, a world where everyone must use rx does
sound very nice :).

I haven't read too much in the regexp code of Emacs, but it might be easier to
maintain one 'core' regexp language and translate 'alternative syntaxes' down
to some form of common representation. It would also be nice to provide some
facility for writing your own regexp flavor (Extended, Basic, Perl, Fixed,
<whatever's 'modern' in 50 years>) and integrating it into Emacs. Because of
that, I think it would be a good idea to make whatever interface used support
multiple possible flavors. I don't think that can easily be done with multiple
standalone functions (by itself). It would also be nice to have some facility
to pre 'compile' down to the intermediate representation ahead of time for
performance reasons (if it's not all implemented via macros like rx is).

-Jay



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 19:48                     ` Eli Zaretskii
@ 2019-02-17  3:17                       ` Richard Stallman
  2019-02-25 14:47                         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 57+ messages in thread
From: Richard Stallman @ 2019-02-17  3:17 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: mattiase, lokedhs, perry, philippe.vaucher, jaygkamat, acm,
	emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > If I read the petitioners correctly, they want a global setting that
  > > permits them to type a(b|c) instead of a\(b\|c\) in
  > > isearch-forward-regexp and all other interactive commands.

  > That is not going to happen any time soon, because it will break gobs
  > of Emacs code.  I don't see anyone proposing that, because they all
  > understand the impossible implications.

That's true, of course.  But I wonder if some other interface could
make it possible to offer optional use of egrep syntax in a compatible
way that would not break anything.

A text property on the string could specify this.  We could make a
convenient function to add that text property -- that's how users
would specify "new syntax".

  > If I read the petitioners correctly, they want a global setting
  > that permits them to type a(b|c) instead of a\(b\|c\) in
  > isearch-forward-regexp and all other interactive commands.

The incremental search commands are a separate issue -- having a flag
to control what syntax they use wouldn't break Lisp code.

If we create a new letter in interactive specs to read a regexp,
we could have a global variable specify that those should be
treated as "new syntax".  Arg reading would convert them to old syntax,
or add the text property; after that, they could be simply passed along.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 18:43                   ` Mattias Engdegård
  2019-02-15 19:48                     ` Eli Zaretskii
  2019-02-15 23:35                     ` Perry E. Metzger
@ 2019-02-17 20:01                     ` Juri Linkov
  2019-02-18  0:38                       ` Stefan Monnier
  2 siblings, 1 reply; 57+ messages in thread
From: Juri Linkov @ 2019-02-17 20:01 UTC (permalink / raw)
  To: Mattias Engdegård
  Cc: Elias Mårtenson, perry, philippe.vaucher, jaygkamat,
	Alan Mackenzie, Eli Zaretskii, emacs-devel

> If I read the petitioners correctly, they want a global setting that
> permits them to type a(b|c) instead of a\(b\|c\) in isearch-forward-regexp
> and all other interactive commands.

This would be really trivial to do when we'll have a function that
will convert an extended regexp string into rx form like what
the xr package does, i.e. like the function ‘xr’, a new function
‘exr’ could convert an extended regexp - as opposed to basic regexp
as noted in https://www.regular-expressions.info/gnu.html

Then similarly to ‘isearch-symbol-regexp’, the implementation will be just
4 lines:

  (isearch-define-mode-toggle extended "\\" isearch-extended-regexp "\
  Turning on extended regexp search turns off basic regexp mode.")

  (defun isearch-extended-regexp (string)
    (rx (exr string)))

that will also automatically support extended regexp interactively in
occur, query-replace-regexp...



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 13:42       ` Philippe Vaucher
  2019-02-15 14:10         ` Clément Pit-Claudel
  2019-02-15 14:18         ` Eli Zaretskii
@ 2019-02-17 20:47         ` Stefan Monnier
  2019-02-18  8:40           ` Philippe Vaucher
  2019-02-18  8:55           ` Mattias Engdegård
  2 siblings, 2 replies; 57+ messages in thread
From: Stefan Monnier @ 2019-02-17 20:47 UTC (permalink / raw)
  To: emacs-devel

> Would this even be possible? I can imagine a whole lot of packages breaking
> if the regexp syntax changed, and changing it just for the user input in
> interactive functions looks a bit sketchy.

Other than isearch, most other commands should (ideally) read their
regexps interactively with `read-regexp`, so it should be easy for
a third party package to advise `read-regexp` so it accepts the PCRE
syntax (or the RX syntax, ...) and then converts it to Emacs's
own syntax.  Shouldn't break any package at all.


        Stefan




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-17 20:01                     ` Juri Linkov
@ 2019-02-18  0:38                       ` Stefan Monnier
  0 siblings, 0 replies; 57+ messages in thread
From: Stefan Monnier @ 2019-02-18  0:38 UTC (permalink / raw)
  To: emacs-devel

> This would be really trivial to do when we'll have a function that
> will convert an extended regexp string into rx form like what
> the xr package does, i.e. like the function ‘xr’, a new function
> ‘exr’ could convert an extended regexp - as opposed to basic regexp
> as noted in https://www.regular-expressions.info/gnu.html

FWIW, lex.el's lex-parse-re supports both ERE and BRE syntaxes.
It's more a "proof of concept" for lex.el so it's currently not as
thorough as xr.el is.


        Stefan




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-17 20:47         ` Stefan Monnier
@ 2019-02-18  8:40           ` Philippe Vaucher
  2019-02-18  8:55           ` Mattias Engdegård
  1 sibling, 0 replies; 57+ messages in thread
From: Philippe Vaucher @ 2019-02-18  8:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 712 bytes --]

>
> Other than isearch, most other commands should (ideally) read their
> regexps interactively with `read-regexp`, so it should be easy for
> a third party package to advise `read-regexp` so it accepts the PCRE
> syntax (or the RX syntax, ...) and then converts it to Emacs's
> own syntax.  Shouldn't break any package at all.
>

This is what `pcre-mode` does in pcre2el
https://github.com/joddie/pcre2el/blob/master/pcre2el.el#L707

The package pcre2el is already quite satisfying for me, but given it
requires a bit of updates to work and is always a bit kludgy due to having
to fight emacs, I would like for Emacs to give better PCRE support so
packages like this become obsolete or really trivial to write.

[-- Attachment #2: Type: text/html, Size: 1079 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-17 20:47         ` Stefan Monnier
  2019-02-18  8:40           ` Philippe Vaucher
@ 2019-02-18  8:55           ` Mattias Engdegård
  1 sibling, 0 replies; 57+ messages in thread
From: Mattias Engdegård @ 2019-02-18  8:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

17 feb. 2019 kl. 21.47 skrev Stefan Monnier <monnier@iro.umontreal.ca>:
> 
> Other than isearch, most other commands should (ideally) read their
> regexps interactively with `read-regexp`, so it should be easy for
> a third party package to advise `read-regexp` so it accepts the PCRE
> syntax (or the RX syntax, ...) and then converts it to Emacs's
> own syntax.  Shouldn't break any package at all.

Like this?

(defun backslash-mod-regexp (re)
  "Invert the backslash requirements for |(){} in RE."
  (with-temp-buffer
    (insert re)
    (goto-char (point-min))
    (while (re-search-forward
	    (rx (or (seq "["
			 (opt "^")
			 (opt "]")
			 (* (not (any "]")))
			 "]")
		    (seq "\\" (or (group (any "|(){}"))
				  (seq (any "sScC") anything)
				  anything))
		    (group (any "|(){}"))))
	    nil t)
      (cond
       ((match-beginning 1) (replace-match "\\1"))
       ((match-beginning 2) (replace-match "\\\\\\&"))))
    (buffer-string)))

(defadvice read-regexp (after read-regexp-backslash-mod last activate)
  (when (stringp ad-return-value)
    (setq ad-return-value (backslash-mod-regexp ad-return-value))))

Of course it won't help with commands that display previously entered regexps, which users naturally want to see in the form entered, not converted. However, that should not matter functionally.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-15 19:37                 ` Stefan Monnier
@ 2019-02-19 12:29                   ` Van L
  0 siblings, 0 replies; 57+ messages in thread
From: Van L @ 2019-02-19 12:29 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier writes:

> I much prefer the |(){} syntax over the current \|\(\)\{\}.

Me too, I prefer the shorter syntax. A use case is for lightening the
cognitive load on a source code auditor.

> This said, such a switch would probably be rather painful.
> It probably make sense to do it at the same time we switch to
> Scheme/CommonLisp/Agda.

(2038)

-- 
© 2019 Van L
gpg using EEF2 37E9 3840 0D5D 9183  251E 9830 384E 9683 B835
"What's so strange when you know that you're a Wizard at 3?" -Joni Mitchell




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-17  3:17                       ` Richard Stallman
@ 2019-02-25 14:47                         ` Lars Ingebrigtsen
  2019-02-25 15:46                           ` Clément Pit-Claudel
  2019-02-26 12:00                           ` Mattias Engdegård
  0 siblings, 2 replies; 57+ messages in thread
From: Lars Ingebrigtsen @ 2019-02-25 14:47 UTC (permalink / raw)
  To: emacs-devel

Richard Stallman <rms@gnu.org> writes:

> That's true, of course.  But I wonder if some other interface could
> make it possible to offer optional use of egrep syntax in a compatible
> way that would not break anything.

I have not followed this discussion closely, so I'm sorry if I'm
repeating points others have made before, but:

I think it would be nice if Emacs had a regexp object.  That would allow
us to extend Emacs in a compatible way somewhat seamlessly over a period
of time.

We could have a number of different functions that use different regexp
syntaxes, but they'd all return a regexp object.

For instance, we could have (erx "foo\\|bar") and (pcre "foo|bar") etc,
and this would allow us to experiment with different syntaxes more
freely.

The base Emacs functions that do stuff on regexps have to be adjusted en
masse, probably, but there aren't thousands of them --
`re-search-forward' etc -- to take these objects in addition to strings.

And: The final endpoint (many years in the future) of all this would be
to deprecate the `re-' functions altogether, and allow functions like
`search-forward' to take either a string or a regexp object and behave
in the obvious fashion...

(search-forward (pcre "foo|bar"))

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-25 14:47                         ` Lars Ingebrigtsen
@ 2019-02-25 15:46                           ` Clément Pit-Claudel
  2019-02-26  2:57                             ` Richard Stallman
  2019-02-26  3:47                             ` Elias Mårtenson
  2019-02-26 12:00                           ` Mattias Engdegård
  1 sibling, 2 replies; 57+ messages in thread
From: Clément Pit-Claudel @ 2019-02-25 15:46 UTC (permalink / raw)
  To: emacs-devel

On 25/02/2019 09.47, Lars Ingebrigtsen wrote:
> I think it would be nice if Emacs had a regexp object.  That would allow
> us to extend Emacs in a compatible way somewhat seamlessly over a period
> of time.

I think many others on this thread agree :) The one issue with this scheme is that we might need to change string manipulation functions too (many programs concatenate regexps using a plain `concat')

Clément.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-25 15:46                           ` Clément Pit-Claudel
@ 2019-02-26  2:57                             ` Richard Stallman
  2019-02-26 12:39                               ` Lars Ingebrigtsen
  2019-02-26  3:47                             ` Elias Mårtenson
  1 sibling, 1 reply; 57+ messages in thread
From: Richard Stallman @ 2019-02-26  2:57 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

If we add a regexp type, we would need to add several primitive
functions.  I estimate it would make the Emacs Lisp reference Manual
10 pages longer.  We would need a read syntax for them too.

Let's choose a solution which does not add a new data type.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-25 15:46                           ` Clément Pit-Claudel
  2019-02-26  2:57                             ` Richard Stallman
@ 2019-02-26  3:47                             ` Elias Mårtenson
  1 sibling, 0 replies; 57+ messages in thread
From: Elias Mårtenson @ 2019-02-26  3:47 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

On Tue, 26 Feb 2019 at 00:34, Clément Pit-Claudel <cpitclaudel@gmail.com>
wrote:

> On 25/02/2019 09.47, Lars Ingebrigtsen wrote:
> > I think it would be nice if Emacs had a regexp object.  That would allow
> > us to extend Emacs in a compatible way somewhat seamlessly over a period
> > of time.
>
> I think many others on this thread agree :) The one issue with this scheme
> is that we might need to change string manipulation functions too (many
> programs concatenate regexps using a plain `concat')
>

Not saying it's a good idea, but ‘concat’ could be overloaded to support
concatenation of regexes.

[-- Attachment #2: Type: text/html, Size: 925 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-25 14:47                         ` Lars Ingebrigtsen
  2019-02-25 15:46                           ` Clément Pit-Claudel
@ 2019-02-26 12:00                           ` Mattias Engdegård
  1 sibling, 0 replies; 57+ messages in thread
From: Mattias Engdegård @ 2019-02-26 12:00 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

25 feb. 2019 kl. 15.47 skrev Lars Ingebrigtsen <larsi@gnus.org>:
> 
> I think it would be nice if Emacs had a regexp object.  That would allow
> us to extend Emacs in a compatible way somewhat seamlessly over a period
> of time.

Yes. Furthermore, there would be less need to rely on a cache of regexp compilation objects. Right now these are user-inaccessible; it is not possible to keep, query or manipulate them. The cache is finite and small, and occasionally thrashed. Enlarging the cache increases lookup times, and entries can still be evicted.

Reliably persistent regexp objects make it more viable to use expensive compilation methods, triggered by use thresholds or on request.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26  2:57                             ` Richard Stallman
@ 2019-02-26 12:39                               ` Lars Ingebrigtsen
  2019-02-26 13:24                                 ` Troy Hinckley
                                                   ` (2 more replies)
  0 siblings, 3 replies; 57+ messages in thread
From: Lars Ingebrigtsen @ 2019-02-26 12:39 UTC (permalink / raw)
  To: Richard Stallman; +Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:

> If we add a regexp type, we would need to add several primitive
> functions.  I estimate it would make the Emacs Lisp reference Manual
> 10 pages longer.  We would need a read syntax for them too.

We don't need a read syntax for all new objects.  For instance, we have
no read syntax for

(window-frame) -> #<frame emacs 0x562a456cefb0>

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26 12:39                               ` Lars Ingebrigtsen
@ 2019-02-26 13:24                                 ` Troy Hinckley
  2019-02-26 13:32                                   ` Lars Ingebrigtsen
  2019-02-26 15:29                                 ` Eli Zaretskii
  2019-02-27  4:08                                 ` Richard Stallman
  2 siblings, 1 reply; 57+ messages in thread
From: Troy Hinckley @ 2019-02-26 13:24 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

I don’t think you could have a regexp object without a read syntax. Reading a regexp is such a common operation, where as reading a frame has limited usefulness. 

- Troy Hinckley

> On Feb 26, 2019, at 5:39 AM, Lars Ingebrigtsen <larsi@gnus.org> wrote:
> 
> Richard Stallman <rms@gnu.org> writes:
> 
>> If we add a regexp type, we would need to add several primitive
>> functions.  I estimate it would make the Emacs Lisp reference Manual
>> 10 pages longer.  We would need a read syntax for them too.
> 
> We don't need a read syntax for all new objects.  For instance, we have
> no read syntax for
> 
> (window-frame) -> #<frame emacs 0x562a456cefb0>
> 
> -- 
> (domestic pets only, the antidote for overdose, milk.)
>   bloggy blog: http://lars.ingebrigtsen.no
> 



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26 13:24                                 ` Troy Hinckley
@ 2019-02-26 13:32                                   ` Lars Ingebrigtsen
  2019-02-26 14:33                                     ` Andreas Schwab
  0 siblings, 1 reply; 57+ messages in thread
From: Lars Ingebrigtsen @ 2019-02-26 13:32 UTC (permalink / raw)
  To: Troy Hinckley; +Cc: emacs-devel

Troy Hinckley <t.macman@gmail.com> writes:

> I don’t think you could have a regexp object without a read
> syntax. Reading a regexp is such a common operation, where as reading
> a frame has limited usefulness.

A regexp object read syntax is a possibility, but it's orthogonal to
whether to have an object at all.

You can, after all, just write (pcre "foo|bar").

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26 13:32                                   ` Lars Ingebrigtsen
@ 2019-02-26 14:33                                     ` Andreas Schwab
  2019-02-27 12:09                                       ` Mattias Engdegård
  0 siblings, 1 reply; 57+ messages in thread
From: Andreas Schwab @ 2019-02-26 14:33 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Troy Hinckley, emacs-devel

On Feb 26 2019, Lars Ingebrigtsen <larsi@gnus.org> wrote:

> Troy Hinckley <t.macman@gmail.com> writes:
>
>> I don’t think you could have a regexp object without a read
>> syntax. Reading a regexp is such a common operation, where as reading
>> a frame has limited usefulness.
>
> A regexp object read syntax is a possibility, but it's orthogonal to
> whether to have an object at all.

If you want to byte-compile a form that contains a regexp object, a
proper read syntax is required.

The object types without read syntax are rather ephemeral, unlikely to
occur in byte-compiled forms.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26 12:39                               ` Lars Ingebrigtsen
  2019-02-26 13:24                                 ` Troy Hinckley
@ 2019-02-26 15:29                                 ` Eli Zaretskii
  2019-02-27  4:08                                 ` Richard Stallman
  2 siblings, 0 replies; 57+ messages in thread
From: Eli Zaretskii @ 2019-02-26 15:29 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: rms, emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Tue, 26 Feb 2019 13:39:45 +0100
> Cc: emacs-devel@gnu.org
> 
> We don't need a read syntax for all new objects.  For instance, we have
> no read syntax for
> 
> (window-frame) -> #<frame emacs 0x562a456cefb0>

That's true, but don't forget that some features that record history
of various kinds do expect to be able to read back regexps.  E.g.,
desktop.el.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26 12:39                               ` Lars Ingebrigtsen
  2019-02-26 13:24                                 ` Troy Hinckley
  2019-02-26 15:29                                 ` Eli Zaretskii
@ 2019-02-27  4:08                                 ` Richard Stallman
  2 siblings, 0 replies; 57+ messages in thread
From: Richard Stallman @ 2019-02-27  4:08 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > We don't need a read syntax for all new objects.  For instance, we have
  > no read syntax for

  > (window-frame) -> #<frame emacs 0x562a456cefb0>

That is ok for things like windows, frames, and buffers, but
I think users will expect regexps to be more like strings and numbers.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-26 14:33                                     ` Andreas Schwab
@ 2019-02-27 12:09                                       ` Mattias Engdegård
  2019-02-27 18:18                                         ` Daniel Pittman
  0 siblings, 1 reply; 57+ messages in thread
From: Mattias Engdegård @ 2019-02-27 12:09 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Troy Hinckley, Lars Ingebrigtsen, emacs-devel

26 feb. 2019 kl. 15.33 skrev Andreas Schwab <schwab@suse.de>:
> 
> If you want to byte-compile a form that contains a regexp object, a
> proper read syntax is required.
> 
> The object types without read syntax are rather ephemeral, unlikely to
> occur in byte-compiled forms.

Thanks for pointing that out. I'm not sure how it would work -- please bear with me.

Suppose we want to write (looking-at (pcre "a(b|c)")).
Then `pcre' is a macro returning a mutable object with the regexp in some canonical form -- a traditional Emacs regexp, perhaps, or normalised rx or something else. The object also has space for the internal compiled pattern, roughly struct re_pattern_buffer today.

As Richard pointed out, it is polite to make the object human-readable (for debugging, if nothing else). This means that we are either satisfied with the readability of the canonical form, or the original pattern is included around for this purpose.

Then (pcre "a(b|c)") might produce #s(regexp "a\\(b\\|c\\)" nil), which can be serialised and read, even by humans.
After its first use, the last slot would have become something like #<compiled-regexp 0xabc123>, but that would not occur in byte-compiled elisp code.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: modern regexes in emacs
  2019-02-27 12:09                                       ` Mattias Engdegård
@ 2019-02-27 18:18                                         ` Daniel Pittman
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Pittman @ 2019-02-27 18:18 UTC (permalink / raw)
  To: Mattias Engdegård
  Cc: Troy Hinckley, Andreas Schwab, Lars Ingebrigtsen, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2227 bytes --]

On Wed, Feb 27, 2019 at 8:53 AM Mattias Engdegård <mattiase@acm.org> wrote:

> 26 feb. 2019 kl. 15.33 skrev Andreas Schwab <schwab@suse.de>:
> >
> > If you want to byte-compile a form that contains a regexp object, a
> > proper read syntax is required.
> >
> > The object types without read syntax are rather ephemeral, unlikely to
> > occur in byte-compiled forms.
>
> Thanks for pointing that out. I'm not sure how it would work -- please
> bear with me.
>
> Suppose we want to write (looking-at (pcre "a(b|c)")).
> Then `pcre' is a macro returning a mutable object with the regexp in some
> canonical form -- a traditional Emacs regexp, perhaps, or normalised rx or
> something else. The object also has space for the internal compiled
> pattern, roughly struct re_pattern_buffer today.
>
> As Richard pointed out, it is polite to make the object human-readable
> (for debugging, if nothing else). This means that we are either satisfied
> with the readability of the canonical form, or the original pattern is
> included around for this purpose.


As a somewhat outsider opinion, but based on helping a lot of junior
developers get up to speed with a wide range of languages over many years,
I like to imagine my suggestion here is useful.  Other languages express
regex literals with the equivalent of a CL reader macro, or the record
literal syntax #s(...):

Clojure: #"..."
JavaScript and many others: /.../
Racket: #rx"..." and #px"..." for basic and PCRE respectively.
Dart, and a few others: r"...", or r'...', or a tagged prefix such as
$r"..." or %r/.../

Of those the most Emacs Lisp-ish would be something like the Racket
versions for supporting both types, for example `#r"..."`, or `#pcre"..."`,
or even `#rx(...)`.

I'd personally suggest that an additional reader (macro) syntax, and using
that in the printed form, is the most user friendly option.  The
S-expression form is a little less friendly, but in my eyes the absolute
best fallback, being a printed representation of `(pcre "...")` etc.  That
works, but it doesn't give the "compiled expression" a distinct identity
from the methods to create them, and I think separating them is the correct
choice.

[-- Attachment #2: Type: text/html, Size: 2822 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2019-02-27 18:18 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-16 16:37 modern regexes in emacs Perry E. Metzger
2018-06-16 17:45 ` Radon Rosborough
2018-06-16 18:25   ` Perry E. Metzger
2018-06-16 21:01     ` Daniel Colascione
2018-06-16 22:31 ` Jay Kamat
2019-02-09 17:20   ` Philippe Vaucher
2019-02-10  9:39   ` Elias Mårtenson
2019-02-11 22:12     ` Mattias Engdegård
2019-02-15 13:42       ` Philippe Vaucher
2019-02-15 14:10         ` Clément Pit-Claudel
2019-02-15 15:03           ` Philippe Vaucher
2019-02-15 15:13             ` Clément Pit-Claudel
2019-02-15 14:18         ` Eli Zaretskii
2019-02-15 15:28           ` Perry E. Metzger
2019-02-15 16:06             ` Stefan Monnier
2019-02-15 16:24           ` Mattias Engdegård
2019-02-15 16:47             ` Perry E. Metzger
2019-02-15 17:54               ` Alan Mackenzie
2019-02-15 18:27                 ` Drew Adams
2019-02-15 23:33                   ` Perry E. Metzger
2019-02-16  0:34                     ` Jay Kamat
2019-02-16  1:46                       ` Perry E. Metzger
2019-02-16  2:44                         ` Jay Kamat
2019-02-15 18:36                 ` Eli Zaretskii
2019-02-15 18:43                   ` Mattias Engdegård
2019-02-15 19:48                     ` Eli Zaretskii
2019-02-17  3:17                       ` Richard Stallman
2019-02-25 14:47                         ` Lars Ingebrigtsen
2019-02-25 15:46                           ` Clément Pit-Claudel
2019-02-26  2:57                             ` Richard Stallman
2019-02-26 12:39                               ` Lars Ingebrigtsen
2019-02-26 13:24                                 ` Troy Hinckley
2019-02-26 13:32                                   ` Lars Ingebrigtsen
2019-02-26 14:33                                     ` Andreas Schwab
2019-02-27 12:09                                       ` Mattias Engdegård
2019-02-27 18:18                                         ` Daniel Pittman
2019-02-26 15:29                                 ` Eli Zaretskii
2019-02-27  4:08                                 ` Richard Stallman
2019-02-26  3:47                             ` Elias Mårtenson
2019-02-26 12:00                           ` Mattias Engdegård
2019-02-15 23:35                     ` Perry E. Metzger
2019-02-17 20:01                     ` Juri Linkov
2019-02-18  0:38                       ` Stefan Monnier
2019-02-15 18:46                   ` Clément Pit-Claudel
2019-02-15 19:52                     ` Eli Zaretskii
2019-02-15 20:08                       ` Clément Pit-Claudel
2019-02-15 19:14                   ` Alan Mackenzie
2019-02-15 20:00                     ` Eli Zaretskii
2019-02-15 20:40                       ` Alan Mackenzie
2019-02-15 23:33                   ` Perry E. Metzger
2019-02-15 18:44                 ` Clément Pit-Claudel
2019-02-15 19:37                 ` Stefan Monnier
2019-02-19 12:29                   ` Van L
2019-02-17 20:47         ` Stefan Monnier
2019-02-18  8:40           ` Philippe Vaucher
2019-02-18  8:55           ` Mattias Engdegård
  -- strict thread matches above, loose matches on Subject: below --
2018-06-16 21:33 Jimmy Yuen Ho Wong

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).