unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Ugly regexps
@ 2021-03-03  0:32 Stefan Monnier
  2021-03-03  1:32 ` Stefan Kangas
                   ` (4 more replies)
  0 siblings, 5 replies; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03  0:32 UTC (permalink / raw)
  To: emacs-devel

BTW, while this theme of ugly regexps keeps coming up, how 'bout we add
a new function `ere` which converts between the ERE style of regexps
where grouping parens are not escaped (and plain chars meant to match
an actual paren need to be escaped instead) to ELisp-style regexps?

So you can do

    (string-match (ere "\\(def(macro|un|subst) .{1,}"))

instead of

    (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")

?


        Stefan


(defun ere (re)
  "Convert an ERE-style regexp RE to an Emacs-style regexp."
  (let ((pos 0)
        (last 0)
        (chunks '()))
    (while (string-match "\\\\.\\|[{}()|]" re pos)
      (let ((beg (match-beginning 0))
            (end (match-end 0)))
        (when (subregexp-context-p re beg)
          (cond
           ;; A normal paren: add a backslash.
           ((= (1+ beg) end)
            (push (substring re last beg) chunks) (setq last beg)
            (push "\\" chunks))
           ;; A grouping paren: skip the backslash.
           ((memq (aref re (1+ beg)) '(?\( ?\) ?\{ ?\} ?\|))
            (push (substring re last beg) chunks)
            (setq last (1+ beg)))))
        (setq pos end)))
    (mapconcat #'identity (nreverse (cons (substring re last) chunks)) "")))




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  0:32 Ugly regexps Stefan Monnier
@ 2021-03-03  1:32 ` Stefan Kangas
  2021-03-03  2:08   ` Stefan Kangas
  2021-03-03 20:46   ` Alan Mackenzie
  2021-03-03  6:00 ` Eli Zaretskii
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03  1:32 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> BTW, while this theme of ugly regexps keeps coming up, how 'bout we add
> a new function `ere` which converts between the ERE style of regexps
> where grouping parens are not escaped (and plain chars meant to match
> an actual paren need to be escaped instead) to ELisp-style regexps?
>
> So you can do
>
>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
>
> instead of
>
>     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")
>
> ?

Sounds good to me.

I was going to ask why not just do PCRE, but then I realized I'm not
exactly sure what the syntactical differences are.  (We obviously lack
some features.)  AFAIR, Emacs regexps don't exactly match GNU grep,
egrep, Perl, or anything else really.

So I cranked out my dusty old copy of Mastering Regular Expressions and
found this overview:

    grep           egrep          Emacs          Perl
    \? \+ \|      ? + |          ? + \|         ? + |
    \( \)          ( )            \( \)          ( )
                  \< \>         \< \> \b \B   \b \B

    (Excerpt from Mastering Regular Expressions: Table 3-3: A (Very)
    Superficial Look at the Flavor of a Few Common Tools)

This shows the differences that most commonly bites you, in my
experience.

While we're at it, has it ever been discussed to add support for the
pcre library side-by-side with our homegrown regexp.c?  It would give us
sane (standard) syntax and some useful features "for free"
(e.g. lookaround).  I didn't test but a priori I would also assume the
code to be much more performant than anything we could ever cook up
ourselves.  It is used by several high-profile projects.

I would imagine we'd introduce entirely new function names for it.
Perhaps even a completely new and improved API like Lars suggested a
while back.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  1:32 ` Stefan Kangas
@ 2021-03-03  2:08   ` Stefan Kangas
  2021-03-03  6:19     ` Eli Zaretskii
  2021-03-03 20:46   ` Alan Mackenzie
  1 sibling, 1 reply; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03  2:08 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> While we're at it, has it ever been discussed to add support for the
> pcre library side-by-side with our homegrown regexp.c?  It would give us
> sane (standard) syntax and some useful features "for free"
> (e.g. lookaround).  I didn't test but a priori I would also assume the
> code to be much more performant than anything we could ever cook up
> ourselves.  It is used by several high-profile projects.

Of course this had already been discussed.  I found this interesting
thread from 2012:

    https://lists.gnu.org/archive/html/emacs-devel/2012-01/msg00736.html

Long story short, it may be a non-trivial job.  In particular supporting
the \s and \c operators seems like a hard nut to crack.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  0:32 Ugly regexps Stefan Monnier
  2021-03-03  1:32 ` Stefan Kangas
@ 2021-03-03  6:00 ` Eli Zaretskii
  2021-03-03 15:46   ` Stefan Monnier
  2021-03-03  7:09 ` Helmut Eller
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-03  6:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Tue, 02 Mar 2021 19:32:20 -0500
> 
> So you can do
> 
>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
> 
> instead of
> 
>     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")

Why not use 'rx' in those cases?  IMO it makes the regexp even more
easy to write and read.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  2:08   ` Stefan Kangas
@ 2021-03-03  6:19     ` Eli Zaretskii
  0 siblings, 0 replies; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-03  6:19 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: monnier, emacs-devel

> From: Stefan Kangas <stefankangas@gmail.com>
> Date: Tue, 2 Mar 2021 20:08:53 -0600
> 
>     https://lists.gnu.org/archive/html/emacs-devel/2012-01/msg00736.html
> 
> Long story short, it may be a non-trivial job.  In particular supporting
> the \s and \c operators seems like a hard nut to crack.

Yes.  But I think supporting non-ASCII characters is also not easy,
and the main reason we still don't use Gnulib's regexp code in Emacs.

Another worthy goal, if we are talking about this, is to support more
of the Unicode Regular Expressions, at least at the functional level,
if not syntactically.  See https://unicode.org/reports/tr18/ for the
details.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  0:32 Ugly regexps Stefan Monnier
  2021-03-03  1:32 ` Stefan Kangas
  2021-03-03  6:00 ` Eli Zaretskii
@ 2021-03-03  7:09 ` Helmut Eller
  2021-03-03 14:11   ` Stefan Kangas
  2021-03-03 15:49   ` Stefan Monnier
  2021-03-03 12:17 ` Dmitry Gutov
  2021-03-03 13:57 ` Lars Ingebrigtsen
  4 siblings, 2 replies; 42+ messages in thread
From: Helmut Eller @ 2021-03-03  7:09 UTC (permalink / raw)
  To: emacs-devel

On Tue, Mar 02 2021, Stefan Monnier wrote:

> BTW, while this theme of ugly regexps keeps coming up, how 'bout we add
> a new function `ere` which converts between the ERE style of regexps
> where grouping parens are not escaped (and plain chars meant to match
> an actual paren need to be escaped instead) to ELisp-style regexps?
>
> So you can do
>
>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))

Better call it (rx (ere STRING)).  Namespace pollution may not be
prohibited by the law but mother Emacs may thank you anyway :-)

Helmut




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  0:32 Ugly regexps Stefan Monnier
                   ` (2 preceding siblings ...)
  2021-03-03  7:09 ` Helmut Eller
@ 2021-03-03 12:17 ` Dmitry Gutov
  2021-03-03 15:48   ` Stefan Monnier
  2021-03-03 13:57 ` Lars Ingebrigtsen
  4 siblings, 1 reply; 42+ messages in thread
From: Dmitry Gutov @ 2021-03-03 12:17 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

On 03.03.2021 02:32, Stefan Monnier wrote:
> (defun ere (re)
>    "Convert an ERE-style regexp RE to an Emacs-style regexp."
>    (let ((pos 0)
>          (last 0)
>          (chunks '()))
>      (while (string-match "\\\\.\\|[{}()|]" re pos)
>        (let ((beg (match-beginning 0))
>              (end (match-end 0)))
>          (when (subregexp-context-p re beg)
>            (cond
>             ;; A normal paren: add a backslash.
>             ((= (1+ beg) end)
>              (push (substring re last beg) chunks) (setq last beg)
>              (push "\\" chunks))
>             ;; A grouping paren: skip the backslash.
>             ((memq (aref re (1+ beg)) '(?\( ?\) ?\{ ?\} ?\|))
>              (push (substring re last beg) chunks)
>              (setq last (1+ beg)))))
>          (setq pos end)))
>      (mapconcat #'identity (nreverse (cons (substring re last) chunks)) "")))

See also xref--regexp-to-extended, my last attempt at RE->ERE 
conversion, though woefully lacking in tests.

Its goal was to move in the other direction, but (unless I'm missing 
something about the syntax differences) this function is reversible.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  0:32 Ugly regexps Stefan Monnier
                   ` (3 preceding siblings ...)
  2021-03-03 12:17 ` Dmitry Gutov
@ 2021-03-03 13:57 ` Lars Ingebrigtsen
  4 siblings, 0 replies; 42+ messages in thread
From: Lars Ingebrigtsen @ 2021-03-03 13:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> So you can do
>
>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
>
> instead of
>
>     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")
>
> ?

Sounds good to me.  In some cases, introducing an alternative syntax can
create confusion, but I don't think that's really the case here -- I
think everybody knows this syntax, perhaps better than the Emacs regexp
syntax.

The byte compiler can do the transformation, I guess?  (When it's a
string literal, which is usually is.)  So there should be no performance
impact.  And when Emacs finally grows support for a regexp object, then
`ere' can return one of those.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  7:09 ` Helmut Eller
@ 2021-03-03 14:11   ` Stefan Kangas
  2021-03-03 16:40     ` Stefan Monnier
  2021-03-03 15:49   ` Stefan Monnier
  1 sibling, 1 reply; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03 14:11 UTC (permalink / raw)
  To: Helmut Eller, emacs-devel

Helmut Eller <eller.helmut@gmail.com> writes:

>>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
>
> Better call it (rx (ere STRING)).  Namespace pollution may not be
> prohibited by the law but mother Emacs may thank you anyway :-)

FWIW, I'd strongly prefer `ere'.

Some of us will be using this macro *a lot*.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  6:00 ` Eli Zaretskii
@ 2021-03-03 15:46   ` Stefan Monnier
  2021-03-03 16:30     ` Eli Zaretskii
  0 siblings, 1 reply; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 15:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> So you can do
>>
>>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
>>
>> instead of
>>
>>     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")
>
> Why not use 'rx' in those cases?

Not sure what you mean by "those cases".  I'm thinking this `ere` would
be useful for the cases where the author finds `rx` unpalatable for
some reason.

> IMO it makes the regexp even more easy to write and read.

I believe this depends on taste and circumstances.  Experience shows
that while some packages use `rx` extensively, most ELisp code doesn't.


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 12:17 ` Dmitry Gutov
@ 2021-03-03 15:48   ` Stefan Monnier
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 15:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

>> (defun ere (re)
>>    "Convert an ERE-style regexp RE to an Emacs-style regexp."
>>    (let ((pos 0)
>>          (last 0)
>>          (chunks '()))
>>      (while (string-match "\\\\.\\|[{}()|]" re pos)
>>        (let ((beg (match-beginning 0))
>>              (end (match-end 0)))
>>          (when (subregexp-context-p re beg)
>>            (cond
>>             ;; A normal paren: add a backslash.
>>             ((= (1+ beg) end)
>>              (push (substring re last beg) chunks) (setq last beg)
>>              (push "\\" chunks))
>>             ;; A grouping paren: skip the backslash.
>>             ((memq (aref re (1+ beg)) '(?\( ?\) ?\{ ?\} ?\|))
>>              (push (substring re last beg) chunks)
>>              (setq last (1+ beg)))))
>>          (setq pos end)))
>>      (mapconcat #'identity (nreverse (cons (substring re last) chunks)) "")))
>
> See also xref--regexp-to-extended, my last attempt at RE->ERE conversion,
> though woefully lacking in tests.

Oh, thanks for the pointer.

> Its goal was to move in the other direction, but (unless I'm missing
> something about the syntax differences) this function is reversible.

Indeed it is,


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  7:09 ` Helmut Eller
  2021-03-03 14:11   ` Stefan Kangas
@ 2021-03-03 15:49   ` Stefan Monnier
  1 sibling, 0 replies; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 15:49 UTC (permalink / raw)
  To: Helmut Eller; +Cc: emacs-devel

>>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
> Better call it (rx (ere STRING)).

I think "every character counts" here.


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 15:46   ` Stefan Monnier
@ 2021-03-03 16:30     ` Eli Zaretskii
  2021-03-03 17:44       ` Stefan Monnier
  0 siblings, 1 reply; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-03 16:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Wed, 03 Mar 2021 10:46:20 -0500
> 
> >>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
> >>
> >> instead of
> >>
> >>     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")
> >
> > Why not use 'rx' in those cases?
> 
> Not sure what you mean by "those cases".  I'm thinking this `ere` would
> be useful for the cases where the author finds `rx` unpalatable for
> some reason.

Why would someone find rx unpalatable?

> > IMO it makes the regexp even more easy to write and read.
> 
> I believe this depends on taste and circumstances.  Experience shows
> that while some packages use `rx` extensively, most ELisp code doesn't.

If this is about personal preferences and tastes, then I think having
3 different flavors of regexps in our sources due to personal
preferences is not necessarily a good idea.  We have coding
conventions for a reason.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 14:11   ` Stefan Kangas
@ 2021-03-03 16:40     ` Stefan Monnier
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 16:40 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: Helmut Eller, emacs-devel

> Some of us will be using this macro *a lot*.

Side note: it's a *function*


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 16:30     ` Eli Zaretskii
@ 2021-03-03 17:44       ` Stefan Monnier
  2021-03-03 18:46         ` Stefan Kangas
  0 siblings, 1 reply; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 17:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> Not sure what you mean by "those cases".  I'm thinking this `ere` would
>> be useful for the cases where the author finds `rx` unpalatable for
>> some reason.
> Why would someone find rx unpalatable?

Maybe just because of habit, but I think the main downside of `rx` is
that it's very verbose, which ends up hiding the "text".  For example in

    (rx "(def" (or "macro" "un" "subst")))

I find the `or` to get a bit in the way of my visual cortex recognizing
the "defmacro" pattern above.

> If this is about personal preferences and tastes, then I think having
> 3 different flavors of regexps in our sources due to personal
> preferences is not necessarily a good idea.

Yes, it's the downside.


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 17:44       ` Stefan Monnier
@ 2021-03-03 18:46         ` Stefan Kangas
  2021-03-03 19:21           ` Eli Zaretskii
  2021-03-03 19:32           ` [External] : " Drew Adams
  0 siblings, 2 replies; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03 18:46 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Why would someone find rx unpalatable?
>
> Maybe just because of habit, but I think the main downside of `rx` is
> that it's very verbose, which ends up hiding the "text".  For example in
>
>     (rx "(def" (or "macro" "un" "subst")))
>
> I find the `or` to get a bit in the way of my visual cortex recognizing
> the "defmacro" pattern above.

It is also just another thing to learn.

If you're just doing some basic ELisp functions for your personal
editing you might not want to spend time parsing the docstring of `rx'
just to say "^(foo|bar)".  This applies also if you're just writing some
small package that just needs a regexp or two.

Also, `rx' does not translate to most other languages.  So if you are
learning regexps for the first time or are still struggling with them,
you will IMO probably be better off staying away from `rx' for a while.
Note also that you can't use `rx' syntax in `query-replace-regexp'.

I am not surprised that we don't see `rx' used more, even if I would
certainly wish that wasn't the case.  Especially in our own sources.
(It's too bad that we don't use it in our preloaded code, for example.)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 18:46         ` Stefan Kangas
@ 2021-03-03 19:21           ` Eli Zaretskii
  2021-03-03 19:50             ` Stefan Kangas
                               ` (2 more replies)
  2021-03-03 19:32           ` [External] : " Drew Adams
  1 sibling, 3 replies; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-03 19:21 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: monnier, emacs-devel

> From: Stefan Kangas <stefankangas@gmail.com>
> Date: Wed, 3 Mar 2021 12:46:47 -0600
> Cc: emacs-devel@gnu.org
> 
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
> 
> >> Why would someone find rx unpalatable?
> >
> > Maybe just because of habit, but I think the main downside of `rx` is
> > that it's very verbose, which ends up hiding the "text".  For example in
> >
> >     (rx "(def" (or "macro" "un" "subst")))
> >
> > I find the `or` to get a bit in the way of my visual cortex recognizing
> > the "defmacro" pattern above.
> 
> It is also just another thing to learn.

And ERE isn't?



^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [External] : Re: Ugly regexps
  2021-03-03 18:46         ` Stefan Kangas
  2021-03-03 19:21           ` Eli Zaretskii
@ 2021-03-03 19:32           ` Drew Adams
  1 sibling, 0 replies; 42+ messages in thread
From: Drew Adams @ 2021-03-03 19:32 UTC (permalink / raw)
  To: Stefan Kangas, Stefan Monnier, Eli Zaretskii; +Cc: emacs-devel@gnu.org

To add to what some others have said -

Is RX usable as part of our interactive use of regexps?
If so, great (assuming the UI for that is well done).

If not, I'd say that's another reason that at least
some of us might not bother with (or aren't in the
habit of using) RX.

I think that interactive use of regexps is the most
important for Emacs - more important than what is
used for Elisp.  And if that means (as it does now)
Elisp regexps, then that's what people will and should
learn: Elisp regexp syntax.

Of course, for interactive use, we already remove the
need for double backslashing etc.  But the regexp
dialect that's available interactively is (so far) the
Elisp one, not some other.  I think that alone may
explain limited use of RX in code.  (Just a thought.)


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 19:21           ` Eli Zaretskii
@ 2021-03-03 19:50             ` Stefan Kangas
  2021-03-03 20:16               ` Stefan Kangas
  2021-03-03 19:50             ` Stefan Kangas
  2021-03-03 19:58             ` Dmitry Gutov
  2 siblings, 1 reply; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03 19:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> It is also just another thing to learn.
>
> And ERE isn't?

Exactly.  I mean, it



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 19:21           ` Eli Zaretskii
  2021-03-03 19:50             ` Stefan Kangas
@ 2021-03-03 19:50             ` Stefan Kangas
  2021-03-03 19:58             ` Dmitry Gutov
  2 siblings, 0 replies; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03 19:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> It is also just another thing to learn.
>
> And ERE isn't?

Exactly.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 19:21           ` Eli Zaretskii
  2021-03-03 19:50             ` Stefan Kangas
  2021-03-03 19:50             ` Stefan Kangas
@ 2021-03-03 19:58             ` Dmitry Gutov
  2021-03-03 20:07               ` [External] : " Drew Adams
  2021-03-04  5:47               ` Eli Zaretskii
  2 siblings, 2 replies; 42+ messages in thread
From: Dmitry Gutov @ 2021-03-03 19:58 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Kangas; +Cc: monnier, emacs-devel

On 03.03.2021 21:21, Eli Zaretskii wrote:
> And ERE isn't?

To be fair, extended regular expressions are the regular expressions 
flavor most commonly used in the contemporary world, recent/popular 
programming languages, etc.

So for a lot of people this won't be +1 thing to learn.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [External] : Re: Ugly regexps
  2021-03-03 19:58             ` Dmitry Gutov
@ 2021-03-03 20:07               ` Drew Adams
  2021-03-03 20:31                 ` Stefan Kangas
  2021-03-03 20:32                 ` Stefan Monnier
  2021-03-04  5:47               ` Eli Zaretskii
  1 sibling, 2 replies; 42+ messages in thread
From: Drew Adams @ 2021-03-03 20:07 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, Stefan Kangas
  Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org

> > And ERE isn't?
> 
> To be fair, extended regular expressions are the regular expressions
> flavor most commonly used in the contemporary world, recent/popular
> programming languages, etc.
> 
> So for a lot of people this won't be +1 thing to learn.

See my previous message.  It _will_ be a +1 to learn,
in the context of Emacs, if people have to also learn
the Elisp syntax anyway, for interactive use.

Any way you look at it, I think, if the Elisp regexp
syntax is what is used interactively (modulo extra
backslashing), then adding another syntax means that
using that other syntax is a +1 - extra learning.

Not that extra learning is necessarily bad... ;-)


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 19:50             ` Stefan Kangas
@ 2021-03-03 20:16               ` Stefan Kangas
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03 20:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> It is also just another thing to learn.
>>
>> And ERE isn't?
>
> Exactly.  I mean, it

[ My reply was accidentally sent before it was done, sorry: ]

Exactly.  It is what is used in most other programming languages.

Whereas `rx' is specific for ELisp.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [External] : Re: Ugly regexps
  2021-03-03 20:07               ` [External] : " Drew Adams
@ 2021-03-03 20:31                 ` Stefan Kangas
  2021-03-03 22:17                   ` Drew Adams
  2021-03-03 20:32                 ` Stefan Monnier
  1 sibling, 1 reply; 42+ messages in thread
From: Stefan Kangas @ 2021-03-03 20:31 UTC (permalink / raw)
  To: Drew Adams, Dmitry Gutov, Eli Zaretskii
  Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org

Drew Adams <drew.adams@oracle.com> writes:

>> To be fair, extended regular expressions are the regular expressions
>> flavor most commonly used in the contemporary world, recent/popular
>> programming languages, etc.
>>
>> So for a lot of people this won't be +1 thing to learn.
>
> See my previous message.  It _will_ be a +1 to learn,
> in the context of Emacs, if people have to also learn
> the Elisp syntax anyway, for interactive use.

We could add an option to prefer ERE in interactive use.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [External] : Re: Ugly regexps
  2021-03-03 20:07               ` [External] : " Drew Adams
  2021-03-03 20:31                 ` Stefan Kangas
@ 2021-03-03 20:32                 ` Stefan Monnier
  1 sibling, 0 replies; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 20:32 UTC (permalink / raw)
  To: Drew Adams
  Cc: Eli Zaretskii, emacs-devel@gnu.org, Stefan Kangas, Dmitry Gutov

WRT interactive regexp syntax, I'm still hoping someone will write
a proper package that lets the user choose which regexp syntax to use.
Currently `re-builder` has such a thing, but it should really apply
"across the board", i.e. in `read-regexp`, in Isearch, and anywhere else
we read regexps from the keyboard.

This is largely orthogonal to what we use in ELisp code.


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03  1:32 ` Stefan Kangas
  2021-03-03  2:08   ` Stefan Kangas
@ 2021-03-03 20:46   ` Alan Mackenzie
  2021-03-04 18:35     ` Stefan Kangas
  1 sibling, 1 reply; 42+ messages in thread
From: Alan Mackenzie @ 2021-03-03 20:46 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: Stefan Monnier, emacs-devel

Hello, Stefan.

On Tue, Mar 02, 2021 at 19:32:23 -0600, Stefan Kangas wrote:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:

> > BTW, while this theme of ugly regexps keeps coming up, how 'bout we add
> > a new function `ere` which converts between the ERE style of regexps
> > where grouping parens are not escaped (and plain chars meant to match
> > an actual paren need to be escaped instead) to ELisp-style regexps?

> > So you can do

> >     (string-match (ere "\\(def(macro|un|subst) .{1,}"))

> > instead of

> >     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")

> > ?

> Sounds good to me.

> I was going to ask why not just do PCRE, but then I realized I'm not
> exactly sure what the syntactical differences are.  (We obviously lack
> some features.)  AFAIR, Emacs regexps don't exactly match GNU grep,
> egrep, Perl, or anything else really.

These things don't exactly match eachother, do they?

> So I cranked out my dusty old copy of Mastering Regular Expressions and
> found this overview:

>     grep           egrep          Emacs          Perl
>     \? \+ \|      ? + |          ? + \|         ? + |
>     \( \)          ( )            \( \)          ( )
>                   \< \>         \< \> \b \B   \b \B

>     (Excerpt from Mastering Regular Expressions: Table 3-3: A (Very)
>     Superficial Look at the Flavor of a Few Common Tools)

> This shows the differences that most commonly bites you, in my
> experience.

The "biting" effect is surely small.  I have little difficulty using
grep, egrep and awk, all of whose regexp notations differ somewhat.

> While we're at it, has it ever been discussed to add support for the
> pcre library side-by-side with our homegrown regexp.c?  It would give us
> sane (standard) syntax and some useful features "for free"
> (e.g. lookaround).  I didn't test but a priori I would also assume the
> code to be much more performant than anything we could ever cook up
> ourselves.  It is used by several high-profile projects.

> I would imagine we'd introduce entirely new function names for it.
> Perhaps even a completely new and improved API like Lars suggested a
> while back.

No, No, No, No!

All these tools have one overarching thing in common, and that is they
each have a single variety of regexp.  That is, with the exception of
Emacs, which also has a radically different source form, namely rx.
Somebody pointed out the relatively small use of rx, and the same might
happen for a new regexp notation.  Or it might not, and we'd have two
different notations side by side.  This is surely something to avoid.

There's not a lot wrong with Emacs's regexp notation.  It works, works
well, and we're all familiar with it.  And there are many thousands of
lines of lisp containing regexps, all of which are in the same variety.
With the exception of those written with rx.

To introduce a second (string) variety alongside Emacs regexps would
cause confusion, and suck up effort better used for productive work.
Just how is one meant to search for a regexp using grep, when one
doesn't even know whether it follows Emacs conventions or some foreign
set of conventions?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [External] : Re: Ugly regexps
  2021-03-03 20:31                 ` Stefan Kangas
@ 2021-03-03 22:17                   ` Drew Adams
  2021-03-03 22:32                     ` Stefan Monnier
  0 siblings, 1 reply; 42+ messages in thread
From: Drew Adams @ 2021-03-03 22:17 UTC (permalink / raw)
  To: Stefan Kangas, Dmitry Gutov, Eli Zaretskii
  Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org

> >> for a lot of people this won't be +1 thing to learn.
> >
> > See my previous message.  It _will_ be a +1 to learn,
> > in the context of Emacs, if people have to also learn
> > the Elisp syntax anyway, for interactive use.

sk> We could add an option to prefer ERE in interactive use.

Sure, if it were supported generally.

sm> WRT interactive regexp syntax, I'm still hoping someone will write
sm> a proper package that lets the user choose which regexp syntax to use.
sm> Currently `re-builder` has such a thing, but it should really apply
sm> "across the board", i.e. in `read-regexp`, in Isearch, and anywhere
sm> else we read regexps from the keyboard.

Sure, worth hoping.

And then have an option to express one's preference.

Or several options?  Depending on what's implemented,
maybe someone will prefer one thing for, say, Isearch
query-replace*, and completion, and another thing for
some other interactive uses?

[But since Emacs (not so wisely, IMO) forbids commands
from binding options, code couldn't just bind such a
variable when calling `read-regexp'.  `read-regexp'
could accept another arg for this, of course, but then
that too could, in a sense, override a user preference.]
 
> This is largely orthogonal to what we use in ELisp code.

In one sense, sure.  And especially if we're now talking
only about different regexp dialects, and not also about
alternative ways, such as RX, to enter/create a regexp.

But as I mentioned, I don't think it's orthogonal, in
practice, to what people actually use when coding Elisp.

I think they often code based on what they're used to
using, which, at least for now, is mostly the interactive
syntax (modulo backslashing, etc. for Elisp).  Use of
something like RX seems to be less common, so far.

And then there's the question, interactively, of choosing
one or another.

Often you use a very simple regexp for searching or
completion matching - even just a substring (no special
chars).  Less often you need a more complex regexp.

Will someone want to use something like RX for simple
patterns too?  Would going through some kind of
interactive RX dialog be cumbersome for something simple?
We'd want to make sure that any dialog to be defined
keeps the simple simple.  `(rx "abc")' is simple enough -
it should be possible to type just `abc', like we do now.  

(At least among regexp dialects, as opposed to something
like RX, use of simple-vs-complex patterns shouldn't make
any difference, from one dialect to another.)

BTW, I see this in (emacs) `Rx Notation':

  The ‘rx’ notation is mainly useful in Lisp code; it
  cannot be used in most interactive situations where
  a regexp is requested, such as when running
  ‘query-replace-regexp’ or in variable customization.

I guess that's just saying that an RX-based dialog isn't
available yet, not that it's inconceivable or couldn't
be useful.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [External] : Re: Ugly regexps
  2021-03-03 22:17                   ` Drew Adams
@ 2021-03-03 22:32                     ` Stefan Monnier
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Monnier @ 2021-03-03 22:32 UTC (permalink / raw)
  To: Drew Adams
  Cc: Eli Zaretskii, emacs-devel@gnu.org, Stefan Kangas, Dmitry Gutov

> I guess that's just saying that an RX-based dialog isn't
> available yet, not that it's inconceivable or couldn't
> be useful.

Indeed if we introduce some way to choose which dialect to use for
interactive regexps, I'd fully expect the RX syntax to be one of
the options.

I could also imagine one of the options to be DWIMish (i.e. use RX if
the regexp starts and ends with a paren).


        Stefan




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 19:58             ` Dmitry Gutov
  2021-03-03 20:07               ` [External] : " Drew Adams
@ 2021-03-04  5:47               ` Eli Zaretskii
  2021-03-04 10:49                 ` Lars Ingebrigtsen
  2021-03-04 14:25                 ` Dmitry Gutov
  1 sibling, 2 replies; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-04  5:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, stefankangas, monnier

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Wed, 3 Mar 2021 21:58:17 +0200
> Content-Language: en-US
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> On 03.03.2021 21:21, Eli Zaretskii wrote:
> > And ERE isn't?
> 
> To be fair, extended regular expressions are the regular expressions 
> flavor most commonly used in the contemporary world, recent/popular 
> programming languages, etc.
> 
> So for a lot of people this won't be +1 thing to learn.

So we are now going to cater to users of other programs more than we
cater to Emacs users who are used to the Emacs RE syntaxes?  How does
that make sense?  Do other programs prefer the Emacs RE syntax to
their own?

We have rx for many years, and just recently enhanced it
significantly.  I fail to see how it would make sense to introduce yet
another RE syntax into Emacs, with all the overhead that brings with
it.  Maybe it could make sense as an ELPA add-on, but not in core.

More generally, I wish we stopped investing so much of our time and
energy in cleanups and other support tasks, and more to add
significant new applications and editing features.  That would make
more users happier, I think.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04  5:47               ` Eli Zaretskii
@ 2021-03-04 10:49                 ` Lars Ingebrigtsen
  2021-03-04 11:25                   ` Mattias Engdegård
                                     ` (2 more replies)
  2021-03-04 14:25                 ` Dmitry Gutov
  1 sibling, 3 replies; 42+ messages in thread
From: Lars Ingebrigtsen @ 2021-03-04 10:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, stefankangas, monnier, Dmitry Gutov

Eli Zaretskii <eliz@gnu.org> writes:

> More generally, I wish we stopped investing so much of our time and
> energy in cleanups and other support tasks, and more to add
> significant new applications and editing features.  That would make
> more users happier, I think.

I think users will be very happy to be able to use the regexp syntax
they know (instead of the special Emacs regexp variants) in their .emacs
files.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 10:49                 ` Lars Ingebrigtsen
@ 2021-03-04 11:25                   ` Mattias Engdegård
  2021-03-04 11:28                   ` Alan Mackenzie
  2021-03-04 14:11                   ` Eli Zaretskii
  2 siblings, 0 replies; 42+ messages in thread
From: Mattias Engdegård @ 2021-03-04 11:25 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, Dmitry Gutov, stefankangas, monnier, emacs-devel

4 mars 2021 kl. 11.49 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> I think users will be very happy to be able to use the regexp syntax
> they know (instead of the special Emacs regexp variants) in their .emacs
> files.

Unfortunately it isn't the regexp syntax they know; it's a new variant, with subtle differences from what they may think it is. False friends include [] \d \s \w . ^ $ and so on: constructs that look like something they know well from other software but  that have slightly (or completely) different meaning in Emacs.

This doesn't mean that `ere` wouldn't be a useful addition, but that it should not be presented as "Regexps just like in Python (etc)! No Emacs quirks to worry about!" but exactly for what it is: a way to toggle the requirement for backslash-escaping (){}|, no more and no less. Someone who wants to write or understand an `ere` regexp has to read the Emacs regexp docs, then the `ere` documentation, and then mentally combine the two.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 10:49                 ` Lars Ingebrigtsen
  2021-03-04 11:25                   ` Mattias Engdegård
@ 2021-03-04 11:28                   ` Alan Mackenzie
  2021-03-04 14:11                   ` Eli Zaretskii
  2 siblings, 0 replies; 42+ messages in thread
From: Alan Mackenzie @ 2021-03-04 11:28 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, Dmitry Gutov, stefankangas, monnier, emacs-devel

Hello, Lars.

On Thu, Mar 04, 2021 at 11:49:48 +0100, Lars Ingebrigtsen wrote:
> Eli Zaretskii <eliz@gnu.org> writes:

> > More generally, I wish we stopped investing so much of our time and
> > energy in cleanups and other support tasks, and more to add
> > significant new applications and editing features.  That would make
> > more users happier, I think.

> I think users will be very happy to be able to use the regexp syntax
> they know (instead of the special Emacs regexp variants) in their .emacs
> files.

Emacs users know the Emacs regexp syntax.  They may also be aware of
other variants.  There's nothing "special" about Emacs regexps - their
makeup is simply one variant amongst several.

I very much doubt users will be "very happy" about having to choose
between two regexp syntaxes.  I expect they are "happy" that each
program they use has just one regexp syntax, if they even think about
that at all.

Introducing an alternative regexp syntax would cause bloat (which Emacs
isn't short of), and impose extra work on Emacs hackers everywhere, who
at the very least would need to put something like

   (let (alternative-regexp-syntax) ....)

around all their entry points.  I don't want that extra hassle, that
extra bug source.

In short, this proposal is a proposal to increase complexity, increase
the workload on all hackers, and a source of future bugs.  What we've
got already works well enough.  Why change it?

> -- 
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: http://lars.ingebrigtsen.no

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 10:49                 ` Lars Ingebrigtsen
  2021-03-04 11:25                   ` Mattias Engdegård
  2021-03-04 11:28                   ` Alan Mackenzie
@ 2021-03-04 14:11                   ` Eli Zaretskii
  2 siblings, 0 replies; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-04 14:11 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: dgutov, stefankangas, monnier, emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Thu, 04 Mar 2021 11:49:48 +0100
> Cc: emacs-devel@gnu.org, stefankangas@gmail.com, monnier@iro.umontreal.ca,
>  Dmitry Gutov <dgutov@yandex.ru>
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > More generally, I wish we stopped investing so much of our time and
> > energy in cleanups and other support tasks, and more to add
> > significant new applications and editing features.  That would make
> > more users happier, I think.
> 
> I think users will be very happy to be able to use the regexp syntax
> they know (instead of the special Emacs regexp variants) in their .emacs
> files.

The suggestion was to introduce this  for general use, not just for
user-private files.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04  5:47               ` Eli Zaretskii
  2021-03-04 10:49                 ` Lars Ingebrigtsen
@ 2021-03-04 14:25                 ` Dmitry Gutov
  2021-03-04 14:50                   ` tomas
  2021-03-04 15:11                   ` Eli Zaretskii
  1 sibling, 2 replies; 42+ messages in thread
From: Dmitry Gutov @ 2021-03-04 14:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, stefankangas, monnier

On 04.03.2021 07:47, Eli Zaretskii wrote:
>> From: Dmitry Gutov <dgutov@yandex.ru>
>> Date: Wed, 3 Mar 2021 21:58:17 +0200
>> Content-Language: en-US
>> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
>>
>> On 03.03.2021 21:21, Eli Zaretskii wrote:
>>> And ERE isn't?
>>
>> To be fair, extended regular expressions are the regular expressions
>> flavor most commonly used in the contemporary world, recent/popular
>> programming languages, etc.
>>
>> So for a lot of people this won't be +1 thing to learn.
> 
> So we are now going to cater to users of other programs more than we
> cater to Emacs users who are used to the Emacs RE syntaxes?  How does
> that make sense?  Do other programs prefer the Emacs RE syntax to
> their own?

Not "more". Just make an extra (fairly small) effort to accommodate them.

> We have rx for many years, and just recently enhanced it
> significantly.  I fail to see how it would make sense to introduce yet
> another RE syntax into Emacs, with all the overhead that brings with
> it.  Maybe it could make sense as an ELPA add-on, but not in core.

The 'ere' function would be helpful to have in the core either way. As 
already mentioned in this thread, I have been using an equivalent of it 
for 5 years now.

> More generally, I wish we stopped investing so much of our time and
> energy in cleanups and other support tasks, and more to add
> significant new applications and editing features.  That would make
> more users happier, I think.

Perhaps if contributors didn't have to fight you about every little 
thing they need to change or fix (or if you responded to arguments, at 
least), we'll get more features over time. This is especially 
discouraging when the disagreement is over a minor change in a package I 
supposedly maintain (bug#44611 is the most glaring example).

I can't threaten to slam the door and leave every time this happens, but 
this kind of malarkey sucks out a significant amount of time and enthusiasm.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 14:25                 ` Dmitry Gutov
@ 2021-03-04 14:50                   ` tomas
  2021-03-04 15:04                     ` Dmitry Gutov
  2021-03-04 15:05                     ` Dmitry Gutov
  2021-03-04 15:11                   ` Eli Zaretskii
  1 sibling, 2 replies; 42+ messages in thread
From: tomas @ 2021-03-04 14:50 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, stefankangas, monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 577 bytes --]

On Thu, Mar 04, 2021 at 04:25:58PM +0200, Dmitry Gutov wrote:

[...]

> Perhaps if contributors didn't have to fight you about every little
> thing they need to change or fix (or if you responded to arguments,
> at least) [...]

Please, be gentle. I think this is an unfair depiction of what Eli is
doing here. Granted, he's not always easy to convince -- but I think
that's part of his job as a maintainer. And he has nearly infinite
patience in following up on discussions.

I'm sure you can bring on your (I think founded) criticism in a more
constructive way.

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 14:50                   ` tomas
@ 2021-03-04 15:04                     ` Dmitry Gutov
  2021-03-05  5:45                       ` Richard Stallman
  2021-03-04 15:05                     ` Dmitry Gutov
  1 sibling, 1 reply; 42+ messages in thread
From: Dmitry Gutov @ 2021-03-04 15:04 UTC (permalink / raw)
  To: tomas; +Cc: Eli Zaretskii, stefankangas, monnier, emacs-devel

On 04.03.2021 16:50, tomas@tuxteam.de wrote:
> Please, be gentle. I think this is an unfair depiction of what Eli is
> doing here.

Yes, it was harsh, and it doesn't happen every single time, but it 
happens enough that I can feel overwhelmed and have to choose something 
else to do with my time. And it's not like it's easy to forget all the 
previous times it happened (unresolved bugs have a way of keeping one's 
attention returning to them).

> Granted, he's not always easy to convince -- but I think
> that's part of his job as a maintainer. 

If only it didn't step on, repeatedly, on my job as a maintainer. And 
feeling powerless (while still bearing responsibility) is not a great 
position to be in.

 > And he has nearly infinite
 > patience in following up on discussions.

Just read the comments in that bug report.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 14:50                   ` tomas
  2021-03-04 15:04                     ` Dmitry Gutov
@ 2021-03-04 15:05                     ` Dmitry Gutov
  1 sibling, 0 replies; 42+ messages in thread
From: Dmitry Gutov @ 2021-03-04 15:05 UTC (permalink / raw)
  To: tomas; +Cc: Eli Zaretskii, stefankangas, monnier, emacs-devel

On 04.03.2021 16:50, tomas@tuxteam.de wrote:
> I'm sure you can bring on your (I think founded) criticism in a more
> constructive way.

I think I have tried many different ways of doing that by now. Including 
emailing Eli directly, of course.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 14:25                 ` Dmitry Gutov
  2021-03-04 14:50                   ` tomas
@ 2021-03-04 15:11                   ` Eli Zaretskii
  1 sibling, 0 replies; 42+ messages in thread
From: Eli Zaretskii @ 2021-03-04 15:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, stefankangas, monnier

> Cc: stefankangas@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Thu, 4 Mar 2021 16:25:58 +0200
> 
> > More generally, I wish we stopped investing so much of our time and
> > energy in cleanups and other support tasks, and more to add
> > significant new applications and editing features.  That would make
> > more users happier, I think.
> 
> Perhaps if contributors didn't have to fight you about every little 
> thing they need to change or fix (or if you responded to arguments, at 
> least), we'll get more features over time. This is especially 
> discouraging when the disagreement is over a minor change in a package I 
> supposedly maintain (bug#44611 is the most glaring example).
> 
> I can't threaten to slam the door and leave every time this happens, but 
> this kind of malarkey sucks out a significant amount of time and enthusiasm.

Ah, okay.  So I'm the culprit.  Thanks, noted.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-03 20:46   ` Alan Mackenzie
@ 2021-03-04 18:35     ` Stefan Kangas
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Kangas @ 2021-03-04 18:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

>> I was going to ask why not just do PCRE, but then I realized I'm not
>> exactly sure what the syntactical differences are.  (We obviously lack
>> some features.)  AFAIR, Emacs regexps don't exactly match GNU grep,
>> egrep, Perl, or anything else really.
>
> These things don't exactly match eachother, do they?

There is also a POSIX standard for BRE and ERE that we don't follow.

My point is that we could match one of the above, even if they don't
match each other.

> The "biting" effect is surely small.  I have little difficulty using
> grep, egrep and awk, all of whose regexp notations differ somewhat.

I am happy to hear that this works well for you.  Two decades after
writing my first regexp, I still tend to forget sometimes (oh wait is it
\+ in sed again?).  Then I have to look these stupid details up for the
Nth time.

> There's not a lot wrong with Emacs's regexp notation.  It works, works
> well, and we're all familiar with it.

Of course it gets the job done in the sense that you can write a regexp
that will match what you want.

But it is overly verbose in common cases, making regexps harder than
they need to be to read, understand and modify.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-04 15:04                     ` Dmitry Gutov
@ 2021-03-05  5:45                       ` Richard Stallman
  2021-03-05 11:47                         ` Dmitry Gutov
  0 siblings, 1 reply; 42+ messages in thread
From: Richard Stallman @ 2021-03-05  5:45 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: eliz, tomas, emacs-devel, stefankangas, monnier

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Eli has to be firm in order to avoid being bullied by insistent
contributors.  He is not supposed to give the maintainer of an
individual Lisp package carte blanche.

The maintainers of Emacs are in charge of all of Emacs.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-05  5:45                       ` Richard Stallman
@ 2021-03-05 11:47                         ` Dmitry Gutov
  2021-03-06  5:11                           ` Richard Stallman
  0 siblings, 1 reply; 42+ messages in thread
From: Dmitry Gutov @ 2021-03-05 11:47 UTC (permalink / raw)
  To: rms; +Cc: eliz, tomas, emacs-devel, stefankangas, monnier

On 05.03.2021 07:45, Richard Stallman wrote:
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
> 
> Eli has to be firm in order to avoid being bullied by insistent
> contributors.  He is not supposed to give the maintainer of an
> individual Lisp package carte blanche.
> 
> The maintainers of Emacs are in charge of all of Emacs.

That's how it should work, yes.

But then one should exercise their better judgment to avoid bullying the 
maintainers of individual Lisp packages over lesser disagreements.

As well as recognize where their own proficiency ends and where it is 
more appropriate to delegate to others' technical opinion.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Ugly regexps
  2021-03-05 11:47                         ` Dmitry Gutov
@ 2021-03-06  5:11                           ` Richard Stallman
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Stallman @ 2021-03-06  5:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: eliz, tomas, stefankangas, monnier, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > But then one should exercise their better judgment to avoid bullying the 
  > maintainers of individual Lisp packages over lesser disagreements.

Perhaps they are exercising their better judgment already.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-03-06  5:11 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-03  0:32 Ugly regexps Stefan Monnier
2021-03-03  1:32 ` Stefan Kangas
2021-03-03  2:08   ` Stefan Kangas
2021-03-03  6:19     ` Eli Zaretskii
2021-03-03 20:46   ` Alan Mackenzie
2021-03-04 18:35     ` Stefan Kangas
2021-03-03  6:00 ` Eli Zaretskii
2021-03-03 15:46   ` Stefan Monnier
2021-03-03 16:30     ` Eli Zaretskii
2021-03-03 17:44       ` Stefan Monnier
2021-03-03 18:46         ` Stefan Kangas
2021-03-03 19:21           ` Eli Zaretskii
2021-03-03 19:50             ` Stefan Kangas
2021-03-03 20:16               ` Stefan Kangas
2021-03-03 19:50             ` Stefan Kangas
2021-03-03 19:58             ` Dmitry Gutov
2021-03-03 20:07               ` [External] : " Drew Adams
2021-03-03 20:31                 ` Stefan Kangas
2021-03-03 22:17                   ` Drew Adams
2021-03-03 22:32                     ` Stefan Monnier
2021-03-03 20:32                 ` Stefan Monnier
2021-03-04  5:47               ` Eli Zaretskii
2021-03-04 10:49                 ` Lars Ingebrigtsen
2021-03-04 11:25                   ` Mattias Engdegård
2021-03-04 11:28                   ` Alan Mackenzie
2021-03-04 14:11                   ` Eli Zaretskii
2021-03-04 14:25                 ` Dmitry Gutov
2021-03-04 14:50                   ` tomas
2021-03-04 15:04                     ` Dmitry Gutov
2021-03-05  5:45                       ` Richard Stallman
2021-03-05 11:47                         ` Dmitry Gutov
2021-03-06  5:11                           ` Richard Stallman
2021-03-04 15:05                     ` Dmitry Gutov
2021-03-04 15:11                   ` Eli Zaretskii
2021-03-03 19:32           ` [External] : " Drew Adams
2021-03-03  7:09 ` Helmut Eller
2021-03-03 14:11   ` Stefan Kangas
2021-03-03 16:40     ` Stefan Monnier
2021-03-03 15:49   ` Stefan Monnier
2021-03-03 12:17 ` Dmitry Gutov
2021-03-03 15:48   ` Stefan Monnier
2021-03-03 13:57 ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).