unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* icalendar.el bug fix patch
@ 2019-10-23 13:33 Rajeev Narang via Emacs development discussions.
  2019-11-01  9:40 ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Rajeev Narang via Emacs development discussions. @ 2019-10-23 13:33 UTC (permalink / raw)
  To: emacs-devel

icalendar-export-region does not export multi-line Desc as it is imported by icalendar-import-file.  The following patch fixes the issue.  If acceptable, please commit. Thanks.

diff --git a/lisp/calendar/icalendar.el b/lisp/calendar/icalendar.el
index 1186ced3fb..1f4e582aa5 100644
--- a/lisp/calendar/icalendar.el
+++ b/lisp/calendar/icalendar.el
@@ -1244,7 +1244,7 @@ icalendar--parse-summary-and-rest
                      (concat "\\(" icalendar-import-format-uid "\\)??"))))
 	;; Need the \' regexp in order to detect multi-line items
         (setq s (concat "\\`"
-                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
+                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
                         "\\'"))
         (if (string-match s summary-and-rest)
             (let (cla des loc org sta url uid) ;; sum



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-10-23 13:33 icalendar.el bug fix patch Rajeev Narang via Emacs development discussions.
@ 2019-11-01  9:40 ` Eli Zaretskii
  2019-11-01 10:51   ` Mattias Engdegård
  2019-11-01 11:12   ` Rajeev Narang via Emacs development discussions.
  0 siblings, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-01  9:40 UTC (permalink / raw)
  To: Rajeev Narang; +Cc: emacs-devel

> Date: Wed, 23 Oct 2019 09:33:52 -0400
> From: Rajeev Narang via "Emacs development discussions." <emacs-devel@gnu.org>
> 
> icalendar-export-region does not export multi-line Desc as it is imported by icalendar-import-file.  The following patch fixes the issue.  If acceptable, please commit. Thanks.
> 
> diff --git a/lisp/calendar/icalendar.el b/lisp/calendar/icalendar.el
> index 1186ced3fb..1f4e582aa5 100644
> --- a/lisp/calendar/icalendar.el
> +++ b/lisp/calendar/icalendar.el
> @@ -1244,7 +1244,7 @@ icalendar--parse-summary-and-rest
>                       (concat "\\(" icalendar-import-format-uid "\\)??"))))
>  	;; Need the \' regexp in order to detect multi-line items
>          (setq s (concat "\\`"
> -                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
> +                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
>                          "\\'"))

Thanks, but is [^a-z] correct?  Do you want to reject only lower-case
ASCII characters?  What about non-ASCII?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01  9:40 ` Eli Zaretskii
@ 2019-11-01 10:51   ` Mattias Engdegård
  2019-11-01 13:00     ` Eli Zaretskii
  2019-11-01 11:12   ` Rajeev Narang via Emacs development discussions.
  1 sibling, 1 reply; 24+ messages in thread
From: Mattias Engdegård @ 2019-11-01 10:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Rajeev Narang

1 nov. 2019 kl. 10.40 skrev Eli Zaretskii <eliz@gnu.org>:

>> -                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
>> +                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
>> 
> Thanks, but is [^a-z] correct?  Do you want to reject only lower-case
> ASCII characters?  What about non-ASCII?

It's [^z-a], also known as 'anychar'.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01  9:40 ` Eli Zaretskii
  2019-11-01 10:51   ` Mattias Engdegård
@ 2019-11-01 11:12   ` Rajeev Narang via Emacs development discussions.
  2019-11-01 13:05     ` Eli Zaretskii
  1 sibling, 1 reply; 24+ messages in thread
From: Rajeev Narang via Emacs development discussions. @ 2019-11-01 11:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[^z-a] is correct. I understand it to mean any char. It is another was of writing [.\n] and is used in rx.el, so I presumed is preferred.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 10:51   ` Mattias Engdegård
@ 2019-11-01 13:00     ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-01 13:00 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: emacs-devel, rajeev

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Fri, 1 Nov 2019 11:51:49 +0100
> Cc: Rajeev Narang <rajeev@sivalik.com>, emacs-devel@gnu.org
> 
> 1 nov. 2019 kl. 10.40 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> >> -                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
> >> +                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
> >> 
> > Thanks, but is [^a-z] correct?  Do you want to reject only lower-case
> > ASCII characters?  What about non-ASCII?
> 
> It's [^z-a], also known as 'anychar'.

OK, but I still wonder why it's TRT.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 11:12   ` Rajeev Narang via Emacs development discussions.
@ 2019-11-01 13:05     ` Eli Zaretskii
  2019-11-01 13:24       ` Mattias Engdegård
  2019-11-01 14:30       ` Richard Stallman
  0 siblings, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-01 13:05 UTC (permalink / raw)
  To: Rajeev Narang; +Cc: emacs-devel

> From: Rajeev Narang <rajeev@sivalik.com>
> Cc: emacs-devel@gnu.org
> Date: Fri, 01 Nov 2019 07:12:08 -0400
> 
> [^z-a] is correct. I understand it to mean any char. It is another was of writing [.\n] and is used in rx.el, so I presumed is preferred.

If we need to say [.\n], let's say that.  IMO, it expresses its intent
much more than the cryptic [^z-a].

Thanks.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 13:05     ` Eli Zaretskii
@ 2019-11-01 13:24       ` Mattias Engdegård
  2019-11-01 13:33         ` Eli Zaretskii
  2019-11-01 16:44         ` Howard Melman
  2019-11-01 14:30       ` Richard Stallman
  1 sibling, 2 replies; 24+ messages in thread
From: Mattias Engdegård @ 2019-11-01 13:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Rajeev Narang

1 nov. 2019 kl. 14.05 skrev Eli Zaretskii <eliz@gnu.org>:

> If we need to say [.\n], let's say that.  IMO, it expresses its intent
> much more than the cryptic [^z-a].

'.' is not special inside []; you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand. [^z-a] is mentioned in the manual, by the way.

If readability is important, consider (rx (group (*? anychar))).




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 13:24       ` Mattias Engdegård
@ 2019-11-01 13:33         ` Eli Zaretskii
  2019-11-01 21:19           ` Paul Eggert
  2019-11-01 16:44         ` Howard Melman
  1 sibling, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2019-11-01 13:33 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: emacs-devel, rajeev

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Fri, 1 Nov 2019 14:24:31 +0100
> Cc: Rajeev Narang <rajeev@sivalik.com>, emacs-devel@gnu.org
> 
> 1 nov. 2019 kl. 14.05 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> > If we need to say [.\n], let's say that.  IMO, it expresses its intent
> > much more than the cryptic [^z-a].
> 
> '.' is not special inside []; you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand.

Yes, it's easier to understand, so I prefer that we use it.

> If readability is important, consider (rx (group (*? anychar))).

What does it translate into?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 13:05     ` Eli Zaretskii
  2019-11-01 13:24       ` Mattias Engdegård
@ 2019-11-01 14:30       ` Richard Stallman
  1 sibling, 0 replies; 24+ messages in thread
From: Richard Stallman @ 2019-11-01 14:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, rajeev

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > If we need to say [.\n], let's say that.

Doesn't [.\n] match either a period or a newline?
Period is a special character in regexps, but it does not have
that special meaning inside a character class.

-- 
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 13:24       ` Mattias Engdegård
  2019-11-01 13:33         ` Eli Zaretskii
@ 2019-11-01 16:44         ` Howard Melman
  1 sibling, 0 replies; 24+ messages in thread
From: Howard Melman @ 2019-11-01 16:44 UTC (permalink / raw)
  To: emacs-devel


Mattias Engdegård <mattiase@acm.org> writes:

> '.' is not special inside []; you would have to write "\\(?:.\\|\n\\)"
> which is slower and messier, although perhaps easier to
> understand. [^z-a] is mentioned in the manual, by the way.

I did not know about this either. FWIW the manual says: 

    However, the lower bound should be at most one greater
    than the upper bound; for example, ‘[c-a]’ should be
    avoided. 

So [^b-a] would be preferred over [^z-a].

-- 

Howard




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 13:33         ` Eli Zaretskii
@ 2019-11-01 21:19           ` Paul Eggert
  2019-11-01 21:38             ` Mattias Engdegård
                               ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Paul Eggert @ 2019-11-01 21:19 UTC (permalink / raw)
  To: Eli Zaretskii, Mattias Engdegård; +Cc: rajeev, emacs-devel

On 11/1/19 6:33 AM, Eli Zaretskii wrote:
>> you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand.
> Yes, it's easier to understand, so I prefer that we use it.
> 

I find "[^z-a]" to be signficantly easier to understand than 
"\\(?:.\\|\n\\)".

But since the concept is useful, how about if we create an escape for 
it? For example, we could establish \! as a regexp that matches any 
single character. This be more readable than either [^z-a] or \(?:.\|
\), and would surely help performance as well as readability.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 21:19           ` Paul Eggert
@ 2019-11-01 21:38             ` Mattias Engdegård
  2019-11-02 18:39             ` Juri Linkov
  2019-11-03  3:24             ` Richard Stallman
  2 siblings, 0 replies; 24+ messages in thread
From: Mattias Engdegård @ 2019-11-01 21:38 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, rajeev, emacs-devel

1 nov. 2019 kl. 22.19 skrev Paul Eggert <eggert@cs.ucla.edu>:

> But since the concept is useful, how about if we create an escape for it? For example, we could establish \! as a regexp that matches any single character. This be more readable than either [^z-a] or \(?:.\|
> \), and would surely help performance as well as readability.

Some time ago I experimented with adding a regexp-engine opcode for anychar, but didn't observe any significant difference in performance from that of [^z-a] or \Sq. It is possible that gains could be had if the opcode were to be exploited on a deeper level, such as producing a fast scan loop for [^z-a]*STRING. One has to be careful with backtracking, however.

This is orthogonal to the addition of regexp string syntax for anychar (like \!); neither requires the other.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 21:19           ` Paul Eggert
  2019-11-01 21:38             ` Mattias Engdegård
@ 2019-11-02 18:39             ` Juri Linkov
  2019-11-03 13:21               ` Richard Stallman
  2019-11-03 19:55               ` Stefan Monnier
  2019-11-03  3:24             ` Richard Stallman
  2 siblings, 2 replies; 24+ messages in thread
From: Juri Linkov @ 2019-11-02 18:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Mattias Engdegård, Eli Zaretskii, emacs-devel, rajeev

>>> you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand.
>> Yes, it's easier to understand, so I prefer that we use it.
>
> I find "[^z-a]" to be signficantly easier to understand than
> "\\(?:.\\|\n\\)".
>
> But since the concept is useful, how about if we create an escape for it?
> For example, we could establish \! as a regexp that matches any single
> character. This be more readable than either [^z-a] or \(?:.\|
> \), and would surely help performance as well as readability.

Like using the dotall modifier in other regexp engines to enable single line
mode where the dot matches all characters, including newlines, e.g. in PCRE

  /regexp/s

what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
Maybe something like

  (let ((regexp-modifiers "s"))
    (string-match "." string))



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-01 21:19           ` Paul Eggert
  2019-11-01 21:38             ` Mattias Engdegård
  2019-11-02 18:39             ` Juri Linkov
@ 2019-11-03  3:24             ` Richard Stallman
  2019-11-03 16:54               ` Drew Adams
  2 siblings, 1 reply; 24+ messages in thread
From: Richard Stallman @ 2019-11-03  3:24 UTC (permalink / raw)
  To: Paul Eggert; +Cc: mattiase, eliz, emacs-devel, rajeev

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > But since the concept is useful, how about if we create an escape for 
  > it? For example, we could establish \! as a regexp that matches any 
  > single character.

+1.

-- 
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-02 18:39             ` Juri Linkov
@ 2019-11-03 13:21               ` Richard Stallman
  2019-11-03 16:55                 ` Drew Adams
  2019-11-03 19:55               ` Stefan Monnier
  1 sibling, 1 reply; 24+ messages in thread
From: Richard Stallman @ 2019-11-03 13:21 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rajeev, mattiase, eliz, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
  > Maybe something like

  >   (let ((regexp-modifiers "s"))
  >     (string-match "." string))

I don't like the idea of a global variable to alter regexp syntax.
I think that some sort of operator would be better.
It could be an escape sequence for "any character" or it could
be a kind of parenthetical grouping which alters the meaning of a period
inside it.

What would be convenient in the rx syntax?
That question might be helpful to finding the best way to
handle this in string-regexp syntax.

-- 
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: icalendar.el bug fix patch
  2019-11-03  3:24             ` Richard Stallman
@ 2019-11-03 16:54               ` Drew Adams
  0 siblings, 0 replies; 24+ messages in thread
From: Drew Adams @ 2019-11-03 16:54 UTC (permalink / raw)
  To: rms, Paul Eggert; +Cc: mattiase, eliz, rajeev, emacs-devel

>> But since the concept is useful, how about if we
>> create an escape for it? For example, we could
>> establish \! as a regexp that matches any single
>> character.
> 
> +1.

[Interesting.  I skipped this thread, based on the
Subject.  Just stumbled on this part of it, about
having a simple pattern to match what `\\(.\\|[\n]\\)'
matches.  Wouldn't have guessed that from the Subject.]

FWIW, I proposed this back in 2006.  Actually, I
proposed a toggle, so you could use the _same_ regexp
to match either any char or any char except newline.

Nothing wrong with having a separate escape, such as
`\!', to always match what `\\(.\\|[\n]\\)' matches.

But it would still be good to (also) have a toggle,
to be able to make just `.' match the same thing.

https://lists.gnu.org/archive/html/emacs-devel/2006-03/msg00162.html

In 2012 I again voiced the proposal (in the context
of mentioning how Icicles lets you do such things):

  I proposed long ago, for instance, a toggle for a
  `.' in regexp search to match also newlines, i.e.,
  any character.  IOW, `.' in one mode state of the
  toggle, would be equivalent to "\\(.\\|[\n]\\)"
  in the other (traditional) state.

https://lists.gnu.org/archive/html/emacs-devel/2012-08/msg00832.html

So how about adding an escape, such as `\!', and
_also_ adding a toggle (variable and command to
toggle it) that treats both `\!' and plain `.' the
same; that is; have them each match any char?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: icalendar.el bug fix patch
  2019-11-03 13:21               ` Richard Stallman
@ 2019-11-03 16:55                 ` Drew Adams
  0 siblings, 0 replies; 24+ messages in thread
From: Drew Adams @ 2019-11-03 16:55 UTC (permalink / raw)
  To: rms, Juri Linkov; +Cc: mattiase, eliz, eggert, emacs-devel, rajeev

> I don't like the idea of a global variable to alter regexp syntax.

Why not?

> I think that some sort of operator would be better.

It need not be either-or.

> It could be an escape sequence for "any character" or it could
> be a kind of parenthetical grouping which alters the meaning of a
> period inside it.

+1 for the former (escape char).

But having also a variable to control what plain `.'
matches has an additional advantage of letting the
same regexp (using `.') have two different behaviors.

That's particularly helpful interactively, if you
can hit a key to toggle the variable value (so toggle
the behavior of `.').



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-02 18:39             ` Juri Linkov
  2019-11-03 13:21               ` Richard Stallman
@ 2019-11-03 19:55               ` Stefan Monnier
  2019-11-03 20:54                 ` Juri Linkov
  1 sibling, 1 reply; 24+ messages in thread
From: Stefan Monnier @ 2019-11-03 19:55 UTC (permalink / raw)
  To: Juri Linkov
  Cc: rajeev, Mattias Engdegård, Eli Zaretskii, Paul Eggert,
	emacs-devel

> what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
> Maybe something like
>
>   (let ((regexp-modifiers "s"))
>     (string-match "." string))

That will affect all the regexp matching that will happen during
execution of this code, so it will require changes in debug.el and
edebug.el, and probably in elp.el and trace.el as well to "reset" the
var before running innocent code.

Also, it can be problematic for cases where we combine/concatenate
several regexp chunks.


        Stefan




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-03 19:55               ` Stefan Monnier
@ 2019-11-03 20:54                 ` Juri Linkov
  2019-11-03 21:10                   ` Stefan Monnier
  0 siblings, 1 reply; 24+ messages in thread
From: Juri Linkov @ 2019-11-03 20:54 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: rajeev, Mattias Engdegård, Eli Zaretskii, Paul Eggert,
	emacs-devel

>> what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
>> Maybe something like
>>
>>   (let ((regexp-modifiers "s"))
>>     (string-match "." string))
>
> That will affect all the regexp matching that will happen during
> execution of this code, so it will require changes in debug.el and
> edebug.el, and probably in elp.el and trace.el as well to "reset" the
> var before running innocent code.
>
> Also, it can be problematic for cases where we combine/concatenate
> several regexp chunks.

In /regexp/s syntax the modifiers are inseparable from regexp.
What would be an equivalent in Emacs, maybe text properties?

  (string-match (propertize "." 'regexp-modifiers 'newline) string)



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-03 20:54                 ` Juri Linkov
@ 2019-11-03 21:10                   ` Stefan Monnier
  2019-11-03 21:32                     ` Juri Linkov
       [not found]                     ` <E81C3456-834F-469D-B8CA-80B1CDD311F8@acm.org>
  0 siblings, 2 replies; 24+ messages in thread
From: Stefan Monnier @ 2019-11-03 21:10 UTC (permalink / raw)
  To: Juri Linkov
  Cc: rajeev, Mattias Engdegård, Eli Zaretskii, Paul Eggert,
	emacs-devel

> In /regexp/s syntax the modifiers are inseparable from regexp.
> What would be an equivalent in Emacs, maybe text properties?

I don't like the sound of it.
I think text markers would make more sense, like

   \(*FOO:REGEXP\)

where FOO is the "syntax modifier".  This uses a similar syntax to the
\(?..:REGEXP\) used for shy and named groups.


        Stefan




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-03 21:10                   ` Stefan Monnier
@ 2019-11-03 21:32                     ` Juri Linkov
       [not found]                     ` <E81C3456-834F-469D-B8CA-80B1CDD311F8@acm.org>
  1 sibling, 0 replies; 24+ messages in thread
From: Juri Linkov @ 2019-11-03 21:32 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: rajeev, Mattias Engdegård, Eli Zaretskii, Paul Eggert,
	emacs-devel

> I think text markers would make more sense, like
>
>    \(*FOO:REGEXP\)
>
> where FOO is the "syntax modifier".  This uses a similar syntax to the
> \(?..:REGEXP\) used for shy and named groups.

This makes sense since there are not too many possible modifiers:

https://www.regular-expressions.info/modifiers.html

It seems 'i' should override the value of 'case-fold-search'.

Another useful modifier to add would be 'm' that changes the meaning
of ‘^’ to ‘\`’ (or vice versa).



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
       [not found]                     ` <E81C3456-834F-469D-B8CA-80B1CDD311F8@acm.org>
@ 2019-11-04  0:50                       ` Paul Eggert
  2019-11-04 11:56                         ` Mattias Engdegård
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Eggert @ 2019-11-04  0:50 UTC (permalink / raw)
  To: Mattias Engdegård, Stefan Monnier
  Cc: rajeev, Eli Zaretskii, emacs-devel, Juri Linkov

On 11/3/19 1:34 PM, Mattias Engdegård wrote:

> There are so many regexp engines these days that I can't keep track of them all, but there may be precedences elsewhere.

Some of the precedent discussed in this thread seems to be based on Perl, which 
introduced the "single-line" concept and which has notations like (?s:RE) to 
cause RE to operate in single-line mode which means '.' within RE matches any 
single character. However, Perl later added another escape '\N' which matches 
any single non-newline character even when single-line mode is in effect - which 
sounds a bit like piling one kludge atop another.

This remind me of Perl's using ^ and $ to mean either the beginning and end of 
text or the beginning and end of line, depending on whether Perl's m flag is in 
effect. Emacs solves this in a different and arguably better way, by using \` 
and \' for beginning and end of text instead. If we want to keep with this 
tradition, Emacs should use a separate escape sequence for "match any single 
character including newline".

On looking into what other systems do, I find that I prefer Vim's syntax of 
'\_.' to match any single character including newline. (Vim has other uses for 
\_ which we could decide separately whether to adopt.) Partly this is because \! 
might better be used to denote regular expression negation.

> Paul mentioned \! which would do, but I'm not sure if it was an off-the-cuff suggestion or a carefully considered proposal.
A bit of both. The choice of '\!' was off-the-cuff and on second thought it'd 
probably be better to use Vim's syntax, since it's a precedent and it leaves us 
better room for future extensions.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: icalendar.el bug fix patch
  2019-11-04  0:50                       ` Paul Eggert
@ 2019-11-04 11:56                         ` Mattias Engdegård
  2019-11-04 15:16                           ` Drew Adams
  0 siblings, 1 reply; 24+ messages in thread
From: Mattias Engdegård @ 2019-11-04 11:56 UTC (permalink / raw)
  To: Paul Eggert
  Cc: Juri Linkov, Eli Zaretskii, emacs-devel, Stefan Monnier, rajeev

4 nov. 2019 kl. 01.50 skrev Paul Eggert <eggert@cs.ucla.edu>:

> If we want to keep with this tradition, Emacs should use a separate escape sequence for "match any single character including newline".

I very much agree, for practicality more than tradition. An atomic notation is far superior to a modal mechanism.

> On looking into what other systems do, I find that I prefer Vim's syntax of '\_.' to match any single character including newline.

Thanks for finding that. Another advantage of '\_.' is that it does not change the behaviour of existing regexps; '\!' currently means '!', but '\_.' is a syntax error.

There doesn't seem to be much else in Vim's '\_' family worth appropriating for Emacs. If anything, I'd favour adding some (non-Vim) Unicode patterns: \p and \P for general categories and \X for graphemes.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: icalendar.el bug fix patch
  2019-11-04 11:56                         ` Mattias Engdegård
@ 2019-11-04 15:16                           ` Drew Adams
  0 siblings, 0 replies; 24+ messages in thread
From: Drew Adams @ 2019-11-04 15:16 UTC (permalink / raw)
  To: Mattias Engdegård, Paul Eggert
  Cc: rajeev, Eli Zaretskii, emacs-devel, Stefan Monnier, Juri Linkov

> > If we want to keep with this tradition, Emacs should use a separate
> escape sequence for "match any single character including newline".
> 
> I very much agree, for practicality more than tradition. An atomic
> notation is far superior to a modal mechanism.

A single escape is a good thing to have.  That's a
good start.

But it's not either-or.  There's also an advantage,
for users (interactively) and for code, to be able
to use the _same regexp_ to optionally interpret
`.' as match-any-char-including-newline.

IMO we should do both: (1) add an escape for this
and (2) provide a variable and toggle command that
makes `.' match either possibility: any char or
any char except newline.

There are lots of places in existing code that use
predefined regexps, including complex ones.  Letting
these optionally interpret `.' to include a newline
adds functionality.  Even more importantly probably
is letting users use `.' interactively either way.

It's also possible to limit the variable and its
toggling to interactive use (as I explained earlier).
IOW, (optionally) not let it affect regexps in code.



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-11-04 15:16 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-23 13:33 icalendar.el bug fix patch Rajeev Narang via Emacs development discussions.
2019-11-01  9:40 ` Eli Zaretskii
2019-11-01 10:51   ` Mattias Engdegård
2019-11-01 13:00     ` Eli Zaretskii
2019-11-01 11:12   ` Rajeev Narang via Emacs development discussions.
2019-11-01 13:05     ` Eli Zaretskii
2019-11-01 13:24       ` Mattias Engdegård
2019-11-01 13:33         ` Eli Zaretskii
2019-11-01 21:19           ` Paul Eggert
2019-11-01 21:38             ` Mattias Engdegård
2019-11-02 18:39             ` Juri Linkov
2019-11-03 13:21               ` Richard Stallman
2019-11-03 16:55                 ` Drew Adams
2019-11-03 19:55               ` Stefan Monnier
2019-11-03 20:54                 ` Juri Linkov
2019-11-03 21:10                   ` Stefan Monnier
2019-11-03 21:32                     ` Juri Linkov
     [not found]                     ` <E81C3456-834F-469D-B8CA-80B1CDD311F8@acm.org>
2019-11-04  0:50                       ` Paul Eggert
2019-11-04 11:56                         ` Mattias Engdegård
2019-11-04 15:16                           ` Drew Adams
2019-11-03  3:24             ` Richard Stallman
2019-11-03 16:54               ` Drew Adams
2019-11-01 16:44         ` Howard Melman
2019-11-01 14:30       ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).