unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Regexp for c-or-c++-mode
@ 2020-06-07 16:41 Alan Mackenzie
  2020-06-07 18:07 ` Michał Nazarewicz
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Mackenzie @ 2020-06-07 16:41 UTC (permalink / raw)
  To: Micha� Nazarewicz; +Cc: emacs-devel

Hello, Micha�.

In c-or-c++-mode--regexp, there are several occurrences of

    [ \t\r]

.  These expressions notably lack \n.  This seems strange, given how \n
is the normal line terminator in Emacs and \r is a rarely used artefact.

Is there any reason these expressions are like that, and if so please
tell me that reason.  If there is no such reason, I have a patch ready
to put the \n's into the regexp.

Thanks!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-07 16:41 Regexp for c-or-c++-mode Alan Mackenzie
@ 2020-06-07 18:07 ` Michał Nazarewicz
  2020-06-09 20:12   ` Alan Mackenzie
  0 siblings, 1 reply; 9+ messages in thread
From: Michał Nazarewicz @ 2020-06-07 18:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Sun, 7 Jun 2020 at 17:41, Alan Mackenzie <acm@muc.de> wrote:
> In c-or-c++-mode--regexp, there are several occurrences of
>
>     [ \t\r]
>
> .  These expressions notably lack \n.  This seems strange, given how \n
> is the normal line terminator in Emacs and \r is a rarely used artefact.

I wanted the regex to match single-line rather than multi-line statements
to avoid false positives. Though, other than #include lines, either will
probably work equally well.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-07 18:07 ` Michał Nazarewicz
@ 2020-06-09 20:12   ` Alan Mackenzie
  2020-06-10 11:35     ` Michał Nazarewicz
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Mackenzie @ 2020-06-09 20:12 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: emacs-devel

Hello, Micha�.

On Sun, Jun 07, 2020 at 19:07:12 +0100, Michał Nazarewicz wrote:
> On Sun, 7 Jun 2020 at 17:41, Alan Mackenzie <acm@muc.de> wrote:
> > In c-or-c++-mode--regexp, there are several occurrences of

> >     [ \t\r]

> > .  These expressions notably lack \n.  This seems strange, given how \n
> > is the normal line terminator in Emacs and \r is a rarely used artefact.

> I wanted the regex to match single-line rather than multi-line statements
> to avoid false positives. Though, other than #include lines, either will
> probably work equally well.

I don't fully understand.  Why have you got the \r there?

> -- 
> Best regards
> ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
> «If at first you don’t succeed, give up skydiving»

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-09 20:12   ` Alan Mackenzie
@ 2020-06-10 11:35     ` Michał Nazarewicz
  2020-06-10 11:40       ` Robert Pluim
  0 siblings, 1 reply; 9+ messages in thread
From: Michał Nazarewicz @ 2020-06-10 11:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Tue, 9 Jun 2020 at 21:13, Alan Mackenzie <acm@muc.de> wrote:
> On Sun, Jun 07, 2020 at 19:07:12 +0100, Michał Nazarewicz wrote:
> > On Sun, 7 Jun 2020 at 17:41, Alan Mackenzie <acm@muc.de> wrote:
> > > In c-or-c++-mode--regexp, there are several occurrences of
>
> > >     [ \t\r]
>
> > > .  These expressions notably lack \n.  This seems strange, given how \n
> > > is the normal line terminator in Emacs and \r is a rarely used artefact.
>
> > I wanted the regex to match single-line rather than multi-line statements
> > to avoid false positives. Though, other than #include lines, either will
> > probably work equally well.
>
> I don't fully understand.  Why have you got the \r there?

I suppose you’re right. It should be [ \t\v\f] instead to catch
all non-new-line white-space characters. Or [ \t\v\f\r\n] to catch
all white-space characters.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-10 11:35     ` Michał Nazarewicz
@ 2020-06-10 11:40       ` Robert Pluim
  2020-06-10 13:58         ` Michał Nazarewicz
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Pluim @ 2020-06-10 11:40 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: Alan Mackenzie, emacs-devel

>>>>> On Wed, 10 Jun 2020 12:35:19 +0100, Michał Nazarewicz <mina86@mina86.com> said:

    Michał> On Tue, 9 Jun 2020 at 21:13, Alan Mackenzie <acm@muc.de> wrote:
    >> On Sun, Jun 07, 2020 at 19:07:12 +0100, Michał Nazarewicz wrote:
    >> > On Sun, 7 Jun 2020 at 17:41, Alan Mackenzie <acm@muc.de> wrote:
    >> > > In c-or-c++-mode--regexp, there are several occurrences of
    >> 
    >> > >     [ \t\r]
    >> 
    >> > > .  These expressions notably lack \n.  This seems strange, given how \n
    >> > > is the normal line terminator in Emacs and \r is a rarely used artefact.
    >> 
    >> > I wanted the regex to match single-line rather than multi-line statements
    >> > to avoid false positives. Though, other than #include lines, either will
    >> > probably work equally well.
    >> 
    >> I don't fully understand.  Why have you got the \r there?

    Michał> I suppose you’re right. It should be [ \t\v\f] instead to catch
    Michał> all non-new-line white-space characters. Or [ \t\v\f\r\n] to catch
    Michał> all white-space characters.

[[:blank:]] ?

Robert



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-10 11:40       ` Robert Pluim
@ 2020-06-10 13:58         ` Michał Nazarewicz
  2020-06-10 14:15           ` Robert Pluim
  0 siblings, 1 reply; 9+ messages in thread
From: Michał Nazarewicz @ 2020-06-10 13:58 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Alan Mackenzie, emacs-devel

On Wed, 10 Jun 2020 at 12:40, Robert Pluim <rpluim@gmail.com> wrote:
>     Michał> I suppose you’re right. It should be [ \t\v\f] instead to catch
>     Michał> all non-new-line white-space characters. Or [ \t\v\f\r\n] to catch
>     Michał> all white-space characters.
>
> [[:blank:]] ?

[[:blank:]] is defined in terms of Unicode properties so that would
catch things which C does not consider white-space.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-10 13:58         ` Michał Nazarewicz
@ 2020-06-10 14:15           ` Robert Pluim
  2020-06-10 23:42             ` Michał Nazarewicz
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Pluim @ 2020-06-10 14:15 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: Alan Mackenzie, emacs-devel

>>>>> On Wed, 10 Jun 2020 14:58:18 +0100, Michał Nazarewicz <mina86@mina86.com> said:

    Michał> On Wed, 10 Jun 2020 at 12:40, Robert Pluim <rpluim@gmail.com> wrote:
    Michał> I suppose you’re right. It should be [ \t\v\f] instead to catch
    Michał> all non-new-line white-space characters. Or [ \t\v\f\r\n] to catch
    Michał> all white-space characters.
    >> 
    >> [[:blank:]] ?

    Michał> [[:blank:]] is defined in terms of Unicode properties so that would
    Michał> catch things which C does not consider white-space.

[[:space:]] then, which uses the buffer's syntax table.

Robert



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-10 14:15           ` Robert Pluim
@ 2020-06-10 23:42             ` Michał Nazarewicz
  2020-06-11  9:40               ` Alan Mackenzie
  0 siblings, 1 reply; 9+ messages in thread
From: Michał Nazarewicz @ 2020-06-10 23:42 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Alan Mackenzie, emacs-devel

On Wed, 10 Jun 2020 at 15:15, Robert Pluim <rpluim@gmail.com> wrote:
>
> >>>>> On Wed, 10 Jun 2020 14:58:18 +0100, Michał Nazarewicz <mina86@mina86.com> said:
>
>     Michał> On Wed, 10 Jun 2020 at 12:40, Robert Pluim <rpluim@gmail.com> wrote:
>     Michał> I suppose you’re right. It should be [ \t\v\f] instead to catch
>     Michał> all non-new-line white-space characters. Or [ \t\v\f\r\n] to catch
>     Michał> all white-space characters.
>     >>
>     >> [[:blank:]] ?
>
>     Michał> [[:blank:]] is defined in terms of Unicode properties so that would
>     Michał> catch things which C does not consider white-space.
>
> [[:space:]] then, which uses the buffer's syntax table.

The regex under discussion needs to adhere to C syntax but is (may be)
used outside of cc-mode and thus should not rely on syntax table being
set up for C.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regexp for c-or-c++-mode
  2020-06-10 23:42             ` Michał Nazarewicz
@ 2020-06-11  9:40               ` Alan Mackenzie
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Mackenzie @ 2020-06-11  9:40 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: Robert Pluim, emacs-devel

Hello, Micha�.

On Thu, Jun 11, 2020 at 00:42:25 +0100, Michał Nazarewicz wrote:
> On Wed, 10 Jun 2020 at 15:15, Robert Pluim <rpluim@gmail.com> wrote:

> > >>>>> On Wed, 10 Jun 2020 14:58:18 +0100, Michał Nazarewicz <mina86@mina86.com> said:

> >     Michał> On Wed, 10 Jun 2020 at 12:40, Robert Pluim <rpluim@gmail.com> wrote:
> >     Michał> I suppose you’re right. It should be [ \t\v\f] instead to catch
> >     Michał> all non-new-line white-space characters. Or [ \t\v\f\r\n] to catch
> >     Michał> all white-space characters.

> >     >> [[:blank:]] ?

> >     Michał> [[:blank:]] is defined in terms of Unicode properties so that would
> >     Michał> catch things which C does not consider white-space.

> > [[:space:]] then, which uses the buffer's syntax table.

> The regex under discussion needs to adhere to C syntax but is (may be)
> used outside of cc-mode and thus should not rely on syntax table being
> set up for C.

How about using simply [ \t]?  The \r doesn't really add any utility,
just confusion, and there seem not to have been any problems with
c-or-c++-mode so far.  Like you said, including the "bigger" whitespace
characters might lead to false positives.

> -- 
> Best regards
> ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
> «If at first you don’t succeed, give up skydiving»

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-06-11  9:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-07 16:41 Regexp for c-or-c++-mode Alan Mackenzie
2020-06-07 18:07 ` Michał Nazarewicz
2020-06-09 20:12   ` Alan Mackenzie
2020-06-10 11:35     ` Michał Nazarewicz
2020-06-10 11:40       ` Robert Pluim
2020-06-10 13:58         ` Michał Nazarewicz
2020-06-10 14:15           ` Robert Pluim
2020-06-10 23:42             ` Michał Nazarewicz
2020-06-11  9:40               ` Alan Mackenzie

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).