unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
@ 2015-03-25 12:27 AngusC
  2015-03-25 12:50 ` K.P. Huang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: AngusC @ 2015-03-25 12:27 UTC (permalink / raw)
  To: Help-gnu-emacs

I have text like this:

some description here Status1
some other description Status2
some other interesting description Status1

I am using regex like this:

\(.*\)\([a-zA-Z]+\)$

with replacement text:

\1 ZZZ \2

And what I expected was:

some description here ZZZ Status
some other description ZZZ Fault
some other interesting description ZZZ Status

But instead I got:

some description here Statu ZZZ s
some other description Faul ZZZ t
some other interesting description Statu ZZZ s

I want my expected outcome, how do I do that?

What was wrong with my regex?




--
View this message in context: http://emacs.1067599.n5.nabble.com/regex-a-zA-Z-not-working-as-expected-tp353090.html
Sent from the Emacs - Help mailing list archive at Nabble.com.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
       [not found] <mailman.2738.1427286469.31049.help-gnu-emacs@gnu.org>
@ 2015-03-25 12:43 ` Loris Bennett
  2015-03-25 12:45   ` Loris Bennett
  0 siblings, 1 reply; 7+ messages in thread
From: Loris Bennett @ 2015-03-25 12:43 UTC (permalink / raw)
  To: help-gnu-emacs

AngusC <anguscomber@gmail.com> writes:

> I have text like this:
>
> some description here Status1
> some other description Status2
> some other interesting description Status1
>
> I am using regex like this:
>
> \(.*\)\([a-zA-Z]+\)$
>
> with replacement text:
>
> \1 ZZZ \2
>
> And what I expected was:
>
> some description here ZZZ Status
> some other description ZZZ Fault
> some other interesting description ZZZ Status
>
> But instead I got:
>
> some description here Statu ZZZ s
> some other description Faul ZZZ t
> some other interesting description Statu ZZZ s
>
> I want my expected outcome, how do I do that?
>
> What was wrong with my regex?

Your first group is matching right up to the last penultimate character
and the second is match the last character.  Try adding a patter to
match a space between your groups.

Cheers,

Loris

-- 
This signature is currently under construction.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
  2015-03-25 12:43 ` regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected Loris Bennett
@ 2015-03-25 12:45   ` Loris Bennett
  0 siblings, 0 replies; 7+ messages in thread
From: Loris Bennett @ 2015-03-25 12:45 UTC (permalink / raw)
  To: help-gnu-emacs

"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> AngusC <anguscomber@gmail.com> writes:
>
>> I have text like this:
>>
>> some description here Status1
>> some other description Status2
>> some other interesting description Status1
>>
>> I am using regex like this:
>>
>> \(.*\)\([a-zA-Z]+\)$
>>
>> with replacement text:
>>
>> \1 ZZZ \2
>>
>> And what I expected was:
>>
>> some description here ZZZ Status
>> some other description ZZZ Fault
>> some other interesting description ZZZ Status
>>
>> But instead I got:
>>
>> some description here Statu ZZZ s
>> some other description Faul ZZZ t
>> some other interesting description Statu ZZZ s
>>
>> I want my expected outcome, how do I do that?
>>
>> What was wrong with my regex?
>
> Your first group is matching right up to the last penultimate character

Sorry, that should just be "the penultimate character".

> and the second is match the last character.  Try adding a patter to
> match a space between your groups.
>
> Cheers,
>
> Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email loris.bennett@fu-berlin.de


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
  2015-03-25 12:27 AngusC
@ 2015-03-25 12:50 ` K.P. Huang
  2015-03-25 12:54 ` Yuri Khan
  2015-03-25 12:57 ` tomas
  2 siblings, 0 replies; 7+ messages in thread
From: K.P. Huang @ 2015-03-25 12:50 UTC (permalink / raw)
  To: AngusC; +Cc: Help-gnu-emacs

Try this one (please ignorer all the double quote):

" \([a-zA-Z]+\)$"

There is a space before first \, then replaced by

" ZZZ \1"

There is also a space before ZZZ.

Your regex will match the last char from end of line, i.e. \([a-zA-Z]+\)$.

Zero or more char follow by \(.*\). You got the right result for your regex.

-- kphuanghk








2015-03-25 20:27 GMT+08:00 AngusC <anguscomber@gmail.com>:

> I have text like this:
>
> some description here Status1
> some other description Status2
> some other interesting description Status1
>
> I am using regex like this:
>
> \(.*\)\([a-zA-Z]+\)$
>
> with replacement text:
>
> \1 ZZZ \2
>
> And what I expected was:
>
> some description here ZZZ Status
> some other description ZZZ Fault
> some other interesting description ZZZ Status
>
> But instead I got:
>
> some description here Statu ZZZ s
> some other description Faul ZZZ t
> some other interesting description Statu ZZZ s
>
> I want my expected outcome, how do I do that?
>
> What was wrong with my regex?
>
>
>
>
> --
> View this message in context:
> http://emacs.1067599.n5.nabble.com/regex-a-zA-Z-not-working-as-expected-tp353090.html
> Sent from the Emacs - Help mailing list archive at Nabble.com.
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
  2015-03-25 12:27 AngusC
  2015-03-25 12:50 ` K.P. Huang
@ 2015-03-25 12:54 ` Yuri Khan
  2015-03-25 14:16   ` AngusC
  2015-03-25 12:57 ` tomas
  2 siblings, 1 reply; 7+ messages in thread
From: Yuri Khan @ 2015-03-25 12:54 UTC (permalink / raw)
  To: AngusC; +Cc: help-gnu-emacs@gnu.org

On Wed, Mar 25, 2015 at 6:27 PM, AngusC <anguscomber@gmail.com> wrote:
> I have text like this:
>
> some description here Status1
>
> I am using regex like this:
>
> \(.*\)\([a-zA-Z]+\)$
>
> with replacement text:
>
> \1 ZZZ \2
>
> And what I expected was:
>
> some description here ZZZ Status
>
> But instead I got:
>
> some description here Statu ZZZ s

Your regex consists of two parts. The first part matches any sequence
of characters. The second part matches any non-empty sequence of basic
Latin letters. There are many ways to match your text with this regex.
Most regular expression engines use the maximum munch rule — i.e. the
first part tries to match as much of the string as possible while
still satisfying the regex.

Others have suggested adding a space to disambiguate the match.

Alternatively, you can make the first group non-greedy so that instead
of matching as much as possible it will match as little as possible:

\(.*?\)\([a-zA-Z]+\)$

(Note the question mark in the first group.)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
  2015-03-25 12:27 AngusC
  2015-03-25 12:50 ` K.P. Huang
  2015-03-25 12:54 ` Yuri Khan
@ 2015-03-25 12:57 ` tomas
  2 siblings, 0 replies; 7+ messages in thread
From: tomas @ 2015-03-25 12:57 UTC (permalink / raw)
  To: AngusC; +Cc: Help-gnu-emacs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Mar 25, 2015 at 05:27:08AM -0700, AngusC wrote:
> I have text like this:
> 
> some description here Status1
> some other description Status2
> some other interesting description Status1

(I assume the digits at the end of the lines just don't exist:
they disappear later in your post magically anyway ;-)

> I am using regex like this:
> 
> \(.*\)\([a-zA-Z]+\)$

I'll try to read the above aloud:

"First, try to match zero-or-more of anything, call that \1".
"Then try to match at least one character, right before the
end-of-line. Call that \2".

> with replacement text:
> 
> \1 ZZZ \2
> 
> And what I expected was:
> 
> some description here ZZZ Status
> some other description ZZZ Fault
> some other interesting description ZZZ Status

Hm. Just why? It seems you expected the *second* match to
be "as long as possible", but there's no way you told that:

> But instead I got:
> 
> some description here Statu ZZZ s
> some other description Faul ZZZ t
> some other interesting description Statu ZZZ s

Reading aloud the regexp as above makes those matches less
mysterious, doesn't it?

Now: what could you do? Perhaps you might say that you want
the first match to be as short as possible, like so:

  \(.*?\)\([a-zA-Z]+\)$

> I want my expected outcome, how do I do that?

Is that your expected outcome?

> What was wrong with my regex?

As often (it happens to me all the time!), it's just sloppy
specification in your head :-)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlUSsNEACgkQBcgs9XrR2kbl7ACfXC3spfTmLa7SuVQxf+GXEQSE
iQYAnRr2FrWdjn7S4lGP1ZJhEcMV4Tod
=JuV3
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected
  2015-03-25 12:54 ` Yuri Khan
@ 2015-03-25 14:16   ` AngusC
  0 siblings, 0 replies; 7+ messages in thread
From: AngusC @ 2015-03-25 14:16 UTC (permalink / raw)
  To: Help-gnu-emacs

Thanks for the explanation.  

A space was easiest fix.

\(.*\)\( [a-zA-Z]+\)$




--
View this message in context: http://emacs.1067599.n5.nabble.com/regex-a-zA-Z-not-working-as-expected-tp353090p353106.html
Sent from the Emacs - Help mailing list archive at Nabble.com.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-25 14:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.2738.1427286469.31049.help-gnu-emacs@gnu.org>
2015-03-25 12:43 ` regex: \(.*\)\([a-zA-Z]+\)$ - not working as expected Loris Bennett
2015-03-25 12:45   ` Loris Bennett
2015-03-25 12:27 AngusC
2015-03-25 12:50 ` K.P. Huang
2015-03-25 12:54 ` Yuri Khan
2015-03-25 14:16   ` AngusC
2015-03-25 12:57 ` tomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).