unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* problem with recent change to grep-regexp-alist
@ 2005-08-06 10:34 Emanuele Giaquinta
  2005-08-06 18:11 ` Juri Linkov
  0 siblings, 1 reply; 8+ messages in thread
From: Emanuele Giaquinta @ 2005-08-06 10:34 UTC (permalink / raw)


Hi,

In the 1.42 revision of grep.el the subpattern for the filename in the
regexps of the first two grep-regexp-alist's elements has been changed
from "\(.+?\)" to "\([^:\n]+\)". Now the matching fails if the
filename contains a colon, while the previous value worked, thanks to the
non greedy "+?" quantifier. Note that the regexp of the third
grep-regexp-alist's element is still correct, and is the one that
matches if grep-highlight-matches's value is "t" (which is the
default).

Emanuele Giaquinta

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-06 10:34 problem with recent change to grep-regexp-alist Emanuele Giaquinta
@ 2005-08-06 18:11 ` Juri Linkov
  2005-08-06 19:20   ` David Kastrup
  2005-08-09 23:57   ` Stefan Monnier
  0 siblings, 2 replies; 8+ messages in thread
From: Juri Linkov @ 2005-08-06 18:11 UTC (permalink / raw)
  Cc: emacs-devel

> In the 1.42 revision of grep.el the subpattern for the filename in the
> regexps of the first two grep-regexp-alist's elements has been changed
> from "\(.+?\)" to "\([^:\n]+\)". Now the matching fails if the
> filename contains a colon, while the previous value worked, thanks to the
> non greedy "+?" quantifier. Note that the regexp of the third
> grep-regexp-alist's element is still correct, and is the one that
> matches if grep-highlight-matches's value is "t" (which is the
> default).

Ok, let's use "\(.+?\)".

You can still get wrong matches with the file names like "abc:123",
but perhaps such file names are rare.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-06 18:11 ` Juri Linkov
@ 2005-08-06 19:20   ` David Kastrup
  2005-08-07 14:25     ` Emanuele Giaquinta
  2005-08-09 23:57   ` Stefan Monnier
  1 sibling, 1 reply; 8+ messages in thread
From: David Kastrup @ 2005-08-06 19:20 UTC (permalink / raw)
  Cc: Emanuele Giaquinta, emacs-devel

Juri Linkov <juri@jurta.org> writes:

>> In the 1.42 revision of grep.el the subpattern for the filename in the
>> regexps of the first two grep-regexp-alist's elements has been changed
>> from "\(.+?\)" to "\([^:\n]+\)". Now the matching fails if the
>> filename contains a colon, while the previous value worked, thanks to the
>> non greedy "+?" quantifier. Note that the regexp of the third
>> grep-regexp-alist's element is still correct, and is the one that
>> matches if grep-highlight-matches's value is "t" (which is the
>> default).
>
> Ok, let's use "\(.+?\)".
>
> You can still get wrong matches with the file names like "abc:123",
> but perhaps such file names are rare.

Maybe \([A-Za-z]:\| ...
or something?  It is not like there are many Unix file names starting
with a single letter followed by colon.

But Linux has file names like

/proc/driver/uhci/0000:00:07.2

which might not be terribly relevant, but better safe than sorry...

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-06 19:20   ` David Kastrup
@ 2005-08-07 14:25     ` Emanuele Giaquinta
  2005-08-08 20:38       ` Juri Linkov
  0 siblings, 1 reply; 8+ messages in thread
From: Emanuele Giaquinta @ 2005-08-07 14:25 UTC (permalink / raw)
  Cc: emacs-devel

> > Ok, let's use "\(.+?\)".
> >
> > You can still get wrong matches with the file names like "abc:123",
> > but perhaps such file names are rare.
> 
> Maybe \([A-Za-z]:\| ...
> or something?  It is not like there are many Unix file names starting
> with a single letter followed by colon.
> 
> But Linux has file names like
> 
> /proc/driver/uhci/0000:00:07.2

For those files the matching will be fine.
The wrong matches occur when the filename contains somewhere the
pattern ":[0-9]+:" or ends with the pattern ":[0-9]+" (like Juri's example)
Juri, wouldn't be good to add a comment about this in grep.el or grep.txt?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-07 14:25     ` Emanuele Giaquinta
@ 2005-08-08 20:38       ` Juri Linkov
  2005-08-08 22:56         ` Emanuele Giaquinta
  0 siblings, 1 reply; 8+ messages in thread
From: Juri Linkov @ 2005-08-08 20:38 UTC (permalink / raw)
  Cc: emacs-devel

>> > Ok, let's use "\(.+?\)".
>> >
>> > You can still get wrong matches with the file names like "abc:123",
>> > but perhaps such file names are rare.
>> 
>> Maybe \([A-Za-z]:\| ...
>> or something?  It is not like there are many Unix file names starting
>> with a single letter followed by colon.
>> 
>> But Linux has file names like
>> 
>> /proc/driver/uhci/0000:00:07.2
>
> For those files the matching will be fine.

It can't be fine, because the grep output is ambiguous.  For example,
for the grep output line:

1:2:3:4:text

there are three different interpretations:

file name `1:2:3', line number `4', source line `text'
file name `1:2', line number `3', source line `4:text'
file name `1', line number `2', source line `3:4:text'

One way to resolve them is to match markup escape sequences that new
GNU grep puts around file names and line numbers.  You can see them in
grep.txt under the title `GNU grep 2.5.1-cvs with default colors'.

> The wrong matches occur when the filename contains somewhere the
> pattern ":[0-9]+:" or ends with the pattern ":[0-9]+" (like Juri's example)
> Juri, wouldn't be good to add a comment about this in grep.el or grep.txt?

Since Richard doesn't want to document even `-n' option in grep.el,
I am not sure I can add a comment in grep.el about such rare cases
with numbers and colons in the file name.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-08 20:38       ` Juri Linkov
@ 2005-08-08 22:56         ` Emanuele Giaquinta
  2005-08-10  4:02           ` Juri Linkov
  0 siblings, 1 reply; 8+ messages in thread
From: Emanuele Giaquinta @ 2005-08-08 22:56 UTC (permalink / raw)
  Cc: emacs-devel

>>> But Linux has file names like
>>>
>>> /proc/driver/uhci/0000:00:07.2
>>
>> For those files the matching will be fine.
>
> It can't be fine, because the grep output is ambiguous.

Yes, I've misread the filename (seen only a colon); this is the first
case I mentioned, when the filename includes the ":[0-9]+:" pattern.

> For example, for the grep output line:
>
> 1:2:3:4:text
>
> there are three different interpretations:
>
> file name `1:2:3', line number `4', source line `text'
> file name `1:2', line number `3', source line `4:text'
> file name `1', line number `2', source line `3:4:text'

But, using the non greedy '+?' quantifier, the interpretation will
always be of the last type.

> One way to resolve them is to match markup escape sequences that new
> GNU grep puts around file names and line numbers.  You can see them in
> grep.txt under the title `GNU grep 2.5.1-cvs with default colors'.

I'm aware of them, are you waiting for the 2.5.2 release to add the
corresponding element to grep-regexp-alist?

-- 
Emanuele Giaquinta

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-06 18:11 ` Juri Linkov
  2005-08-06 19:20   ` David Kastrup
@ 2005-08-09 23:57   ` Stefan Monnier
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2005-08-09 23:57 UTC (permalink / raw)
  Cc: Emanuele Giaquinta, emacs-devel

>> In the 1.42 revision of grep.el the subpattern for the filename in the
>> regexps of the first two grep-regexp-alist's elements has been changed
>> from "\(.+?\)" to "\([^:\n]+\)". Now the matching fails if the
>> filename contains a colon, while the previous value worked, thanks to the
>> non greedy "+?" quantifier. Note that the regexp of the third
>> grep-regexp-alist's element is still correct, and is the one that
>> matches if grep-highlight-matches's value is "t" (which is the
>> default).

> Ok, let's use "\(.+?\)".

> You can still get wrong matches with the file names like "abc:123",
> but perhaps such file names are rare.

It doesn't matter whether they're rare or not: they result in ambiguous
output from grep, so in such cases Emacs can't be sure to get it right.
I.e. the .+? regexp is as good as it gets: when it's not ambiguous it works
right, and when it's ambiguous it chooses one of the possibilities
"arbitrarily".


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with recent change to grep-regexp-alist
  2005-08-08 22:56         ` Emanuele Giaquinta
@ 2005-08-10  4:02           ` Juri Linkov
  0 siblings, 0 replies; 8+ messages in thread
From: Juri Linkov @ 2005-08-10  4:02 UTC (permalink / raw)
  Cc: emacs-devel

>> One way to resolve them is to match markup escape sequences that new
>> GNU grep puts around file names and line numbers.  You can see them in
>> grep.txt under the title `GNU grep 2.5.1-cvs with default colors'.
>
> I'm aware of them, are you waiting for the 2.5.2 release to add the
> corresponding element to grep-regexp-alist?

I am not sure it's good to recognize rare cases at the cost of the
overhead of processing these additional escape sequences for file
names and line numbers.  But this is possible with

GREP_COLORS='mt=01;31:fn=35:ln=32:bn=:se=:ml=36:cx=37:ne'

and the regexp

    ("^\\(\033\\[35m\\(.+?\\)\033\\[m.*?\033\\[32m\\([0-9]+\\)\033\\[m.*?\033\\[36m\\).*?\
\\(\033\\[01;31m\\(?:\033\\[K\\)?\\)\\(.*?\\)\\(\033\\[[0-9]*m\\)"
     2 3
     ;; Calculate column positions (beg . end) of first grep match on a line
     ((lambda ()
	(setq compilation-error-screen-columns nil)
        (- (match-beginning 4) (match-end 1)))
      .
      (lambda () (- (match-end 5) (match-end 1)
		    (- (match-end 4) (match-beginning 4)))))
     nil 1)

But it still doesn't fontifify grep matches correctly because other
font-lock rules interfere.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-08-10  4:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-06 10:34 problem with recent change to grep-regexp-alist Emanuele Giaquinta
2005-08-06 18:11 ` Juri Linkov
2005-08-06 19:20   ` David Kastrup
2005-08-07 14:25     ` Emanuele Giaquinta
2005-08-08 20:38       ` Juri Linkov
2005-08-08 22:56         ` Emanuele Giaquinta
2005-08-10  4:02           ` Juri Linkov
2005-08-09 23:57   ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).