unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Why does not rgrep use "grep -r"?
@ 2007-11-02 21:42 Lennart Borgman (gmail)
  2007-11-02 22:44 ` Miles Bader
  2007-12-12  1:56 ` Lennart Borgman (gmail)
  0 siblings, 2 replies; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-02 21:42 UTC (permalink / raw)
  To: Emacs Devel

If grep can do recursive searches then why not use that in rgrep?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 21:42 Why does not rgrep use "grep -r"? Lennart Borgman (gmail)
@ 2007-11-02 22:44 ` Miles Bader
  2007-11-02 23:29   ` Andreas Schwab
                     ` (2 more replies)
  2007-12-12  1:56 ` Lennart Borgman (gmail)
  1 sibling, 3 replies; 26+ messages in thread
From: Miles Bader @ 2007-11-02 22:44 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Emacs Devel

"Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
> If grep can do recursive searches then why not use that in rgrep?

I don't know the reason of the rgrep authors (though I suspect it was
portability concerns), but I have noticed something odd about "grep -r":
-- sometimes it seems _much_ slower than "find ... -type f | xargs grep"
on very large trees (I think I noticed with trees in NFS, where speed is
a perennial issue).  Dunno why this...

-Miles

-- 
Americans are broad-minded people.  They'll accept the fact that a person can
be an alcoholic, a dope fiend, a wife beater, and even a newspaperman, but if a
man doesn't drive, there is something wrong with him.  -- Art Buchwald

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 22:44 ` Miles Bader
@ 2007-11-02 23:29   ` Andreas Schwab
  2007-11-02 23:56     ` Lennart Borgman (gmail)
  2007-11-03 14:01   ` Stefan Monnier
  2007-11-04  0:57   ` Kim F. Storm
  2 siblings, 1 reply; 26+ messages in thread
From: Andreas Schwab @ 2007-11-02 23:29 UTC (permalink / raw)
  To: Miles Bader; +Cc: Lennart Borgman (gmail), Emacs Devel

Miles Bader <miles@gnu.org> writes:

> I don't know the reason of the rgrep authors (though I suspect it was
> portability concerns), but I have noticed something odd about "grep -r":
> -- sometimes it seems _much_ slower than "find ... -type f | xargs grep"
> on very large trees (I think I noticed with trees in NFS, where speed is
> a perennial issue).

Probably because find has been optimized for, umm, finding files.  There
are quite a few things you can do to speed up directory traversal.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 23:29   ` Andreas Schwab
@ 2007-11-02 23:56     ` Lennart Borgman (gmail)
  2007-11-03  1:31       ` Miles Bader
  0 siblings, 1 reply; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-02 23:56 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Emacs Devel, Miles Bader

Andreas Schwab wrote:
> Miles Bader <miles@gnu.org> writes:
> 
>> I don't know the reason of the rgrep authors (though I suspect it was
>> portability concerns), but I have noticed something odd about "grep -r":
>> -- sometimes it seems _much_ slower than "find ... -type f | xargs grep"
>> on very large trees (I think I noticed with trees in NFS, where speed is
>> a perennial issue).
> 
> Probably because find has been optimized for, umm, finding files.  There
> are quite a few things you can do to speed up directory traversal.


I guess it also depends on what kind of OS you are using, is process 
creation cheap or not.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 23:56     ` Lennart Borgman (gmail)
@ 2007-11-03  1:31       ` Miles Bader
  2007-11-03  1:45         ` Lennart Borgman (gmail)
                           ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Miles Bader @ 2007-11-03  1:31 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Andreas Schwab, Emacs Devel

"Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>> Probably because find has been optimized for, umm, finding files.  There
>> are quite a few things you can do to speed up directory traversal.
>
> I guess it also depends on what kind of OS you are using, is process
> creation cheap or not.

Neither method should use many processes unless the command-line
arg limit is very short (though on windows, maybe that's the case...).

-Miles
-- 
Everywhere is walking distance if you have the time.  -- Steven Wright

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  1:31       ` Miles Bader
@ 2007-11-03  1:45         ` Lennart Borgman (gmail)
  2007-11-03  3:37           ` Miles Bader
  2007-11-03  4:01         ` Ken Raeburn
  2007-11-03  8:37         ` Eli Zaretskii
  2 siblings, 1 reply; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-03  1:45 UTC (permalink / raw)
  To: Miles Bader; +Cc: Andreas Schwab, Emacs Devel

Miles Bader wrote:
> "Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>>> Probably because find has been optimized for, umm, finding files.  There
>>> are quite a few things you can do to speed up directory traversal.
>> I guess it also depends on what kind of OS you are using, is process
>> creation cheap or not.
> 
> Neither method should use many processes unless the command-line
> arg limit is very short (though on windows, maybe that's the case...).
> 
> -Miles

I believed that grep had to be started many times. Is not that the case?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  1:45         ` Lennart Borgman (gmail)
@ 2007-11-03  3:37           ` Miles Bader
  2007-11-03  8:40             ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Miles Bader @ 2007-11-03  3:37 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Andreas Schwab, Emacs Devel

"Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>> Neither method should use many processes unless the command-line
>> arg limit is very short (though on windows, maybe that's the case...).
>
> I believed that grep had to be started many times. Is not that the case?

xargs invokes grep in "batches," with as many filenames as will fit on
the command line; for e.g. linux, that's many thousands at once, so
process invocation overhead will tend to be in the noise compared to
file I/O overhead.

-Miles
-- 
It wasn't the Exxon Valdez captain's driving that caused the Alaskan oil spill.
It was yours.  [Greenpeace advertisement, New York Times, 25 February 1990]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  1:31       ` Miles Bader
  2007-11-03  1:45         ` Lennart Borgman (gmail)
@ 2007-11-03  4:01         ` Ken Raeburn
  2007-11-03  8:41           ` Eli Zaretskii
  2007-11-03  8:37         ` Eli Zaretskii
  2 siblings, 1 reply; 26+ messages in thread
From: Ken Raeburn @ 2007-11-03  4:01 UTC (permalink / raw)
  To: Emacs Devel

On Nov 2, 2007, at 21:31, Miles Bader wrote:
> "Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>>> Probably because find has been optimized for, umm, finding  
>>> files.  There
>>> are quite a few things you can do to speed up directory traversal.
>>
>> I guess it also depends on what kind of OS you are using, is process
>> creation cheap or not.
>
> Neither method should use many processes unless the command-line
> arg limit is very short (though on windows, maybe that's the case...).

There may also be the issue of parallelism.  Even on a single-cpu  
system, you can queue more i/o requests in advance, or do cpu-bound  
work with data in memory while the other process is blocked on disk  
or network i/o.  (Now, if "grep -r" uses multiple threads, it may not  
be an issue...)

Ken

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  1:31       ` Miles Bader
  2007-11-03  1:45         ` Lennart Borgman (gmail)
  2007-11-03  4:01         ` Ken Raeburn
@ 2007-11-03  8:37         ` Eli Zaretskii
  2 siblings, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-11-03  8:37 UTC (permalink / raw)
  To: Miles Bader; +Cc: lennart.borgman, emacs-devel

> From: Miles Bader <miles@gnu.org>
> Date: Sat, 03 Nov 2007 10:31:42 +0900
> Cc: Andreas Schwab <schwab@suse.de>, Emacs Devel <emacs-devel@gnu.org>
> 
> Neither method should use many processes unless the command-line
> arg limit is very short (though on windows, maybe that's the case...).

No, modern Windows versions (beyond Windows 9x) support up to 32KB
long command lines.

However, users may have Grep installed, but not Findutils.  Especially
since the Findutils port from the popular GnuWin32 collection is very
buggy (`locate' simply doesn't work, `xargs' dies with mysterious
error message, and `find' is abysmally slow).  I had to roll my own
port to be able to use the package.

So I think on Windows it might make sense to use recursive Grep by
default.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  3:37           ` Miles Bader
@ 2007-11-03  8:40             ` Eli Zaretskii
  2007-11-03  9:43               ` David Kastrup
  0 siblings, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2007-11-03  8:40 UTC (permalink / raw)
  To: Miles Bader; +Cc: schwab, lennart.borgman, emacs-devel

> From: Miles Bader <miles@gnu.org>
> Date: Sat, 03 Nov 2007 12:37:47 +0900
> Cc: Andreas Schwab <schwab@suse.de>, Emacs Devel <emacs-devel@gnu.org>
> 
> > I believed that grep had to be started many times. Is not that the case?
> 
> xargs invokes grep in "batches," with as many filenames as will fit on
> the command line; for e.g. linux, that's many thousands at once, so
> process invocation overhead will tend to be in the noise compared to
> file I/O overhead.

Yes, but I believe "grep -r" will be still faster, even on GNU/Linux,
since all it does to recurse is `readdir' and `fnmatch'; the need for
writing file names to the pipe and reading them on the xargs side is
avoided.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  4:01         ` Ken Raeburn
@ 2007-11-03  8:41           ` Eli Zaretskii
  0 siblings, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-11-03  8:41 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sat, 3 Nov 2007 00:01:38 -0400
> 
> Now, if "grep -r" uses multiple threads

It doesn't.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  8:40             ` Eli Zaretskii
@ 2007-11-03  9:43               ` David Kastrup
  2007-11-03 11:01                 ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: David Kastrup @ 2007-11-03  9:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, emacs-devel, lennart.borgman, Miles Bader

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Miles Bader <miles@gnu.org>
>> Date: Sat, 03 Nov 2007 12:37:47 +0900
>> Cc: Andreas Schwab <schwab@suse.de>, Emacs Devel <emacs-devel@gnu.org>
>> 
>> > I believed that grep had to be started many times. Is not that the case?
>> 
>> xargs invokes grep in "batches," with as many filenames as will fit on
>> the command line; for e.g. linux, that's many thousands at once, so
>> process invocation overhead will tend to be in the noise compared to
>> file I/O overhead.
>
> Yes, but I believe "grep -r" will be still faster, even on GNU/Linux,
> since all it does to recurse is `readdir' and `fnmatch'; the need for
> writing file names to the pipe and reading them on the xargs side is
> avoided.

Totally warm cache:

dak@lola:/usr/local/texlive/2007$ time find -name \*.tex|xargs grep snort
./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for

real    0m0.974s
user    0m0.372s
sys     0m0.536s
dak@lola:/usr/local/texlive/2007$ time grep -r  --include=\*.tex snort .
./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for

real    0m1.225s
user    0m0.376s
sys     0m0.764s

Totally cold cache (after umount and mount):

dak@lola:/usr/local/texlive/2007$ time grep -r  --include=\*.tex snort .
./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for

real    1m44.387s
user    0m0.508s
sys     0m3.768s


dak@lola:/usr/local/texlive/2007$ time find -name \*.tex|xargs grep snort
./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for

real    0m59.633s
user    0m0.604s
sys     0m1.484s


And, for good measure:

dak@lola:/usr/local/texlive/2007$ time find -name \*.tex -exec grep snort {} \+
./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for

real    0m55.640s
user    0m0.576s
sys     0m1.292s

In short: interspersing the directory and file search on a per-file
basis (as grep -r does) makes the whole operation much more inefficient
on a cold buffer cache.  On a warm cache, it is pretty much the same.
Using a pipe also allows for some parallelism.  In this particular case,
however, both jobs are so much I/O-bound that the last, pipeless version
using -exec ... \+ is still somewhat faster even though it is strictly
single-threaded in its operation.  The decisive factor appears to be the
large-scale bundling of directory searches without intervening file
searches in between.

This is on a
Linux lola 2.6.20-16-generic #2 SMP Sun Sep 23 19:50:39 UTC 2007 i686 GNU/Linux

single processor laptop with a fairly standard ATA disk.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03  9:43               ` David Kastrup
@ 2007-11-03 11:01                 ` Eli Zaretskii
  2007-11-03 11:54                   ` Lennart Borgman (gmail)
  2007-11-04 18:59                   ` David Kastrup
  0 siblings, 2 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-11-03 11:01 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel, lennart.borgman, miles

> From: David Kastrup <dak@gnu.org>
> Cc: Miles Bader <miles@gnu.org>,  schwab@suse.de,  lennart.borgman@gmail.com,  emacs-devel@gnu.org
> Date: Sat, 03 Nov 2007 10:43:03 +0100
> 
> Totally warm cache:
> 
> dak@lola:/usr/local/texlive/2007$ time find -name \*.tex|xargs grep snort
> ./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for
> 
> real    0m0.974s
> user    0m0.372s
> sys     0m0.536s
> dak@lola:/usr/local/texlive/2007$ time grep -r  --include=\*.tex snort .
> ./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for
> 
> real    0m1.225s
> user    0m0.376s
> sys     0m0.764s
> [...]
> On a warm cache, it is pretty much the same.

Perhaps for the Linux filesystem, it is.  It looks as it's quite
different on Windows:

With warm cache:

timep grep -r snort d:/gnu/gdb-CVS/src/gdb > nul

real    00h00m03.171s
user    00h00m00.234s
sys     00h00m02.312s

timep find d:/gnu/gdb-CVS/src/gdb -name "*.c" | xargs grep snort > nul

real    00h00m03.921s
user    00h00m00.015s
sys     00h00m00.015s

That's a 20% difference in elapsed time (the fact that user and sys
are zero is just an artefact of the timep command implementation on
Windows).

With cold cache:

timep grep -r snort d:/gnu/gdb-CVS/src/gdb > nul

real    00h00m15.531s
user    00h00m00.328s
sys     00h00m03.140s

timep find d:/gnu/gdb-CVS/src/gdb -name "*.c" | xargs grep snort > nul

real    00h00m13.687s
user    00h00m00.015s
sys     00h00m00.078s

That's 11%, a much smaller gain, and in the other direction.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03 11:01                 ` Eli Zaretskii
@ 2007-11-03 11:54                   ` Lennart Borgman (gmail)
  2007-11-04 18:59                   ` David Kastrup
  1 sibling, 0 replies; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-03 11:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, miles

Another difference is the caching of output data. When I use lgrep I see 
the output immediately, but with rgrep (which uses find etc) it looks to 
me like it takes much longer time for the output be visible. (I am using 
w32.)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 22:44 ` Miles Bader
  2007-11-02 23:29   ` Andreas Schwab
@ 2007-11-03 14:01   ` Stefan Monnier
  2007-11-04  0:57   ` Kim F. Storm
  2 siblings, 0 replies; 26+ messages in thread
From: Stefan Monnier @ 2007-11-03 14:01 UTC (permalink / raw)
  To: Miles Bader; +Cc: Lennart Borgman (gmail), Emacs Devel

>> If grep can do recursive searches then why not use that in rgrep?
> I don't know the reason of the rgrep authors (though I suspect it was
> portability concerns), but I have noticed something odd about "grep -r":
> -- sometimes it seems _much_ slower than "find ... -type f | xargs grep"
> on very large trees (I think I noticed with trees in NFS, where speed is
> a perennial issue).  Dunno why this...

I don't know why that is, but the difference may come from diferences in
the way the tree is traversed.  There are many tricks that one can try
to use to speed up tree traversal and unless both commands share their
tree-traversal code, it's likely that find's tree traversal has been
better optimized.

Some of the tricks that can be used:
- A dir without subdirs has a refcount of 2 (one ref from the parent,
  one from its "." entry).  If you know a dir has no subdirs you can
  avoid calling `stat' on each file to determine whether it's a subdir
  or not.
- some filesystems can provide the "dir/notdir" info without going
  through `stat', but that requires the use of a separate syscall.
- assuming the inodes numbers are related to the position of the inode
  on the drive (which is not the case for log-structured filesystems but
  is the case for many others) you can reorder your traversal to try and
  produce a more sequential scan of your disk.
- if the OS&filesystem gives you access to disk-block-numbers instead of only
  inode numbers, you can use that to do the above.
- ...


        Stefan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 22:44 ` Miles Bader
  2007-11-02 23:29   ` Andreas Schwab
  2007-11-03 14:01   ` Stefan Monnier
@ 2007-11-04  0:57   ` Kim F. Storm
  2007-11-04  1:03     ` Lennart Borgman (gmail)
  2007-11-04  1:15     ` Miles Bader
  2 siblings, 2 replies; 26+ messages in thread
From: Kim F. Storm @ 2007-11-04  0:57 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Emacs Devel

Miles Bader <miles@gnu.org> writes:

> "Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>> If grep can do recursive searches then why not use that in rgrep?
>
> I don't know the reason of the rgrep authors (though I suspect it was
> portability concerns), 

Yes portability was definitely a concern.

When I first wrote the code years ago, grep -r wasn't widely supported,
so I wrote it to use find, xargs, and grep.

Later when I integrated rgrep into Emacs 22.1, I reworked quite a lot
of the existing grep & find stuff so that the old grep and grep-find, and
the new lgrep and rgrep commands could share a common code base.

Since both grep-find and my rgrep code used find/xargs/grep, I decided
to continue using them, even though grep -r could have been an alternative.

But I was still concerned about portability, and since using find
worked just nicely, I saw no reason to change.  Besides, as you and
others have noted, grep -r has severe performance problems on some
platforms.

It probably wouldn't be difficult to make it used grep -r (as an
_optional alternative_), but the current code works nicely and is well
tested, so why mess with it?

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04  0:57   ` Kim F. Storm
@ 2007-11-04  1:03     ` Lennart Borgman (gmail)
  2007-11-04  1:15     ` Miles Bader
  1 sibling, 0 replies; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-04  1:03 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Emacs Devel

Kim F. Storm wrote:
> Miles Bader <miles@gnu.org> writes:
> 
>> "Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>>> If grep can do recursive searches then why not use that in rgrep?
>> I don't know the reason of the rgrep authors (though I suspect it was
>> portability concerns), 
> 
> Yes portability was definitely a concern.
> 
> When I first wrote the code years ago, grep -r wasn't widely supported,
> so I wrote it to use find, xargs, and grep.
> 
> Later when I integrated rgrep into Emacs 22.1, I reworked quite a lot
> of the existing grep & find stuff so that the old grep and grep-find, and
> the new lgrep and rgrep commands could share a common code base.
> 
> Since both grep-find and my rgrep code used find/xargs/grep, I decided
> to continue using them, even though grep -r could have been an alternative.
> 
> But I was still concerned about portability, and since using find
> worked just nicely, I saw no reason to change.  Besides, as you and
> others have noted, grep -r has severe performance problems on some
> platforms.
> 
> It probably wouldn't be difficult to make it used grep -r (as an
> _optional alternative_), but the current code works nicely and is well
> tested, so why mess with it?

I thought performance was one reason, but maybe not.

However there are other problems, at least on w32. Try a search with a 
space in it. That works with lgrep, but not with rgrep. -- At least that 
is true when you are using the utilities from GnuWin32.

So I suggest that the default on w32 is changed to use grep -r.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04  0:57   ` Kim F. Storm
  2007-11-04  1:03     ` Lennart Borgman (gmail)
@ 2007-11-04  1:15     ` Miles Bader
  2007-11-04 10:32       ` Jason Rumney
  1 sibling, 1 reply; 26+ messages in thread
From: Miles Bader @ 2007-11-04  1:15 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Lennart Borgman (gmail), Emacs Devel

storm@cua.dk (Kim F. Storm) writes:
> It probably wouldn't be difficult to make it used grep -r (as an
> _optional alternative_), but the current code works nicely and is well
> tested, so why mess with it?

I agree; the only reason I use grep -r in a shell is because it's much
easier to type, which isn't an issue here.

However on non-unixish platforms, maybe such issues as the availability
of findutils (or as Eli mentioned, widespread buggy versions of
findutils) might make using grep -r more attractive?

-Miles
-- 
`Cars give people wonderful freedom and increase their opportunities.
 But they also destroy the environment, to an extent so drastic that
 they kill all social life' (from _A Pattern Language_)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04  1:15     ` Miles Bader
@ 2007-11-04 10:32       ` Jason Rumney
  2007-11-04 11:32         ` Lennart Borgman (gmail)
  0 siblings, 1 reply; 26+ messages in thread
From: Jason Rumney @ 2007-11-04 10:32 UTC (permalink / raw)
  To: Miles Bader; +Cc: Emacs Devel, Lennart Borgman (gmail), Kim F. Storm

Miles Bader wrote:
> However on non-unixish platforms, maybe such issues as the availability
> of findutils (or as Eli mentioned, widespread buggy versions of
> findutils) might make using grep -r more attractive?
>   

It's not so much the availability of find on Windows, more the fact that
there is a standard system command with the same name that is not
compatible. So users have to be knowledgeable enough about Windows to
order their PATH so that the findutils version is found first, and
periodically fix the reversions that can occur after system updates or
program installers move the standard windows command directories back to
the beginning of PATH.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04 10:32       ` Jason Rumney
@ 2007-11-04 11:32         ` Lennart Borgman (gmail)
  2007-11-04 11:48           ` Jason Rumney
  0 siblings, 1 reply; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-04 11:32 UTC (permalink / raw)
  To: Jason Rumney; +Cc: Kim F. Storm, Emacs Devel, Miles Bader

Jason Rumney wrote:
> Miles Bader wrote:
>> However on non-unixish platforms, maybe such issues as the availability
>> of findutils (or as Eli mentioned, widespread buggy versions of
>> findutils) might make using grep -r more attractive?
>>   
> 
> It's not so much the availability of find on Windows, more the fact that
> there is a standard system command with the same name that is not
> compatible. So users have to be knowledgeable enough about Windows to
> order their PATH so that the findutils version is found first, and
> periodically fix the reversions that can occur after system updates or
> program installers move the standard windows command directories back to
> the beginning of PATH.

Jason, did you see my message about the probelm with spaces in the searches?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04 11:32         ` Lennart Borgman (gmail)
@ 2007-11-04 11:48           ` Jason Rumney
  2007-11-04 12:01             ` Lennart Borgman (gmail)
  2007-11-05  5:22             ` Eli Zaretskii
  0 siblings, 2 replies; 26+ messages in thread
From: Jason Rumney @ 2007-11-04 11:48 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Miles Bader, Emacs Devel, Kim F. Storm

Lennart Borgman (gmail) wrote:
> Jason, did you see my message about the probelm with spaces in the
> searches?

That seems more like a bug with argument quoting than a problem with the
commands used.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04 11:48           ` Jason Rumney
@ 2007-11-04 12:01             ` Lennart Borgman (gmail)
  2007-11-05  5:22             ` Eli Zaretskii
  1 sibling, 0 replies; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-11-04 12:01 UTC (permalink / raw)
  To: Jason Rumney; +Cc: Miles Bader, Emacs Devel, Kim F. Storm

Jason Rumney wrote:
> Lennart Borgman (gmail) wrote:
>> Jason, did you see my message about the probelm with spaces in the
>> searches?
> 
> That seems more like a bug with argument quoting than a problem with the
> commands used.

Maybe, but the bug is there. The command looks ok to me:

   find . "(" -path "*/CVS" -o -path "*/.svn" -o -path "*/{arch}" -o 
-path "*/.hg" -o -path "*/_darcs" -o -path "*/.git" -o -path "*/.bzr" 
")" -prune -o  -type f "(" -name "*.el" ")" -print0 | xargs -0 -e grep 
-i -nH -e "free software"

This problem together with the problem with "find.exe" path makes me 
think that it is better to use grep -r on w32.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-03 11:01                 ` Eli Zaretskii
  2007-11-03 11:54                   ` Lennart Borgman (gmail)
@ 2007-11-04 18:59                   ` David Kastrup
  2007-11-05  5:24                     ` Eli Zaretskii
  1 sibling, 1 reply; 26+ messages in thread
From: David Kastrup @ 2007-11-04 18:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, lennart.borgman, miles

Eli Zaretskii <eliz@gnu.org> writes:

>> From: David Kastrup <dak@gnu.org>
>> Cc: Miles Bader <miles@gnu.org>,  schwab@suse.de,  lennart.borgman@gmail.com,  emacs-devel@gnu.org
>> Date: Sat, 03 Nov 2007 10:43:03 +0100
>> 
>> Totally warm cache:
>> 
>> dak@lola:/usr/local/texlive/2007$ time find -name \*.tex|xargs grep snort
>> ./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for
>> 
>> real    0m0.974s
>> user    0m0.372s
>> sys     0m0.536s
>> dak@lola:/usr/local/texlive/2007$ time grep -r  --include=\*.tex snort .
>> ./texmf-dist/source/latex/ae/aesample.tex:and whooping and sneezing and snorting, that I could not hear myself think for
>> 
>> real    0m1.225s
>> user    0m0.376s
>> sys     0m0.764s
>> [...]
>> On a warm cache, it is pretty much the same.
>
> Perhaps for the Linux filesystem, it is.  It looks as it's quite
> different on Windows:
>
> With warm cache:
>
> timep grep -r snort d:/gnu/gdb-CVS/src/gdb > nul
>
> real    00h00m03.171s
> user    00h00m00.234s
> sys     00h00m02.312s
>
> timep find d:/gnu/gdb-CVS/src/gdb -name "*.c" | xargs grep snort > nul
>
> real    00h00m03.921s
> user    00h00m00.015s
> sys     00h00m00.015s
>
> That's a 20% difference in elapsed time (the fact that user and sys
> are zero is just an artefact of the timep command implementation on
> Windows).

What sense is there in using commands doing something quite different?
The first searches all files, the second just a subset.

> With cold cache:
>
> timep grep -r snort d:/gnu/gdb-CVS/src/gdb > nul
>
> real    00h00m15.531s
> user    00h00m00.328s
> sys     00h00m03.140s
>
> timep find d:/gnu/gdb-CVS/src/gdb -name "*.c" | xargs grep snort > nul
>
> real    00h00m13.687s
> user    00h00m00.015s
> sys     00h00m00.078s
>
> That's 11%, a much smaller gain, and in the other direction.

How is this the other direction?  You mean the other direction from your
first test rather than the test using GNU/Linux?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04 11:48           ` Jason Rumney
  2007-11-04 12:01             ` Lennart Borgman (gmail)
@ 2007-11-05  5:22             ` Eli Zaretskii
  1 sibling, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-11-05  5:22 UTC (permalink / raw)
  To: Jason Rumney; +Cc: emacs-devel, lennart.borgman, storm, miles

> Date: Sun, 04 Nov 2007 11:48:41 +0000
> From: Jason Rumney <jasonr@gnu.org>
> Cc: Miles Bader <miles@gnu.org>, Emacs Devel <emacs-devel@gnu.org>,
> 	"Kim F. Storm" <storm@cua.dk>
> 
> Lennart Borgman (gmail) wrote:
> > Jason, did you see my message about the probelm with spaces in the
> > searches?
> 
> That seems more like a bug with argument quoting than a problem with the
> commands used.

Not a bug, a missing feature: the quoting of file names by quotearg is
not really compatible with Windows.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-04 18:59                   ` David Kastrup
@ 2007-11-05  5:24                     ` Eli Zaretskii
  0 siblings, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-11-05  5:24 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel, lennart.borgman, miles

> From: David Kastrup <dak@gnu.org>
> Cc: miles@gnu.org,  lennart.borgman@gmail.com,  emacs-devel@gnu.org
> Date: Sun, 04 Nov 2007 19:59:28 +0100
> What sense is there in using commands doing something quite different?
> The first searches all files, the second just a subset.

That's a typo: the --include argument got deleted somehow from the
mail.  Both commands were searching the same files.

> > That's 11%, a much smaller gain, and in the other direction.
> 
> How is this the other direction?  You mean the other direction from your
> first test rather than the test using GNU/Linux?

Yes, with warm cache Grep was faster, with cold cache it's the other
way around.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Why does not rgrep use "grep -r"?
  2007-11-02 21:42 Why does not rgrep use "grep -r"? Lennart Borgman (gmail)
  2007-11-02 22:44 ` Miles Bader
@ 2007-12-12  1:56 ` Lennart Borgman (gmail)
  1 sibling, 0 replies; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-12-12  1:56 UTC (permalink / raw)
  To: Emacs Devel

Lennart Borgman (gmail) wrote:
> If grep can do recursive searches then why not use that in rgrep?


I believe we found that it might be best to use "grep -r" on w32, or? Is 
that implemented in CVS?

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2007-12-12  1:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-02 21:42 Why does not rgrep use "grep -r"? Lennart Borgman (gmail)
2007-11-02 22:44 ` Miles Bader
2007-11-02 23:29   ` Andreas Schwab
2007-11-02 23:56     ` Lennart Borgman (gmail)
2007-11-03  1:31       ` Miles Bader
2007-11-03  1:45         ` Lennart Borgman (gmail)
2007-11-03  3:37           ` Miles Bader
2007-11-03  8:40             ` Eli Zaretskii
2007-11-03  9:43               ` David Kastrup
2007-11-03 11:01                 ` Eli Zaretskii
2007-11-03 11:54                   ` Lennart Borgman (gmail)
2007-11-04 18:59                   ` David Kastrup
2007-11-05  5:24                     ` Eli Zaretskii
2007-11-03  4:01         ` Ken Raeburn
2007-11-03  8:41           ` Eli Zaretskii
2007-11-03  8:37         ` Eli Zaretskii
2007-11-03 14:01   ` Stefan Monnier
2007-11-04  0:57   ` Kim F. Storm
2007-11-04  1:03     ` Lennart Borgman (gmail)
2007-11-04  1:15     ` Miles Bader
2007-11-04 10:32       ` Jason Rumney
2007-11-04 11:32         ` Lennart Borgman (gmail)
2007-11-04 11:48           ` Jason Rumney
2007-11-04 12:01             ` Lennart Borgman (gmail)
2007-11-05  5:22             ` Eli Zaretskii
2007-12-12  1:56 ` Lennart Borgman (gmail)

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).