unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Spencer Baugh <sbaugh@janestreet.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Dmitry Gutov <dmitry@gutov.dev>, 71179@debbugs.gnu.org
Subject: bug#71179: [PATCH] In rgrep, check matching files before excluding files
Date: Sun, 26 May 2024 09:42:10 -0400	[thread overview]
Message-ID: <ierzfsc666l.fsf@janestreet.com> (raw)
In-Reply-To: <86y17yx74y.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 25 May 2024 18:07:09 +0300")

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Sat, 25 May 2024 17:47:49 +0300
>> Cc: 71179@debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@gutov.dev>
>> 
>> On 25/05/2024 17:13, Eli Zaretskii wrote:
>> > Anyway, I suggested an approach that should leave everyone happy.  Why
>> > are we still arguing?
>> 
>> New option has the usual problems:
>> 
>> * What to call it.
>> * General proliferation of new options makes it harder to find each one.
>> * Several days of arguing whether we can make the new behavior the 
>> default one.
>
> I already said that I'm okay with making the new behavior the default,
> provided that we document the way of getting back old behavior.
>
> The new option can be a defvar, from where I stand, if that makes
> things easier.  Just don't make it an internal variable.

OK, here's the patch, now with an option.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-In-rgrep-check-matching-files-before-excluding-files.patch --]
[-- Type: text/x-patch, Size: 6138 bytes --]

From 231813e5825e1f3a63c24ad6063eb73de7b2b361 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@janestreet.com>
Date: Sun, 26 May 2024 09:26:09 -0400
Subject: [PATCH] In rgrep, check matching files before excluding files

There are a lot of excluding globs, and checking them all is expensive.
The files glob (i.e. the glob for files we actually want) is usually
just one or two entries, so it's quite fast to check.

If find checks the files glob first and then the excluding glob, it has
to do much less checking (since the files glob will substantially narrow
down the set of files on its own), and find performance is much better.

In my benchmarking, this takes (rgrep "foo" "*.el" "~/src/emacs/trunk/")
from ~410ms to ~130ms.

Further optimizations are possible now that the ignores and matched
files are in the same <F> argument which can be rearranged more easily
without compatibility issues; I'll do those optimizations in later
commits.

* lisp/progmodes/grep.el (rgrep-find-ignores-in-<f>): Add.
* lisp/progmodes/grep.el (rgrep-default-command): Check
rgrep-find-ignores-in-<f> and move the excluded files glob to part of
the "files" argument.
---
 lisp/progmodes/grep.el | 97 +++++++++++++++++++++++++-----------------
 1 file changed, 58 insertions(+), 39 deletions(-)

diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index ce54c57aabc..c0b48fb17e0 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -214,6 +214,21 @@ grep-find-template
   :set #'grep-apply-setting
   :version "22.1")
 
+(defvar rgrep-find-ignores-in-<f> t
+  "If nil, when `rgrep' expands `grep-find-template', file ignores go in <X>.
+
+By default, the <X> placeholder contains find options for affecting the
+directory list, and the <F> placeholder contains the find options which
+affect which files are matched, both `grep-find-ignored-files' and the
+FILES argument to `rgrep'.
+
+This separation allows the two sources of file matching in <F> to be
+optimized together into a set of options which are overall faster for
+\"find\" to evaluate.
+
+If nil, <X> contains ignores both for directories and files, and <F>
+contains only the FILES argument.  This is the old behavior.")
+
 (defvar grep-quoting-style nil
   "Whether to use POSIX-like shell argument quoting.")
 
@@ -1372,45 +1387,49 @@ rgrep-find-ignored-directories
 
 (defun rgrep-default-command (regexp files dir)
   "Compute the command for \\[rgrep] to use by default."
-  (require 'find-dired)      ; for `find-name-arg'
-  (grep-expand-template
-   grep-find-template
-   regexp
-   (concat (shell-quote-argument "(" grep-quoting-style)
-           " " find-name-arg " "
-           (mapconcat
-            (lambda (x) (shell-quote-argument x grep-quoting-style))
-            (split-string files)
-            (concat " -o " find-name-arg " "))
-           " "
-           (shell-quote-argument ")" grep-quoting-style))
-   dir
-   (concat
-    (when-let ((ignored-dirs (rgrep-find-ignored-directories dir)))
-      (concat "-type d "
-              (shell-quote-argument "(" grep-quoting-style)
-              ;; we should use shell-quote-argument here
-              " -path "
-              (mapconcat
-               (lambda (d)
-                 (shell-quote-argument (concat "*/" d) grep-quoting-style))
-               ignored-dirs
-               " -o -path ")
-              " "
-              (shell-quote-argument ")" grep-quoting-style)
-              " -prune -o "))
-    (when-let ((ignored-files (grep-find-ignored-files dir)))
-      (concat (shell-quote-argument "!" grep-quoting-style) " -type d "
-              (shell-quote-argument "(" grep-quoting-style)
-              ;; we should use shell-quote-argument here
-              " -name "
-              (mapconcat
-               (lambda (ignore) (shell-quote-argument ignore grep-quoting-style))
-               ignored-files
-               " -o -name ")
-              " "
-              (shell-quote-argument ")" grep-quoting-style)
-              " -prune -o ")))))
+  (require 'find-dired)                 ; for `find-name-arg'
+  (let ((ignored-files-arg
+         (when-let ((ignored-files (grep-find-ignored-files dir)))
+           (concat (shell-quote-argument "(" grep-quoting-style)
+                   ;; we should use shell-quote-argument here
+                   " -name "
+                   (mapconcat
+                    (lambda (ignore) (shell-quote-argument ignore grep-quoting-style))
+                    ignored-files
+                    " -o -name ")
+                   " " (shell-quote-argument ")" grep-quoting-style)))))
+    (grep-expand-template
+     grep-find-template
+     regexp
+     (concat (shell-quote-argument "(" grep-quoting-style)
+             " " find-name-arg " "
+             (mapconcat
+              (lambda (x) (shell-quote-argument x grep-quoting-style))
+              (split-string files)
+              (concat " -o " find-name-arg " "))
+             " "
+             (shell-quote-argument ")" grep-quoting-style)
+             (when (and rgrep-find-ignores-in-<f> ignored-files-arg)
+               (concat " " (shell-quote-argument "!" grep-quoting-style) " " ignored-files-arg)))
+     dir
+     (concat
+      (when-let ((ignored-dirs (rgrep-find-ignored-directories dir)))
+        (concat "-type d "
+                (shell-quote-argument "(" grep-quoting-style)
+                ;; we should use shell-quote-argument here
+                " -path "
+                (mapconcat
+                 (lambda (d)
+                   (shell-quote-argument (concat "*/" d) grep-quoting-style))
+                 ignored-dirs
+                 " -o -path ")
+                " "
+                (shell-quote-argument ")" grep-quoting-style)
+                " -prune -o "))
+      (when (and (not rgrep-find-ignores-in-<f>) ignored-files-arg)
+        (concat (shell-quote-argument "!" grep-quoting-style) " -type d "
+                ignored-files-arg
+                " -prune -o "))))))
 
 (defun grep-find-toggle-abbreviation ()
   "Toggle showing the hidden part of rgrep/lgrep/zrgrep command line."
-- 
2.39.3


  reply	other threads:[~2024-05-26 13:42 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24 20:14 bug#71179: [PATCH] In rgrep, check matching files before excluding files Spencer Baugh
2024-05-24 20:45 ` Dmitry Gutov
2024-05-24 20:54   ` Spencer Baugh
2024-05-25 12:31     ` Dmitry Gutov
2024-05-26  6:50       ` Juri Linkov
2024-05-26 12:48         ` Spencer Baugh
2024-05-26 12:56         ` Dmitry Gutov
2024-05-25  6:36 ` Eli Zaretskii
2024-05-25 12:26   ` Dmitry Gutov
2024-05-25 12:51     ` Eli Zaretskii
2024-05-25 13:03       ` Dmitry Gutov
2024-05-25 13:45         ` Eli Zaretskii
2024-05-25 13:58           ` Dmitry Gutov
2024-05-25 13:36       ` Spencer Baugh
2024-05-25 13:56         ` Eli Zaretskii
2024-05-25 14:02           ` Spencer Baugh
2024-05-25 14:13             ` Eli Zaretskii
2024-05-25 14:47               ` Dmitry Gutov
2024-05-25 15:07                 ` Eli Zaretskii
2024-05-26 13:42                   ` Spencer Baugh [this message]
2024-06-01 14:15                     ` Eli Zaretskii
2024-06-02 10:46                       ` Stefan Kangas
2024-05-25 14:51               ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ierzfsc666l.fsf@janestreet.com \
    --to=sbaugh@janestreet.com \
    --cc=71179@debbugs.gnu.org \
    --cc=dmitry@gutov.dev \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).