From: Spencer Baugh <sbaugh@janestreet.com>
To: 64735@debbugs.gnu.org
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Wed, 19 Jul 2023 17:16:31 -0400 [thread overview]
Message-ID: <iermszrwqj4.fsf@janestreet.com> (raw)
Several important commands and functions invoke find; for example rgrep
and project-find-regexp.
Most of these add some set of ignores to the find command, pulling from
grep-find-ignored-files in the former case. So the find command looks
like:
find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* [...more ignores...] \)
-prune -o -type f -print0
Alas, on my system, using GNU find, these ignores slow down find by
about 15x on a large directory tree, taking it from around .5 seconds to
7.8 seconds.
This is very noticeable overhead; removing the ignores makes rgrep and
other find-invoking commands substantially faster for me.
The overhead is linear in the number of ignores - that is, each
additional ignore adds a small fixed cost. This suggests that find is
linearly scanning the list of ignores and checking each one, rather than
optimizing them to a single regexp and checking that regexp.
Obviously, GNU find should be optimizing this. However they have
previously said they will not optimize this; I commented on this bug
https://savannah.gnu.org/bugs/index.php?58197 to request they rethink
that. Hopefully as a fellow GNU project they will be interested in
helping us...
In Emacs alone, there are a few things we could do:
- we could mitigate the find bug by optimizing the regexp before we pass
it to find; this should basically remove all the overhead but makes the
find command uglier and harder to edit
- we could remove rare and likely irrelevant things from
completion-ignored-extensions and vc-ignore-dir-regexp (which are used
to build these lists of ignores)
- we could use our own recursive directory-tree walking implementation
(directory-files-recursively), if we found a nice way to pipe its output
directly to grep etc without going through Lisp. (This could be nice
for project-files, at least)
Incidentally, I tried a find alternative, "bfs",
https://github.com/tavianator/bfs and it doesn't optimize this either,
sadly, so it also has the 15x slowdown.
In GNU Emacs 29.0.92 (build 5, x86_64-pc-linux-gnu, X toolkit, cairo
version 1.15.12, Xaw scroll bars) of 2023-07-10 built on
Repository revision: dd15432ffacbeff0291381c0109f5b1245060b1d
Repository branch: emacs-29
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
System Description: Rocky Linux 8.8 (Green Obsidian)
Configured using:
'configure --config-cache --with-x-toolkit=lucid
--with-gif=ifavailable'
Configured features:
CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG JSON
LIBSELINUX LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND
SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE XIM XINPUT2 XPM LUCID
ZLIB
Important settings:
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Shell
Memory information:
((conses 16 1939322 193013)
(symbols 48 76940 49)
(strings 32 337371 45355)
(string-bytes 1 12322013)
(vectors 16 148305)
(vector-slots 8 3180429 187121)
(floats 8 889 751)
(intervals 56 152845 1238)
(buffers 976 235)
(heap 1024 978725 465480))
next reply other threads:[~2023-07-19 21:16 UTC|newest]
Thread overview: 213+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-19 21:16 Spencer Baugh [this message]
2023-07-20 5:00 ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii
2023-07-20 12:22 ` sbaugh
2023-07-20 12:42 ` Dmitry Gutov
2023-07-20 13:43 ` Spencer Baugh
2023-07-20 18:54 ` Dmitry Gutov
2023-07-20 12:38 ` Dmitry Gutov
2023-07-20 13:20 ` Ihor Radchenko
2023-07-20 15:19 ` Dmitry Gutov
2023-07-20 15:42 ` Ihor Radchenko
2023-07-20 15:57 ` Dmitry Gutov
2023-07-20 16:03 ` Ihor Radchenko
2023-07-20 18:56 ` Dmitry Gutov
2023-07-21 9:14 ` Ihor Radchenko
2023-07-20 16:33 ` Eli Zaretskii
2023-07-20 16:36 ` Ihor Radchenko
2023-07-20 16:45 ` Eli Zaretskii
2023-07-20 17:23 ` Ihor Radchenko
2023-07-20 18:24 ` Eli Zaretskii
2023-07-20 18:29 ` Ihor Radchenko
2023-07-20 18:43 ` Eli Zaretskii
2023-07-20 18:57 ` Ihor Radchenko
2023-07-21 12:37 ` Dmitry Gutov
2023-07-21 12:58 ` Ihor Radchenko
2023-07-21 13:00 ` Dmitry Gutov
2023-07-21 13:34 ` Ihor Radchenko
2023-07-21 13:36 ` Dmitry Gutov
2023-07-21 13:46 ` Ihor Radchenko
2023-07-21 15:41 ` Dmitry Gutov
2023-07-21 15:48 ` Ihor Radchenko
2023-07-21 19:53 ` Dmitry Gutov
2023-07-23 5:40 ` Ihor Radchenko
2023-07-23 11:50 ` Michael Albinus
2023-07-24 7:35 ` Ihor Radchenko
2023-07-24 7:59 ` Michael Albinus
2023-07-24 8:22 ` Ihor Radchenko
2023-07-24 9:31 ` Michael Albinus
2023-07-21 7:45 ` Michael Albinus
2023-07-21 10:46 ` Eli Zaretskii
2023-07-21 11:32 ` Michael Albinus
2023-07-21 11:51 ` Ihor Radchenko
2023-07-21 12:01 ` Michael Albinus
2023-07-21 12:20 ` Ihor Radchenko
2023-07-21 12:25 ` Ihor Radchenko
2023-07-21 12:46 ` Eli Zaretskii
2023-07-21 13:01 ` Michael Albinus
2023-07-21 13:23 ` Ihor Radchenko
2023-07-21 15:31 ` Michael Albinus
2023-07-21 15:38 ` Ihor Radchenko
2023-07-21 15:49 ` Michael Albinus
2023-07-21 15:55 ` Eli Zaretskii
2023-07-21 16:08 ` Michael Albinus
2023-07-21 16:15 ` Ihor Radchenko
2023-07-21 16:38 ` Eli Zaretskii
2023-07-21 16:43 ` Ihor Radchenko
2023-07-21 16:43 ` Michael Albinus
2023-07-21 17:45 ` Eli Zaretskii
2023-07-21 17:55 ` Michael Albinus
2023-07-21 18:38 ` Eli Zaretskii
2023-07-21 19:33 ` Spencer Baugh
2023-07-22 5:27 ` Eli Zaretskii
2023-07-22 10:38 ` sbaugh
2023-07-22 11:58 ` Eli Zaretskii
2023-07-22 14:14 ` Ihor Radchenko
2023-07-22 14:32 ` Eli Zaretskii
2023-07-22 15:07 ` Ihor Radchenko
2023-07-22 15:29 ` Eli Zaretskii
2023-07-23 7:52 ` Ihor Radchenko
2023-07-23 8:01 ` Eli Zaretskii
2023-07-23 8:11 ` Ihor Radchenko
2023-07-23 9:11 ` Eli Zaretskii
2023-07-23 9:34 ` Ihor Radchenko
2023-07-23 9:39 ` Eli Zaretskii
2023-07-23 9:42 ` Ihor Radchenko
2023-07-23 10:20 ` Eli Zaretskii
2023-07-23 11:43 ` Ihor Radchenko
2023-07-23 12:49 ` Eli Zaretskii
2023-07-23 12:57 ` Ihor Radchenko
2023-07-23 13:32 ` Eli Zaretskii
2023-07-23 13:56 ` Ihor Radchenko
2023-07-23 14:32 ` Eli Zaretskii
2023-07-22 17:18 ` sbaugh
2023-07-22 17:26 ` Ihor Radchenko
2023-07-22 17:46 ` Eli Zaretskii
2023-07-22 18:31 ` Eli Zaretskii
2023-07-22 19:06 ` Eli Zaretskii
2023-07-22 20:53 ` Spencer Baugh
2023-07-23 6:15 ` Eli Zaretskii
2023-07-23 7:48 ` Ihor Radchenko
2023-07-23 8:06 ` Eli Zaretskii
2023-07-23 8:16 ` Ihor Radchenko
2023-07-23 9:13 ` Eli Zaretskii
2023-07-23 9:16 ` Ihor Radchenko
2023-07-23 11:44 ` Michael Albinus
2023-07-23 2:59 ` Richard Stallman
2023-07-23 5:28 ` Eli Zaretskii
2023-07-22 8:17 ` Michael Albinus
2023-07-21 13:17 ` Ihor Radchenko
2023-07-21 12:27 ` Michael Albinus
2023-07-21 12:30 ` Ihor Radchenko
2023-07-21 13:04 ` Michael Albinus
2023-07-21 13:24 ` Ihor Radchenko
2023-07-21 15:36 ` Michael Albinus
2023-07-21 15:44 ` Ihor Radchenko
2023-07-21 12:39 ` Eli Zaretskii
2023-07-21 13:09 ` Michael Albinus
2023-07-21 12:38 ` Dmitry Gutov
2023-07-20 17:08 ` Spencer Baugh
2023-07-20 17:24 ` Eli Zaretskii
2023-07-22 6:35 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-07-20 17:25 ` Ihor Radchenko
2023-07-21 19:31 ` Spencer Baugh
2023-07-21 19:37 ` Ihor Radchenko
2023-07-21 19:56 ` Dmitry Gutov
2023-07-21 20:11 ` Spencer Baugh
2023-07-22 6:39 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-07-22 21:01 ` Dmitry Gutov
2023-07-23 5:11 ` Eli Zaretskii
2023-07-23 10:46 ` Dmitry Gutov
2023-07-23 11:18 ` Eli Zaretskii
2023-07-23 17:46 ` Dmitry Gutov
2023-07-23 17:56 ` Eli Zaretskii
2023-07-23 17:58 ` Dmitry Gutov
2023-07-23 18:21 ` Eli Zaretskii
2023-07-23 19:07 ` Dmitry Gutov
2023-07-23 19:27 ` Eli Zaretskii
2023-07-23 19:44 ` Dmitry Gutov
2023-07-23 19:27 ` Dmitry Gutov
2023-07-24 11:20 ` Eli Zaretskii
2023-07-24 12:55 ` Dmitry Gutov
2023-07-24 13:26 ` Eli Zaretskii
2023-07-25 2:41 ` Dmitry Gutov
2023-07-25 8:22 ` Ihor Radchenko
2023-07-26 1:51 ` Dmitry Gutov
2023-07-26 9:09 ` Ihor Radchenko
2023-07-27 0:41 ` Dmitry Gutov
2023-07-27 5:22 ` Eli Zaretskii
2023-07-27 8:20 ` Ihor Radchenko
2023-07-27 8:47 ` Eli Zaretskii
2023-07-27 9:28 ` Ihor Radchenko
2023-07-27 13:30 ` Dmitry Gutov
2023-07-29 0:12 ` Dmitry Gutov
2023-07-29 6:15 ` Eli Zaretskii
2023-07-30 1:35 ` Dmitry Gutov
2023-07-31 11:38 ` Eli Zaretskii
2023-09-08 0:53 ` Dmitry Gutov
2023-09-08 6:35 ` Eli Zaretskii
2023-09-10 1:30 ` Dmitry Gutov
2023-09-10 5:33 ` Eli Zaretskii
2023-09-11 0:02 ` Dmitry Gutov
2023-09-11 11:57 ` Eli Zaretskii
2023-09-11 23:06 ` Dmitry Gutov
2023-09-12 11:39 ` Eli Zaretskii
2023-09-12 13:11 ` Dmitry Gutov
2023-09-12 14:23 ` Dmitry Gutov
2023-09-12 14:26 ` Dmitry Gutov
2023-09-12 16:32 ` Eli Zaretskii
2023-09-12 18:48 ` Dmitry Gutov
2023-09-12 19:35 ` Eli Zaretskii
2023-09-12 20:27 ` Dmitry Gutov
2023-09-13 11:38 ` Eli Zaretskii
2023-09-13 14:27 ` Dmitry Gutov
2023-09-13 15:07 ` Eli Zaretskii
2023-09-13 17:27 ` Dmitry Gutov
2023-09-13 19:32 ` Eli Zaretskii
2023-09-13 20:38 ` Dmitry Gutov
2023-09-14 5:41 ` Eli Zaretskii
2023-09-16 1:32 ` Dmitry Gutov
2023-09-16 5:37 ` Eli Zaretskii
2023-09-19 19:59 ` bug#66020: (bug#64735 spin-off): regarding the default for read-process-output-max Dmitry Gutov
2023-09-20 11:20 ` Eli Zaretskii
2023-09-21 0:57 ` Dmitry Gutov
2023-09-21 2:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
[not found] ` <58e9135f-915d-beb9-518a-e814ec2a0c5b@gutov.dev>
2023-09-21 13:16 ` Eli Zaretskii
2023-09-21 17:54 ` Dmitry Gutov
2023-09-21 7:42 ` Eli Zaretskii
2023-09-21 14:37 ` Dmitry Gutov
2023-09-21 14:59 ` Eli Zaretskii
2023-09-21 17:40 ` Dmitry Gutov
2023-09-21 18:39 ` Eli Zaretskii
2023-09-21 18:42 ` Dmitry Gutov
2023-09-21 18:49 ` Eli Zaretskii
2023-09-21 17:33 ` Dmitry Gutov
2023-09-23 21:51 ` Dmitry Gutov
2023-09-24 5:29 ` Eli Zaretskii
2024-05-26 15:20 ` Dmitry Gutov
2024-05-26 16:01 ` Eli Zaretskii
2024-05-26 23:27 ` Stefan Kangas
2024-06-08 12:11 ` Eli Zaretskii
2024-06-09 0:12 ` Dmitry Gutov
2024-06-11 3:12 ` Dmitry Gutov
2024-06-11 6:51 ` Eli Zaretskii
2024-06-11 11:41 ` Dmitry Gutov
2024-06-11 12:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-06-11 13:06 ` Eli Zaretskii
2024-06-11 17:15 ` Ihor Radchenko
2024-06-11 18:09 ` Dmitry Gutov
2024-06-11 19:33 ` Ihor Radchenko
2024-06-11 20:00 ` Dmitry Gutov
2023-09-21 8:07 ` Stefan Kangas
[not found] ` <b4f2135b-be9d-2423-02ac-9690de8b5a92@gutov.dev>
2023-09-21 13:17 ` Eli Zaretskii
2023-07-25 18:42 ` bug#64735: 29.0.92; find invocations are ~15x slower because of ignores Eli Zaretskii
2023-07-26 1:56 ` Dmitry Gutov
2023-07-26 2:28 ` Eli Zaretskii
2023-07-26 2:35 ` Dmitry Gutov
2023-07-25 19:16 ` sbaugh
2023-07-26 2:28 ` Dmitry Gutov
2023-07-21 2:42 ` Richard Stallman
2023-07-22 2:39 ` Richard Stallman
2023-07-22 5:49 ` Eli Zaretskii
2023-07-22 10:18 ` Ihor Radchenko
2023-07-22 10:42 ` sbaugh
2023-07-22 12:00 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=iermszrwqj4.fsf@janestreet.com \
--to=sbaugh@janestreet.com \
--cc=64735@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.